Thursday, June 1, 2023

OLVM - Troubleshoot VM startup issue

 




Intro

OLVM is a mighty engine that reduces administration overhead in managing multiple KVMs.As engineers, you clearly understand the maintenance activities and VM recovery procedures.  VM changes are very sensitive and there are changes you may face startup issues.

In this article, I will elaborate on the issue I faced while moving VM from one bond to another.

This environment was configured with two bonded networks. 

  • Management network - This network transfer all the storage information and keeps the cluster heartbeat.
  • VM-Network  - To network configured to access VMs.

Note:  We created a new VLAN on the management network and tried to move the VM to the management network. This VM was used as a proxy server for veem backups.

The procedure of network change on VM.

  • Shut down the VM.
  • Change the network.
  • Reboot the VM.
Somehow this newly created network was not created as expected.

Note: Do not make any changes to the management network, if changes are made you need to reinstall the KVM.

https://ovirt.org/documentation/administration_guide/index.html#Searching_for_Networks




KVM Event after a management network change.


Change we made on both the nodes to make this identical. But network change was not successful on both nodes. 

KVM - Node-01


KVM - Node-02




Troubleshoot VM startup issue

We enabled the port mirroring on the network side and changed the network on VM and rebooted the server.  After restart VM state was not updated on the OLVM engine.



The solution reflects the VM state.


As the VM does not reflect the correct state in the OLVM URL, we need to restart the services below from the KVM level to reflect the clean service status.

Note: If the fencing is configured please uncheck the fencing during the reboot activity.

Explain Ovirt Services


VDSMD: vdsdmd is the service manager of the KVM guest VMs.

Vdsm manages and monitors the host’s storage, memory, and networks as well as virtual machine creation, other host administration tasks, statistics gathering, and log collection.

For more understanding read : https://www.ovirt.org/develop/developer-guide/vdsm/vdsm.html

MOM:  MoM (Memory Overcommitment Manager)

For more understanding read : https://www.ovirt.org/develop/projects/mom.html


systemctl status vdsmd.service

Expected output



[root@KVM01 vdsm]# systemctl status vdsmd.service
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2023-02-14 09:10:14 EST; 2 weeks 2 days ago
  Process: 4814 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 5099 (vdsmd)
    Tasks: 90 (limit: 3355442)
   Memory: 272.4M
   CGroup: /system.slice/vdsmd.service
           ├─   5099 /usr/bin/python3 /usr/share/vdsm/vdsmd
           ├─  20066 /usr/libexec/ioprocess --read-pipe-fd 69 --write-pipe-fd 68 --max-threads 10 --max-queued-requests 10
           ├─  20090 /usr/libexec/ioprocess --read-pipe-fd 86 --write-pipe-fd 85 --max-threads 10 --max-queued-requests 10
           ├─  20097 /usr/libexec/ioprocess --read-pipe-fd 90 --write-pipe-fd 89 --max-threads 10 --max-queued-requests 10
           ├─2026456 /usr/libexec/ioprocess --read-pipe-fd 53 --write-pipe-fd 52 --max-threads 10 --max-queued-requests 10
           ├─2026460 /usr/libexec/ioprocess --read-pipe-fd 59 --write-pipe-fd 58 --max-threads 10 --max-queued-requests 10
           └─2026466 /usr/libexec/ioprocess --read-pipe-fd 73 --write-pipe-fd 72 --max-threads 10 --max-queued-requests 10

Feb 28 13:22:03 KVM01.local.ca vdsm[5099]: ERROR FINISH destroy error=Found not_in_hw, expected ('???',)
Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 302, in destroy res = self.vm.destroy(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5453, in destroy result = self.doDestroy(gracefulAttempts, reason) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5472, in doDestroy return self.releaseVm(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5354, in releaseVm nic.name) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in **kwargs) File "", line 2, in unsetPortMirroring File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) vdsm.network.tc._parser.TCParseError: Found not_in_hw, expected ('???',) Feb 28 13:41:58 KVM01.local.ca vdsm[5099]: ERROR FINISH destroy error=Found not_in_hw, expected ('???',)
Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 302, in destroy res = self.vm.destroy(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5453, in destroy result = self.doDestroy(gracefulAttempts, reason) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5472, in doDestroy return self.releaseVm(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5354, in releaseVm nic.name) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in **kwargs) File "", line 2, in unsetPortMirroring File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) vdsm.network.tc._parser.TCParseError: Found not_in_hw, expected ('???',) Feb 28 14:02:32 KVM01.local.ca vdsm[5099]: ERROR FINISH destroy error=Found not_in_hw, expected ('???',)
Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 302, in destroy res = self.vm.destroy(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5453, in destroy result = self.doDestroy(gracefulAttempts, reason) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5472, in doDestroy return self.releaseVm(gracefulAttempts) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5354, in releaseVm nic.name) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in **kwargs) File "", line 2, in unsetPortMirroring File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result)

First not down the current status and then reboot the services on the node where vm was running

Node down current services



systemctl status -n 0 libvirtd mom-vdsm vdsmd supervdsmd

Restart services



systemctl restart libvirtd mom-vdsm vdsmd supervdsmd

Validate the service status



[root@KVM01 ~]# systemctl status -n 0 libvirtd mom-vdsm vdsmd supervdsmd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/libvirtd.service.d
           └─unlimited-core.conf
   Active: active (running) since Thu 2023-03-02 14:32:43 EST; 4min 24s ago
     Docs: man:libvirtd(8)
           https://libvirt.org
 Main PID: 2031978 (libvirtd)
    Tasks: 21 (limit: 32768)
   Memory: 26.8M
   CGroup: /system.slice/libvirtd.service
           └─2031978 /usr/sbin/libvirtd

● mom-vdsm.service - MOM instance configured for VDSM purposes
   Loaded: loaded (/usr/lib/systemd/system/mom-vdsm.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2023-03-02 14:32:53 EST; 4min 15s ago
 Main PID: 2032631 (momd)
    Tasks: 7 (limit: 3355442)
   Memory: 21.0M
   CGroup: /system.slice/mom-vdsm.service
           └─2032631 /usr/libexec/platform-python /usr/sbin/momd -c /etc/vdsm/mom.conf

● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2023-03-02 14:32:48 EST; 4min 20s ago
  Process: 2032307 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 2032368 (vdsmd)
    Tasks: 72 (limit: 3355442)
   Memory: 89.5M
   CGroup: /system.slice/vdsmd.service
           ├─2032368 /usr/bin/python3 /usr/share/vdsm/vdsmd
           ├─2032664 /usr/libexec/ioprocess --read-pipe-fd 57 --write-pipe-fd 56 --max-threads 10 --max-queued-requests 10
           ├─2032720 /usr/libexec/ioprocess --read-pipe-fd 64 --write-pipe-fd 63 --max-threads 10 --max-queued-requests 10
           └─2033019 /usr/libexec/ioprocess --read-pipe-fd 88 --write-pipe-fd 87 --max-threads 10 --max-queued-requests 10

● supervdsmd.service - Auxiliary vdsm service for running helper functions as root
   Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static; vendor preset: disabled)
   Active: active (running) since Thu 2023-03-02 14:32:43 EST; 4min 24s ago
 Main PID: 2031998 (supervdsmd)
    Tasks: 15 (limit: 3355442)
   Memory: 43.1M
   CGroup: /system.slice/supervdsmd.service
           └─2031998 /usr/bin/python3 /usr/share/vdsm/supervdsmd --sockfile /run/vdsm/svdsm.sock
[root@KVM01 ~]#

Login to the OLVM URL and validate the VM status

Once you restart the services from the KVM level, log in to OLVM to see the VM status.
To perform all the activities VM should show the correct status.



Error



This VM was not coming up because of the port mirroring. VM tried to unset the port mirroring but it was unsuccessful.

VM startup log :

line 2, in unsetPortMirroring 

Complete log output


03-01 11:56:17,034-0500 ERROR (jsonrpc/2) [api] FINISH destroy error=Found not_in_hw, expected ('???',) (api:134)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 302, in destroy
    res = self.vm.destroy(gracefulAttempts)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5453, in destroy
    result = self.doDestroy(gracefulAttempts, reason)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5472, in doDestroy
    return self.releaseVm(gracefulAttempts)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5354, in releaseVm
    nic.name)
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__
    return callMethod()
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in 
    **kwargs)
  File "", line 2, in unsetPortMirroring ------ Error
  File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
    raise convert_to_error(kind, result)
vdsm.network.tc._parser.TCParseError: Found not_in_hw, expected ('???',)

Solution

We need to perform below mentioned activities to overcome this VM startup issue.

  • Detached the network card from the VM and try to start up the VM.
  • Attached different network and try to start the VM.
  • To unset the port mirroring, recreate the vlan with a different name.
  • Attached newly created vlan and startup VM.


No comments:

Post a Comment

Exacs database creation using dbaascli

  Intro OCI (Oracle Cloud Infrastructure) provides robust automation capabilities for routine maintenance tasks such as patching, ...