Monday, June 5, 2023

OLVM - Disk state stuck in finalizing state

 




Intro

It's essential to have a proper backup mechanism for virtualization infrastructure. Also, we need to test the backup and recovery method at least once in 3 months to validate that everything is working as expected. Also documenting the recovery procedure helps to avoid surprises when there is a recovery scenario.  Organizations should be ready to address unexpected failures at any time. 

For Oracle Linux Virtualization Manager (OLVM) 4+ environments you can use API v4 for invoking all backup-related tasks. Import/export mode defines the way the backups and restores are done. OLVM (with API v4) supports 3 modes:

1. Disk attachment :

which exports VM metadata (in OVF format) with separate disk files (in RAW format) via Proxy VM with the Node installed.

  • Supports OLVM 4.0+
  • No incremental backup
  • Proxy VM required in each cluster - used for the disk attachment process

2. Disk image transfer : 

which exports VM metadata (in OVF format) with disk snapshot chains as separate files (QCOW2 format):

  • Supports OLVM 4.2+/oVirt 4.2.3+
  • Supports incremental backup
  • Disk images are transferred directly from the API (no Proxy VM required)
3. SSH Transfer, this method assumes that all data transfers are directly from the hypervisor over SSH protocol.

Below mentioned URL  helps you to filter all the backup tools that support Oracle Linux Virtualization Manager.

Supported third-party backup tools :

https://apexapps.oracle.com/pls/apex/f?p=10263:17::::::



Figure 2: Third-party backup tools


In some cases, the disk image transfer network connection is disturbed disk will be stuck in finalizing state.  

Note: If the disks are finalizing state, you cannot put KVM into maintenance mode.


                                          Figure 2: Try to put KVM into maintenance mode.
                                    

You can get a clear understanding of disk image transfer by referring to the below-mentioned URL: https://storware.gitbook.io/backup-and-recovery/protecting-virtual-machines/virtual-machines/oracle-linux-virtualization-manager

Disk image transfer API  :

This API allowed the export of individual snapshots directly from the OLVM manager. So instead of installing multiple Proxy VMs, you can have a single external Node installation, which invokes APIs via the OLVM manager.


In this article, I will cover how it can be overcome if the disk is stuck in finalizing state.

Also, I have mentioned the Oracle meta link note : OLVM: Unable to put KVM host to maintenance mode due to Image transfer in progress (Doc ID 2915392.1)

As mentioned in Figure 3, this is how it looks when disks are stuck in finalizing state.



                                         Figure 3: disk stuck in finalizing state.


The best approach is the query the disk state in the OLVM engine. This will help you to understand which disks are stuck in finalizing status. 

Note: All the commands should be executed from the OLVM engine server.


Identify the issue



[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |     7 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |     7 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(3 rows)

[root@local-olvm-engine ~]#


Solution

As per the meta link note, you can update the image transfer status in phase 7 to either 9 failed or 10 completed depending on the situation.


[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = 'dcc47178-ebb1-47c1-900b-bc9753e12378'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = '77a820ab-c580-4b46-9c0c-22102a0ce706'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |    10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |    10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)

[root@local-olvm-engine ~]#

Validate

Execute below mentioned command to validate the disk status, Also disk should be changed to the O.K state in the OLVM URL.


[root@sofe-olvm-01 ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |    10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |    10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)

Conclusion

When an organization hosts a critical VMs server in the OLVM virtualization environment they need to plan their backup method. There can be a situation you have to recover the VM from the backup. Backup and recovery need to be tested and documented. 

To resolve disk state errors we need to update the Postgres database, I would recommend backup the OLVM engine before making any changes. Also better to consult an Oracle engineer to get a more precise understanding before changing the image_transfer phase.

No comments:

Post a Comment

Oracle world 2024 - AI

  Intro  The world is transitioning from the data era to the age of artificial intelligence. Many organizations are leveraging AI features t...