Monday, July 17, 2023

OLVM : 2 node with Gluster storage with Arbitrated Replicated Volumes

Intro

Virtualization made a significant change in the IT (Information Technology) industry. This technology helped many organizations to use server resources efficiently. Even though cloud technology is emerging some companies are not ready to move their workloads to the cloud due to data sensitivity and business obstacles. So the only option to save the IT infrastructure cost using virtualization technology.

For small and medium-scale companies IT budgets are really tight. Due to a limited budget, It's a challenging job to achieve storage systems stability with virtualization. When organizations plan on virtualization optimal architecture needs 3 nodes; 3-node architecture gives proper fencing and high availability. Oracle Corporation combined with open source virtualization and introduced (OLVM) Oracle Linux Virtualization Manager. For OLVM there is no restriction to having 2 node architectures. When considering the storage availability there are three ways to archive this using OLVM.

Fiber channel storage - FC data domains.
Glusterfs Storage - Gluster data domains.
ISCSI Storage - ISCSI data domains.

Glusterfs and ISCSI it's a must to have a 10g back-end network for management. Glusterfs Storage replication is happening via the management network.

Two-node architecture storage stability can be archived by implementing the glusterfs arbitrated replicated volumes. This mainly addresses avoiding storage split-brain conditions.

In this article, I would like to highlight the implementation steps of the gluster storage arbitrator.

Glusterfs arbitrator implementation prerequisites.

Why Arbiter?

Split-brains in replica volumes

When a file is in split-brain, there is an inconsistency in either data or metadata (permissions, uid/gid, extended attributes etc.) of the file amongst the bricks of a replica. We do not have enough information to authoritatively pick a copy as being pristine and heal to the bad copies, despite all bricks being up and online. For directories, there is also an entry-split brain where a file inside it has different gfids/ file-type (say one is a file and another is a directory of the same name) across the bricks of a replica.

What is a gluster arbitrator?

The arbiter volume is a special subset of replica volumes that is aimed at preventing split brains and providing the same consistency guarantees as a normal replica 3 volume without consuming 3x space.

If you need to read and understand the get the complete picture, try the below-mentioned link: https://docs.gluster.org/en/v3/Administrator%20Guide/arbiter-volumes-and-quorum/

Redhat: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/creating_arbitrated_replicated_volumes

Arbitrator disk capacity?

Arbitrators only store only the metadata of the files stored in the bricks. When the replicated main to disk is 1TB you need only 2MB of space from the arbitrator side to store metadata.



minimum arbiter brick size = 4 KB * ( size in KB of largest data brick in volume or replica set / average file size in KB)

minimum arbiter brick size  = 4 KB * ( 1 TB / 2 GB )
                            = 4 KB * ( 1000000000 KB / 2000000 KB )
                            = 4 KB * 500 KB
                            = 2000 KB
                            = 2 MB

Pre-Requisites

Host the arbitrator disk outside the OLVM environment.
Need at least 1GB network to replicate metadata to arbitrator disk.
To avoid any fencing issues for storage, the OLVM management network should to 10G.
Open required ports to communicate with glusterfs.

Implementation steps

For this example, we are going to host the arbitrator disk in the OLVM engine server.

Partition the disk

Partition the disk using fdisk and create LVM, Our other glusterfs bricks are hosted as LVMs. I would recommend keeping the brick and arbitrator disk identical.



[root@olvm-engine-01 ~]# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0   50G  0 disk
├─sda1        8:1    0    1G  0 part /boot
└─sda2        8:2    0   49G  0 part
  ├─ol-root 252:0    0   44G  0 lvm  /
  └─ol-swap 252:1    0    5G  0 lvm  [SWAP]
sdb           8:16   0  250G  0 disk
sdc           8:32   0  100G  0 disk
sdd           8:48   0  100G  0 disk
sr0          11:0    1 1024M  0 rom
[root@olvm-engine-01 ~]# fdisk /dev/sdc

Execute lsblk to get the disks layout



[root@olvm-engine-01 ~]# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0   50G  0 disk
├─sda1        8:1    0    1G  0 part /boot
└─sda2        8:2    0   49G  0 part
  ├─ol-root 252:0    0   44G  0 lvm  /
  └─ol-swap 252:1    0    5G  0 lvm  [SWAP]
sdb           8:16   0  250G  0 disk
sdc           8:32   0  100G  0 disk
└─sdc1        8:33   0  100G  0 part
sdd           8:48   0  100G  0 disk
sr0          11:0    1 1024M  0 rom

Setup LVM for arbitrator disk



[root@olvm-engine-01 ~]# pvcreate /dev/sdc1
  Physical volume "/dev/sdc1" successfully created.
  
[root@olvm-engine-01 ~]# pvs
  PV         VG Fmt  Attr PSize    PFree
  /dev/sda2  ol lvm2 a--   <49 ---="" -l="" -n="" -wi-a-----="" -wi-ao----="" .00g="" 0="" 5.00g="" 90.00g="" 90g="" attr="" code="" convert="" cpy="" created.="" created="" data="" dev="" g="" gfs_dev_lv="" gfs_dev_vg="" group="" log="" logical="" lsize="" lv="" lvcreate="" lvm2="" lvs="" meta="" move="" ol="" olvm-engine-01="" origin="" pool="" root="" sdc1="" successfully="" swap="" vg="" vgcreate="" volume="" ync="">

Create xfs file system on the arbitrator disk and mount this as a persistent mount point.



[root@olvm-engine-01 ~]# mkfs.xfs -f -i size=512 -L glusterfs /dev/mapper/GFS_DEV_VG-GFS_DEV_LV
meta-data=/dev/mapper/GFS_DEV_VG-GFS_DEV_LV isize=512    agcount=4, agsize=5898240 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=23592960, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=11520, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[root@olvm-engine-01 ~]#

Execute lsblk to validate the partition disk



[root@olvm-engine-01 ~]# lsblk
NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                         8:0    0   50G  0 disk
├─sda1                      8:1    0    1G  0 part /boot
└─sda2                      8:2    0   49G  0 part
  ├─ol-root               252:0    0   44G  0 lvm  /
  └─ol-swap               252:1    0    5G  0 lvm  [SWAP]
sdb                         8:16   0  250G  0 disk
sdc                         8:32   0  100G  0 disk
└─sdc1                      8:33   0  100G  0 part
  └─GFS_DEV_VG-GFS_DEV_LV 252:2    0   90G  0 lvm  /nodirectwritedata/glusterfs/dev_arb_brick_03
sdd                         8:48   0  100G  0 disk
└─sdd1                      8:49   0  100G  0 part
sr0                        11:0    1 1024M  0 rom
[root@olvm-engine-01 ~]#

Discover arbitrator node

Now discover the arbitrator disk from KVMs.



gluster peer probe olvm-engine-01.oracle.ca  -- Execute on both KVMs

Expected output after peering



[root@KVM01 ~]# gluster peer probe olvm-engine-01.oracle.ca
peer probe: success

[root@KVM02 ~]# gluster peer probe olvm-engine-01.oracle.ca
peer probe: Host olvm-engine-01.oracle.ca port 24007 already in peer list

Add arbitrator disk to glusterfs



[root@KVM01 ~]# gluster volume add-brick gvol0 replica 3 arbiter 1 olvm-engine-01.oracle.ca:/nodirectwritedata/glusterfs/arb_brick3/gvol0
volume add-brick: success

Validate arbitrator replicated volume



[root@KVM120 ~]#  gluster volume info dev_gvol0

Volume Name: dev_gvol0
Type: Replicate
Volume ID: db1a8a7e-6709-4a1c-8839-ff0ab3cc4ebe
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: KVM120:/nodirectwritedata/glusterfs/dev_brick_01/dev_gvol0
Brick2: KVM121:/nodirectwritedata/glusterfs/dev_brick_02/dev_gvol0
Brick3: sofe-olvm-01.sofe.ca:/nodirectwritedata/glusterfs/dev_arb_brick_03/dev_gvol0 (arbiter)
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.data-self-heal: on
cluster.metadata-self-heal: on
cluster.entry-self-heal: on
cluster.favorite-child-policy: mtime

Conclusion

OLVM two-node architecture with glusterfs data domains has a high chance to have this split-brain issues. split-brain issues can be completly avoid by implemetion arbitratored replicated volumes.

Important factor is you do not need huge disk to implement this arbitrator disk. But remember OLVM management network should be 10g to support the brick replication. What ever the changes happening on storage level doing to replicate via management network.

Friday, July 14, 2023

Virtualized ODA 19.13 - Scale down cpu core count on oda_base

Intro

Hope previous blog posts were useful to address ODA upgrade issues for the virtualization platforms.

If you are planning on the 19.8 journey please read the below-mentioned article to ease the 19.8 journey.

Article about the upgrade and issues:

19.13 will be the last upgrade for the virtualized platform, After that oracle going to discontinue OVM and they are moving to KVM-based virtualization. This is going to be a game changer for ODA for performance.

To be compliant with an Oracle license is very important. When configuring virtualized ODA set up the core as per the purchased license core count.

If you misconfigured this there is a way to correct this. But you need to take complete downtime to perform this activity.

In this article, I will illustrate how to adjust CPU core count in virtualized ODA.

Oracle documentation: https://docs.oracle.com/cd/E75549_01/doc.121/e74838/GUID-98E3071C-8278-420D-86F5-72E3B950918E.htm#CMTAR-GUID-98E3071C-8278-420D-86F5-72E3B950918

Adjust core count

The ODA-BASE core count can be adjusted via dom0, Before making any changes make sure to take a backup of the oda-base. For this cpu downscale, we need complete downtime.

Pre-requisites steps.

Gather CPU
Shut down all the VMS.
Shut down all the repositories.
Shutdown oda-base.
Take the backup of the ODA-BASE.

Steps to perform.

Adjust the CPU core count for the ODA-BASE ( This will change the CPU core count on both nodes).

Startup ODA-BASE

Validate ODA-BASE running services. ( Oracle cluster and database services).

CPU Core count configuration

Note: Make sure the ODA-BASE is down for both nodes before performing this activity.

Validate the ODA-BASE status.

First, gather the ODA-BASE details to note down the current CPU core count.

    

[root@ecl-oda-DOM0-1 ~]# oakcli show oda_base
ODA base domain
ODA base CPU cores      :12
ODA base domain memory  :80
ODA base template       :/OVS/NEVER_DELETE_12.1.2.12.tar.gz
ODA base vlans          :['net1', 'net2', 'vbr1']
ODA base current status :Running

[root@ecl-oda-DOM0-1 ~]# oakcli show oda_base
ODA base domain
ODA base CPU cores      :12
ODA base domain memory  :80
ODA base template       :/OVS/NEVER_DELETE_12.1.2.12.tar.gz
ODA base vlans          :['net1', 'net2', 'vbr1']
ODA base current status :Stopped
[root@ecl-oda-DOM0-1 ~]#

As per this example, I'm reducing the ODA-BASE CPU core count from 12 to 10.

Below mentioned command output shows the



[root@ecl-oda-DOM0-1 Repo_backup]# oakcli configure oda_base
Core Licensing Options:
        1. 2 CPU Cores
        2. 4 CPU Cores
        3. 6 CPU Cores
        4. 8 CPU Cores
        5. 10 CPU Cores
        6. 12 CPU Cores
        7. 14 CPU Cores
        8. 16 CPU Cores
        9. 18 CPU Cores
        10. 20 CPU Cores
        11. 22 CPU Cores
        12. 24 CPU Cores
        13. 26 CPU Cores
        14. 28 CPU Cores
        15. 30 CPU Cores
        16. 32 CPU Cores
        17. 34 CPU Cores
        18. 36 CPU Cores
        Current CPU Cores       :12
        Selection[1 .. 18](default 36 CPU Cores)        : 5
        ODA base domain memory in GB(min 16, max 491)(Current Memory 80G)[default 160]  : 80G
WARNING: Please enter a valid option for memory size
        ODA base domain memory in GB(min 16, max 491)(Current Memory 80G)[default 160]  : 80
Additional vlan networks to be assigned to oda_base ? (y/n) [n]:
Vlan network to be removed from oda_base ? (y/n) [n]:
INFO: Configure VNC password for oda_base
Please input your password:
ERROR: Invalid password, password should have uppercase lowercase and special characters and numbers
ERROR: password length should be longer than 8
ERROR: please enter a valid password
Please input your password:
Please confirm your password:
INFO: Node 0:Configured oda base pool
INFO: Node 1:Configured oda base pool
INFO: Node 0:ODA Base configured with new memory
INFO: Node 0:ODA Base configured with new vcpus
INFO: Node 0:ODA Base configured with new VNC passwd
INFO: Changes will be incorporated after the domain is restarted on Node 0.
INFO: Node 1:ODA Base configured with new memory
INFO: Node 1:ODA Base configured with new vcpus
INFO: Node 1:ODA Base configured with new VNC passwd
INFO: Changes will be incorporated after the domain is restarted on Node 1.
INFO: Updating /etc/sysctl.conf in oda_base domain with parameter "vm.nr_hugepages=21520"
ERROR: Odabase Agent on node 0 is down
INFO: Please update /etc/sysctl.conf in oda_base domain on node 0 with parameter "vm.nr_hugepages=21520"
ERROR: Odabase Agent on node 1 is down
INFO: Please update /etc/sysctl.conf in oda_base domain on node 1 with parameter "vm.nr_hugepages=21520"
INFO: Updating /etc/security/limits.conf in oda_base domain with new memlock value 60000000
ERROR: Odabase Agent on node 0 is down
INFO: Please update /etc/security/limits.conf in oda_base domain on node 0 with new memlock value 60000000
ERROR: Odabase Agent on node 1 is down
INFO: Please update /etc/security/limits.conf in oda_base domain on node 1 with new memlock value 60000000
You have new mail in /var/spool/mail/root
[root@ecl-oda-DOM0-1 Repo_backup]#

Validate Core Count

After changing run the show oda_base command to validate the cpu change.



[root@ecl-oda-DOM0-0 Repo_backup]# oakcli show oda_base
ODA base domain
ODA base CPU cores      :10
ODA base domain memory  :80
ODA base template       :/OVS/NEVER_DELETE_12.1.2.12.tar.gz
ODA base vlans          :['net1', 'net2', 'vbr1']
ODA base current status :Stopped
[root@ecl-oda-DOM0-0 Repo_backup]# oakcli start oda_base


[root@ecl-oda-DOM0-1 Repo_backup]# oakcli start oda_base
INFO: Starting ODA base domain...
INFO: Started ODA base domain
[root@ecl-oda-DOM0-1 Repo_backup]# oakcli show oda_base
ODA base domain
ODA base CPU cores      :10
ODA base domain memory  :80
ODA base template       :/OVS/NEVER_DELETE_12.1.2.12.tar.gz
ODA base vlans          :['net1', 'net2', 'vbr1']
ODA base current status :Running
[root@ecl-oda-DOM0-1 Repo_backup]#

Validate settings via dom0 log

I would recommend checking the dom0 oakcli logs. Which gives you a clear understanding of the CPU change.



2023-07-12 15:45:37,800 [Cmd_EnvId] [Thread-4021] [odaBaseActions] [INFO] [176] Getting existing vlan list for ODA BASE
2023-07-12 15:45:37,801 [Cmd_EnvId] [Thread-4021] [odaBaseActions] [INFO] [186] Existing Vlan List ['net1', 'net2', 'vbr1']
2023-07-12 15:46:00,194 [Cmd_EnvId] [Thread-4023] [odaBaseActions] [INFO] [176] Getting existing vlan list for ODA BASE
2023-07-12 15:46:00,194 [Cmd_EnvId] [Thread-4023] [odaBaseActions] [INFO] [186] Existing Vlan List ['net1', 'net2', 'vbr1']
2023-07-12 15:47:56,907 [Cmd_EnvId] [Thread-4024] [odaBaseActions] [INFO] [130] ODA Base pool being configured with 20 cpus
2023-07-12 15:47:56,907 [Cmd_EnvId] [Thread-4024] [cpupoolactions] [DEBUG] [90] Checking if the pool odaBaseCpuPool exists
2023-07-12 15:47:56,908 [Cmd_EnvId] [Thread-4024] [cpupoolactions] [DEBUG] [93] Updating cpu pool odaBaseCpuPool to 20 cpus
2023-07-12 15:47:56,908 [Cmd_EnvId] [Thread-4024] [oakCpuPoolManage] [DEBUG] [75] Trying to set cpu pool odaBaseCpuPool to 20 cpus
2023-07-12 15:47:56,909 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [1327] getNumCpusDom0 called
2023-07-12 15:47:56,909 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [723] Executing command xm vcpu-list |grep Domain-0 | wc -l
2023-07-12 15:47:57,179 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [1334] Number of cpus pinned to dom0 is 20
2023-07-12 15:47:57,180 [Cmd_EnvId] [Thread-4024] [oakCpuPoolManage] [DEBUG] [81] Number of cpus allocated to dom0 is 20
2023-07-12 15:47:57,180 [Cmd_EnvId] [Thread-4024] [cpupoolDb] [DEBUG] [77] cpu 43 assigned to default-unpinned-pool pool
2023-07-12 15:47:57,180 [Cmd_EnvId] [Thread-4024] [cpupoolDb] [DEBUG] [77] cpu 42 assigned to default-unpinned-pool pool
2023-07-12 15:47:57,181 [Cmd_EnvId] [Thread-4024] [cpupoolDb] [DEBUG] [77] cpu 41 assigned to default-unpinned-pool pool
2023-07-12 15:47:57,181 [Cmd_EnvId] [Thread-4024] [cpupoolDb] [DEBUG] [77] cpu 40 assigned to default-unpinned-pool pool
2023-07-12 15:47:57,181 [Cmd_EnvId] [Thread-4024] [oakCpuPoolManage] [DEBUG] [124] Completed allocation of cpus
2023-07-12 15:47:57,182 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [181] Created xml string 0Cpupool configured
2023-07-12 15:47:57,184 [Cmd_EnvId] [Thread-4024] [cpupoolactions] [DEBUG] [53] Cpulist string for cpu pool odaBaseCpuPool is 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
2023-07-12 15:47:57,184 [Cmd_EnvId] [Thread-4024] [cpupoolactions] [DEBUG] [62] Updating vmcfg file of  VM oakDom1 with cpus: 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
2023-07-12 15:47:57,184 [Cmd_EnvId] [Thread-4024] [odaBaseActions] [INFO] [101] ODA Base configure operation called
2023-07-12 15:47:57,184 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [567] Updating cfg params values
2023-07-12 15:47:57,185 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [540] Post conversion, the configuration parameter updated to {'cpus': '20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39'}
2023-07-12 15:47:57,185 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [578] Updating parameter cpus to 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
2023-07-12 15:47:57,185 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [624] Writing vmcfg to /OVS/Repositories/odabaseRepo/VirtualMachines/oakDom1/vm.cfg
2023-07-12 15:47:57,185 [Cmd_EnvId] [Thread-4024] [agentutils] [DEBUG] [181] Created xml string 0ODA Base configured vm
2023-07-12 15:48:07,223 [Check_Shared_Repo] [MainThread] [repoactions] [INFO] [162] Checking for shared repos
2023-07-12 15:48:07,224 [Cmd_EnvId] [Thread-4025] [odaBaseActions] [INFO] [56] Configuring memory for ODA BASE
2023-07-12 15:48:07,239 [Cmd_EnvId] [Thread-4025] [agentutils] [DEBUG] [488] Updating the memory parameter to 81920
2023-07-12 15:48:07,240 [Cmd_EnvId] [Thread-4025] [agentutils] [DEBUG] [624] Writing vmcfg to /OVS/Repositories/odabaseRepo/VirtualMachines/oakDom1/vm.cfg
2023-07-12 15:48:10,250 [Check_Shared_Repo] [MainThread] [repoactions] [ERROR] [182] Error encountered while checking for shared repos: OAKERR:7084The HAVIP 192.168.19.21 is not pingable
2023-07-12 15:48:10,251 [Check_Shared_Repo] [MainThread] [agentutils] [DEBUG] [181] Created xml string 0OAKERR:7084The HAVIP 192.168.19.21 is not pingable
2023-07-12 15:48:17,249 [Cmd_EnvId] [Thread-4026] [odaBaseActions] [INFO] [71] Configuring vcpus for ODA BASE
2023-07-12 15:48:17,249 [Cmd_EnvId] [Thread-4026] [agentutils] [DEBUG] [524] Updating the vcpus parameter to 20
2023-07-12 15:48:17,249 [Cmd_EnvId] [Thread-4026] [agentutils] [DEBUG] [624] Writing vmcfg to /OVS/Repositories/odabaseRepo/VirtualMachines/oakDom1/vm.cfg
2023-07-12 15:48:27,261 [Cmd_EnvId] [Thread-4027] [odaBaseActions] [INFO] [86] Configuring VNC passwd for ODA BASE
2023-07-12 15:48:27,261 [Cmd_EnvId] [Thread-4027] [agentutils] [DEBUG] [624] Writing vmcfg to /OVS/Repositories/odabaseRepo/VirtualMachines/oakDom1/vm.cfg
2023-07-12 15:48:57,313 [Check_Shared_Repo] [MainThread] [repoactions] [INFO] [162] Checking for shared repos
2023-07-12 15:48:57,313 [Cmd_EnvId] [Thread-4030] [odaBaseActions] [INFO] [176] Getting existing vlan list for ODA BASE
2023-07-12 15:48:57,329 [Cmd_EnvId] [Thread-4030] [odaBaseActions] [INFO] [186] Existing Vlan List ['net1', 'net2', 'vbr1']
2023-07-12 15:49:00,340 [Check_Shared_Repo] [MainThread] [repoactions] [ERROR] [182] Error encountered while checking for shared repos: OAKERR:7084The HAVIP 192.168.19.21 is not pingable
2023-07-12 15:49:00,341 [Check_Shared_Repo] [MainThread] [agentutils] [DEBUG] [181] Created xml string 0OAKERR:7084The HAVIP 192.168.19.21 is not pingable
2023-07-12 15:54:47,554 [SYS_DISC_-108646_OvmType] [Thread-4034] [repoactions] [DEBUG] [74] show all repos called
2023-07-12 15:54:47,555 [SYS_DISC_-108646_OvmType] [Thread-4034] [agentutils] [INFO] [79] Re-initializing repos
2023-07-12 15:54:47,555 [SYS_DISC_-108646_OvmType] [Thread-4034] [oakvmagentxml] [DEBUG] [56] Initializing oakagentxml object for repo /OVS/Repositories/odarepo2
2023-07-12 15:54:47,556 [SYS_DISC_-108646_OvmType] [Thread-4034] [repoactions] [DEBUG] [82] writing repo xml for odarepo2 repository to /tmp/fileApuAH8 file
2023-07-12 15:54:47,559 [SYS_DISC_-108646_OvmType] [Thread-4034] [repoactions] [DEBUG] [87] removing temporary file /tmp/fileApuAH8
2023-07-12 15:54:57,572 [Check_Shared_Repo] [MainThread] [repoactions] [INFO] [162] Checking for shared repos
2023-07-12 15:54:57,572 [SYS_DISC_-108646_OvmType] [Thread-4035] [vmactions] [INFO] [348] vm discovery operation called for odarepo2 repo
2023-07-12 15:54:57,589 [SYS_DISC_-108646_OvmType] [Thread-4035] [agentutils] [INFO] [79] Re-initializing repos
2023-07-12 15:54:57,591 [SYS_DISC_-108646_OvmType] [Thread-4035] [oakvmagentxml] [DEBUG] [56] Initializing oakagentxml object for repo /OVS/Repositories/odarepo2
2023-07-12 15:55:00,602 [Check_Shared_Repo] [MainThread] [repoactions] [ERROR] [182] Error encountered while checking for shared repos: OAKERR:7084The HAVIP 192.168.19.21 is not pingable
2023-07-12 15:55:00,603 [Check_Shared_Repo] [MainThread] [agentutils] [DEBUG] [181] Created xml string

Conclusion

It's very important to be compliant with an Oracle license. There is possiblity that you loose control of number of database due to high demand of business needs. Having these core count number ease the planing of coming work loads . If anything beyond the cpu license take small downtime window and adjust as per the licesing policy.

OVM : Troubleshoot process of adding OVM hypervisor back to cluster.

Intro

It has been ages since when Oracle released its own hypervisor (OVM). OVM technology is based on paravirtualization and uses Xen-based hypervisor. OVM's latest release version 3.4.6.3 is the latest one available. Oracle announces extended support for OVM and the support period is March 2021 and will end on March 31, 2024.

Oracle's next virtualization release is based on KVM, combined with OLVM (Oracle Linux Virtualization Manager).

I have published the Oracle documentation for OLVM: https://docs.oracle.com/en/virtualization/oracle-linux-virtualization-manager.

There are customers still using OVM. This is the right time to plan their journey on OLVM.

In this article, I will elaborate on issues we faced when we tried to map repositories to cluster node02.

Overview of the issue.

We faced a new issue with the OVM cluster environment. This was caused due to sudden data center power outage. Once everything was online we could not start the OVM hypervisor ovs-agent services. The only option left was to perform a complete reinstallation of the node.

When I tried to add node backup to the cluster we faced an issue with mounting the repositories. The next option was to remove the nodes from the cluster again, This action was performed via GUI.

In my previous blog: OVM - Remove stale cluster entries. I was able to fix this stale entry issue from the OVM hypervisor side.

But the issue was not resolved When we tried to present repositories to node02 got below mentioned error message.


OVMRU_002036E OVM-Repo2 - Cannot present the Repository to server: calavsovm02. The server needs to be in a cluster. [Thu Jun 22 10:25:34 EDT 2023]

We could mount the repositories manually to test, These repositories should mount automatically when a node is added to the cluster.

But the GUI shows node will be in the cluster. But repositories will not be visible on node02.

If the node is part of the cluster, I would recommend removing the node from the cluster before making any changes. In this scenario, node addition to the cluster is not moving after the below mount. This log is from ovs agent.

Agent log output



"DEBUG (ocfs2:182) cluster debug: {'/sys/kernel/debug/o2dlm': [], 
'/sys/kernel/debug/o2net': ['connected_nodes', 'stats', 'sock_containers', 'send_tracking'],
'/sys/kernel/debug/o2hb': ['0004FB0000050000B705B4397850AAD6', 'failed_regions', 'quorum_regions', 'live_regions', 'livenodes'],
'service o2cb status': 'Driver for "configfs": Loaded\nFilesystem "configfs": Mounted\nStack glue driver: Loaded\nStack plugin "o2cb": 
Loaded\nDriver for "ocfs2_dlmfs": Loaded\nFilesystem "ocfs2_dlmfs": 
Mounted\nChecking O2CB cluster "f6f6b47b38e288e0": Online\n Heartbeat dead threshold: 61\n Network idle timeout: 60000\n Network keepalive delay: 2000\n Network reconnect delay: 2000\n Heartbeat mode: Global\nChecking O2CB heartbeat: Active\n 0004FB0000050000B705B4397850AAD6 /dev/dm-2\nNodes in O2CB cluster: 0 1 \nDebug file system at /sys/kernel/debug: mounted\n'}
[2023-06-22 11:10:25 12640] DEBUG (ocfs2:258) Trying to mount /dev/mapper/36861a6fddaa0481ec0dd3584514a8d62 to /poolfsmnt/0004fb0000050000b705b4397850aad6 "

Var/Log/Message outupt



Jun 27 12:55:32 calavsovm02 kernel: [ 659.079952] o2net: Connection to node calavsovm01 (num 0) at 10.110.110.101:7777 shutdown, state 7
Jun 27 12:55:34 calavsovm02 kernel: [ 661.080005] o2net: Connection to node calavsovm01 (num 0) at 10.110.110.101:7777 shutdown, state 7
Jun 27 12:55:36 calavsovm02 kernel: [ 663.079916] o2net: Connection to node calavsovm01 (num 0) at 10.110.110.101:7777 shutdown, state 7
Jun 27 12:55:38 calavsovm02 kernel: [ 665.080167] o2net: Connection to node calavsovm01 (num 0) at 10.110.110.101:7777 shutdown, state 7
Jun 27 12:55:40 calavsovm02 kernel: [ 667.079905] o2net: No connection established with node 0 after 60.0 seconds, check network and cluster configuration.

Troubleshoot steps

Validate storage settings on both the servers

First, validate from both nodes' storage visible. This can be validated by running a multipath -ll command.


[root@ovs-node01 ~]# multipath -ll
36861a6fddaa0481ec0dd3584514a8d62 dm-0 EQLOGIC,100E-00
size=16G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 11:0:0:0 sdc 8:32 active ready running
36861a6fddaa0787dbeddf57e514abd8a dm-1 EQLOGIC,100E-00
size=3.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 10:0:0:0 sdb 8:16 active ready running
36861a6fddaa0d8306edd157b4d4aed23 dm-2 EQLOGIC,100E-00
size=2.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 9:0:0:0  sdd 8:48 active ready running


[root@ovs-node02 ~]# multipath -ll
36861a6fddaa0481ec0dd3584514a8d62 dm-1 EQLOGIC,100E-00
size=16G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 12:0:0:0 sde 8:64 active ready running
36861a6fddaa0787dbeddf57e514abd8a dm-2 EQLOGIC,100E-00
size=3.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 11:0:0:0 sdd 8:48 active ready running
36861a6fddaa0d8306edd157b4d4aed23 dm-0 EQLOGIC,100E-00
size=2.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 10:0:0:0 sdc 8:32 active ready running
[root@calavsovm02 oswatcher]#

Validate network Setting.

In this architecture, a storage connection is established via bond1. both servers are configured to use jumbo frames which 9000

Node02 Network settings



[root@ovs-node02 network-scripts]# cat ifcfg-bond1
DEVICE=bond1
BONDING_OPTS="mode=6 miimon=250 use_carrier=1 updelay=500 downdelay=500 primary_reselect=2 primary=eth1"
BOOTPROTO=static
IPADDR=*.*.*.*
NETMASK=*.*.*.*
ONBOOT=yes
MTU=9000


-- ifcfg-eth1

[root@ovs-node02 network-scripts]# cat ifcfg-eth1
DEVICE="eth1"
BOOTPROTO="none"
DHCP_HOSTNAME="ovs-node02"
HWADDR="*.*.*.*"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
UUID="36f65c92-3ad9-487b-b9ab-5bd792372d37"
MASTER=bond1
SLAVE=yes
MTU=9000

-- ifcfg-eth2

[root@ovs-node02 network-scripts]# cat ifcfg-eth2
DEVICE="eth2"
BOOTPROTO="none"
DHCP_HOSTNAME="ovs-node02"
HWADDR="*.*.*.*"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
UUID="44589639-3b05-453b-9dc5-1b1a3b4d4c4d"
MASTER=bond1
SLAVE=yes
MTU=9000

Validate Jumps frames working

As per our above-mentioned logs, we observed that there is a problem with the network connectivity, We thought of validating bond1 with 8000 packets and 1000 packets.

As per the below mentioned for 8000 packets there is no output. But we have output for 1000 packets. This concludes some configuration mismatch is there on the switch side.

Test 8000 packet


[root@ovs-node02 ~]# ping -s 8000 -M do 10.110.210.201
PING 10.110.210.201 (10.110.210.201) 8000(8028) bytes of data.

Test 1000 packet


[root@ovs-node02 ~]# ping -s 1000 -M do 10.110.210.201
PING 10.110.210.201 10.110.210.201) 1000(1028) bytes of data.
1008 bytes from 10.110.210.201: icmp_seq=1 ttl=64 time=0.183 ms
1008 bytes from 10.110.210.201: icmp_seq=2 ttl=64 time=0.194 ms

Solution

In the initial stage, we try to add the node to the cluster it's sending the 9000 packet to the storage network on node01. In this scenario, jumbo frames are not working as expected. So automatic storage mounting is not working.

The solution is to change the network MTU size on node02 to 1500 and restart the network service. On the safe side, you can reboot node02.

Change network settings to 1500 on node02


[root@ovs-node02 network-scripts]# cat ifcfg-bond1
DEVICE=bond1
BONDING_OPTS="mode=6 miimon=250 use_carrier=1 updelay=500 downdelay=500 primary_reselect=2 primary=eth1"
BOOTPROTO=static
IPADDR=*.*.*.*
NETMASK=*.*.*.*
ONBOOT=yes
MTU=1500
[root@ovs-node02 network-scripts]#

Now try to add node 02 to the cluster again. This solution worked for our environment, Now try to map other repositories to node02.

Conclusion

These issues are complex and we need to spend more time to understand the issue. I would recommend creating a service request with Oracle before making any changes.

Carefully looks at the logs ovs-agent and var-log-message to understand the issue. Also, I would suggest executing a sosreport of the problematic node.

Monday, July 17, 2023

OLVM : 2 node with Gluster storage with Arbitrated Replicated Volumes

Intro

Why Arbiter?

Split-brains in replica volumes

What is a gluster arbitrator?

Arbitrator disk capacity?

Pre-Requisites

Implementation steps

Partition the disk

Discover arbitrator node

Add arbitrator disk to glusterfs

Validate arbitrator replicated volume

Conclusion

Friday, July 14, 2023

Virtualized ODA 19.13 - Scale down cpu core count on oda_base

Intro

Adjust core count

Steps to perform.

Startup ODA-BASE

CPU Core count configuration

Validate the ODA-BASE status.

Validate Core Count

Validate settings via dom0 log

Conclusion

OVM : Troubleshoot process of adding OVM hypervisor back to cluster.

Intro

Overview of the issue.

Agent log output

Var/Log/Message outupt

Troubleshoot steps

Validate storage settings on both the servers

Validate network Setting.

Node02 Network settings

Validate Jumps frames working

Test 8000 packet

Test 1000 packet

Solution

Change network settings to 1500 on node02

Conclusion

Unified Auditing Housekeeping