Intro
Overview of the issue
How to identify there are stale entire?
[root@ovm-node02 ~]# service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Stack glue driver: Loaded
Stack plugin "o2cb": Loaded
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster "f6f6b47b38e288e0": Online
Heartbeat dead threshold: 61
Network idle timeout: 60000
Network keepalive delay: 2000
Network reconnect delay: 2000
Heartbeat mode: Global
Checking O2CB heartbeat: Active
0004FB0000050000B705B4397850AAD6 /dev/dm-2
Nodes in O2CB cluster: 0 1
Debug file system at /sys/kernel/debug: mounted
[root@ovm-node02 ovm-node02]# ls -lrth /sys/kernel/config/cluster/f6f6b47b38e288e0/node/
total 0
drwxr-xr-x 2 root root 0 Jun 23 09:28 ovm-node02
drwxr-xr-x 2 root root 0 Jun 23 09:33 ovm-node01
[root@ovm-node02 ovm-node02]#
The next step is to validate from the master node (ovm-node01) database entries. This shows there are two pool_member_ip_list.
[root@ovm-node01]# ovs-agent-db dump_db server
{'cluster_state': 'DLM_Ready',
'clustered': True,
'fs_stat_uuid_list': ['0004fb000005000015c1fb14ef761f40',
'0004fb000005000079ae03177c3edc7e',
'0004fb000005000065985109f8834e8b'],
'is_master': True,
'manager_event_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Event',
'manager_ip': '192.168.85.152',
'manager_statistic_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Statistic',
'manager_uuid': '0004fb0000010000c8ecbd219dc6b1ee',
'node_number': 0,
'pool_alias': 'EclipsysOVM',
'pool_master_ip': '192.168.85.177',
'pool_member_ip_list': ['192.168.85.177', '192.168.85.178'],
'pool_uuid': '0004fb0000020000f6f6b47b38e288e0',
'poolfs_nfsbase_uuid': '',
'poolfs_target': '/dev/mapper/36861a6fddaa0481ec0dd3584514a8d62',
'poolfs_type': 'lun',
'poolfs_uuid': '0004fb0000050000b705b4397850aad6',
'registered_hostname': 'ovm-node01',
'registered_ip': '192.168.85.177',
'roles': set(['utility', 'xen'])}
[root@calavsovm01 ovm-node01]#
Remove node from cluster commands line
Now we can remove the oven-node02 from the second node.
[root@ovm-node01]# o2cb remove-node f6f6b47b38e288e0 ovm-node02
Validate node entries
After removing node02, we can see only one entry in the OVM database.
[root@ovm-node01]# ls /sys/kernel/config/cluster/f6f6b47b38e288e0/node/
ovm-node02
[root@ovm-node01]#
Validate using O2CB
First, restart the ovs-agent on both nodes and validate the o2cb cluster status from node01.
[root@ovm-node01]# service ovs-agent restart
Stopping Oracle VM Agent: [ OK ]
Starting Oracle VM Agent: [ OK ]
[root@ovm-node01 ~]# service ovs-agent status
log server (pid 32442) is running...
notificationserver server (pid 32458) is running...
remaster server (pid 32464) is running...
monitor server (pid 32466) is running...
ha server (pid 32468) is running...
stats server (pid 32470) is running...
xmlrpc server (pid 32474) is running...
fsstats server (pid 32476) is running...
apparentsize server (pid 32477) is running...
[root@ovm-node01 ~]#
Also I would recommend to restart the node02 after the node removal, Once the node is back online validate the /etc/ocfs2/cluster.conf
[root@ovm-node01 ~]# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 1
name = f6f6b47b38e288e0
node:
number = 0
cluster = f6f6b47b38e288e0
ip_port = 7777
ip_address = 10.110.110.101
name = ovm-node01
heartbeat:
cluster = f6f6b47b38e288e0
region = 0004FB0000050000B705B4397850AAD6
Note: ovs-agent restart won't have any impact on running VMs.
[root@ovm-node01]# ovs-agent-db dump_db server
{'cluster_state': 'DLM_Ready',
'clustered': True,
'fs_stat_uuid_list': ['0004fb000005000015c1fb14ef761f40',
'0004fb000005000079ae03177c3edc7e',
'0004fb000005000065985109f8834e8b'],
'is_master': True,
'manager_event_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Event',
'manager_ip': '192.168.85.152',
'manager_statistic_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Statistic',
'manager_uuid': '0004fb0000010000c8ecbd219dc6b1ee',
'node_number': 0,
'pool_alias': 'EclipsysOVM',
'pool_master_ip': '192.168.85.177',
'pool_member_ip_list': ['192.168.85.177'],
'pool_uuid': '0004fb0000020000f6f6b47b38e288e0',
'poolfs_nfsbase_uuid': '',
'poolfs_target': '/dev/mapper/36861a6fddaa0481ec0dd3584514a8d62',
'poolfs_type': 'lun',
'poolfs_uuid': '0004fb0000050000b705b4397850aad6',
'registered_hostname': 'ovm-node01',
'registered_ip': '192.168.85.177',
'roles': set(['utility', 'xen'])}
[root@ovm-node01]#
Conclusion
There can be situations gui will not remove the entries from the OVM hypervisor. Always validate the OVM data entries before retying the node addition to the cluster. Make sure the cluster-shared repositories are mounting automatically.
No comments:
Post a Comment