Thursday, December 9, 2021

DataGuard switchover in OCI/Oracle Cloud






Acronym - 

DR - Disaster recovery  

Organizations are investing colossal sum of money on DR (Disaster Recovery solutions) because it's has direct impact on business continuity. Many companies implement the DR solution and not testing the DR functionality frequently. It's business and IT infrastructure team's responsibility to test DR functionality at least twice a year. Some companies run their load in DR site for 6 months and change it to back to primary site.  Company should have a proper disaster recovery plan to be back online within short period of time.


This link has really good information about DR strategy and planning  https://www.iternalnetworks.com/what-is-the-best-method-for-disaster-recovery/


Building a Disaster Recovery Plan

Before you choose the best method for disaster recovery, you must first have a disaster recovery plan in place.

To ensure your business continuity, you’ll want to make sure that these best practices are followed:

  • Have an updated written or printed version of your disaster recovery plan in an easy-to-access location before a disaster occurs. 
  • Knowing where the plan is located will help you to quickly start the process without wasting precious time searching for the document
  • Ensure that your recent system backups are stored offsite, be it in the cloud or at another physical location. This helps to ensure that your backups aren’t affected by the same disaster which can lead to longer recovery times.
  • Plan for how your business will return to normalcy post-disaster
  • Update your disaster recovery plans frequently to reflect changes in your business
  • Test your plan to make sure it’s viable against disasters that are likely in your area

Once your disaster plan meets these criteria, you will be far more suited to choose a method of recovery that fits your business. You’ll also need to choose a method of backup storage, be it the cloud, local, removable storage media, or a mix of both.


Recovery Time Objective vs Recovery Point Objective


Organization must understand the differences between recovery time objective (RTO) and recovery point objective (RPO) in disaster recovery. RTO is how long it will take to get your critical infrastructure up and running after a disaster while RPO is the timeframe between the disaster and your last backup.

When RPO is zero, there is no data loss because your system is continually backing up the newest versions of data. 

When RTO is zero, there is no downtime – you barely notice that your systems went down in the first place.


OCI enables greater reflexibility to enable DR solution via DG (data guard) , This enables to create standby database on same region or different region. Also this enable easy option for switchover and failover. 

This article I will illustrate how we can test the switchover in OCI environments.

First let's gather primary and standby database details.

Gather database details

Primary


########## database details 

SQL> show parameter db_uni

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                       string      PWSH01_yyz16x
SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PWHSE_PDB                      READ WRITE NO

Database Status



SQL> select INST_ID,FLASHBACK_ON from gv$database;

   INST_ID FLASHBACK_ON
---------- ------------------
         1 YES
         2 YES

select instance_name,status,HOST_NAME,to_char(startup_time,'dd/mm/yyyy hh24:mi') startup_time from  gv$instance;
		 
INSTANCE_NAME    STATUS       HOST_NAME                                                        STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011          OPEN         dbsdpl21                                                         26/11/2021 19:39
PWSH012          OPEN         dbsdpl22                                                         26/11/2021 19:39


Standby Database



########## standby server

SQL> how parameter db_uni

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                       string      PWSH01_yyz1k6
SQL> select open_mode,database_role from gv$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ ONLY WITH APPLY PHYSICAL STANDBY
READ ONLY WITH APPLY PHYSICAL STANDBY

select instance_name,status,HOST_NAME,to_char(startup_time,'dd/mm/yyyy hh24:mi') startup_time from  gv$instance;

INSTANCE_NAME    STATUS       HOST_NAME                                                        STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011          OPEN         dbsdpl251                                                        26/11/2021 19:43
PWSH012          OPEN         dbsdpl252                                                        26/11/2021 19:43

Standby guaranteed restore point creation

Note: First create restore point on standby side, before that perform few log switches from primary to make it consistent.



alter system archive log current;

Error

This errors comes because MRP process is recovering the standby database. To create GRP we need to stop the MRP process.



SQL> create restore point Before_Switchover_stby guarantee flashback database;
create restore point Before_Switchover_stby guarantee flashback database
*
ERROR at line 1:
ORA-38784: Cannot create restore point 'BEFORE_SWITCHOVER_STBY'.
ORA-01153: an incompatible media recovery is active

Solution



########### solution

DGMGRL> edit database PWSH01_yyz1k6 set state='APPLY-OFF';
Succeeded.
DGMGRL> show database PWSH01_yyz1k6

Database - PWSH01_yyz1k6

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-OFF
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          16 seconds (computed 1 second ago)
  Average Apply Rate: (unknown)
  Real Time Query:    OFF
  Instance(s):
    PWSH011 (apply instance)
    PWSH012

Database Status:
SUCCESS

SQL> create restore point Before_Switchover_stby guarantee flashback database;

Restore point created.

Primary guaranteed restore point creation and verification.




create restore point Before_Switchover_stby guarantee flashback database;


      SCN GUA TIME                                                                        NAME
---------- --- --------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
   5940139 YES 26-NOV-21 08.29.09.000000000 PM                                             BEFORE_SWITCHOVER_PRI



Pre-checks before switchover.


Verify the connectivity between primary and standby.

############### standby connection 

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION=(SDU=65535)(SEND_BUF_SIZE=10485760)(RECV_BUF_SIZE=10485760)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.0.0.10)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=10.0.0.47)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=PWSH01_yyz1k6.sub08101943420.cnvcn01.oraclevcn.com)(UR=A)))
OK (0 msec)
[oracle@dbsdpl21 admin]$


Validate the data guard status 


 Note : It's really important to validate the data guard status before the switchover , dgmgrl utility has features to check the database readiness for the switchover.


[oracle@dbsdpl21 admin]$ dgmgrl sys/CHana88#_@PWSH01_YYZ1K6
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Fri Nov 26 20:34:49 2021
Version 19.11.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "PWSH01_yyz1k6"
Connected as SYSDBA.
DGMGRL>
DGMGRL> show database verbose
Object "verbose" was not found
DGMGRL>  show configuration;

Configuration - PWSH01_yyz16x_PWSH01_yyz1k6

  Protection Mode: MaxPerformance
  Members:
  PWSH01_yyz16x - Primary database
    PWSH01_yyz1k6 - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 39 seconds ago)

Validate the database is ready for the switchover




############# Validate 


DGMGRL> edit database PWSH01_yyz1k6 set state='APPLY-ON';
Succeeded.
DGMGRL>
DGMGRL> validate database 'PWSH01_yyz1k6';

  Database Role:     Physical standby database
  Primary Database:  PWSH01_yyz16x

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Managed by Clusterware:
    PWSH01_yyz16x:  YES
    PWSH01_yyz1k6:  YES

  Standby Apply-Related Information:
    Apply State:      Running
    Apply Lag:        2 seconds (computed 0 seconds ago)
    Apply Delay:      0 minutes

DGMGRL>

Switchover 


OCI gives flexibility to switchover database using GUI and command link. As a dba I would prefer on command line and it gives more control.
 
Note: Make sure to run few log switches before the switchover and check standby database is fully synced. If everything looks perfect , perform the switchover. 

Switchover using gui :




commands line :


 
DGMGRL> switchover to 'PWSH01_yyz1k6';
Performing switchover NOW, please wait...
New primary database "PWSH01_yyz1k6" is opening...
Oracle Clusterware is restarting database "PWSH01_yyz16x" ...
Connected to "PWSH01_yyz16x"
Connected to "PWSH01_yyz16x"
Switchover succeeded, new primary is "PWSH01_yyz1k6"
DGMGRL>

Validate database role after the switchover


Once the switchover is successful , verify the database role on new primary
and standby.


SQL> show parameter db_uni

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                       string      PWSH01_yyz1k6
SQL> select open_mode,database_role from gv$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ WRITE           PRIMARY
READ WRITE           PRIMARY

SQL> set lines 600
SQL> /

INSTANCE_NAME    STATUS       HOST_NAME                                                        STARTUP_TIME
---------------- ------------ ---------------------------------------------------------------- ----------------
PWSH011          OPEN         dbsdpl251                                                        26/11/2021 19:43
PWSH012          OPEN         dbsdpl252                                                        26/11/2021 19:43

SQL>

No comments:

Post a Comment

Exacs database creation using dbaascli

  Intro OCI (Oracle Cloud Infrastructure) provides robust automation capabilities for routine maintenance tasks such as patching, ...