Monday, January 5, 2026

Stay Secure Without Stopping: Leveraging Oracle TAC and Rolling Patching in the AI-Driven Era

 





Intro

We live in an era where businesses are powered by data, running their most critical systems on robust platforms like Oracle Exadata Cloud@Customer and Exadata Cloud Service. Yet, after migration, many organizations tend to overlook the rich set of features these platforms provide. 

Even fundamental capabilities such as Oracle RAC’s Transparent Application Continuity (TAC) are often forgotten or left unconfigured. Think of it this way , if you buy a Ferrari, you don’t drive it like a regular car. You learn its features, understand its capabilities, and drive it like a Ferrari. The same mindset applies to Exadata - to get the most out of it, you need to fully leverage its advanced features.

Why Transparent Application Continuity (TAC) Matters

As AI continues to evolve at a rapid pace, new types of software vulnerabilities are constantly emerging, putting sensitive data at risk. As DBAs, it’s our responsibility to ensure systems remain secure, resilient, and up to date with the latest patches and security updates.

However, downtime during patching cycles is no longer acceptable for most modern businesses. Oracle addresses this challenge through rolling patching, allowing updates to be applied without taking the system offline. But to truly achieve zero application disruption, you need to ensure your applications are compatible with and configured for Transparent Application Continuity (TAC).

What Is Oracle Transparent Application Continuity (TAC)?

Oracle Transparent Application Continuity (TAC) is a powerful high-availability feature designed to protect user sessions from disruptions caused by planned maintenance, recoverable outages, or instance failures , without requiring any changes to application code.

When TAC is enabled, the database automatically replays in-flight transactions after a failure, preserving session state, database calls, and application context. This allows users to continue working as if nothing happened, making the entire recovery process completely transparent to both users and applications.

How TAC Enhances Availability

TAC builds upon Oracle’s Application Continuity (AC) feature, adding more intelligence and automation. It offers:

  • Transparent support for most applications using Oracle drivers

  • Protection against both planned and unplanned outages

  • Elimination of custom retry logic in applications

  • Seamless integration with Oracle RAC and Exadata environments

In simple terms, TAC keeps your applications running smoothly, even during database patching or failover operations. It ensures a consistent user experience, minimizes downtime, and helps your business maintain true high availability.

In this article, I will explain how to configure Transparent Application Continuity (TAC) for Oracle RAC. We implemented this solution for one of our customers who could not afford any application downtime due to the business-critical nature of their Exadata infrastructure during patching cycles. As part of the configuration, we also created dedicated services to properly segregate workloads and ensure optimal failover behavior.

To begin, we divided the services across RAC nodes to effectively separate workloads. The current services are shown below.



[oracle@exanode-01]$ srvctl status service -d TEST2_EXA
Service test2_uat_iap_srv is running on instance(s) TEST21
Service test2_uat_kaf_srv is running on instance(s) TEST22


TAC configuration

Here are the basic settings you can use to start testing this setup.



srvctl add service -d TEST2_EXA -s TEST2_UAT_KAF_SRV \
-r TEST22 \
-a TEST21 \
-failovertype AUTO \
-failover_restore AUTO \
-commit_outcome TRUE \
-failback YES \
-failovermethod BASIC


Describe parameters


Parameter Meaning
-d TEST2_EXA Specifies the database unique name where the service is being created. In this case, the service is for the database TEST2_EXA.
-s TEST2_UAT_KAF_SRV The service name being created — in this case, TEST2_UAT_KAF_SRV. Applications will connect using this service.
-r TEST22 Defines the preferred instance(s) for the service — i.e., where the service should normally run. Here, the preferred instance is TEST22.
-a TEST21 Defines the available instance(s) (standby/failover instances). If the preferred instance (TEST22) goes down, the service will failover to TEST21.
-failovertype AUTO Enables Application Continuity (AC) or Transparent Application Continuity (TAC). This allows automatic replay of in-flight transactions after a failure, maintaining session continuity.
-failover_restore AUTO Specifies that the session state is automatically restored after a successful replay — restoring PL/SQL package state, session context, etc.
-commit_outcome TRUE Ensures transaction outcome consistency. If a failure happens after a COMMIT, Oracle guarantees that the transaction outcome (committed or not) is correctly identified and replayed safely.
-failback YES When the preferred instance (TEST22) comes back online, the service will automatically move back to it.
-failovermethod BASIC Specifies TAF (Transparent Application Failover) behavior. BASIC means that the client reconnects to the failover instance when a connection loss occurs — simple but effective for many apps.



Using this method, whenever you patch Node 1, you can simply relocate the services to Node 2 to maintain continuity.

In modern ExaCS or ExaCC environments, you also have the option to temporarily increase OCPUs during the patching window. This helps minimize any performance impact on applications while one node is offline.

Below are the commands used to relocate a service.


-- node1 to node02
srvctl relocate service -d TEST2_EXA -s test2_uat_iap_srv -oldinst TEST21 -newinst TEST22
-- node2 to node01 srvctl relocate service -d TEST2_EXA -s test2_uat_kaf_srv -oldinst TEST22 -newinst TEST21


Note: I strongly recommend that database engineers and architects work closely with the application team to verify that there are no application-side disruptions before moving this approach into production.


Conclusion

High availability is no longer optional; it’s an essential requirement for any business operating in a data-driven world. Oracle Exadata platforms are engineered to deliver exceptional performance and resilience. Still, these benefits can only be fully realized when core features, especially those within Oracle Real Application Clusters (RAC), are properly implemented.

Transparent Application Continuity (TAC) is one of the most powerful RAC features available today. When enabled, it ensures that user sessions seamlessly survive node failures, planned maintenance, and rolling patching without requiring any application changes. This transforms routine maintenance into a zero-impact activity and allows businesses to stay secure and up to date without downtime.

By leveraging TAC, organizations unlock the true promise of Exadata and RAC: uninterrupted service, effortless scalability, and a smooth user experience, even in the face of disruptions. If you’ve invested in a high-performance system, make sure you’re using it to its fullest potential.

Tuesday, November 11, 2025

Unlocking Performance: Direct Storage Mapping with Oracle OLVM

 



Intro 

Virtualization is far from dying - in fact, cloud computing wouldn’t exist without it. In my view, virtualization was the real game-changer that opened the gateway to the cloud. While the cloud has been widely adopted, many organizations are now moving to hybrid cloud strategies to balance flexibility and control.

The recent resurgence in virtualization interest is largely due to Broadcom’s acquisition of VMware, which introduced new licensing models that pushed many small and medium-sized businesses to explore alternatives.

In my opinion, Oracle Linux Virtualization Manager (OLVM) stands out as the most viable option. Oracle has been in this space for years, providing robust support and continuous innovation. I recently discovered that OLVM now supports direct storage mapping, allowing workloads to bypass the hypervisor layer and write directly to storage, a major performance advantage for demanding enterprise environments.

We’ve seen many small and medium-sized businesses use OLVM to run their Oracle databases. When a database requires higher storage throughput, OLVM provides an excellent option to optimize storage performance via direct mapping.

In this article, I’ll demonstrate how easy it is to integrate storage using Direct Mapping in OLVM.

Before diving into the implementation steps, let’s first highlight a few disadvantages of using Direct Mapping:

  • Limited Portability:
    VMs using Direct LUNs are tightly tied to specific storage, making live migrations between hosts more complex.

  • Backup and Snapshot Limitations:
    Direct LUNs bypass the standard OLVM storage domain, so built-in snapshot and backup features may not be available.

  • Management Complexity:
    Each LUN must be manually mapped and managed, which increases administrative overhead compared to managed storage.

  • Reduced Flexibility:
    Since the LUN is directly attached, it cannot be easily cloned, resized, or moved across environments.

  • Dependency on Storage Configuration:
    Any misconfiguration or change on the storage side can directly impact VM availability and performance.

Now, navigate to Compute > Virtual Machines > Select the VM > Disks to begin the configuration.




                                                Figure 1: Navigate the storage disk

Once you select a disk, you’ll see three main options:

  • Image (VM Disk)

  • Direct LUN

  • Managed Block

Next, go to the Direct Mapping tab and choose the appropriate host. You’ll then need to select the storage type, which can be one of the following:

  • iSCSI

  • FC (Fibre Channel)

                                                       

                                                      Figure 2: Add da irect mapping disk

In this example, we’ll add a storage type of FC. Once you change the storage type to FC, the system will display the available Fibre Channel disks. Before adding, verify the correct LUN ID with your storage administrator to ensure proper mapping.




                                                    Figure 3: Select disk 

You can now use the fdisk command to partition the direct-mapped disk.

After adding the direct mapping, you can verify the configuration by selecting the Disk Type: Direct LUN to confirm the mapped LUN is correctly attached.


                                              Figure 4 : Validate the direct mapping disk.


Conclusion

Direct mapping in Oracle Linux Virtualization Manager (OLVM) offers a powerful way to achieve higher storage performance by allowing VMs to access storage directly, bypassing the hypervisor layer. While it introduces some management and flexibility trade-offs, it’s an excellent option for workloads, especially Oracle Databases that demand low latency and high throughput.

By understanding both the advantages and limitations, administrators can effectively leverage direct mapping to optimize storage performance while maintaining a stable and efficient OLVM environment.


Friday, October 10, 2025

Simplifying Lower Environment Refreshes with Oracle Exadata Cloud as Service (ExaCS)

 





Intro

AI technologies are evolving at a rapid pace, but the landscape is still unstable. At the core of every AI system lies data, and validating AI models without proper test environments is simply not possible. To ensure reliability, lower environments must be refreshed regularly with production-scale data. Large datasets are essential for accuracy, and inadequate test data often leads to issues such as hallucinations something many enterprises are already struggling with.

Refreshing databases from higher environments is therefore a critical part of any testing strategy. However, managing large-scale refresh operations is complex and requires careful preparation to ensure scripts, processes, and dependencies are fully aligned. With Oracle ExaCS (Exadata Cloud@Customer), this process is greatly simplified, allowing DBAs to execute refresh operations efficiently and with minimal disruption—bringing environment refreshes quite literally to your fingertips.


Creating a Lower Environment Using EXACS

In this article, I’ll walk you through the process of creating a lower environment using Oracle Exadata Cloud@Customer (EXACS).

Scenario:

  • Database Backups: Managed via the Automatic Recovery Service.

  • Database Location: Databases are hosted on EXACS.

  • Note: The Automatic Backup Service utilizes Zero Data Loss Recovery Appliance (ZDLRA) for restoring backups.

Step 1: Navigate to the Database Backup

Begin by accessing your database backup configurations to initiate the environment setup process. Once you locate the 


.

                                              Figure 1 : Navigate the backup


Step 2: Create the Database

This step is quite simple. Here, you’ll select the appropriate Pluggable Database (PDB) and the desired Availability Domain.

Make sure to carefully choose the correct Exadata Infrastructure, as selecting the wrong one can lead to deployment issues or connectivity problems later on.



                                               Figure 2 : Create database using backup


Step 3: Select the Correct Cluster

The next step is to select the appropriate cluster. Remember, ExaCS runs VMs on top of Exadata infrastructure, which allows you to create multiple VMs and segregate different environments. I’ll cover this topic in more detail in a separate blog.



                                        Figure 3 : Create database backup - 1  


Step 4 : Database Configuration and Duplication Setup

In Figure 4, you need to specify both the database name and the database unique name.

You can set the SYS password to any value that follows Oracle’s standard password conventions.

For a database duplication, ensure that you provide the TDE password, as the backup pieces will be directly retrieved from ZDLRA.


                                                  Figure 4 : Set passwords.  


An interesting piece of information I’d like to share is where to find the duplicate database logs.

  1. Log in to the target host where the duplicate database is running.

  2. The duplicate logs are located at: /var/opt/oracle/log/TEST2/dtrs/rman/bkup

  3. Sample log file: DBRS_IRESTORE_2025-08-08_18-41-56-5774831298590343329.log


[grid@exatestdb-node1 bkup]$ tail -200f DBRS_IRESTORE_2025-08-08_18-41-56-5774831298590343329.log
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743375_198"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743401_198"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734225_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734315_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734317_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734307_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734316_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467734322_219"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743306_254"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743388_254"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467729735_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730363_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730614_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730825_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730875_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730901_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730919_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730923_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730940_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730986_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467730985_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467731010_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467731023_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467731036_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467731836_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467731341_271"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743414_273"
new media label is "Recovery Appliance (raiadp4)" for piece "VB$_3990092206_467743537_273"
channel RA0: starting datafile backup set restore
channel RA0: specifying datafile(s) to restore from backup set
channel RA0: restoring datafile 00009 to +DATAC1/TEST_EXA/DATAFILE/reefer_svc_tbs.2509.1208630651
channel RA0: restoring section 1 of 18
channel RA0: reading from backup piece VB$_3890646435_481469587_9
channel RA1: starting datafile backup set restore
channel RA1: specifying datafile(s) to restore from backup set
channel RA1: restoring datafile 00010 to +DATAC1/TEST_EXA/DATAFILE/generic_sensor_hist_2024_q3.3874.1208630651
channel RA1: restoring section 1 of 10
channel RA1: reading from backup piece VB$_3890646435_481469600_10
channel RA2: starting datafile backup set restore
channel RA2: specifying datafile(s) to restore from backup set
channel RA2: restoring datafile 00011 to +DATAC1/TEST_EXA/DATAFILE/generic_sensor_hist_2024_q1.2488.1208630651
channel RA2: restoring section 1 of 10
channel RA2: reading from backup piece VB$_3890646435_481469605_11
channel RA3: starting datafile backup set restore


Conclusion

AI systems rely heavily on data, making well-maintained test environments essential for validating models and avoiding issues like hallucinations. Refreshing lower environments with production-scale data is critical but complex, requiring careful coordination of scripts, dependencies, and processes. Oracle Exadata Cloud@Customer (ExaCS) simplifies this, enabling DBAs to perform efficient, secure, and low-downtime environment refreshes.

Wednesday, October 8, 2025

My Focus at Oracle AI World 2025 – AI, Security, and the Road Ahead

 



Intro

Navigating the Future of AI and Security at Oracle AI World 2025

As we step into the age of intelligent enterprises, AI and data security have become inseparable pillars of innovation and resilience. This year’s Oracle AI World 2025, taking place October 13–16 in Las Vegas, marks a defining shift in Oracle’s vision ,  from cloud transformation to AI-driven reinvention.

CloudWorld has been reimagined as Oracle AI World, reflecting Oracle’s commitment to embedding AI across every layer of its ecosystem from applications to infrastructure to data platforms. The event will highlight Oracle’s latest advancements in agentic AI, AI infrastructure, and autonomous systems, reshaping how businesses operate, secure, and scale in the era of intelligent automation.

My Focus: AI and Security

For me, understanding AI and agentic AI is essential. However, my current focus is on the intersection of AI and security. With AI, it’s increasingly difficult to predict which options or vulnerabilities could be exposed, and data has become the lifeblood of AI. Businesses are driven by data, and without strong protection, even a hard-earned reputation can be compromised within hours.

Scripts and agentic AI are already having a profound impact, and it’s challenging to imagine the scale of threats we may face in the future regarding hacking and data theft. The stakes for securing data have never been higher.

Sessions I Recommend at Oracle AI World 2025

If you’re attending Oracle AI World, here are some sessions I highly recommend based on your area of interest:

1st Choice – For Those Focused on Security

  • Be a Data Security Superhero with Oracle Data Safe [HOL2820]

  • How Oracle Ensures Security and Compliance with Oracle Cloud Infrastructure [THR3119]

  • Build a Secure Landing Zone on Oracle Cloud Infrastructure [LRN2727]

2nd Choice – For Those Managing Large-Scale Workloads and Migrations

  • Deploying Mission-Critical: What’s New with Oracle Database 23ai, Exadata, Cloud [LRN1139]

  • Exadata Mastery: Maximize Performance, Reduce Risk, and Save Costs [LRN2980]

3rd Choice – For Those Exploring Virtualization Alternatives

With VMware licensing costs rising, many small and medium-sized businesses are rethinking their virtualization strategy. It’s the perfect time to evaluate Oracle Linux Virtualization Manager (OLVM) as a cost-effective, enterprise-grade alternative.

  • Migrating to Oracle Virtualization Using Oracle Cloud Migrations Service [LRN3085]

  • Oracle Virtualization: The Cost-Effective, High-Performance Alternative [LRN1119]


Final Thoughts

AI brings enormous opportunity , but also a new dimension of risk. As AI systems become more autonomous and interconnected, security must evolve at the same pace. At Oracle AI World 2025, I look forward to exchanging ideas with peers, customers, and Oracle experts on how to build intelligent, secure, and resilient data-driven enterprises.

Tuesday, October 7, 2025

Join Us at Oracle AI World 2025 – Optimizing and Securing Enterprise Databases

 






Join Us at Oracle AI World 2025 – Optimizing and Securing Enterprise Databases

I’m excited to announce that I will be attending Oracle AI World 2025, taking place October 13–16 in Las Vegas, alongside the Eclipsys and DSP teams.

This year marks a significant milestone for Oracle’s flagship event: CloudWorld has been reimagined as Oracle AI World. AI is no longer a supporting element, it is now the foundation for Oracle’s strategy across applications, infrastructure, and data platforms.

Recent developments highlight Oracle’s leadership in the enterprise AI space:

  • Participation in the $500 billion “Stargate” AI infrastructure initiative with OpenAI and SoftBank, providing unprecedented compute capacity for AI workloads.

  • A $30 billion-per-year agreement with OpenAI to host large-scale model training clusters powered by tens of thousands of NVIDIA GPUs.

  • Support for Google Cloud’s Gemini AI models via Vertex AI, giving customers access to advanced multimodal AI tools directly in Oracle Cloud Infrastructure (OCI).

Oracle is also advancing agentic AI capabilities, enabling AI systems to autonomously plan, decide, and act. The Globally Distributed Exadata on Exascale service ensures mission-critical systems and agentic AI workloads remain resilient and high-performing across regions. Additionally, AI Agent Studio empowers customers and partners to build, extend, and orchestrate AI agents within Fusion Applications, creating autonomous workflows tailored to real business needs. Collectively, these innovations position Oracle AI World as a central hub for the enterprise AI era.

Why Database Optimization and Security Matter

Databases remain the backbone of modern enterprises. Optimizing them for performance, scalability, and security is more important than ever. With tools such as Oracle 23ai, Autonomous Database, and Oracle Linux Virtualization Manager (OLVM), organizations can achieve more intelligent, secure, and cost-efficient database operations across both cloud and on-premises environments.

At Eclipsys and DSP, we specialize in helping organizations:

  • Enhance database performance with Oracle Cloud and AI-driven optimization

  • Strengthen security using Oracle Data Safe, Audit Vault, and AI-powered threat detection

  • Modernize workloads leveraging Oracle 23ai, Autonomous Database, and containerized architectures

  • Simplify infrastructure management with OLVM for efficient VM provisioning, clustering, and high availability

  • Deliver end-to-end managed services to ensure reliability, compliance, and operational excellence

Let’s Connect at Oracle AI World

If you are attending Oracle AI World and are looking to optimize and secure your Oracle databases, implement OLVM, or leverage AI-driven Oracle services, we would be delighted to connect. We’ll share practical strategies, real-world use cases, and insights from across the Oracle ecosystem.

See you in Las Vegas, October 13–16, 2025 at Oracle AI World!

Tuesday, August 19, 2025

Optimizer Regression After Upgrade? Fix Slow Queries Using DBMS_SQLDIAG.CREATE_SQL_PATCH

 







Intro


We are in the era of AI, where increasing processing power is critical to managing rapidly growing and complex workloads. As enterprise systems continue to scale, database performance challenges have become increasingly common, especially under high-throughput operations. 

From my experience, Oracle Database stands out as the most flexible and robust platform for identifying, analyzing, and resolving performance issues, thanks to its advanced tooling and diagnostic capabilities.

Recently, we encountered a performance bottleneck in an Exadata Cloud@Customer (ExaCS) environment, where the database was processing a high volume of insert operations originating from a Kafka data stream.

In this case, the application had been running on Oracle 11g for more than eight years without major issues. However, after migrating to a newer database version, we faced multiple performance hiccups. While upgrades are usually straightforward, the optimizer behavior changes introduced in the newer version made performance tuning trickier and more time-consuming. Handling execution plan shifts and adapting SQL performance required careful analysis and fine-tuning to stabilize the environment.

It’s not easy to introduce changes directly on a long-running production database, especially one supporting critical business operations. In this example, most queries performed well under the 19c optimizer, but we encountered one major query that experienced significant slowness and failed to complete within the expected time window. Troubleshooting required deep investigation, plan comparisons, and precise tuning to restore performance without disrupting stable workloads.

In this blog, I will demonstrate how we resolved performance issues for specific queries without making any changes to the application code. The DBMS_SQLDIAG.CREATE_SQL_PATCH procedure provides an effective way to apply optimizer hints and fix problematic SQL statements without modifying the original SQL text.

For more details, you can also refer to Oracle’s documentation:
How to Create a SQL Patch to Add Hints to Application SQL Statements (Doc ID 1931944.1).

In our case, the application had been running on Oracle 11g for more than eight years.
After upgrading to Oracle 19c, most queries performed better with the new optimizer, but a few critical queries started experiencing severe performance issues.

Introducing changes directly into a long-running production database is not easy.
For one problematic query, switching the optimizer behavior back to 11.2.0.4 resolved the issue, and we achieved this without modifying the application code — using a SQL Patch.

Sample SQL Patch Creation.

Here’s an example of how to create a SQL Patch using the DBMS_SQLDIAG.CREATE_SQL_PATCH procedure:


DECLARE
  v_patch_name VARCHAR2(30);
BEGIN
  v_patch_name := DBMS_SQLDIAG.CREATE_SQL_PATCH (
    sql_id       => 'your_sql_id',
    hint_text    => '/*+ USE_HASH_JOIN */',
    name         => 'patch_use_hash',
    description  => 'Forcing hash join to improve performance'
  );
END;
/
  


Key Parameters


Parameter    Description
sql_id           The SQL_ID from AWR/SQL Monitor of the problematic query
hint_text      The optimizer hints you want to inject
name               A custom name for your SQL patch
description  A note to describe the purpose


Example: Forcing 11.2.0.4 Optimizer for a Specific SQL.

In our case, we applied the 11.2.0.4 optimizer behavior to a specific SQL by using the following:


VARIABLE x VARCHAR2(100);
EXEC :x := DBMS_SQLDIAG.CREATE_SQL_PATCH(
  sql_id      => 'grnbsvudp26j3',
  hint_text   => 'optimizer_features_enable(''11.2.0.4'')',
  name        => 'SQL_Patch_grnbsvudp26j3'
);
    


How to Validate If the SQL Is Using the New Optimizer

You can confirm that the SQL Patch has been applied using:

SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'OUTLINE BASIC NOTE')); 

Sample Execution Plan Output

SQL> perf_dplan_cursor_outline.sql
Enter value for SQL_ID : grnbsvudp26j3
Enter Child Number (Default is 0) :

+------------------------------------------------------------------------+
| Report   : Execution Plan for SQL_ID in Cursor Cache                   |
| Instance : EXADB1                                                     |
| SQL_ID   : grnbsvudp26j3                         |
+------------------------------------------------------------------------+

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------

Plan hash value: 2610349306

------------------------------------------------------------------------
| Id  | Operation                            | Name                    |
------------------------------------------------------------------------
|   0 | INSERT STATEMENT                     |                         |
|   1 |  LOAD TABLE CONVENTIONAL             | TEST_ASSET_ALERT_HISTORY |
|   2 |   CONCATENATION                      |                         |
|   3 |    NESTED LOOPS                      |                         |
|   4 |     NESTED LOOPS                     |                         |
|   5 |      TABLE ACCESS BY INDEX ROWID     | TEST_ASSET_WORK          |
|   6 |       INDEX RANGE SCAN               | IDX_JAW_JID_CID_JEID    |
|   7 |      PARTITION RANGE ITERATOR        |                         |
|   8 |       INDEX RANGE SCAN               | IDX_DGH_CAID_ET_FID_TT  |
|   9 |     TABLE ACCESS BY LOCAL INDEX ROWID| TEST_HISTORY        |
|  10 |    NESTED LOOPS                      |                         |
|  11 |     NESTED LOOPS                     |                         |
|  12 |      TABLE ACCESS BY INDEX ROWID     | TEST_ASSET_WORK          |
|  13 |       INDEX RANGE SCAN               | IDX_JAW_JID_CID_JEID    |
|  14 |      PARTITION RANGE ITERATOR        |                         |
|  15 |       INDEX RANGE SCAN               | IDX_DGH_CAID_ET_FID_TT  |
|  16 |     TABLE ACCESS BY LOCAL INDEX ROWID| TEST_HISTORY        |
------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$58A6D7F6")
      MERGE(@"SEL$1" >"SEL$2")
      OUTLINE_LEAF(@"INS$1")
      OUTLINE_LEAF(@"SEL$58A6D7F6_1")
      USE_CONCAT(@"SEL$58A6D7F6" 8 OR_PREDICATES(8) PREDICATE_REORDERS((16 4)
              (8 6) (9 7) (10 8) (11 9) (12 10) (13 11) (14 12) (15 13) (4 14) (6 15)
              (7 16)))
      OUTLINE_LEAF(@"SEL$58A6D7F6_2")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      FULL(@"INS$1" "TEST_ASSET_ALERT_HISTORY"@"INS$1")
INDEX_RS_ASC(@"SEL$58A6D7F6_1" "A"@"SEL$1" ("TEST_ASSET_WORK"."JOB_ID"
"JOB_ASSET_WORK"."COMPANY_ID" "JOB_ASSET_WORK"."JOB_EVENT_ID")) INDEX(@"SEL$58A6D7F6_1" "DH"@"SEL$1" ("TEST_HISTORY"."COMPANY_ASSET_ID" "TEST_HISTORY"."EVENT_TIME"
"TEST_HISTORY"."FACILITY_ID" "TEST_HISTORY"."TRANSACTION_TYPE"))
INDEX_RS_ASC(@"SEL$58A6D7F6_2" "A"@"SEL$58A6D7F6_2" ("TEST_ASSET_WORK"."JOB_ID" "JOB_ASSET_WORK"."COMPANY_ID"
"TEST_ASSET_WORK"."JOB_EVENT_ID"))
INDEX(@"SEL$58A6D7F6_2" "DH"@"SEL$58A6D7F6_2" ("TEST_HISTORY"."COMPANY_ASSET_ID" "TEST_HISTORY"."EVENT_TIME"
"TEST_HISTORY"."FACILITY_ID" "TEST_HISTORY"."TRANSACTION_TYPE"))
LEADING(@"SEL$58A6D7F6_1" "A"@"SEL$1" "DH"@"SEL$1") LEADING(@"SEL$58A6D7F6_2" "A"@"SEL$58A6D7F6_2" "DH"@"SEL$58A6D7F6_2") USE_NL(@"SEL$58A6D7F6_1" "DH"@"SEL$1") NLJ_BATCHING(@"SEL$58A6D7F6_1" "DH"@"SEL$1") USE_NL(@"SEL$58A6D7F6_2" "DH"@"SEL$58A6D7F6_2") NLJ_BATCHING(@"SEL$58A6D7F6_2" "DH"@"SEL$58A6D7F6_2") END_OUTLINE_DATA */ Note ----- - SQL patch "SQL_Patch_grnbsvudp26j3" used for this statement 87 rows selected. SQL>
Key Highlights from the Execution Plan: Optimizer hint applied successfully:

Summary

  • Problem: After upgrading to 19c, a few queries slowed down due to optimizer changes.

  • Solution: Created a SQL Patch to apply legacy optimizer features.

  • Advantage: No changes were made to the application code.

  • Result: Query performance was restored to expected levels.


Conclusion

Handling performance issues in long-running Oracle databases, especially after major version upgrades, can be challenging. Optimizer behavior often changes between versions, and while most queries benefit from the enhancements, certain critical queries may experience unexpected regressions.

In such scenarios, DBMS_SQLDIAG.CREATE_SQL_PATCH provides a powerful, non-intrusive solution to fix problematic queries without making any application changes. By injecting optimizer hints or enforcing a specific optimizer version, we can stabilize performance quickly and ensure business continuity.

This approach not only reduces risk but also saves significant time during troubleshooting, particularly in production environments where application changes may involve lengthy testing cycles. SQL patches act as a bridge between performance optimization and application stability, making them an essential tool for every Oracle DBA.

Thursday, August 7, 2025

Performance issue : Handling SQL Version Count Issues with High-Volume Kafka Inserts on ExaCS

 






Intro 

We are in the era of AI, where increasing processing power is crucial for handling growing and complex workloads. As enterprise systems continue to scale, database performance challenges become increasingly common, especially under high-throughput operations. In my experience, no other database platform matches Oracle’s flexibility and tooling when it comes to identifying and resolving such performance issues.

Recently, we encountered a performance bottleneck in an Exadata Cloud@Customer (ExaCS) environment, where the database was handling a high volume of insert operations coming from a Kafka stream.

In this article, I’ll walk through the technical details of the SQL version count issue we faced and the solution we implemented to stabilize performance.

The Issue: Excessive SQL Versioning and Hard Parses

The ExaCS database was receiving a continuous stream of dynamically structured INSERT statements from Kafka. The column structure of these inserts varied significantly; some contained 100 columns, while others had up to 150. This variation was ongoing and unpredictable.

Due to these structural differences, Oracle’s optimizer treated each statement as a unique SQL. As a result, the database began to experience:
  • Excessive hard parses

  • High CPU utilization

  • Shared pool contention and pressure

Even though we were running Oracle 19.24, which includes enhancements to SQL plan management and version count handling, the optimizer still created new cursor versions for each structurally distinct INSERT, which led to rapid cursor growth and degraded overall performance.


                                    
                                        Figure 1: AWR report for Oracle SQLID version count


Temporary Workaround: Manual Flushing of High Version Count Cursors

As an immediate mitigation step, we identified the SQLs with high version counts and manually flushed them from the shared pool using their memory IDs. This helped temporarily relieve pressure on CPU and memory by:

  • Reducing shared pool bloat

  • Freeing up memory consumed by excessive cursor versions

  • Preventing further hard parsing on the same overloaded SQL

However, it's important to note that this is only a temporary workaround. The relief is short-lived, as the issue resurfaces once new INSERT statements with varying structures continue streaming in from Kafka.

    To clarify, this issue has not been resolved in Oracle 19.24, despite the version including several recent patches and updates. Here’s the output from the environment confirming the exact patch level:
    
    [oracle@exaprd01-node01 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
    34697081;NOT SHIPPING LIBAUTH_SDK_IAM.SO IN 23 SHIPHOME INSTALL
    36538667;JDK BUNDLE PATCH 19.0.0.0.240716
    36414915;OJVM RELEASE UPDATE: 19.24.0.0.240716 (36414915)
    36587798;OCW RELEASE UPDATE 19.24.0.0.0 (36587798)
    36582781;Database Release Update : 19.24.0.0.240716 (36582781)
    OPatch succeeded.
    [oracle@exaprd01-node01 ~]$
    


    To monitor and identify SQL statements with high version counts typically those contributing to shared pool pressure, you can use the following query:
    
    SELECT version_count, sql_id, sql_text FROM   v$sqlarea WHERE  version_count >  512;
    
    For any SQLs with unusually high version counts, manual flushing can be performed as a short-term mitigation step using the following commands:

    select inst_id,ADDRESS, HASH_VALUE from V$SQLAREA where SQL_ID like '&sqlid';
    exec sys.DBMS_SHARED_POOL.PURGE ('-ADDRESS-, -HASH-VALUE-', 'C');
    

    Note: Use manual flushing with caution, especially in production environments, as it may impact performance for frequently executed queries.

    Permanent Solution

    To address the high version count issue more permanently, Oracle provides specific guidance in the following My Oracle Support note:

    High Version Counts For SQL Statements (>1024) Post Upgrade To 12.2 and Above Causing Database Slow Performance Doc ID 2431353.1

    As per the recommendation, you should add the following initialization parameter _cursor_obsolete_threshold = <recommended_value>

    This parameter helps control the number of obsolete child cursors and can significantly reduce version count growth, improving shared pool performance and overall database stability.

    If you’re running in a RAC environment, apply this change and restart the database instances in a rolling fashion to avoid downtime.

    
    alter system set “_cursor_obsolete_threshold”=1024 scope=spfile;
    

    Conclusion

    This issue showed us how dynamic SQL from systems like Kafka can create serious performance problems, even in powerful environments like Exadata Cloud@Customer. Because each INSERT had a slightly different structure, Oracle treated them as new statements, leading to too many cursor versions, high CPU usage, and shared pool pressure.

    Even though we were on Oracle 19.24, the problem still occurred. The key was identifying the root cause and taking action, monitoring version counts, applying a temporary fix, and then implementing a permanent solution using the _cursor_obsolete_threshold parameter.

    In short, managing SQL behavior and understanding how Oracle handles different workloads is critical for keeping your systems running smoothly, especially in today’s fast-moving, high-volume environments.

    Stay Secure Without Stopping: Leveraging Oracle TAC and Rolling Patching in the AI-Driven Era

      Intro We live in an era where businesses are powered by data, running their most critical systems on robust platforms like Ora...