Intro
Lab 1: Create Compartment, VCN, and MySQL HeatWave DB System while loading DB Data
2. Open heatwave-vcn
3. Click public subnet-heatwave-vcn
4. Click Default Security List for heatwave-vcn
5. Click Add Ingress Rules page under Ingress Rule
chanaka_ya@cloudshell:~ (ca-toronto-1)$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/chanaka_ya/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/chanaka_ya/.ssh/id_rsa.
Your public key has been saved in /home/chanaka_ya/.ssh/id_rsa.pub.
Lab 1: Create Compartment, VCN, and MySQL HeatWave DB System while loading DB Data
Create a computer instance called heatwave-client in a public subnet and copy the public key to the heatwave-client instance.
Now connect to the instance and install MySQL-shell, using MySQL-shell you can access the MySQL server.
chanaka_ya@cloudshell:.ssh (ca-toronto-1)$ ssh -i id_rsa opc@ip_address_of_client_instance
FIPS mode initialized
The authenticity of host '40.233.82.142 (40.233.82.142)' can't be established.
ECDSA key fingerprint is SHA256:q17nqoqKwfYCde2eIKk0V+jFwAbqoThHA0HPF5KFgq8.
ECDSA key fingerprint is SHA1:G1Hdn/PE7nezwyWgAcwWpeU4UAU.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '40.233.82.142' (ECDSA) to the list of known hosts.
Activate the web console with: systemctl enable --now cockpit.socket
[opc@heatwave-client ~]$
[opc@heatwave-client ~]$
[opc@heatwave-client ~]$ sudo yum install mysql-shell -y
Ksplice for Oracle Linux 8 (x86_64) 39 MB/s | 3.9 MB 00:00
MySQL 8.0 for Oracle Linux 8 (x86_64) 34 MB/s | 3.0 MB 00:00
MySQL 8.0 Tools Community for Oracle Linux 8 (x86_64) 8.0 MB/s | 491 kB 00:00
MySQL 8.0 Connectors Community for Oracle Linux 8 (x86_64) 187 kB/s | 30 kB 00:00
Oracle Software for OCI users on Oracle Linux 8 (x86_64) 72 MB/s | 100 MB 00:01
Oracle Linux 8 BaseOS Latest (x86_64) 70 MB/s | 64 MB 00:00
Oracle Linux 8 Application Stream (x86_64) 69 MB/s | 49 MB 00:00
Oracle Linux 8 Addons (x86_64) 20 MB/s | 6.9 MB 00:00
Latest Unbreakable Enterprise Kernel Release 7 for Oracle Linux 8 (x86_64) 63 MB/s | 23 MB 00:00
Dependencies resolved.
=========================================================================================================================================================================================================================================================================================================================
Package Architecture Version Repository Size
=========================================================================================================================================================================================================================================================================================================================
Installing:
mysql-shell x86_64 8.0.33-1.el8 ol8_MySQL80_tools_community 27 M
Installing dependencies:
python39-libs x86_64 3.9.16-1.module+el8.8.0+21116+ee8c18cf.1 ol8_appstream 8.2 M
python39-pip-wheel noarch 20.2.4-7.module+el8.6.0+20625+ee813db2 ol8_appstream 1.1 M
python39-setuptools-wheel noarch 50.3.2-4.module+el8.5.0+20364+c7fe1181 ol8_appstream 497 k
Installing weak dependencies:
python39 x86_64 3.9.16-1.module+el8.8.0+21116+ee8c18cf.1 ol8_appstream 33 k
python39-pip noarch 20.2.4-7.module+el8.6.0+20625+ee813db2 ol8_appstream 1.9 M
python39-setuptools noarch 50.3.2-4.module+el8.5.0+20364+c7fe1181 ol8_appstream 871 k
Enabling module streams:
python39 3.9
Transaction Summary
=========================================================================================================================================================================================================================================================================================================================
Install 7 Packages
Total download size: 39 M
Installed size: 229 M
Downloading Packages:
(1/7): python39-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64.rpm 376 kB/s | 33 kB 00:00
(2/7): python39-pip-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch.rpm 9.7 MB/s | 1.9 MB 00:00
(3/7): python39-libs-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64.rpm 22 MB/s | 8.2 MB 00:00
(4/7): python39-setuptools-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch.rpm 5.1 MB/s | 871 kB 00:00
(5/7): mysql-shell-8.0.33-1.el8.x86_64.rpm 33 MB/s | 27 MB 00:00
(6/7): python39-setuptools-wheel-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch.rpm 1.9 MB/s | 497 kB 00:00
(7/7): python39-pip-wheel-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch.rpm 1.6 MB/s | 1.1 MB 00:00
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 40 MB/s | 39 MB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : python39-setuptools-wheel-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch 1/7
Installing : python39-pip-wheel-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch 2/7
Installing : python39-libs-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 3/7
Installing : python39-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 4/7
Running scriptlet: python39-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 4/7
Installing : python39-setuptools-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch 5/7
Running scriptlet: python39-setuptools-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch 5/7
Installing : python39-pip-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch 6/7
Running scriptlet: python39-pip-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch 6/7
Installing : mysql-shell-8.0.33-1.el8.x86_64 7/7
Running scriptlet: mysql-shell-8.0.33-1.el8.x86_64 7/7
Verifying : mysql-shell-8.0.33-1.el8.x86_64 1/7
Verifying : python39-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 2/7
Verifying : python39-libs-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 3/7
Verifying : python39-pip-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch 4/7
Verifying : python39-pip-wheel-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch 5/7
Verifying : python39-setuptools-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch 6/7
Verifying : python39-setuptools-wheel-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch 7/7
Installed:
mysql-shell-8.0.33-1.el8.x86_64 python39-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 python39-libs-3.9.16-1.module+el8.8.0+21116+ee8c18cf.1.x86_64 python39-pip-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch
python39-pip-wheel-20.2.4-7.module+el8.6.0+20625+ee813db2.noarch python39-setuptools-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch python39-setuptools-wheel-50.3.2-4.module+el8.5.0+20364+c7fe1181.noarch
Complete!
[opc@heatwave-client ~]$
Now let's validate the connection to the MySQL server using the MySQL client server.
mysqlsh -uadmin -p -h mysql-instance-ip --sql
Lab 3: Upload data to Object Storage for HeatWave Lakehouse
Download the sample schema files to the MySQL client instance.
[opc@heatwave-client lakehouse]$ wget https://objectstorage.us-ashburn-1.oraclecloud.com/p/nnsIBVX1qztFmyAuwYIsZT2p7Z-tWBcuP9xqPCdND5LzRDIyBHYqv_8a26Z38Kqq/n/mysqlpm/b/plf_mysql_customer_orders/o/lakehouse/lakehouse-order.zip
--2023-10-23 15:06:46-- https://objectstorage.us-ashburn-1.oraclecloud.com/p/nnsIBVX1qztFmyAuwYIsZT2p7Z-tWBcuP9xqPCdND5LzRDIyBHYqv_8a26Z38Kqq/n/mysqlpm/b/plf_mysql_customer_orders/o/lakehouse/lakehouse-order.zip
Resolving objectstorage.us-ashburn-1.oraclecloud.com (objectstorage.us-ashburn-1.oraclecloud.com)... 134.70.28.1, 134.70.24.1, 134.70.32.1
Connecting to objectstorage.us-ashburn-1.oraclecloud.com (objectstorage.us-ashburn-1.oraclecloud.com)|134.70.28.1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 274987698 (262M) [application/x-www-form-urlencoded]
Saving to: ‘lakehouse-order.zip’
lakehouse-order.zip 100%[=================================================================================================================================================================================================>] 262.25M 83.0MB/s in 3.2s
2023-10-23 15:06:50 (83.0 MB/s) - ‘lakehouse-order.zip’ saved [274987698/274987698]
Now create a standard object storage and upload the CSV files. To upload the files you need to give written permission using create a pre-authenticated request.
Upload CSV files to the order folder and delivery vendor. pq files in main object storage.
curl -X PUT --data-binary '@delivery-orders-2.csv' https://objectstorage.us-ashburn-1.oraclecloud.com/p/RfXc55AGpLSu26UgqbmGxbWZwh4hPhLkVWYMg4f5pNerQx_1NghgSKJHLzE4IWxH/n/******/b/lakehouse-files/o/order/delivery-orders-2.csv
Validate uploaded files
Lab 4: Add HeatWave Cluster to the DB system
Lab 6: Load CSV data from OCI Object Store
Task 1 : Run Autoload to infer the schema and estimate capacity required for the DELIVERY table in the Object Store
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > SET @db_list = '["mysql_customer_orders"]';Query OK, 0 rows affected (0.0012 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > SET @dl_tables = '[{
'> "db_name": "mysql_customer_orders", '> "tables": [{ '> "table_name": "delivery_orders", '> "dialect": '> { '> "format": "csv", '> "field_delimiter": "\\t", '> "record_delimiter": "\\n" '> }, '> "file": [{"par": "(https://objectstorage.ca-toronto-1.oraclecloud.com/p/IZ7Lo3HMB42T0vfh5QPuCIkIZiTpayz3l1OBFC6kqQq3B_6yIFNPfhRYE2vcBfpd/n/yza00k7tuks0/b/lakehouse-files/o/delivery-orders-1.csv"}] '> }] }]';Query OK, 0 rows affected (0.0005 sec)MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > SET @options = JSON_OBJECT('mode', 'dryrun', 'policy', 'disable_unsupported_columns', 'external_tables', CAST(@dl_tables AS JSON));Query OK, 0 rows affected (0.0006 sec)
Now let's load tables to heatwave nodes using "CALL sys.heatwave_load(@db_list, @options)"
Once Autoload completes running, its output has several pieces of information: a. Whether the table exists in the schema you have identified. b. Auto schema inference determines the number of columns in the table. c. Auto schema sampling samples a small number of rows from the table and determines the number of rows in the table and the size of the table. d. Auto-provisioning determines how much memory would be needed to load this table into HeatWave and how much time loading this data take.
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > CALL sys.heatwave_load(@db_list, @options);
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > SET @options = JSON_OBJECT('mode', 'dryrun', 'policy', 'disable_unsupported_columns', 'external_tables', CAST(@dl_tables AS JSON));
Query OK, 0 rows affected (0.0005 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > CALL sys.heatwave_load(@db_list, @options);
+------------------------------------------+
| INITIALIZING HEATWAVE AUTO PARALLEL LOAD |
+------------------------------------------+
| Version: 2.20 |
| |
| Load Mode: dryrun |
| Load Policy: disable_unsupported_columns |
| Output Mode: normal |
| |
+------------------------------------------+
6 rows in set (9.4463 sec)
+--------------------------------------------------------------------------------------------------------------------+
| LAKEHOUSE AUTO SCHEMA INFERENCE |
+--------------------------------------------------------------------------------------------------------------------+
| Verifying external lakehouse tables: 1 |
| |
| SCHEMA TABLE TABLE IS RAW NUM. OF ESTIMATED SUMMARY OF |
| NAME NAME CREATED FILE SIZE COLUMNS ROW COUNT ISSUES |
| ------ ----- -------- --------- ------- --------- ---------- |
| `mysql_customer_orders` `delivery_orders` NO 1.10 GiB 7 30 M |
| |
| New schemas to be created: 0 |
| External lakehouse tables to be created: 1 |
| |
+--------------------------------------------------------------------------------------------------------------------+
10 rows in set (9.4463 sec)
+------------------------------------------------------------------------------------------------------------------------+
| OFFLOAD ANALYSIS |
+------------------------------------------------------------------------------------------------------------------------+
| Verifying input schemas: 1 |
| User excluded items: 0 |
| |
| SCHEMA OFFLOADABLE OFFLOADABLE SUMMARY OF |
| NAME TABLES COLUMNS ISSUES |
| ------ ----------- ----------- ---------- |
| `mysql_customer_orders` 1 7 3 table(s) are not loadable, 6 table(s) are already loaded |
| |
| Total offloadable schemas: 1 |
| |
+------------------------------------------------------------------------------------------------------------------------+
10 rows in set (9.4463 sec)
+-----------------------------------------------------------------------------------------------------------------------------+
| CAPACITY ESTIMATION |
+-----------------------------------------------------------------------------------------------------------------------------+
| Default encoding for string columns: VARLEN (unless specified in the schema) |
| Estimating memory footprint for 1 schema(s) |
| |
| TOTAL ESTIMATED ESTIMATED TOTAL DICTIONARY VARLEN ESTIMATED |
| SCHEMA OFFLOADABLE HEATWAVE NODE MYSQL NODE STRING ENCODED ENCODED LOAD |
| NAME TABLES FOOTPRINT FOOTPRINT COLUMNS COLUMNS COLUMNS TIME |
| ------ ----------- --------- --------- ------- ---------- ------- --------- |
| `mysql_customer_orders` 1 682.25 MiB 512.00 KiB 1 0 1 10.00 s |
| |
| Sufficient MySQL host memory available to load all tables. |
| Sufficient HeatWave cluster memory available to load all tables. |
| |
+-----------------------------------------------------------------------------------------------------------------------------+
12 rows in set (9.4463 sec)
+-------------------------------------------------------------------------------------------------------------+
| LOAD SCRIPT GENERATION |
+-------------------------------------------------------------------------------------------------------------+
| Dryrun mode only generates the load script |
| Set mode to "normal" in options to load tables |
| |
| Retrieve load script containing 2 generated DDL commands using the query below: |
| Deprecation Notice: "heatwave_load_report" will be deprecated, please switch to "heatwave_autopilot_report" |
| SELECT log->>"$.sql" AS "Load Script" FROM sys.heatwave_autopilot_report WHERE type = "sql" ORDER BY id; |
| |
| Applying changes will take approximately 10.00 s |
| |
| Caution: Executing the generated load script will affect the secondary engine flags in the schema |
| |
+-------------------------------------------------------------------------------------------------------------+
11 rows in set (9.4463 sec)
Query OK, 0 rows affected (9.4463 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL >
Modify the columns and create the table in heatwave. If you need to load this table again drop the table and load it again.
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > SELECT log->>"$.sql" AS "Load Script" FROM sys.heatwave_autopilot_report WHERE type = "sql" ORDER BY id;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Load Script |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE TABLE `mysql_customer_orders`.`delivery_orders`( `col_1` int unsigned NOT NULL, `col_2` bigint unsigned NOT NULL, `col_3` tinyint unsigned NOT NULL, `col_4` varchar(9) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN', `col_5` tinyint unsigned NOT NULL, `col_6` tinyint unsigned NOT NULL, `col_7` tinyint unsigned NOT NULL) ENGINE=lakehouse SECONDARY_ENGINE=RAPID ENGINE_ATTRIBUTE='{"file": [{"par": "https://yza00k7tuks0.objectstorage.ca-toronto-1.oci.customer-oci.com/p/CuWqb71FgRTcK37zrrYmOZQ_AokIUB2CCIg2fOO9tIpFSNEjrNfpRE5n7Q3lVhmO/n/yza00k7tuks0/b/lakehouse-files/o/delivery-orders-1.csv"}], "dialect": {"format": "csv", "field_delimiter": "\\t", "record_delimiter": "\\n"}}'; |
| ALTER TABLE `mysql_customer_orders`.`delivery_orders` SECONDARY_LOAD; |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.0008 sec)
As the table is loaded to heatwave now let's try to query the table. As this is directly coming from memory query execution drops to microseconds.
CREATE TABLE `mysql_customer_orders`.`delivery_orders`
( `orders_delivery` int unsigned NOT NULL, `order_id` bigint unsigned NOT NULL, `customer_id` tinyint unsigned NOT NULL,
`order_status` varchar(9) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN', `store_id` tinyint unsigned NOT NULL,
`delivery_vendor_id` tinyint unsigned NOT NULL, `estimated_time_hours` tinyint unsigned NOT NULL) ENGINE=lakehouse SECONDARY_ENGINE=RAPID ENGINE_ATTRIBUTE='{"file": [{"par": "https://objectstorage.ca-toronto-1.oraclecloud.com/p/IZ7Lo3HMB42T0vfh5QPuCIkIZiTpayz3l1OBFC6kqQq3B_6yIFNPfhRYE2vcBfpd/n/yza00k7tuks0/b/lakehouse-files/o/delivery-orders-1.csv"}], "dialect": {"format": "csv", "field_delimiter": "\\t", "record_delimiter": "\\n"}}';
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > desc delivery_orders;
+----------------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| orders_delivery | int unsigned | NO | | NULL | |
| order_id | bigint unsigned | NO | | NULL | |
| customer_id | tinyint unsigned | NO | | NULL | |
| order_status | varchar(9) | NO | | NULL | |
| store_id | tinyint unsigned | NO | | NULL | |
| delivery_vendor_id | tinyint unsigned | NO | | NULL | |
| estimated_time_hours | tinyint unsigned | NO | | NULL | |
+----------------------+------------------+------+-----+---------+-------+
7 rows in set (0.0015 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL >
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > select count(*) from delivery_orders;
+----------+
| count(*) |
+----------+
| 29999998 |
+----------+
1 row in set (0.0621 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > select * from delivery_orders limit 5;
+-----------------+-----------+-------------+--------------+----------+--------------------+----------------------+
| orders_delivery | order_id | customer_id | order_status | store_id | delivery_vendor_id | estimated_time_hours |
+-----------------+-----------+-------------+--------------+----------+--------------------+----------------------+
| 5283230 | 368181698 | 84 | COMPLETE | 21 | 6 | 28 |
| 5283231 | 368181699 | 84 | COMPLETE | 22 | 5 | 48 |
| 5283232 | 368181700 | 84 | COMPLETE | 23 | 5 | 17 |
| 5283233 | 368181701 | 84 | COMPLETE | 23 | 6 | 58 |
| 5283234 | 368181702 | 84 | COMPLETE | 22 | 2 | 43 |
+-----------------+-----------+-------------+--------------+----------+--------------------+----------------------+
5 rows in set (0.0271 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL > select o.* ,d.* from orders o
-> join delivery_orders d on o.order_id = d.order_id
-> where o.order_id = 93751524;
+----------+---------------------+-------------+--------------+----------+-----------------+----------+-------------+--------------+----------+--------------------+----------------------+
| ORDER_ID | ORDER_DATETIME | CUSTOMER_ID | ORDER_STATUS | STORE_ID | orders_delivery | order_id | customer_id | order_status | store_id | delivery_vendor_id | estimated_time_hours |
+----------+---------------------+-------------+--------------+----------+-----------------+----------+-------------+--------------+----------+--------------------+----------------------+
| 93751524 | 2022-08-23 00:00:00 | 81 | COMPLETE | 3 | 1377370 | 93751524 | 81 | COMPLETE | 3 | 3 | 65 |
+----------+---------------------+-------------+--------------+----------+-----------------+----------+-------------+--------------+----------+--------------------+----------------------+
1 row in set (0.3301 sec)
MySQL 10.0.1.192:33060+ ssl mysql_customer_orders SQL >
Conclusion
In a nutshell, This is the data era, and data coming from various platforms, Better decision-making data engineers get all the data into the picture. Data warehouse needs fast retrieval of data for decision-making. Oracle my SQL heatwave is a game changer, reading data directly from CSV and providing query output in milliseconds is a huge contributive factor for organizations.
No comments:
Post a Comment