I am trying to update one column CODE from table TAB1 using REF_TAB table having 200 mn and 30 mn records respectively, but it is taking 3-4 hrs to update the records.
TAB1 -- 200000000(200mn)
ID VARCHAR2(10) PK on ID
T_ID VARCHAR2(10) --- duplicate values
X_CD VARCHAR2(2)
CODE VARCHAR2(2) -- list partition with 30 codes
REF_TAB --30000000(30 mn)
T_ID VARCHAR2(10) --- unique values
CODE VARCHAR2(2)
The unique index on TAB1.ID primary key is the only index.
I have tried MERGE. Is there any other way to improve the performance of the query or by any other means?
MERGE /*+ parallel(A,4) append */ INTO TAB1 A
USING (SELECT T_ID,CODE FROM REF_TAB) R
ON ( A.T_ID = R.T_ID
AND A.X_CD IS NOT NULL
)
WHEN MATCHED THEN
UPDATE
SET A.CODE = R.CODE;
COMMIT;
with append
Plan hash value: 3594611329
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 93M| 714M| 494K (4)| 00:00:03 | | | | | |
| 1 | MERGE | TAB1 | | | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10001 | 93M| 27G| 494K (4)| 00:00:03 | | | Q1,01 | P->S | QC (RAND) |
| 4 | VIEW | | | | | | | | Q1,01 | PCWP | |
|* 5 | HASH JOIN | | 93M| 27G| 494K (4)| 00:00:03 | | | Q1,01 | PCWP | |
| 6 | PX RECEIVE | | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,01 | PCWP | |
| 7 | PX SEND BROADCAST | :TQ10000 | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,00 | S->P | BROADCAST |
| 8 | PX SELECTOR | | | | | | | | Q1,00 | SCWC | |
| 9 | TABLE ACCESS FULL| REF_TAB | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,00 | SCWP | |
| 10 | PX BLOCK ITERATOR | | 93M| 26G| 484K (3)| 00:00:03 | 1 | 35 | Q1,01 | PCWC | |
|* 11 | TABLE ACCESS FULL | TAB1 | 93M| 26G| 484K (3)| 00:00:03 | 1 | 35 | Q1,01 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------------------
without append
Plan hash value: 3594611329
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 103M| 790M| 494K (4)| 00:00:03 | | | | | |
| 1 | MERGE | TAB1 | | | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10001 | 103M| 30G| 494K (4)| 00:00:03 | | | Q1,01 | P->S | QC (RAND) |
| 4 | VIEW | | | | | | | | Q1,01 | PCWP | |
|* 5 | HASH JOIN | | 103M| 30G| 494K (4)| 00:00:03 | | | Q1,01 | PCWP | |
| 6 | PX RECEIVE | | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,01 | PCWP | |
| 7 | PX SEND BROADCAST | :TQ10000 | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,00 | S->P | BROADCAST |
| 8 | PX SELECTOR | | | | | | | | Q1,00 | SCWC | |
| 9 | TABLE ACCESS FULL| REF_TAB | 32M| 374M| 8430 (5)| 00:00:01 | | | Q1,00 | SCWP | |
| 10 | PX BLOCK ITERATOR | | 103M| 29G| 485K (4)| 00:00:03 | 1 | 35 | Q1,01 | PCWC | |
|* 11 | TABLE ACCESS FULL | TAB1 | 103M| 29G| 485K (4)| 00:00:03 | 1 | 35 | Q1,01 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------------------
1) Add indices
You can add an index on column
T_IDin tabletab1. That should speed up the statement a lot. Or what also might help as you sayT_IDinREF_TABis unique if you make that column the primary key that should also help.In both cases you should get rid of one full table scan and replace it with an index or index range scan, which should be faster.
2) Create table solution
One other solution that should be faster is to create a new table
tab2:Add the indices, grants, constraints to tab2, drop tab1 and then rename tab2 to tab1. This way you don't store rollback information, which is the main resource problem when doing mass updates.
3) Use Foreign Key + Join instead of duplicate data (DB normalization)
You might also consider to add a foreign key on
tab1:And then drop
CODEcolumn from tab1 and read the value perJOIN- so you don't have to update the value, when it is changed.