Oracle中比对2张表之间数据是否一致的几种方法by Maclean.liuliu.maclean@About Mel Email & Gtalk:liu.maclean@l Blog:l QQ:47079569 QQ Group:23549328l Oracle Certified Database Administrator Master 10g and 11gl Over 6 years experience with Oracle DBA technology l Over 7 years experience with Linux technologyl Member Independent Oracle Users Groupl Member All China Oracle Users Groupl Presents for advanced Oracle topics: RAC, DataGuard, Performance Tuning and Oracle Internal.How To Find Maclean Liu?大约是2个星期前做一个夜班的时候,开发人员需要比对shareplex 数据同步复制软件在 源端和目标端的2张表上的数据是否一致,实际上后来想了下shareplex 本身应当具有这种数据校验功能, 但是还是希望从数据库的角度得出几种可用的同表结构下的数据比对方法。
注意以下几种数据比对方式适用的前提条件:1. 所要比对的表的结构是一致的2. 比对过程中源端和 目标端 表上的数据都是静态的,没有任何DML修改方式1:假设你所要进行数据比对的数据库其中有一个 版本为11g且该表上有相应的主键索引(primary key index)或者唯一非空索引(unique key ¬ null)的话,那么恭喜你! 你可以借助11g 新引入的专门做数据对比的PL/SQL Package dbms_comparison来实现数据校验的目的,如以下演示:Source 源端版本为11gR2 :conn maclean/macleanSQL> select * from v$version;BANNER--------------------------------------------------------------------------------Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit ProductionPL/SQL Release 11.2.0.3.0 - ProductionCORE 11.2.0.3.0 ProductionTNS for Linux: Version 11.2.0.3.0 - ProductionNLSRTL Version 11.2.0.3.0 - ProductionSQL> select * from global_name;GLOBAL_NAME & drop table test1;create table test1 tablespace users as select object_id t1,object_name t2 from dba_objects where object_id is not null;alter table test1 add primary key(t1);exec dbms_stats.gather_table_stats('MACLEAN','TEST1',cascade=>TRUE);create database link maclean connect to maclean identified by maclean using'G10R21';Database link created.以上源端数据库版本为11.2.0.3 , 源表结构为test1(t1 number primary key,t2 varchar2(128),透过dblink链接到版本为10.2.0.1的目标端conn maclean/macleanSQL> select * from v$versionBANNER----------------------------------------------------------------Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64biPL/SQL Release 10.2.0.1.0 - ProductionCORE 10.2.0.1.0 ProductionTNS for Linux: Version 10.2.0.1.0 - ProductionNLSRTL Version 10.2.0.1.0 - Productioncreate table test2 tablespace users as select object_id t1,object_name t2from dba_objects where object_id is not null;alter table test2 add primary key(t1);exec dbms_stats.gather_table_stats('MACLEAN','TEST2',cascade=>TRUE);目标端版本为10.2.0.1 , 表结构为test2(t1 number primary key,t2 varchar2(128))。
注意这里2张表上均必须有相同的主键索引或者伪主键索引(pseudoprimary key伪主键要求是唯一键且所有的成员列均是非空NOT NULL)。
实际创建comparison对象,并实施校验:begindbms_comparison.create_comparison(comparison_name => 'MACLEAN_TEST_COM',schema_name => 'MACLEAN',object_name => 'TEST1',dblink_name => 'MACLEAN',remote_schema_name => 'MACLEAN',remote_object_name => 'TEST2',scan_mode => dbms_comparison.CMP_SCAN_MODE_FULL); end;PL/SQL procedure successfully completed.SQL> set linesize 80 pagesize 1400SQL> select * from user_comparison where comparison_name='MACLEAN_TEST_COM'; COMPARISON_NAME COMPA SCHEMA_NAME------------------------------ ----- ------------------------------OBJECT_NAME OBJECT_TYPE REMOTE_SCHEMA_NAME------------------------------ ----------------- ------------------------------REMOTE_OBJECT_NAME REMOTE_OBJECT_TYP------------------------------ -----------------DBLINK_NAME--------------------------------------------------------------------------------SCAN_MODE SCAN_PERCENT--------- ------------CYCLIC_INDEX_VALUE--------------------------------------------------------------------------------NULL_VALUE--------------------------------------------------------------------------------LOCAL_CONVERGE_TAG--------------------------------------------------------------------------------REMOTE_CONVERGE_TAG--------------------------------------------------------------------------------MAX_NUM_BUCKETS MIN_ROWS_IN_BUCKET--------------- ------------------LAST_UPDATE_TIME---------------------------------------------------------------------------MACLEAN_TEST_COM TABLE MACLEANTEST1 TABLE MACLEANTEST2 TABLEMACLEANFULLORA$STREAMS$NV1000 1000020-DEC-11 01.08.44.562092 PM利用dbms_comparison.create_comparison创建comparison后,新建的comparison会出现在user_comparison视图中;以上我们完成了comparison的创建,但实际的校验仍未发生我们利用10046事件监控这个数据对比过程:conn maclean/macleanset timing on;alter system flush shared_pool;alter session set events '10046 trace name context forever,level 8';set serveroutput onDECLAREretval dbms_parison_type;BEGINIF dbms_pare('MACLEAN_TEST_COM', retval, perform_row_dif => TRUE) THENdbms_output.put_line('No Differences');ELSEdbms_output.put_line('Differences Found');END IF;END;/说Differences Found =====> 返回果结为Differences Found,明数据存在差异并不一致PL/SQL procedure successfully completed.Elapsed: 00:00:10.87===========================10046 tkprof result =========================SELECT MIN("T1"), MAX("T1")FROM"MACLEAN"."TEST1"SELECT MIN("T1"), MAX("T1")FROM"MACLEAN"."TEST2"@MACLEANSELECT COUNT(1)FROM"MACLEAN"."TEST1" s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max )SELECT COUNT(1)FROM"MACLEAN"."TEST2"@MACLEAN s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max ) SELECT q.wb1, min(q."T1") min_range1, max(q."T1") max_range1, count(*)num_rows, sum(q.s_hash) sum_range_hashFROM(SELECT /*+ FULL(s) */ width_bucket(s."T1", :scan_min1, :scan_max_inc1,:num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"), 'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'), 4294967295, 0))s_hash FROM "MACLEAN"."TEST1" s WHERE (s."T1">=:scan_min1 AND s."T1"<=:scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1SELECT /*+ REMOTE_MAPPED */ q.wb1, min(q."T1") min_range1, max(q."T1")max_range1, count(*) num_rows, sum(q.s_hash) sum_range_hashFROM(SELECT /*+ FULL(s) REMOTE_MAPPED */ width_bucket(s."T1", :scan_min1,:scan_max_inc1, :num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"),'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) s_hash FROM "MACLEAN"."TEST2"@MACLEAN s WHERE (s."T1">=:scan_min1 AND s."T1"<=:scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1SELECT /*+ FULL(P) +*/ * FROM "MACLEAN"."TEST2" PSELECT /*+ FULL ("A1") */WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS),MIN("A1"."T1"),MAX("A1"."T1"),COUNT(*),SUM(ORA_HASH(NVL(TO_CHAR("A1"."T1"), 'ORA$STREAMS$NV'),4294967295,ORA_HASH(NVL("A1"."T2", 'ORA$STREAMS$NV'), 4294967295, 0)))FROM "MACLEAN"."TEST2" "A1"WHERE "A1"."T1" >= :SCAN_MIN1AND "A1"."T1" <= :SCAN_MAX1GROUP BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS) ORDER BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS)SELECT ROWID, "T1", "T2"FROM "MACLEAN"."TEST2" "R"WHERE "T1" >= :1AND "T1" <= :2--------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time |--------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 126 | 3528 | 4 (0)| 00:00:01 ||* 1 | FILTER | | | | | || 2 | TABLE ACCESS BY INDEX ROWID| TEST2 | 126 | 3528 | 4 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN | SYS_C006255 | 227 | | 2 (0)| 00:00:01 |--------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------1 - filter(TO_NUMBER(:1)<=TO_NUMBER(:2))3 - access("T1">=TO_NUMBER(:1) AND "T1"<=TO_NUMBER(:2))SELECT ll.l_rowid, rr.r_rowid, NVL(ll."T1", rr."T1") idx_valFROM(SELECT l.rowid l_rowid, l."T1", ora_hash(NVL(to_char(l."T1"),'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((l."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) l_hash FROM "MACLEAN"."TEST1" l WHERE l."T1">=:scan_min1AND l."T1"<=:scan_max1 ) ll FULL OUTER JOIN (SELECT /*+ NO_MERGEREMOTE_MAPPED */ r.rowid r_rowid, r."T1", ora_hash(NVL(to_char(r."T1"),'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((r."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) r_hash FROM "MACLEAN"."TEST2"@MACLEAN r WHERE r."T1">=:scan_min1 AND r."T1"<=:scan_max1 ) rr ON ll."T1"=rr."T1" WHERE ll.l_hashIS NULL OR rr.r_hash IS NULL OR ll.l_hash <> rr.r_hash----------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 190 | 754K| 9 (12)| 00:00:01 | | ||* 1 | VIEW | VW_FOJ_0 | 190 | 754K| 9 (12)| 00:00:01 | | ||* 2 | HASH JOIN FULL OUTER | | 190 | 754K| 9 (12)| 00:00:01 | | || 3 | VIEW | | 190 | 7220 | 4 (0)| 00:00:01 | | ||* 4 | FILTER | | || | | | || 5 | TABLE ACCESS BY INDEX ROWID| TEST1 | 190 | 5510 | 4 (0)| 00:00:01 | | ||* 6 | INDEX RANGE SCAN | SYS_C0013098 | 341 | | 2 (0)| 00:00:01 | | || 7 | VIEW | | 126 | 495K| 4 (0)| 00:00:01 | | || 8 | REMOTE | TEST2 | 126 | 3528 | 4 (0)| 00:00:01 | MACLE~ | R->S |----------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------1 - filter("LL"."L_HASH" IS NULL OR "RR"."R_HASH" IS NULL OR"LL"."L_HASH"<>"RR"."R_HASH")2 - access("LL"."T1"="RR"."T1")4 - filter(TO_NUMBER(:SCAN_MIN1)<=TO_NUMBER(:SCAN_MAX1))6 - access("L"."T1">=TO_NUMBER(:SCAN_MIN1) AND"L"."T1"<=TO_NUMBER(:SCAN_MAX1))Remote SQL Information (identified by operation id):----------------------------------------------------8 - SELECT ROWID,"T1","T2" FROM "MACLEAN"."TEST2" "R" WHERE "T1">=:1 AND"T1"<=:2 (accessing'MACLEAN' )可以看到以上过程中虽然没有避免对TEST1、TEST2表的全表扫描(FULL TABLE SCAN), 但是好在实际参与HASH JOIN FULL OUTER 的仅是访问索引后获得的少量数据,所以效率还是挺高的。