Pages

Sunday, June 5, 2011

How to compare the record differences of two similar tables - Part 2 of 2

Permalink: http://bit.ly/1ztV5sU



The rationale behind comparing tables versus using a CHECKSUM TABLE statement can be found in the first part of this entry.

Comparing the record differences of two similar tables can be useful when transferring records from an old database to a new one or when comparing backup tables against the original tables. Depending on specific requirements, it may be necessary to validate that the transfer was successful or to see which specific data in the records of the original and in-use tables have been updated, inserted, or deleted when compared to the backup. The query in the stored procedure below will show the differences caused by updates to current records as well as the record differences caused when new records are inserted or old ones are deleted in one table but not in the other:

DELIMITER $$
DROP PROCEDURE IF EXISTS `compareTables` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `compareTables`(
  dbTableNameA VARCHAR(133),
  dbTableNameB VARCHAR(133))
BEGIN
  DECLARE dbTableA, dbTableB VARCHAR(129);
  DECLARE dbA, tableA, dbB, tableB VARCHAR(64);
  DECLARE columnList TEXT;
  SET dbTableA = REPLACE(dbTableNameA, '`', '');
  SET dbTableB = REPLACE(dbTableNameB, '`', '');
  SET dbA = LEFT(dbTableA, LOCATE('.', dbTableA) - 1);
  IF (dbA = '')
  THEN
    SET dbA = DATABASE();
  END IF;
  SET tableA = RIGHT(dbTableA, LENGTH(dbTableA) - LOCATE('.', dbTableA));
  SET dbB = LEFT(dbTableB, LOCATE('.', dbTableB) - 1);
  IF (dbB = '')
  THEN
    SET dbB = DATABASE();
  END IF;
  SET tableB = RIGHT(dbTableB, LENGTH(dbTableB) - LOCATE('.', dbTableB));
  SET columnList = getColumnList(dbA, tableA);
  SET @compareTablesStmt = CONCAT(
  'SELECT MIN(`compareTables`) AS \'Tables Compared\', ',
     columnList, '
  FROM (
    SELECT \'', dbTableNameA, '\' AS `compareTables`, ',
     columnList, '
    FROM `', dbA, '`.`', tableA, '`
    UNION ALL
    SELECT \'', dbTableNameB, '\' AS `compareTables`, ',
     columnList, '
    FROM `', dbB, '`.`', tableB, '`)
  AS `compareTables`
  GROUP BY ', columnList, '
  HAVING COUNT(*) = 1'
  );
  PREPARE compareTablesStmt FROM @compareTablesStmt;
  EXECUTE compareTablesStmt;
  DEALLOCATE PREPARE compareTablesStmt;
END $$
DELIMITER ;

The stored procedure can be called by passing the two tables to be compared as parameters and can be in the forms: `db`.`table`, db.table, `table`, or table. Passing the names wrapped in backticks and specifying from which database the table is from is optional. If no database is specified, the SP will use the current one. The SP is dependent on a user defined function to get the column names dynamically. The getColumnList() function can be found here.

For TEMPORARY TABLES, you will have to manually input the columns of the tables compared. This cannot be done dynamically purely inside MySQL since as stated in a previous blog entry: "TEMPORARY TABLEs are invisible from the `information_schema` database (as of MySQL version 5.6)".

Here is a sample call:

 CALL compareTables(
 '`origDB`.`origTable`', 'backupTable'
 );

And a sample result set:

Tables ComparedcustomerNumbercontactLastNamecontactFirstName
`origDB`.`origTable`103fooCarine
backupTable103SchmittCarine
`origDB`.`origTable`999foobarfoobarz



The first two rows compares the records with the Primary Key or ID = 103 and shows that the data under `contactLastName` are different between the original and backup tables. The third row shows a record that is present in the original table but not in the backup table.

See part 1 of this entry for the rationale behind the SELECT statement using a UNION ALL clause used by the stored procedure.