Pages

Showing posts with label MySQL. Show all posts
Showing posts with label MySQL. Show all posts

Sunday, November 30, 2014

Automatic Logging of Table Data Changes and Creation of Backups via a Stored Procedure

Permalink: http://bit.ly/1rNFRsO



The stripped down stored procedure shown below will accept any Data Manipulation Language statement as its parameter and automatically log the statement and create table backup copies before the statement is executed. The logging functionality is similar to MySQL's binary log but exclusive to DML statements and is useful for table data recovery operations, such as undoing the last table data change or to revert databases back to a certain point in time. All this is done exclusively using stored routines (procedures and functions).

Its assumed that the databases and tables that will be used are already formed to specific business requirements since DDL statements will not be logged by the stored procedure. Though logging of table data changes can also be achieved using triggers, it is not practical to alter each and every trigger of every table if there are numerous tables to consider. Using a stored procedure allows the functionality to be portable to any existing database since no alteration is needed.


DELIMITER $$
DROP PROCEDURE IF EXISTS `$DML` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `$DML`(
  IN dmlString TEXT)
$dml: BEGIN
   DECLARE clearLogs BOOLEAN DEFAULT TRUE;
   DECLARE backupAllDB BOOLEAN DEFAULT TRUE;
   DECLARE dmlFilter BOOLEAN DEFAULT TRUE;
   DECLARE EXIT HANDLER
     FOR SQLWARNING, SQLEXCEPTION, NOT FOUND
     ROLLBACK;
   SET dmlString = TRIM(dmlString);
   IF (dmlString LIKE ('UNDO%'))
   THEN
     CALL $UNDO(dmlString);
     LEAVE $dml;
   END IF;
   initialize: BEGIN
   DECLARE selectStmtBackupAllDB  TEXT;
     IF (
       IF((
         EXISTS(
           SELECT *
           FROM `information_schema`.`TABLES`
           WHERE (`TABLE_SCHEMA` = '$backup'))
         ),
       1, 0)
     )
     THEN
       LEAVE initialize;
     END IF;
     CREATE DATABASE `$backup`;
     CREATE TABLE `$backup`.`history` (
       `dmlStringId` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
       `user` VARCHAR(255) NOT NULL,
       `dmlString` TEXT NOT NULL,
       `timestamp` TIMESTAMP NOT NULL DEFAULT NOW(),
       PRIMARY KEY (`dmlStringId`)
     )
     ENGINE = InnoDB;
     CREATE TABLE `$backup`.`tableref` (
       `tableRefId` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
       `dbName` VARCHAR(64) NOT NULL,
       `tableName` VARCHAR(64) NOT NULL,
       `tableAlias` VARCHAR(64) NOT NULL,
       `backupOf` INTEGER UNSIGNED,
       PRIMARY KEY (`tableRefId`)
     )
     ENGINE = InnoDB;
     IF (backupAllDB)
     THEN
       SET selectStmtBackupAllDB =
         "SELECT T.`TABLE_SCHEMA`, T.`TABLE_NAME`
           FROM `information_schema`.`TABLES` T
           WHERE (`TABLE_SCHEMA` != 'information_schema')
             AND (`TABLE_SCHEMA` != 'mysql')
             AND (`TABLE_SCHEMA` != 'performance_schema')
             AND (`TABLE_SCHEMA` != '$backup')
             AND (`TABLE_NAME` NOT LIKE 'dynamic_cursor%')";
     ELSE
       SET selectStmtBackupAllDB =
         "SELECT T.`TABLE_SCHEMA`, T.`TABLE_NAME`
           FROM `information_schema`.`TABLES` T
           WHERE (`TABLE_SCHEMA` =  DATABASE())
             AND (`TABLE_NAME` NOT LIKE 'dynamic_cursor%')";
     END IF;
     CALL dynamicCursor(
       selectStmtBackupAllDB,
       'initialize', @NULL, logging
     );
   END initialize;
   IF (dmlFilter)
     AND (dmlString NOT LIKE ('INSERT%'))
     AND (dmlString NOT LIKE ('UPDATE%'))
     AND (dmlString NOT LIKE ('DELETE%'))
     AND (dmlString NOT LIKE ('REPLACE%'))
     AND (dmlString NOT LIKE ('SELECT%'))
     AND (dmlString NOT LIKE ('SHOW%'))
   THEN
     SET dmlString = NULL;
   END IF;
   START TRANSACTION;
     SET @dmlStringStmt = dmlString;
     PREPARE dmlStringStmt FROM @dmlStringStmt;
     EXECUTE dmlStringStmt;
     DEALLOCATE PREPARE dmlStringStmt;
   COMMIT;
   IF (dmlString LIKE ('SELECT%'))
     OR (dmlString LIKE ('SHOW%'))
   THEN
     LEAVE $dml;
   ELSE
     INSERT `$backup`.`history`
     VALUES (DEFAULT, USER(), dmlString, DEFAULT);
   END IF;
END $dml $$
DELIMITER ;

The stored procedure is part of a small project, Safe DML, $DML().

Monday, June 13, 2011

How to Disable/Enable Triggers on Demand

Permalink: http://bit.ly/1kBCcQu



One of the sorely wanted features missing in MySQL is the ability to disable/enable triggers on the fly. By comparison, disabling/enabling Foreign Key constraints can be simply done by setting a server system variable:

SET FOREIGN_KEY_CHECKS = [TRUE|FALSE];

Now as of version 5.6, there is no built-in server system variable TRIGGER_CHECKS in MySQL. A simple workaround is to instead use a user-defined session variable. The setting for trigger checks stored into the session variable allows the setting to be seen by all statements, including all stored procedures and functions, as long as the user is connected, which for this workaround is in effect similar to using a server system variable.

Besides a session variable that switches the checks for all triggers, to make the control more flexible, we can also set a session variable for each trigger type. There is a total of 6 trigger types, a combination of [BEFORE|AFTER] with [INSERT|UPDATE|DELETE]. The total to consider would be 7 session variables:

#Affects all triggers 
#FALSE value overrides trigger type settings
SET @TRIGGER_CHECKS = [TRUE|FALSE];

#Trigger type settings
SET @TRIGGER_BEFORE_INSERT_CHECKS = [TRUE|FALSE];
SET @TRIGGER_AFTER_INSERT_CHECKS = [TRUE|FALSE];
SET @TRIGGER_BEFORE_UPDATE_CHECKS = [TRUE|FALSE];
SET @TRIGGER_AFTER_UPDATE_CHECKS = [TRUE|FALSE];
SET @TRIGGER_BEFORE_DELETE_CHECKS = [TRUE|FALSE];
SET @TRIGGER_AFTER_DELETE_CHECKS = [TRUE|FALSE];

For the disable/enable logic, the only value for the session variables that matters is the FALSE value. Anything that isn't FALSE or 0 is considered TRUE. Even NULL values are to be considered TRUE. It makes sense that the default value for trigger checks should be TRUE, and when the user disconnects, the session variable is reset back to NULL which re-enables trigger checking.

Implementing the ability to disable/enable on existing triggers could be hard if there are several hundred tables that need to be altered and with each table having their own unique triggers. This could especially be painful if there is no easy way to automate the updating of those existing triggers.

Triggers are special stored procedures and to implement the trigger disable/enable functionality, a simple logic needs to be inserted at the BEGIN statement. Here's an example of an existing BEFORE-INSERT TRIGGER type that has been patched so that it can be disabled/enabled:

DELIMITER $$
DROP TRIGGER IF EXISTS `yearCheck_beforeInsert` $$
CREATE DEFINER=`root`@`localhost` TRIGGER `yearCheck_beforeInsert`
BEFORE INSERT ON `movies` FOR EACH ROW

#Patch starts here
thisTrigger: BEGIN
  IF ((@TRIGGER_CHECKS = FALSE)
      OR (@TRIGGER_BEFORE_INSERT_CHECKS = FALSE))
    AND (USER() = 'root@localhost')
  THEN
    LEAVE thisTrigger;
  END IF;

#Original trigger body starts here
  IF (NEW.YEAR > YEAR(NOW()))
  THEN
    SET NEW.YEAR = NULL;
  END IF;
END $$
DELIMITER ;

Here's the rundown on the workaround code patch:
  • The BEGIN statement needs to have a label in order to LEAVE the TRIGGER if the trigger checking is SET to FALSE. This facilitates the patching of the functionality into existing TRIGGERs without needing to wrap the original body of the TRIGGER inside the IF-THEN construct.
  • The appropriate session variable trigger type needs to be correctly chosen, in order for the specific control variable to be able to turn on/off all the triggers intended for it. In this example, it is correctly coded as  @TRIGGER_BEFORE_INSERT_CHECKS.
  • We wouldn't want just any user to be able to disable TRIGGERs. Unfortunately, GRANT PRIVILEGES can't cover this case. Either we would need to hardcode the specific USER/s that can disable triggers into the TRIGGER procedure in order to have security or allow only USERs that have the SUPER privilege since disabling triggers is an administrative operation (see this comment for the condition change).

Triggers can CALL stored procedures and its possible that the trigger logic be moved into an external SP. However, this is not recommended since it would not be possible to use the special trigger-specific variables OLD.<columnName> and NEW.<columnName> in the SP.



When the database administrator logs into the MySQL server under the user 'root@localhost', they can run DML statements without invoking any trigger by setting the @TRIGGER_CHECKS session variable to FALSE. Once done, setting it back to TRUE re-enables triggers and they can continue with other tasks that require triggers to fire. They can simply disconnect immediately without setting the check back to TRUE since in their next session the triggers are enabled by default.

Another workaround on a missing MySQL functionality, support for dynamic cursors, can be found here.

Sunday, June 12, 2011

Truly Random and Complex Password Generator - Part 2 of 2

Permalink: http://bit.ly/1tMM9h2



In the first part of this entry, it was shown how its possible that a password from a normal user would significantly be weaker than that from a complex and randomly generated one.

Note: in the strictest sense, there is no such thing as an uncrackable password. Passwords can be uncrackable only in theory, i.e. the mathematical probability of a password being guessed correctly-- no matter how infinitesimally small the odds are, the possibility of a right guess is always present. Also, passwords are uncrackable only technically-- given enormous resources and time any password can be cracked.

Here is a function for a truly random and complex password generator which is based on the formulas given in the first part of this entry:

DELIMITER $$
DROP FUNCTION IF EXISTS `randomPasswordGenerator` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `randomPasswordGenerator`(
  ) RETURNS varchar(64) CHARSET utf8
BEGIN
  DECLARE charCount TINYINT(1) DEFAULT 0;
  DECLARE charDiceRoll TINYINT(2);
  DECLARE randomChar CHAR(1);
  DECLARE randomPassword CHAR(8) DEFAULT '';
  REPEAT
    SET charCount = charCount + 1;
    SET charDiceRoll = 1 + FLOOR(RAND() * 94);
    IF (charDiceRoll <= 32)
    THEN
      SET randomChar = ELT(charDiceRoll,
      '`', '~', '!', '@', '#', '$', '%', '^',
      '&', '*', '(', ')', '-', '=', '_', '+',
      '[', ']', '{', '}', '\\', '/', '|', '?',
      ';', ':', '\'', '"', ',', '.', '<', '>');
    ELSEIF (charDiceRoll >= 33)
      AND (charDiceRoll <= 68)
    THEN
      SET charDiceRoll = charDiceRoll - 33;
      SET randomChar = CONV(
        charDiceRoll,
        10, 36);
    ELSE
      SET charDiceRoll = charDiceRoll - 59;
      SET randomChar = LOWER(
        CONV(
          charDiceRoll,
          10, 36)
      );
    END IF;
    SET randomPassword = CONCAT(randomPassword, randomChar);
  UNTIL (charCount = 8)
  END REPEAT;
  RETURN randomPassword;
END $$
DELIMITER ;

This function will return an 8-character password string. Each character has an equal chance of 1/94 to be generated. Given a short period of time and a normal amount of resources, this qualifies as a theoretical technically uncrackable password. It can be modified to return a longer password length or even a random length, say between 8-12 characters long. A separate user defined function, randomRangePicker(), can be used, if refactoring is desired.



The output can be checked with a simple SELECT statement:

SELECT randomPasswordGenerator();

See the first part of this entry or a similar random string/name generator.

Wednesday, June 8, 2011

Truly Random and Complex Password Generator - Part 1 of 2

Permalink: http://bit.ly/1pJlpHz



Skip to the 2nd part for the code snippet.

Its an important matter of security to enforce complex passwords that have a sufficient length. From personal experience, if you ask a normal user to create their own passwords, their passwords will be based on a character set consisting of 36 case-insensitive alphanumeric characters: a-z, 0-9 instead of the full 94 character set typable on all keyboard layouts. Also, most normal users would use dictionary based passwords with a predictable pattern: dictionary words at the beginning and numbers at the end.

Relying solely on the client-side or front-end to enforce the creation of passwords of at least 8 characters long and the use of special characters will not be practical in preventing the use of dictionary words as well as the usage of a certain pattern. Whatever the mechanism is on the client-side, the backend MySQL database should complement it.

Assigning complex passwords to users will, in effect, increase the number of characters from 36 to 94. By making the password randomly generated, the predictability of dictionary words and pattern matching is removed. The number of possible passwords is substantially increased. For an 8-character password string, under a reasonable time limitation, say 6 hours, and using a single modern computer, this results to a theoretical technically uncrackable password:

SELECT FORMAT(POW(32, 8), 0); 
  -- Results to 1,099,511,627,776 possible combinations. Note that the number of possible combinations is greatly reduced when the user limits the password to use dictionary words and pattern matching. This results to a crackable password in a short period of time.  

  
 SELECT FORMAT(POW(94, 8), 0); 
  -- Results to 6,095,689,385,410,816 possible combinations. By being randomly generated, the number of combinations is not reduced as explained above. This results to a theoretical technically uncrackable password given a short period of time.  

A password generator, to be truly random, should satisfy the following:
  • The character set for the generator should include all the typable characters on any keyboard layout: 

    a-z, A-Z, 0-9,
    and ` ~ ! @ # $ % ^ & * ( ) - = _ + [ ] { } \ / | ? ; : ' " , . < >

    This results to 26 + 26 + 10 + 32 = 94 characters.
  • Each of the allowed characters should all have an equal chance of being generated.

For practical purposes, we'll take aside arguments on password complexity versus password length, and we'll assume an 8-character password string. To generate any of the 62 alphanumeric characters, we'll use a base 36 statement as the formula:

SELECT CONV(
          FLOOR(
            RAND() * 36),
      10, 36);

Using a base 36 statement gives us the most compact alphanumeric numeral system. The case sensitivity will be based on odds from a random number range in order to include the LOWER case of the alphabet.

The special characters can be generated by using the ELT function as the basis for the formula like:

SELECT ELT(1 + FLOOR(RAND() * 32),
      '`', '~', '!', '@', '#', '$', '%', '^',
      '&', '*', '(', ')', '-', '=', '_', '+',
      '[', ']', '{', '}', '\\', '/', '|', '?',
      ';', ':', '\'', '"', ',', '.', '<', '>');




In the continuation of this entry is an example of a true random and complex password generator function.

Sunday, June 5, 2011

How to compare the record differences of two similar tables - Part 2 of 2

Permalink: http://bit.ly/1ztV5sU



The rationale behind comparing tables versus using a CHECKSUM TABLE statement can be found in the first part of this entry.

Comparing the record differences of two similar tables can be useful when transferring records from an old database to a new one or when comparing backup tables against the original tables. Depending on specific requirements, it may be necessary to validate that the transfer was successful or to see which specific data in the records of the original and in-use tables have been updated, inserted, or deleted when compared to the backup. The query in the stored procedure below will show the differences caused by updates to current records as well as the record differences caused when new records are inserted or old ones are deleted in one table but not in the other:

DELIMITER $$
DROP PROCEDURE IF EXISTS `compareTables` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `compareTables`(
  dbTableNameA VARCHAR(133),
  dbTableNameB VARCHAR(133))
BEGIN
  DECLARE dbTableA, dbTableB VARCHAR(129);
  DECLARE dbA, tableA, dbB, tableB VARCHAR(64);
  DECLARE columnList TEXT;
  SET dbTableA = REPLACE(dbTableNameA, '`', '');
  SET dbTableB = REPLACE(dbTableNameB, '`', '');
  SET dbA = LEFT(dbTableA, LOCATE('.', dbTableA) - 1);
  IF (dbA = '')
  THEN
    SET dbA = DATABASE();
  END IF;
  SET tableA = RIGHT(dbTableA, LENGTH(dbTableA) - LOCATE('.', dbTableA));
  SET dbB = LEFT(dbTableB, LOCATE('.', dbTableB) - 1);
  IF (dbB = '')
  THEN
    SET dbB = DATABASE();
  END IF;
  SET tableB = RIGHT(dbTableB, LENGTH(dbTableB) - LOCATE('.', dbTableB));
  SET columnList = getColumnList(dbA, tableA);
  SET @compareTablesStmt = CONCAT(
  'SELECT MIN(`compareTables`) AS \'Tables Compared\', ',
     columnList, '
  FROM (
    SELECT \'', dbTableNameA, '\' AS `compareTables`, ',
     columnList, '
    FROM `', dbA, '`.`', tableA, '`
    UNION ALL
    SELECT \'', dbTableNameB, '\' AS `compareTables`, ',
     columnList, '
    FROM `', dbB, '`.`', tableB, '`)
  AS `compareTables`
  GROUP BY ', columnList, '
  HAVING COUNT(*) = 1'
  );
  PREPARE compareTablesStmt FROM @compareTablesStmt;
  EXECUTE compareTablesStmt;
  DEALLOCATE PREPARE compareTablesStmt;
END $$
DELIMITER ;

The stored procedure can be called by passing the two tables to be compared as parameters and can be in the forms: `db`.`table`, db.table, `table`, or table. Passing the names wrapped in backticks and specifying from which database the table is from is optional. If no database is specified, the SP will use the current one. The SP is dependent on a user defined function to get the column names dynamically. The getColumnList() function can be found here.

For TEMPORARY TABLES, you will have to manually input the columns of the tables compared. This cannot be done dynamically purely inside MySQL since as stated in a previous blog entry: "TEMPORARY TABLEs are invisible from the `information_schema` database (as of MySQL version 5.6)".

Here is a sample call:

 CALL compareTables(
 '`origDB`.`origTable`', 'backupTable'
 );

And a sample result set:

Tables ComparedcustomerNumbercontactLastNamecontactFirstName
`origDB`.`origTable`103fooCarine
backupTable103SchmittCarine
`origDB`.`origTable`999foobarfoobarz



The first two rows compares the records with the Primary Key or ID = 103 and shows that the data under `contactLastName` are different between the original and backup tables. The third row shows a record that is present in the original table but not in the backup table.

See part 1 of this entry for the rationale behind the SELECT statement using a UNION ALL clause used by the stored procedure.

Thursday, June 2, 2011

How to compare the record differences of two similar tables - Part 1 of 2

Permalink: http://bit.ly/1z7MNGQ



Click here to skip to the code snippet.

The CHECKSUM TABLE result sets of two similar tables only indicate if there are differences between the two tables. It does not tell you what exactly the differences are between the two. Does `tableA` have an updated value for one of its records that `tableB` does not have? Or does `tableA` have an extra row that `tableB` does not have? Depending on specific business requirements, the CHECKSUM TABLE statement may be sufficient. If you need to determine what the actual differences are, there is a way to go about doing this automatically and dynamically by creating a stored procedure.

This SELECT statement that uses a UNION ALL clause provides the same functionality in simply determining if there are differences between two tables:

  SELECT
  IF(
    EXISTS(
    SELECT *
    FROM (
      SELECT *
      FROM `tableA`
      UNION ALL
      SELECT *
      FROM `tableB`)
    AS `compareTables`
    GROUP BY `column1`, `column2`, `columnN`
    HAVING COUNT(*) = 1),
  TRUE, FALSE);

This statement provides an automated check in the sense that you won't have to manually compare the CHECKSUMs of `tableA` and `tableB`. Also, unlike using a CHECKSUM TABLE statement, a SELECT statement allows the result to be stored INTO a variable. If the statement returns TRUE then there is a difference found between the two tables, if not it will return FALSE. The GROUP BY clause will need a comma-separated list of all the columns of the tables since MySQL will not permit an asterisk (*) wildcard in this part of the statement.

To fully automate the above formula, it can be converted into a function and the writing of the column names for the GROUP BY clause can be retrieved using the getColumnList() user defined function. Then re-write the SELECT statement as a dynamic PREPARE-EXECUTE construct. The two tables `tableA` and `tableB` can be the parameters to be passed and the TRUE/FALSE values to be the RETURN.

To determine what the actual differences are, a more complex SELECT statement is needed. Based on the first iteration, we can derive from it and make it more useful:

SELECT MIN(`compareTables`) AS 'Tables Compared',
  `column1`, `column2`, `columnN`
FROM (
  SELECT '`tableA`' AS `compareTables`,
    `column1`, `column2`, `columnN`
  FROM `tableA`
  UNION ALL
  SELECT '`tableB`' AS `compareTables`,
    `column1`, `column2`, `columnN`
  FROM `tableB`)
AS `compareTables`
GROUP BY `column1`, `column2`, `columnN`
HAVING COUNT(*) = 1;

A sample result set will look like this:

Tables ComparedcustomerNumbercontactLastNamecontactFirstName
`tableA`103fooCarine
`tableB`103SchmittCarine
`tableA`999foobarfoobarz

The first two rows compared `tableA` and `tableB`. Note that the difference between the two tables was found in the values for the `contactLastName` column. The last row shows `tableA` with no matching comparison with `tableB`. This means that the record in `tableA` does not exist in `tableB`.



In the continuation of this entry, a stored procedure that automates the writing of the SELECT statement is given. The names of the tables to be compared and their column names can be dynamically generated so that the SELECT statement won't have to be re-written for each different case that a comparison of two similar tables is made.

Sunday, May 29, 2011

A better way to get Primary Key columns

Permalink: http://bit.ly/1o0NdpY



When an application asks MySQL for the Primary Key of a table, there are several ways to go about doing this. A fast way would be to use these statements:

DESCRIBE `dbName`.`tableName`;
-- or
SHOW INDEX FROM `dbName`.`tableName` 
WHERE `Key_name` = 'PRIMARY';

The result set would have to be parsed in order to get the column names. This is not a recommended way to get the PK columns due to its limited usefulness as the column names cannot be returned INTO a variable.

Another method often used is this SELECT statement that uses a table JOIN:

SELECT k.`COLUMN_NAME`
FROM `information_schema`.`TABLE_CONSTRAINTS` t
JOIN `information_schema`.`KEY_COLUMN_USAGE` k
USING (`CONSTRAINT_NAME`, `TABLE_SCHEMA`, `TABLE_NAME`)
WHERE t.`CONSTRAINT_TYPE` = 'PRIMARY KEY'
 AND t.`TABLE_SCHEMA` = 'dbName'
 AND t.`TABLE_NAME` = 'tableName';

Depending on the number of tables and databases, this method may be slightly slower than the first method shown above. However, this method is more flexible as it allows a return of the output INTO a variable.

A better way to get the Primary Key columns is this SELECT statement that does not use a table JOIN:

SELECT `COLUMN_NAME`
FROM `information_schema`.`COLUMNS`
WHERE (`TABLE_SCHEMA` = 'dbName')
  AND (`TABLE_NAME` = 'tableName')
  AND (`COLUMN_KEY` = 'PRI');

Limitations: This is faster than the second method that uses a table JOIN. However, only the first method using DESCRIBE or SHOW INDEX can retrieve the PK columns of a TEMPORARY TABLE. This is because TEMPORARY TABLEs are invisible from the `information_schema` database (as of MySQL version 5.6).

Using the above formula, a function that retrieves the PK columns by passing the database and table names as parameters can be done. See the snippet below:

DELIMITER $$
DROP FUNCTION IF EXISTS `getPKColumns` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `getPKColumns`(
  dbName VARCHAR(64),
  tableName VARCHAR(64)) RETURNS text CHARSET utf8
BEGIN
  DECLARE PKColumns TEXT;
  SELECT GROUP_CONCAT(`COLUMN_NAME` SEPARATOR '`, `')
  FROM `information_schema`.`COLUMNS`
  WHERE (`TABLE_SCHEMA` = dbName)
    AND (`TABLE_NAME` = tableName)
    AND (`COLUMN_KEY` = 'PRI')
    INTO PKColumns;
  SET PKColumns = CONCAT('`', PKColumns, '`');
  RETURN PKColumns;
END $$
DELIMITER ;

Here's a sample that uses a SELECT statement to pass the database and table names in order to use the function:

SELECT getPKColumns('db', 'table');

The Primary Key column names returned are wrapped in backticks in case the name of the column turns out to be a reserved word. The function will correctly wrap the backticks even for cases where the table has multi-column Primary Keys. This implementation allows the function's return to be usable in many different scenarios such as in dynamic SQL statements. The output will be like:

`primaryKeyColumn1`, `primaryKeyColumn2`



Since Primary Keys are typically set up to be a single column or for lookup tables usually no more than two columns, the GROUP_CONCAT function can safely rely on the default value of group_concat_max_len which is at 1024 characters. There is no need to increase group_concat_max_len since a column name cannot exceed more than 64 characters which means there is very little risk of the result being truncated should the character count of the output exceed 1024.

Wednesday, May 25, 2011

Safe DML Options

Permalink: http://bit.ly/VNLTQe



The Safe DML project provides automatic creation of backups and an undo functionality for MySQL. These two abilities do not rely on the command line shell and can simply be executed by queries.

$DML() Options

Inside the stored procedure, $DML(), you can find the following options that can be set:

-- Switches logging on/off
DECLARE logging BOOLEAN DEFAULT FALSE;
-- Clears the logs per call
DECLARE clearLogs BOOLEAN DEFAULT TRUE;
-- Set to FALSE to backup only the current db in use
DECLARE backupAllDB BOOLEAN DEFAULT TRUE;
-- Disables filtering out of unsupported statements
DECLARE dmlFilter BOOLEAN DEFAULT TRUE;

  • The logging option enables/disables logs written by Safe DML into the `debug` table in the `$backup` database. Logging is useful for development work when adding new features or updating Safe DML. Disable for production use to improve performance.
  • Additionally, you can set logs to clear for each time $DML() is used. Enable this option only when needed and only during development work since keeping this BOOLEAN set to TRUE can cause the `debug` table to consume too much disk space over time especially when used in a production environment.
  • $DML() can be used for just the current database by setting this option to FALSE. This is also useful for keeping disk space consumption of backup tables low if only the current database needs an UNDO functionality.
  • Statements not supported by Safe DML has an option to be filtered out from being passed. Only INSERT, UPDATE, DELETE, and REPLACE statements are currently supported. Since SELECT and SHOW statements do not modify data, you can pass these statements into $DML() if so desired but these will not be recorded in the `history` table inside the `$backup` database. 

Other statements that are allowed to be used inside transactions will be supported in a future version. Options on the size restriction on the maximum number of `history` table records and flushing schedule of table backups are under development.

See the change logs for updates on new features.

$UNDO() Options

For the $UNDO() stored procedure, you can also set its options for logging and clearLogs. To invoke an undo command, you can either use the UNDO keyword by calling the $DML() stored procedure or call the $UNDO() stored procedure directly.

To UNDO the last statement:

CALL $DML("UNDO");
-- or
CALL $DML("UNDO LAST STATEMENT");
-- or
CALL $UNDO('');
-- or
CALL $UNDO('LAST STATEMENT');

To UNDO all changes:

CALL $DML("UNDO ALL")
-- or
CALL $UNDO('ALL');

To UNDO multiple statements or to a specific point in time, first determine the commands you want to revert by parsing the `history` table for the id of the final command to be reverted:

SELECT * FROM `$backup`.`history`;
-- take note of the dmlStringId

Then call the desired stored procedures, where n is the dmlStringId of the final command to be reverted.

CALL $DML("UNDO n");
-- or
CALL $UNDO(n);



For example, there are 3 records in the `history` table with dmlStringIds 1, 2, and 3. The following call will have the effect of undoing the 3rd command in the record, then the 2nd command, in sequence, while keeping the 1st command with dmlStringId = 1, untouched:

CALL $DML("UNDO 2");

To be continued.

Monday, May 23, 2011

Safe DML

Permalink: http://bit.ly/1vSmnGm



There is no native undo ability inside MySQL. Thus, the common methods to workaround the problem is through creating backup dumps and enabling binary logging, using transactions, and requiring the WHERE clause in Data Manipulation Language commands by using the safe updates option. These methods have drawbacks:
  1. Creating backups via mysqldump and using binary logging to revert to a point in time will have the same effect as an undo functionality. However, these are executed via the command line shell. Since these tools are not executed inside MySQL, this method may not be convenient and presents limitations on when it can be used.
  2. Transactions allow you to "undo" as long as you have not committed your data manipulation changes. Imagine if you discover data manipulation changes that you wish to undo after the last transaction commit. It is impossible.
  3. Using the safe updates option when running MySQL is meant to be a way to prevent undesirable changes by requiring the WHERE clause. Requiring the WHERE clause does not provide an undo ability.

Without a feasible undo ability, mistakes can lead to frustrations and several hours of work lost to having to manually re-enter previous data.

This small project is aimed to provide the missing undo ability, all inside MySQL and allows the user to undo previous DML changes independent of the session. To fully automate the undo functionality, the DML command needs to be contained in a stored procedure. The example below shows a data manipulation statement encapsulated in quotation marks so that single quotes can be used by the statement itself. The statement is passed as a parameter of the $DML() SP:

CALL $DML(
  "UPDATE `dbName`.`tableName`
  SET `columnName` = 'updateValue'
  WHERE `primaryKey` = 100"
);

The Safe DML stored procedure, $DML(), automates the creation of backups by copying table records into a created database named `$backup`. The undo functionality can be invoked by calling the $UNDO() stored procedure:

CALL $UNDO('');
-- or
CALL $DML('UNDO');

This reverts the last DML command. There are options to undo all previous DML commands, by UNDO('ALL') or just up to a specific number of changes. UNDO options can be found in this blog entry: http://mysql-0v34c10ck.blogspot.com/2011/05/safe-dml-options.html.

How does it work? $DML() will accept any data manipulation statement no matter how complex. It will gracefully handle commands that are in the wrong syntax or that cause errors. In such cases, no changes will take effect and the appropriate MySQL error message will be thrown to indicate what the problem with the statement is.

Before executing the DML statement, $DML() will create snapshots of all tables from all databases. There is an option to create backup copies for only the current database and this is discussed in Safe DML Options. The creation of a snapshot only takes effect when $DML() is called for the first time when the `$backup` database does not exist yet.

Note: The backup copies are considered "snapshots" since these are data copies only. The primary key, indexes, and foreign key constraints of the tables are not backed up. The snapshots are stored in the `$backup` database and if any DDL statements are executed, for example, a new column was added in a table, it is recommended that the `$backup` database be dropped so that it can be appropriately re-created. This flush process may become automated in a future version of this project.

Inside the created `$backup` database, there will be 3 tables that are used by Safe DML: `history`, `tableref`, and `debug`. The `history` table contains records of the DML commands executed using $DML(). The `tableref` table contains records about each table snapshot. The third table, `debug`, contains logs written by Safe DML and is useful for development work when adding new features or updating Safe DML.

$DML() stored procedure does not manipulate any table from any database other than its own `$backup` database. This allows the user to create TRIGGERS and CONSTRAINTS in their tables since Safe DML does not rely on those.



The project, Safe DML, is currently in development. Only INSERT, UPDATE, DELETE, and REPLACE statements are currently supported. This will expand in a future version. 

To be continued.

Monday, May 16, 2011

Dynamic Cursor Fix

Permalink: http://bit.ly/1pl6kz5



This is the workaround to create dynamic cursors in MySQL and is meant to be used together with the general purpose dynamic cursor stored procedure.

Increase the compared value of colValN in the WHILE-DO construct to the maximum number of columns your dynamic cursor will be needing. This fixes MySQL's limitation on cursors by providing the dynamic cursor with consistent column names that are independent of the underlying SELECT statement and also provide a predictable number of columns.

The final result is that the temporarily created table will be constant, while the records that the cursor will traverse will remain dynamic. The code in part 3 of the general purpose dynamic cursor together with this fix creates a working dynamic cursor that can be called multiple times using different SELECT statement parameters with no stale table data returned.

DELIMITER $$
DROP PROCEDURE IF EXISTS `dynamicCursorFix` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `dynamicCursorFix`(
 IN tableName VARCHAR(64))
BEGIN
 DECLARE columnNo TINYINT DEFAULT 0;
 DECLARE columnName TEXT;
 DECLARE noMoreColumns BOOLEAN DEFAULT FALSE;
 DECLARE columnNameCursor CURSOR FOR
 SELECT `COLUMN_NAME`
  FROM `information_schema`.`COLUMNS`
  WHERE (`TABLE_SCHEMA` = DATABASE())
   AND (`TABLE_NAME` = tableName);
 DECLARE CONTINUE HANDLER
  FOR 1329
  SET noMoreColumns = TRUE;
 OPEN columnNameCursor;
 columnNameCursorLoop: LOOP
 FETCH columnNameCursor INTO columnName;
  IF (noMoreColumns)
  THEN
   CLOSE columnNameCursor;
   LEAVE columnNameCursorLoop;
  END IF;
  SET columnNo = columnNo + 1;
  SET @changeColStmt = CONCAT(
   'ALTER TABLE `', tableName, '`
   CHANGE `', columnName, '` `column', columnNo, '` TEXT');
  PREPARE changeColStmt FROM @changeColStmt;
  EXECUTE changeColStmt;
  DEALLOCATE PREPARE changeColStmt;
 END LOOP columnNameCursorLoop;
 #Increase as needed by the dynamic cursor
 WHILE columnNo < 3
 DO
  SET columnNo = columnNo + 1;
  SET @addColStmt = CONCAT(
   'ALTER TABLE `', tableName, '`
   ADD `column', columnNo, '` TINYINT');
  PREPARE addColStmt FROM @addColStmt;
  EXECUTE addColStmt;
  DEALLOCATE PREPARE addColStmt;
 END WHILE;
END $$
DELIMITER ;

Call this fix from 'dynamicCursor' by passing the table name of its temporarily created table 'dynamic_cursor'. This same fix should be called by any nested dynamic cursor like 'dynamicCursorNested' and passing the nested dynamic cursor's temporarily created table name such as 'dynamic_cursor_nested'. See the 3rd part of the general purpose dynamic cursor for an explanation on how to nest dynamic cursors.

See part 1 for the rationale on creating a dynamic cursor. Temporary tables are not the way to go due to MySQL's limitation of hiding temporary tables from the `information_schema` database. Part 2 discusses how MySQL might fetch stale table data if different SELECT statements are passed to a dynamic cursor. This problem also occurs when the tables referred to by a fixed cursor's SELECT statement are altered by changing the table's columns.



This fix applies for the final iteration found in part 3 of the general purpose dynamic cursor blog entry and by continuing to use normal (not temporary) tables, the column names are made visible in the `information_schema` database.

Sunday, May 15, 2011

A General Purpose Dynamic Cursor - Part 3 of 3

Permalink: http://bit.ly/1jxurF0



See part 1 for the rationale behind the code and part 2 to understand limitations faced by cursors in MySQL.

The code snippet below shows an example that performs an action with the cursor data as well as a good way to debug the stored procedure. This iteration also works around MySQL's cursor problem mentioned in part 2 by calling the dynamic cursor fix stored procedure:

DELIMITER $$
DROP PROCEDURE IF EXISTS `dynamicCursor` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `dynamicCursor`(
 IN selectStmt TEXT,
 IN whatAction VARCHAR(255),
 INOUT inOutParam VARCHAR(255))
BEGIN
 #Used by whatAction ='debug'
 DECLARE cursorRead TEXT DEFAULT '';
 #Cursor-specific variables
 DECLARE colVal1, colVal2, colVal3 TEXT;
 DECLARE noMoreRows BOOLEAN DEFAULT FALSE;
 DECLARE dynamicCursor CURSOR FOR
  SELECT *
  FROM `dynamic_cursor`;
 DECLARE CONTINUE HANDLER
  FOR 1329
 SET noMoreRows = TRUE;
 DROP TABLE IF EXISTS `dynamic_cursor`;
 SET @createTableStmt = CONCAT(
  'CREATE TABLE `dynamic_cursor` AS ',
  selectStmt);
 PREPARE createTableStmt FROM @createTableStmt;
 EXECUTE createTableStmt;
 DEALLOCATE PREPARE createTableStmt;
 CALL dynamicCursorFix('dynamic_cursor');
 OPEN dynamicCursor;
 dynamicCursorLoop: LOOP
 FETCH dynamicCursor INTO colVal1, colVal2, colVal3;
  IF (noMoreRows)
  THEN
   CLOSE dynamicCursor;
   LEAVE dynamicCursorLoop;
  END IF;
  CASE whatAction
  WHEN 'debug'
  THEN
   SET cursorRead = CONCAT_WS(', ', 
     cursorRead, colVal1, colVal2, colVal3);
  ELSE
   BEGIN
   END;
  END CASE;
 END LOOP dynamicCursorLoop;
 CASE whatAction
 WHEN 'debug'
 THEN
  SELECT cursorRead 'Values read by the cursor';
 ELSE
  BEGIN
  END;
 END CASE;
 DROP TABLE `dynamic_cursor`;
END $$
DELIMITER ;

Important: The workaround for MySQL's cursor limitation that causes cursors to read stale table data is implemented in a stored procedure dynamicCursorFix(). This solution is implemented externally from the dynamic cursor so that it may be called by other nested dynamic cursors if needed. You can find this fix at this link: http://mysql-0v34c10ck.blogspot.com/2011/05/dynamic-cursor-fix.html.

You can test the function and/or debug the dynamic cursor by passing the following SELECT statement and specifying the 'debug' action:

CALL dynamicCursor(
 "SELECT T.`TABLE_NAME`
  FROM `information_schema`.`TABLES` T
  WHERE `TABLE_SCHEMA` = DATABASE()",
 'debug',
 @NULL);

The INOUT parameter is not used by the 'debug' action so we have set it to a dummy variable @NULL. For the above given SELECT statement, expect the result set to contain 'dynamic_cursor' together with any existing tables in the current database to confirm that the stored procedure is working. This also demonstrates how to implement actions that will be performed when the cursor fetches data.

You will want to replace the SELECT statement to a specific SELECT statement you are analyzing to pass to the dynamic cursor. Remember that a temporarily created table is present inside the current database and you may want to filter it out of your SELECT statement by adding a WHERE clause that excludes 'dynamic_cursor%' or as the case may be.



You may opt to nest several dynamic cursors inside the case for the 'whatAction' parameter if needed. Simply copy the stored procedure and change its name to `dynamicCursorNested` or the like, and the table name `dynamic_cursor` to another name such as `dynamic_cursor_nested`. The nested cursors will need to depend on the same fix mentioned above.

Return to part 1 or part 2 or see another workaround on a missing MySQL feature, disabling/enabling triggers on demand.

Tuesday, May 10, 2011

A General Purpose Dynamic Cursor - Part 2 of 3

Permalink: http://bit.ly/RcRieg



Refer to part 1 for the rationale behind the code or you can skip to part 3 for a working example as well as how you can debug the stored procedure.

Important: The SP will create a table named `dynamic_cursor`. Make sure this table does not exist in the database where you will be storing the procedure. Here's the 1st iteration of a general purpose dynamic cursor:

DELIMITER $$
DROP PROCEDURE IF EXISTS `dynamicCursor` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `dynamicCursor`(
 IN selectStmt TEXT,
 IN whatAction VARCHAR(255),
 INOUT inOutParam VARCHAR(255))
BEGIN
 DECLARE colVal1, colVal2, colVal3 TEXT;
 DECLARE colCount TINYINT;
 DECLARE noMoreRows BOOLEAN DEFAULT FALSE;
 DECLARE dynamicCursor CURSOR FOR
  SELECT *
  FROM `dynamic_cursor`;
 DECLARE CONTINUE HANDLER
  FOR 1329
  SET noMoreRows = TRUE;
 DROP TABLE IF EXISTS `dynamic_cursor`;
 SET @createTableStmt = CONCAT(
  'CREATE TABLE `dynamic_cursor` AS ',
  selectStmt);
 PREPARE createTableStmt FROM @createTableStmt;
 EXECUTE createTableStmt;
 DEALLOCATE PREPARE createTableStmt;
 SELECT COUNT(*)
 FROM `information_schema`.`COLUMNS`
 WHERE (`TABLE_SCHEMA` = DATABASE())
  AND (`TABLE_NAME` = 'dynamic_cursor')
 INTO colCount;
 OPEN dynamicCursor;
 dynamicCursorLoop: LOOP
  CASE colCount
   WHEN 1 THEN
    FETCH dynamicCursor INTO colVal1;
   WHEN 2 THEN
    FETCH dynamicCursor INTO colVal1, colVal2;
   WHEN 3 THEN
    FETCH dynamicCursor INTO colVal1, colVal2, colVal3;
  END CASE;
  IF (noMoreRows)
  THEN
   CLOSE dynamicCursor;
   LEAVE dynamicCursorLoop;
  END IF;
  CASE whatAction
  WHEN 'doSomething'
  THEN
   BEGIN
   END;
  ELSE
   BEGIN
   END;
  END CASE;
 END LOOP dynamicCursorLoop;
 DROP TABLE `dynamic_cursor`;
END $$
DELIMITER ;

Regarding MySQL's limitations on cursors: As of version 5.5, the above stored procedure will only reliably work on the first call in the current session. Subsequent calls may fail if the SELECT statement passed is different from the first one. This is because the cursor is traversing the column names from the first call and is not aware that the columns it's supposed to read are now different. The cursor reads its own copy of the table defined from the first call, which results to stale data when the columns of the table are subsequently altered. Part 3 implements the workaround to this inherent MySQL cursor limitation. The bug is due to MySQL's optimization on the SELECT statement where it wrongly assumes that the columns are still the same.

The code snippet can accept a SELECT statement parameter that will produce a result set from 1-3 columns. You may want to increase the maximum to around 12 columns or even more. Simply increase the number of ColValN variables and create additional cases for colCount and write their FETCH statements.

The intention here is that the statements to execute using the data fetched from the cursor can be written inside the cases for the whatAction string parameter. You may want to create additional CASE constructs before the cursor's OPEN and/or after the LOOP is ended depending on your specific needs.

Additionally, you can add additional INOUT parameters as needed like inOutParam1, inOutParam2, ... inOutParamN. However, keep in mind that the procedure relies on MySQL's automatic data type conversion. This means that in order to keep the procedure able to accept a broad range of inputs and maintain its general purpose nature, you need to keep using generic data types such as VARCHAR or TEXT. The data will be converted at run-time if it isn't a VARCHAR or TEXT variable.



Part 3 shows an example that performs an action with the cursor data as well as a good way to debug the stored procedure.

Return to part 1 for the rationale behind the code.



Updated: 10/22/2012

Monday, May 9, 2011

A General Purpose Dynamic Cursor - Part 1 of 3

Permalink: http://bit.ly/TBKQL3



Skip to part 2 for the code snippet or to part 3 for a working example and how to debug the stored procedure.

Also another workaround on a missing MySQL functionality, enabling/disabling triggers, can be found here.

As of version 5.5, MySQL still does not have the native ability to execute a dynamic cursor. This can be worked around but the resulting stored procedure will have a few limitations.

This stored procedure is a general purpose dynamic cursor. It is general purpose, in the sense that, it qualifies as a template that you can modify to suite a specific purpose and/or expand to cover a broader range. It can be called by passing the following parameters:

CALL dynamicCursor(
 "SELECT * FROM `dbName`.`tableName`", 
 'whatAction', 
 @inOutParam);

The first parameter is the SELECT statement that the cursor is declared to use. Quotation marks can be used to encapsulate the string so that single quotes can be used by the statement itself if needed.

The second parameter is a string that specifies what action to do with the cursor traversal data-- the statements to execute are written inside the stored procedure as cases of the second parameter. What you do with the data is up to you.

The third parameter is an INOUT variable which will allow you to get a return or to pass additional parameters as required by your particular case. You can expand the number of INOUT variables to the needed amount by simply adding more in the stored procedure. If a return or passing additional variables are not needed, simply set the INOUT variable to a dummy variable such as @NULL.

The dynamic cursor logic can be achieved through a dynamic MySQL statement that creates a copy of the data from the SELECT statement of the cursor declaration into a created table. The cursor, instead of using the SELECT statement, is declared to SELECT the table that was created. This causes the table data to become the part which is dynamic and not the cursor itself.

The creation of this table then stems to two methods: create the table as a temporary table or to create the table as a normal (not temporary) table. Both methods have limitations:
  • If a normal table is created, MySQL will see it as a DML statement. An implicit commit will be triggered if this stored procedure is used inside transactions.
  • If a temporary table is created, it will be invisible for SHOW TABLES, SHOW OPEN TABLES, SHOW TABLE STATUS, and more importantly as stated in a previous blog entry, "it is not possible to get a column list into a variable if it comes from a temporary table. This is due to the fact that temporary tables are invisible within the `information_schema` database." This means that the cursor's FETCH statement will need to have a fixed number of variables.
  • Both methods have a performance impact on the system since we are creating a new table and copying data from a SELECT statement each time the stored procedure is called.

The advantages between the two are:
  • By creating a normal table, we can determine the number of columns from the `information_schema` database. The FETCH statement can be nested inside a condition construct and fetch the cursor traversal into the appropriate number of variables up to a reasonable maximum quantity.
  • By creating a temporary table, the stored procedure can be used inside transactions without causing an implicit commit.

Due to the fact that using the method of creating a temporary table will work only for SELECT statements with a fixed or predetermined number of columns, it defeats the purpose of being dynamic. A temporary table is not the way to go unless you absolutely need a dynamic cursor for a transaction.

The normal table method is what we will use, however, there is the issue of the maximum number of variables that the cursor's FETCH statement should have for practical coding purposes. This is because FETCH statements are not supported by prepared statements which can dynamically write the specific number of variables that need to be created and fetch data into. For example, an InnoDB table can permit up to a maximum of 1,000 columns. To simplify, this equates to 1,000 cases for each of the possible number of columns. It is not practical to write all the variables for each of the cases and the 1,000 cases themselves. It would be better to write a non-dynamic cursor construct for SELECT statements which produce result sets with a large number of columns.



Therefore, a decision as to when it is practical to use a dynamic cursor and when to use a fixed cursor needs to be made. A dynamic cursor should be used only for SELECT statements that will produce a small result set. The result set should also have a practical number of columns. Consider using a fixed cursor if these two conditions are not met.

The second part of this blog entry shows the code for a general purpose dynamic cursor stored procedure and the third part shows a working example and how to debug the cursor.



Updated: 10/20/2012

Saturday, May 7, 2011

True Random Database and Table Name Generator - Part 2 of 2

Permalink: http://bit.ly/QuBLVB



Read part 1 for the rationale behind the code.

As discussed in the first part of this blog entry, we'll be utilizing a statement that uses base 36 to generate the random name. We will be adding the $ and _ characters using the ELT function. Here is a true random database and table name generator:

DELIMITER $$
DROP FUNCTION IF EXISTS `randomNameGenerator` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `randomNameGenerator`(
 ) RETURNS varchar(64) CHARSET utf8
BEGIN
 DECLARE numberOfChars, charDiceRoll TINYINT(2);
 DECLARE charCount TINYINT DEFAULT 0;
 DECLARE randomChar CHAR(1);
 DECLARE randomName VARCHAR(64) DEFAULT '';
 SET numberOfChars = randomRangePicker(1, 64);
 REPEAT
  SET charCount = charCount + 1;
  SET charDiceRoll = randomRangePicker(1, 38);
  IF (charDiceRoll <= 2)
  THEN
   SET randomChar = ELT(charDiceRoll, '$', '_');
  ELSE
   SET charDiceRoll = charDiceRoll - 3;
   SET randomChar = LOWER(
    CONV(
     charDiceRoll,
     10, 36)
   );
  END IF;
  SET randomName = CONCAT(randomName, randomChar);
 UNTIL (charCount = numberOfChars)
 END REPEAT;
 RETURN randomName;
END $$
DELIMITER ;

Total of 38 characters: 36 case-insensitive alphanumeric characters a-z, 0-9, including $ and _. For code portability between Windows and Linux, the random name generated is all in lowercase characters. Each character has a 1/38 chance to be generated.

Note that we utilized the random number range picker function mentioned in a previous blog entry. Here's an explanation of the code:
  • The length of the name is random. Anywhere from 1 to 64 characters long can be generated. The maximum number of characters is based on the value of numberOfChars.
  • A character will be generated based on the result of a dice roll from 1 to 38. The 1 to 38 range represents each of the 38 characters.
  • The name will be created random character by random character by the CONCAT function until the length is equal to numberOfChars.

Here's a version of the above function that doesn't depend on the random number range picker function:

DELIMITER $$
DROP FUNCTION IF EXISTS `randomNameGenerator` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `randomNameGenerator`() RETURNS varchar(64) CHARSET utf8
BEGIN
 DECLARE numberOfChars, charDiceRoll TINYINT(2);
 DECLARE charCount TINYINT DEFAULT 0;
 DECLARE randomChar CHAR(1);
 DECLARE randomName VARCHAR(64) DEFAULT '';
 SET numberOfChars = 1 + FLOOR(RAND() * 64);
 REPEAT
  SET charCount = charCount + 1;
  SET charDiceRoll = 1 + FLOOR(RAND() * 38);
  IF (charDiceRoll <= 2)
  THEN
   SET randomChar = ELT(charDiceRoll, '$', '_');
  ELSE
   SET charDiceRoll = charDiceRoll - 3;
   SET randomChar = LOWER(
     CONV(
       charDiceRoll,
       10, 36)
   );
  END IF;
  SET randomName = CONCAT(randomName, randomChar);
 UNTIL (charCount = numberOfChars)
 END REPEAT;
 RETURN randomName;
END $$
DELIMITER ;

You can check out the function's output by:

SELECT randomNameGenerator();



This function can be modified to become a random string generator of fixed length or even to become a random password generator. Additional characters can be added to the ELT function, and the statement that uses base 36 can be converted to randomly generate uppercase and lowercase characters. If numbers are not desired to be in the random name, simply modify the range for the base 36 statement so that it will not generate 0-9.

Back to part 1.



Updated: 10/19/2012

Thursday, May 5, 2011

True Random Database and Table Name Generator - Part 1 of 2

Permalink: http://bit.ly/UZY7xT



Skip to part 2 to go straight to the code snippet.

See also a similar generator: Truly Random and Complex Password Generator

Database names and table names have certain restrictions in MySQL:
  • The maximum name length for both are 64 characters
  • Allowed characters are a-z, A-Z, 0-9, $, and _

It is possible to create a table or database with a dot (.) in its name, however this is not recommended as it will cause some of MySQL's built-in functions to not work as expected.

Using uppercase characters in names are also not recommended. The case sensitivity of the name is dependent on the underlying operating system where the MySQL server is installed. For example, in Linux, the name "dbName" is different from "dbname" but are both the same in Windows. For consistency and to allow the database to be portable between the two, as well as to future-proof your database in case at some point you suddenly need to port from Linux to Windows, using all lower case names is recommended.

A database or table random name generator, to be truly random, should satisfy the following:
  • The length of the name should be random, between 1 to 64 characters
  • The character set for the generator should include all the allowed characters, except uppercase characters since these are not recommended as explained above
  • Each of the allowed characters should all have an equal chance of being generated

Using the MD5() function will not satisfy the above conditions and it won't be a true random string generator. For a simple example:
SELECT MD5(RAND());

The above example will always be 32 characters and its first character will never be 0. Therefore, this violates the condition that every character should all have an equal chance of being generated as the character 0 has no chance of being generated as a first character. However, the main defect here is that the function MD5() uses only A-F and 0-9, a range of only 16 characters.

Another approach is to use the ELT function. This function can be used to generate a random character which can then be concatenated to form a random string. A simple example to create a random character:

SELECT ELT(1 + FLOOR(RAND() * 38), 
  '$', '_', 'a', 'b', 'c', 'd', 'e',
  'f', 'g', 'h', 'i', 'j', 'k', 'l',
  'm', 'n', 'o', 'p', 'q', 'r', 's',
  't', 'u', 'v', 'w', 'x', 'y', 'z', 
  0, 1, 2, 3, 4, 5, 6, 7, 8, 9
);

This statement can be used in a user defined function to create a random string with a random length ranging from 1 to 64. However, this implementation is lengthy and there is a cleaner way to do the same by using the following formula:

SELECT LOWER(
  CONV(
    FLOOR(
      RAND() * 36
    ),
  10, 36)
);

This is a shorter way to generate a-z, 0-9. The characters $ and _ can be added as will be shown in the 2nd part of this blog entry.

Here's an explanation of the formula:
  • Implicit conversion between numbers and strings automatically occurs during expression evaluation at run-time.
  • The CONV statement converts the random integer from a base 10 number to a base 36 number. Using base 36 gives the most compact case-insensitive alphanumeric numeral system. This converts the number to range from 0-9 and A-Z.



The second part of this blog entry will show an example of a user defined function that uses the statement which uses base 36 as a true random database and table name generator.



Updated: 10/15/2012

Sunday, May 1, 2011

Random Number Range Picker

Permalink: http://bit.ly/R5rizQ



A function that can randomly pick an integer from a given range can be used by applications that need a number dice roll and random generation of strings, numbers, and even random generation of complex passwords. Let's say you would like to randomly pick a number from 1 to 10. This gives us 10 choices to randomly pick, not 9 as can be mistakenly thought of since 10 minus 1 equals 9. The same is with the range of choices from 0 to 10. This gives 11 possibilities, not 10. To illustrate:
  • The range of choices should include the value of the lower end of the range
  • It should also include the value of the higher end of the range

To generate a random number from a given lower value and a higher value, use the formula below, where minRange is the lower end of the range and maxRange is the higher end of the range:

minRange + FLOOR(RAND() * (maxRange - minRange + 1))

The formula gives minRange a random integer to add with, where the return from the  RAND() function is passed to the math flooring function. The return from RAND() is multiplied with the difference between minRange and maxRange to set the decimal range that can be randomly created. The + 1 added to the difference between the two values is needed so that the range includes the maxRange number itself. This addition of an extra 1 can easily be missed which results in erroneous results especially when the lower value for the range is a negative number.

To illustrate the common mistakes on how a formula for a simple random range picker often leads to a wrong analysis, let us first fill the values with minRange = 1 and maxRange = 10. This gives us:

SELECT 1 + FLOOR(RAND() * 10);

The common mistake here is that some devs will think a range from 1 to 10 should result in a formula that would include "RAND() * 9" when they use a range formula of 10 - 1 = 9.

And filling the values with minRange = -10 and maxRange = 10 gives us:

SELECT -10 + FLOOR(RAND() * 21);

The common mistake happens when a quick mental analysis of a range formula is made, some devs think that the solution for the second formula will result to "RAND() * 20".

The given formula is used in the function below. The integers for minRange and maxRange can be positive or negative, including zero. The range of numbers returned will correctly include the values of minRange and maxRange as was explained above.

DELIMITER $$
DROP FUNCTION IF EXISTS `randomRangePicker` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `randomRangePicker`(
 minRange INT,
 maxRange INT) RETURNS int(11)
BEGIN
 DECLARE pick INT;
 SET pick = minRange + FLOOR(RAND() * (maxRange - minRange + 1));
 RETURN pick;
END $$
DELIMITER ;



To use the function, simply pass the two needed parameters like:

SELECT randomRangePicker(-1, 1);

This will return either -1, 0, or 1.

See the following user defined functions for samples where randomRangePicker() can be used:



Updated: 10/14/2012

Thursday, April 28, 2011

A function to get all the columns of any table from any database

Permalink: http://bit.ly/VP174V



Certain complex MySQL SELECT and subquery statements will not allow the use of the * wildcard and you will need to fill in the entire column list of a given table. Consider the following simplified example, a SELECT statement that contains 3 columns. The asterisk here refers to all columns, which is actually the 3 columns listed in the GROUP BY clause:

SELECT
 IF(
  EXISTS(
  SELECT *
  FROM (
   SELECT *
   FROM `dbName_A`.`tableName_A`
   UNION ALL
   SELECT *
   FROM `dbName_B`.`tableName_B`
   AS `compareTables`
  GROUP BY `column_1`, `column_2`, `column_3`
  HAVING COUNT(*) = 1),
 1, 0);

Imagine if it were dozens of columns instead of just 3. You can't simply put in the * wildcard like 'GROUP BY * '. The above example will not work without the GROUP BY clause and you'll need to type in all the column names. A solution is to create a function that returns a list of the column names for you. Here it is:

DELIMITER $$
DROP FUNCTION IF EXISTS `getColumnList` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `getColumnList`(
  dbName VARCHAR(64),
  tableName VARCHAR(64)) RETURNS text CHARSET utf8
BEGIN
  DECLARE columnList TEXT;
  SET group_concat_max_len = 65533;
  SELECT GROUP_CONCAT(`COLUMN_NAME` SEPARATOR '`,`')
  FROM `information_schema`.`COLUMNS`
  WHERE (`TABLE_SCHEMA` = dbName)
    AND (`TABLE_NAME` = tableName)
  INTO columnList;
  SET columnList = CONCAT('`', columnList, '`');
  RETURN columnList;
END $$
DELIMITER ;

This function has 1 limitation: as of version 5.5 of MySQL, it is not possible to get a column list into a variable if it comes from a temporary table. This is due to the fact that temporary tables are invisible within the `information_schema` database. This will hopefully be resolved in a future version of MySQL.

To use the function, simply pass the database name and table name parameters to the function such as:

SET @columnList = getColumnList('dbName_A', 'tableName_A');

A list of the column names in the form similar to the above example will be returned:

`column_1`, `column_2`, `column_3` 

The first example can now be re-written in a simpler way. Now, it does not matter how many columns there are, the function will take care of it all:

SELECT
 IF(
  EXISTS(
  SELECT *
  FROM (
   SELECT *
   FROM `dbName_A`.`tableName_A`
   UNION ALL
   SELECT *
   FROM `dbName_B`.`tableName_B`
   AS `compareTables`
  GROUP BY @columnList
  HAVING COUNT(*) = 1),
 1, 0);



This function is especially useful for dynamic MySQL statements and is robust enough to handle a wide variety of usage types. Here's a run down on the function's underlying code:
  • TEXT variable type is used for the string return since it is the most practical character length among the variable choices. Smaller string types are too small at 255 maximum character length, while the next larger option, MEDIUMTEXT, is at 16,777,215 characters is just too big and would be a resource hog. TEXT data type can hold a string up to a maximum length of 65,535 characters.
  • group_concat_max_len has a default value of 1,024. If the column list concatenated is longer than 1,024 characters, it will be truncated. The max length of the GROUP_CONCAT clause should match the character length of the TEXT data type in order to maximize its usefulness. The value is set at 65,533 since 2 backticks (explained below) surrounding the string will add up to 65,535 characters. This change is not permanent and only remains valid for the duration of the active session. You may increase the maximum to a much larger value, but 65 thousand characters should pretty much handle most cases.
  • Column names are surrounded in backticks in case they contain reserved words. Not enclosing column names in backticks can be problematic since the function may be used in a variety of ways and may cause unpredictable results if not done so.



Updated: 10/11/2012