Pages

Thursday, May 5, 2011

True Random Database and Table Name Generator - Part 1 of 2

Permalink: http://bit.ly/UZY7xT



Skip to part 2 to go straight to the code snippet.

See also a similar generator: Truly Random and Complex Password Generator

Database names and table names have certain restrictions in MySQL:
  • The maximum name length for both are 64 characters
  • Allowed characters are a-z, A-Z, 0-9, $, and _

It is possible to create a table or database with a dot (.) in its name, however this is not recommended as it will cause some of MySQL's built-in functions to not work as expected.

Using uppercase characters in names are also not recommended. The case sensitivity of the name is dependent on the underlying operating system where the MySQL server is installed. For example, in Linux, the name "dbName" is different from "dbname" but are both the same in Windows. For consistency and to allow the database to be portable between the two, as well as to future-proof your database in case at some point you suddenly need to port from Linux to Windows, using all lower case names is recommended.

A database or table random name generator, to be truly random, should satisfy the following:
  • The length of the name should be random, between 1 to 64 characters
  • The character set for the generator should include all the allowed characters, except uppercase characters since these are not recommended as explained above
  • Each of the allowed characters should all have an equal chance of being generated

Using the MD5() function will not satisfy the above conditions and it won't be a true random string generator. For a simple example:
SELECT MD5(RAND());

The above example will always be 32 characters and its first character will never be 0. Therefore, this violates the condition that every character should all have an equal chance of being generated as the character 0 has no chance of being generated as a first character. However, the main defect here is that the function MD5() uses only A-F and 0-9, a range of only 16 characters.

Another approach is to use the ELT function. This function can be used to generate a random character which can then be concatenated to form a random string. A simple example to create a random character:

SELECT ELT(1 + FLOOR(RAND() * 38), 
  '$', '_', 'a', 'b', 'c', 'd', 'e',
  'f', 'g', 'h', 'i', 'j', 'k', 'l',
  'm', 'n', 'o', 'p', 'q', 'r', 's',
  't', 'u', 'v', 'w', 'x', 'y', 'z', 
  0, 1, 2, 3, 4, 5, 6, 7, 8, 9
);

This statement can be used in a user defined function to create a random string with a random length ranging from 1 to 64. However, this implementation is lengthy and there is a cleaner way to do the same by using the following formula:

SELECT LOWER(
  CONV(
    FLOOR(
      RAND() * 36
    ),
  10, 36)
);

This is a shorter way to generate a-z, 0-9. The characters $ and _ can be added as will be shown in the 2nd part of this blog entry.

Here's an explanation of the formula:
  • Implicit conversion between numbers and strings automatically occurs during expression evaluation at run-time.
  • The CONV statement converts the random integer from a base 10 number to a base 36 number. Using base 36 gives the most compact case-insensitive alphanumeric numeral system. This converts the number to range from 0-9 and A-Z.



The second part of this blog entry will show an example of a user defined function that uses the statement which uses base 36 as a true random database and table name generator.



Updated: 10/15/2012

No comments: