Thursday, July 7, 2011

Relational Database Quirks with Auto Increment and Identity Columns

Recently, I was given the task of implementing a software solution to allow various databases to be migrated between one another. Under most circumstances, I would have opted for a third party solution to handle this issue, as numerous vendors exist. However, the project I was on did not allow for this option and I was forced to roll my own.

This first seemed rather simple.. read the data, insert the data, add the constraints. One of the first issues that I ran into was in how auto incrementing is handled between systems.

  • Oracle: Key generation is handled via a sequence and trigger method (others exist.. I stuck with the trigger before insert option)

  • MySQL: Supports AUTO_INCREMENT as a column property

  • SQL Server: Uses the Identity property

  • Postgres: Similar to oracle.. uses a sequence.

  • Derby: Supports auto increment better than most others. (Shocking.. didn't see that one coming)

    While moving between these systems, I noticed a few quirks in how things are handled.. Most seem to be post insert operations. Thus, the auto increment constraint is added after the data had been migrated into the new database. But, in Microsoft had dropped the ball in this category and requires that you use a feature called IDENTITY_INSERT, therfore you can define your column as an identity and before you being your inserts, you call: SET IDENTITY_INSERT ON, then turn it off once your inserts have completed. This allows you to force the value of an identity column - which preserves key integrity when migrating between database types.

    Another quirk that I ran into that was of note was in MySQL. MySQL seems to have an issue with an auto_incrementing value when it is zero. Therefore, if you have a column that is not defined as auto incrementing and add a key with the value of zero. Then, modify the column to make it auto incrementing. It will try to reorder the zero value into a one. If you already have a key value of one, it will create a duplicate key exception.

    I honestly think this is a bug in MySQL's handling of keys and auto_increment as it should never modify an existing value due to adding a constraint. I tried several options to resolve the issue so that any clients whom had primary key values of zero would not have their system hosed, including modifying the auto_increment value on the table before adding it to the column. But, the one solution that finally did work was to not key the column until you were ready to add the auto_increment. So, when the table is created, you omit the primary key flag, insert the data, then alter the column adding both the primary key and auto increment flags. This seemed to cause MySQL to behave correctly, it did not modify the existing values (Preserving the zero pk value) and allow future inserts to correctly increment off of the highest column value.