Sunday 21 February 2010

SQL Server - MySQL 4 integration - openquery does not work for tables with datetime values of 0000-00-00

I have been trying copy the contents of an old MySQL 4.0.12 database to SQL Server 2008 in order to archive its contents.

To archive this, I followed these steps:
  1. Create an database within SQL Server 2008 to hold the data called ARCHIVE_DB.
  2. Download and install ODBC driver for MySQL (available here).  Note that due to the version of the MySQL database I was unable to use the newer driver (version 5.1.xx), since this requires version 4.1 of MySQL.
  3. Set up a linked server in SQL Server 2008 using this driver to point to the data source.  I called this data source MYSQL_ls.
  4. I can now select data from this data source using the "openquery" TSQL syntax, eg:


    SELECT *
    FROM openquery([NADT-SQLHOST], 'select * from webdb.episode')
    

  5. I can extend this query using the TSQL "INTO" clause, which will build the SQL table from the source data, eg:


    SELECT * INTO ARCHIVE_DB
    FROM openquery([MYSQL_ls], 'select * from mydatabase.mytable')
    

  6. In theory the above will copy all the data from the MySQL database table called (in the MySQL database called ) into the SQL Server table (SQL Server database ).  I do not have to create the table schema, this will be created automatically by the INTO clause.  
  7. I can now create an SSIS package that loops through the tables in the MySQL database and runs the SQL statement show in number 5 against each table.  This should in theory copy each table across.
This technique works very well until you come to tables that include datetime values.

MySQL allows datetime values in the format of "0000-00-00", which is the equivalent of a NULL date in SQL server.  Unfortunately the "Select * Into From Openquery..." trick above does not deal with this situation, and you will recieve a version of the following error:

Msg 7342, Level 16, State 1, Line 4
An unexpected NULL value was returned for column "[MSDASQL].enddate" from OLE DB provider "MSDASQL" for linked server "MYSQL_ls". This column cannot be NULL.

This error is caused by using "openquery".  Fortunately there is a very easy work around for this:  where tables have datetimes in "0000-00-00" format, you must use an SSIS data flow task to copy the information, rather than the "openquery" described above:

  1. Set the data pump source to the MySQL table pointed to in the Linked server and click on "preview" and you will notice that it can return data.  This is not possible by running a query in the SQL Management studio, so clearly we are getting somewhere.  
  2. Now set the data pump target to "" and click on "Mappings" and you will notice that each column in the source table will be mapped to an identically named table in the destination table.
  3. Save the data pump and run.

If you run this data flow, the data will be copied across without reporing the datetime mismatch error.

3 comments:

  1. Could you please eloborate on this.

    Set the data pump source to the MySQL table pointed to in the Linked server and click on "preview" and you will notice that it can return data. This is not possible by running a query in the SQL Management studio, so clearly we are getting somewhere.

    ReplyDelete
  2. Hi Rajesh, set up a new data flow task in SSIS, and set the source of the data to an ADO connection to your MySQL database. Set the target to your SQL Server database. The reason I haven't elaborated on this is that this is a "standard" way of copying database from one location to another in SSIS, whereas the main technique descibed above is code based, and so can be used in a loop to copy all the data from one database to another.

    If this doesn't make any sense then please let me know and I will write a new blog about the data flow task with screenshots.

    ReplyDelete
  3. Keep posting such kind of information on your blog. I bookmarked it for continuous visit. Thanks once again.

    ReplyDelete