Sunday 21 February 2010

SQL Server - MySQL 4 integration - openquery does not work for tables with datetime values of 0000-00-00

I have been trying copy the contents of an old MySQL 4.0.12 database to SQL Server 2008 in order to archive its contents.

To archive this, I followed these steps:
  1. Create an database within SQL Server 2008 to hold the data called ARCHIVE_DB.
  2. Download and install ODBC driver for MySQL (available here).  Note that due to the version of the MySQL database I was unable to use the newer driver (version 5.1.xx), since this requires version 4.1 of MySQL.
  3. Set up a linked server in SQL Server 2008 using this driver to point to the data source.  I called this data source MYSQL_ls.
  4. I can now select data from this data source using the "openquery" TSQL syntax, eg:


    SELECT *
    FROM openquery([NADT-SQLHOST], 'select * from webdb.episode')
    

  5. I can extend this query using the TSQL "INTO" clause, which will build the SQL table from the source data, eg:


    SELECT * INTO ARCHIVE_DB
    FROM openquery([MYSQL_ls], 'select * from mydatabase.mytable')
    

  6. In theory the above will copy all the data from the MySQL database table called (in the MySQL database called ) into the SQL Server table (SQL Server database ).  I do not have to create the table schema, this will be created automatically by the INTO clause.  
  7. I can now create an SSIS package that loops through the tables in the MySQL database and runs the SQL statement show in number 5 against each table.  This should in theory copy each table across.
This technique works very well until you come to tables that include datetime values.

MySQL allows datetime values in the format of "0000-00-00", which is the equivalent of a NULL date in SQL server.  Unfortunately the "Select * Into From Openquery..." trick above does not deal with this situation, and you will recieve a version of the following error:

Msg 7342, Level 16, State 1, Line 4
An unexpected NULL value was returned for column "[MSDASQL].enddate" from OLE DB provider "MSDASQL" for linked server "MYSQL_ls". This column cannot be NULL.

This error is caused by using "openquery".  Fortunately there is a very easy work around for this:  where tables have datetimes in "0000-00-00" format, you must use an SSIS data flow task to copy the information, rather than the "openquery" described above:

  1. Set the data pump source to the MySQL table pointed to in the Linked server and click on "preview" and you will notice that it can return data.  This is not possible by running a query in the SQL Management studio, so clearly we are getting somewhere.  
  2. Now set the data pump target to "" and click on "Mappings" and you will notice that each column in the source table will be mapped to an identically named table in the destination table.
  3. Save the data pump and run.

If you run this data flow, the data will be copied across without reporing the datetime mismatch error.

Wednesday 17 February 2010

Use a variable as a table name in an SSIS Execute SQL Task

I spent quite a while today trying to build an SSIS package that iterated through a dataset that contained a set of table names usign a for-each container, then run a piece of SQL against each table.

Setting up the for-each loop was relatively simple, but I couldn't get the Execute SQL task to run a simple query of the type:
Select count(*) from @table
Where @table is the name of the parameter generated from the for each loop.

It appears that this is not possible in SSIS.  Eventually I found a solution:  use another variable to store the SQL statement and set up the SQL task to use this variable as its input source.

Assuming that the input parameter is already set up and called [User::TableName],  then the steps to achive this are:

  1. Set up a new variable (say called "SQLStatement"), make this a string type.
  2. In the properties box of the paremeter, Set EvaluateAsExpression to True
  3. In the properties box of the paremeter, click on the elipses (...) of the "Expression" property to build an expression.  Type in the SQL you need and drag the variable you want to use from the "variables" tree.
  4. Escape you SQL in this expression wwithin quotation marks.  e.g. the count statement above would look like:
  5. "Select count(*) from " + @[User:TableName] 
  6.  You can use the "Evaluate expression" button to make sure that this evaluates as real SQL.  If you are using a "for-each" container then you may want to give your [User::TableName] variable a sensible default value in order to see the results of the evaluated expression in a format you could use for testing.
  7. Drag an "Execute SQL Task" object into your control flow and open its properties page
  8. Set the "SQL Source Type" property to "Variable" to tell it that the SQL will come from a variable, rather from a typed in SQL statement.
  9. Set the "SourceVariable" property to the name of the variable you created in step 1 (in this example it would be "SQLStatement".
That's it!  

    Monday 8 February 2010

    Blogger.com code syntax highlighting, or always remember to save your code before you start playing around with it!

    I've decided to re-vamp the look of this blog to give it more width and make the code fragments easier to read. I hope you like the new look!

    When doing this I made one of the oldest mistakes in the programmer's book: I didn't save my work!

    This meant I had to add all the site specific code to render the code in syntax-highlighted format, and add Google analytics tracking code back into the HTML, which whilst not the worst job in the world, did take me some time to remember what I had done. REMEMBER TO SAVE YOUR WORK!

    This does give me the opportunity to thank the author of the fabulous syntax highlighter for Blogger.com that I have used, which can be found here: http://alexgorbatchev.com/wiki/SyntaxHighlighter. Thank you very much Alex, it's a superb bit of code!

    The code I used to enable the nifty highlighter for C# should be pasted into the HEAD section of your blog:


    <link href='http://alexgorbatchev.com/pub/sh/current/styles/shCore.css' rel='stylesheet' type='text/css'/>
    <link href='http://alexgorbatchev.com/pub/sh/current/styles/shThemeDefault.css' rel='stylesheet' type='text/css'/>
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js' type='text/javascript'></script>
    <!-- add brushes here -->
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js' type='text/javascript'></script>
    
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushBash.js' type='text/javascript'></script>
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushSql.js' type='text/javascript'></script>
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js' type='text/javascript'></script>
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCpp.js' type='text/javascript'></script>
    <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js' type='text/javascript'></script>
    <script type='text/javascript'>
      SyntaxHighlighter.config.bloggerMode = true;
      SyntaxHighlighter.all();
    </script>
    
    

    To use it, make sure that the code you want to display has been HTML escaped (which can be done using the utility available here: http://www.string-functions.com/htmlencode.aspx.

    Now, to use this, just paste your escaped code into your blog and wrap it in the following HTML tags:
    <pre class="brush:csharp">
       ...your code here...
    </pre>
    

    Note that this will work for C Sharp code. For other types of code highlighting, simply change the brush type in the <pre> tag, and make sure the appropriate brush is included in the list of brish types defined in the code above in the "!-- add brushes here --" section.