Friday, July 07, 2006

SQLXMLBulkLoad Rocks!

For reasons that will become obvious in September (sly grin), I've been working a lot lately on uploading data to SQL Server 2005. There are a variety of mechanisms you can use to do this, but all of them have their shortcomings under certain conditions. For example, using a series of INSERT statements is slow while bulk insert doesn't work with memo fields. While looking for something else in SQL Server Books Online, I happened across a topic on bulk XML loading. After playing with it for a while, it seemed pretty easy to do, so I did some testing on a relatively large file (365,741 records) containing memo fields (which means I couldn't use bulk insert). Using other mechanisms, it took more than two hours to load the data into a SQL Server table. Using bulk XML load, it took 11 minutes. That's a 90% improvement in speed. I love making things faster!

Using bulk XML load is easy: you instantiate SQLXMLBulkLoad.SQLXMLBulkload.4.0, set its ConnectionString property to an OLE DB connection string, set some other properties appropriately (for example, KeepNulls to .T. to insert null rather than default values for missing column values), and call Execute, passing it the name of a schema file and the name of the XML data file. Execute throws an error if there's something wrong with either file, and import errors are logged to a file whose name is in the ErrorLogFile property.

There's one slight wrinkle in working with VFP data: while the CURSORTOXML() function can create a schema file, it needs to be tweaked slightly to work with bulk XML load. Also, DateTime fields have to be flagged with the sql:datatype="dateTime" attribute (the latter was documented, the former took some trial and error to resolve).

Here's a generic program that will load the data from an open VFP cursor into an existing SQL Server table. You can download it from http://www.stonefield.com/articles/other/bulkxmlload.zip.

*==============================================================================
* Function: BulkXMLLoad
* Purpose: Performs a SQL Server bulk XML load
* Author: Doug Hennig
* Last revision: 07/06/2006
* Parameters: tcAlias - the alias of the cursor to export
* tcTable - the name of the table to import into
* tcDatabase - the database the table belongs to
* tcServer - the SQL Server name
* tcUserName - the user name for the connection (optional:
* if it isn't specified, Windows Integrated Security is
* used)
* tcPassword - the password for the connection (optional:
* if it isn't specified, Windows Integrated Security is
* used)
* Returns: an empty string if the bulk load succeeded or the text of
* an error message if it failed
* Environment in: the alias specified in tcAlias must be open
* the specified table and database must exist
* the specified server must be accessible
* there must be enough disk space for the XML files
* Environment out: if an empty string is returned, the data was imported into
* the specified table
*==============================================================================


lparameters tcAlias, ;
tcTable, ;
tcDatabase, ;
tcServer, ;
tcUserName, ;
tcPassword
local lnSelect, ;
lcSchema, ;
lcData, ;
lcReturn, ;
loException as Exception, ;
lcXSD, ;
loBulkLoad


* Create the XML data and schema files.


lnSelect = select()
select (tcAlias)
lcSchema = forceext(tcTable, 'xsd')
lcData = forceext(tcTable, 'xml')
try
cursortoxml(alias(), lcData, 1, 512 + 8, 0, lcSchema)
lcReturn = ''
catch to loException
lcReturn = loException.Message
endtry


* Convert the XSD into a format acceptable by SQL Server. Add the SQL
* namespace, convert the start and end tags to ,
* use the sql:datatype attribute for DateTime fields, and specify the table
* imported into with the sql:relation attribute.


if empty(lcReturn)
lcXSD = filetostr(lcSchema)
lcXSD = strtran(lcXSD, ':xml-msdata">', ;
':xml-msdata" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">')
lcXSD = strtran(lcXSD, 'IsDataSet="true">', ;
'IsDataSet="true" sql:is-constant="1">')
lcXSD = strtran(lcXSD, '<xsd:choice maxOccurs="unbounded">', ;
'<xsd:sequence>')
lcXSD = strtran(lcXSD, '</xsd:choice>', ;
'</xsd:sequence>')
lcXSD = strtran(lcXSD, 'type="xsd:dateTime"', ;
'type="xsd:dateTime" sql:datatype="dateTime"')
lcXSD = strtran(lcXSD, 'minOccurs="0"', ;
'sql:relation="' + lower(tcTable) + '" minOccurs="0"')
strtofile(lcXSD, lcSchema)


* Instantiate the SQLXMLBulkLoad object, set its ConnectionString and other
* properties, and call Execute to perform the bulk import.


try
loBulkLoad = createobject('SQLXMLBulkLoad.SQLXMLBulkload.4.0')
lcConnString = 'Provider=SQLOLEDB.1;Initial Catalog=' + tcDatabase + ;
';Data Source=' + tcServer + ';Persist Security Info=False;'
if empty(tcUserName)
lcConnString = lcConnString + 'Integrated Security=SSPI'
else
lcConnString = lcConnString + 'User ID=' + tcUserName + ;
';Password=' + tcPassword
endif empty(tcUserName)
loBulkLoad.ConnectionString = lcConnString
*** Can set the ErrorLogFile property to the name of a file to write import
*** errors to
loBulkLoad.KeepNulls = .T.
loBulkLoad.Execute(lcSchema, lcData)
lcReturn = ''
catch to loException
lcReturn = loException.Message
endtry


* Clean up.


erase (lcSchema)
erase (lcData)
endif empty(lcReturn)
select (lnSelect)
return lcReturn

21 comments:

  1. Anonymous1:26 PM

    Doug,

    Let me guess, you are creating a VFP to SQL upsizing wizzard that actually works!

    I know, you won't tell me if I'm right, so I'll assume I am.

    Bob Archer

    ReplyDelete
  2. Hey Bob, good to hear from you. That's a darn good guess!

    ReplyDelete
  3. Hi Ken.

    I'd check msdn.microsoft.com to see if there are any details about that error message.

    Doug

    ReplyDelete
  4. Anonymous8:11 AM

    Thanks for your contribution.One problem I could not solve is: fields that are empty but don´t accept null values. The SQLXMLBUlkLoad treat theses fields as NULL and prompt an error. Any ideas?

    TIA
    Moacyr Zalcman

    ReplyDelete
  5. Anonymous1:18 PM

    1100 individual XML Files to load into SQL Server becomes a LOT easier. Thanks Doug.

    ReplyDelete
  6. Anonymous3:05 PM

    Get error message Class definition SQLXMLBULKLOAD.SQLXMLBULKLOAD.4.0 is not found

    ReplyDelete
  7. Anonymous8:10 AM

    Hi Dough,
    i've tried the sample its working no error but nothing inserting to the SQL side :(

    ReplyDelete
  8. Did you try setting loBulkLoad.ErrorLogFile to the name of a file to log errors to?

    ReplyDelete
  9. Anonymous1:26 AM

    No, how would be this

    ReplyDelete
  10. Anonymous3:45 PM

    SQLXMLBulkLoad eats CR of CRLF in Memo fields when importing. Any ideas on a workaround?

    ReplyDelete
  11. Hello Doug

    I have also problems with NULL because my sql table don´t accept NULL.

    Could you give me na ideia
    Many thanks
    Luis Santos

    ReplyDelete
  12. If you have empty or null values in your data but the fields don't accept null values, then it sounds like you have a data issue you need to resolve.

    ReplyDelete
  13. Anonymous9:32 AM

    did the SQLXML component get deprecated?

    it doesn't install into windows 10

    the download page says
    Supported Operating System
    Windows 7, Windows Server 2003, Windows Server 2008, Windows Server 2008 R2, Windows Vista, Windows XP

    ReplyDelete
  14. Not sure -- I haven't used it in a while.

    ReplyDelete
  15. nigel gomm11:45 AM

    SQL server 2016 pages still refer to it... it just doesn't install into windows 10 client PCs (nor Win8 by the looks of it).
    Can't find anything relevant with a google search.....

    I'll see if i can find the .dll elsewhere and install manually.

    ReplyDelete
  16. nigel gomm3:28 PM

    i'm drawing a blank searching for the .dll

    would you be breaking copyright by putting it here for download?

    (assuming you still have it even)

    ReplyDelete
  17. I don't have it either and the download for it doesn't work on my machine (Windows 10). So maybe it's been deprecated after all.

    ReplyDelete
  18. nigel gomm11:47 AM

    Appreciate you looking for it...

    it wouldn't install on a windows 7 pc either and yet this sql server2016 page makes explicit reference to it

    https://msdn.microsoft.com/en-us/library/ms171744.aspx

    i'll keep looking....

    thanks again

    ReplyDelete
  19. nigel gomm1:43 PM

    update

    using the .msi downloaded from the page above and using lessmsierables (http://lessmsi.activescott.com/)

    i was able to extract
    sqlxml4.dll
    sqlxml4m.dll
    sqlxml4r.dll
    sqlxmlx.dll
    xblkld4.dll
    xblkld4r.dll

    then regsiter sqlxml4.dll and xblkld4.dll

    .... and it worked (uploading to an azure sql database)

    (and wow is it quick!)

    n

    ReplyDelete
  20. Thanks for the update, Nigel.

    ReplyDelete