Friday, July 07, 2006

SQLXMLBulkLoad Rocks!

For reasons that will become obvious in September (sly grin), I've been working a lot lately on uploading data to SQL Server 2005. There are a variety of mechanisms you can use to do this, but all of them have their shortcomings under certain conditions. For example, using a series of INSERT statements is slow while bulk insert doesn't work with memo fields. While looking for something else in SQL Server Books Online, I happened across a topic on bulk XML loading. After playing with it for a while, it seemed pretty easy to do, so I did some testing on a relatively large file (365,741 records) containing memo fields (which means I couldn't use bulk insert). Using other mechanisms, it took more than two hours to load the data into a SQL Server table. Using bulk XML load, it took 11 minutes. That's a 90% improvement in speed. I love making things faster!

Using bulk XML load is easy: you instantiate SQLXMLBulkLoad.SQLXMLBulkload.4.0, set its ConnectionString property to an OLE DB connection string, set some other properties appropriately (for example, KeepNulls to .T. to insert null rather than default values for missing column values), and call Execute, passing it the name of a schema file and the name of the XML data file. Execute throws an error if there's something wrong with either file, and import errors are logged to a file whose name is in the ErrorLogFile property.

There's one slight wrinkle in working with VFP data: while the CURSORTOXML() function can create a schema file, it needs to be tweaked slightly to work with bulk XML load. Also, DateTime fields have to be flagged with the sql:datatype="dateTime" attribute (the latter was documented, the former took some trial and error to resolve).

Here's a generic program that will load the data from an open VFP cursor into an existing SQL Server table. You can download it from http://www.stonefield.com/articles/other/bulkxmlload.zip.

*==============================================================================
* Function: BulkXMLLoad
* Purpose: Performs a SQL Server bulk XML load
* Author: Doug Hennig
* Last revision: 07/06/2006
* Parameters: tcAlias - the alias of the cursor to export
* tcTable - the name of the table to import into
* tcDatabase - the database the table belongs to
* tcServer - the SQL Server name
* tcUserName - the user name for the connection (optional:
* if it isn't specified, Windows Integrated Security is
* used)
* tcPassword - the password for the connection (optional:
* if it isn't specified, Windows Integrated Security is
* used)
* Returns: an empty string if the bulk load succeeded or the text of
* an error message if it failed
* Environment in: the alias specified in tcAlias must be open
* the specified table and database must exist
* the specified server must be accessible
* there must be enough disk space for the XML files
* Environment out: if an empty string is returned, the data was imported into
* the specified table
*==============================================================================


lparameters tcAlias, ;
tcTable, ;
tcDatabase, ;
tcServer, ;
tcUserName, ;
tcPassword
local lnSelect, ;
lcSchema, ;
lcData, ;
lcReturn, ;
loException as Exception, ;
lcXSD, ;
loBulkLoad


* Create the XML data and schema files.


lnSelect = select()
select (tcAlias)
lcSchema = forceext(tcTable, 'xsd')
lcData = forceext(tcTable, 'xml')
try
cursortoxml(alias(), lcData, 1, 512 + 8, 0, lcSchema)
lcReturn = ''
catch to loException
lcReturn = loException.Message
endtry


* Convert the XSD into a format acceptable by SQL Server. Add the SQL
* namespace, convert the start and end tags to ,
* use the sql:datatype attribute for DateTime fields, and specify the table
* imported into with the sql:relation attribute.


if empty(lcReturn)
lcXSD = filetostr(lcSchema)
lcXSD = strtran(lcXSD, ':xml-msdata">', ;
':xml-msdata" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">')
lcXSD = strtran(lcXSD, 'IsDataSet="true">', ;
'IsDataSet="true" sql:is-constant="1">')
lcXSD = strtran(lcXSD, '<xsd:choice maxOccurs="unbounded">', ;
'<xsd:sequence>')
lcXSD = strtran(lcXSD, '</xsd:choice>', ;
'</xsd:sequence>')
lcXSD = strtran(lcXSD, 'type="xsd:dateTime"', ;
'type="xsd:dateTime" sql:datatype="dateTime"')
lcXSD = strtran(lcXSD, 'minOccurs="0"', ;
'sql:relation="' + lower(tcTable) + '" minOccurs="0"')
strtofile(lcXSD, lcSchema)


* Instantiate the SQLXMLBulkLoad object, set its ConnectionString and other
* properties, and call Execute to perform the bulk import.


try
loBulkLoad = createobject('SQLXMLBulkLoad.SQLXMLBulkload.4.0')
lcConnString = 'Provider=SQLOLEDB.1;Initial Catalog=' + tcDatabase + ;
';Data Source=' + tcServer + ';Persist Security Info=False;'
if empty(tcUserName)
lcConnString = lcConnString + 'Integrated Security=SSPI'
else
lcConnString = lcConnString + 'User ID=' + tcUserName + ;
';Password=' + tcPassword
endif empty(tcUserName)
loBulkLoad.ConnectionString = lcConnString
*** Can set the ErrorLogFile property to the name of a file to write import
*** errors to
loBulkLoad.KeepNulls = .T.
loBulkLoad.Execute(lcSchema, lcData)
lcReturn = ''
catch to loException
lcReturn = loException.Message
endtry


* Clean up.


erase (lcSchema)
erase (lcData)
endif empty(lcReturn)
select (lnSelect)
return lcReturn

21 comments:

Anonymous said...

Doug,

Let me guess, you are creating a VFP to SQL upsizing wizzard that actually works!

I know, you won't tell me if I'm right, so I'll assume I am.

Bob Archer

Doug Hennig said...

Hey Bob, good to hear from you. That's a darn good guess!

Doug Hennig said...

Hi Ken.

I'd check msdn.microsoft.com to see if there are any details about that error message.

Doug

Anonymous said...

Thanks for your contribution.One problem I could not solve is: fields that are empty but don´t accept null values. The SQLXMLBUlkLoad treat theses fields as NULL and prompt an error. Any ideas?

TIA
Moacyr Zalcman

Anonymous said...

1100 individual XML Files to load into SQL Server becomes a LOT easier. Thanks Doug.

Anonymous said...

Get error message Class definition SQLXMLBULKLOAD.SQLXMLBULKLOAD.4.0 is not found

Anonymous said...

Hi Dough,
i've tried the sample its working no error but nothing inserting to the SQL side :(

Doug Hennig said...

Did you try setting loBulkLoad.ErrorLogFile to the name of a file to log errors to?

Anonymous said...

No, how would be this

Anonymous said...

SQLXMLBulkLoad eats CR of CRLF in Memo fields when importing. Any ideas on a workaround?

Doug Hennig said...

Sorry, I don't.

Unknown said...

Hello Doug

I have also problems with NULL because my sql table don´t accept NULL.

Could you give me na ideia
Many thanks
Luis Santos

Doug Hennig said...

If you have empty or null values in your data but the fields don't accept null values, then it sounds like you have a data issue you need to resolve.

Anonymous said...

did the SQLXML component get deprecated?

it doesn't install into windows 10

the download page says
Supported Operating System
Windows 7, Windows Server 2003, Windows Server 2008, Windows Server 2008 R2, Windows Vista, Windows XP

Doug Hennig said...

Not sure -- I haven't used it in a while.

nigel gomm said...

SQL server 2016 pages still refer to it... it just doesn't install into windows 10 client PCs (nor Win8 by the looks of it).
Can't find anything relevant with a google search.....

I'll see if i can find the .dll elsewhere and install manually.

nigel gomm said...

i'm drawing a blank searching for the .dll

would you be breaking copyright by putting it here for download?

(assuming you still have it even)

Doug Hennig said...

I don't have it either and the download for it doesn't work on my machine (Windows 10). So maybe it's been deprecated after all.

nigel gomm said...

Appreciate you looking for it...

it wouldn't install on a windows 7 pc either and yet this sql server2016 page makes explicit reference to it

https://msdn.microsoft.com/en-us/library/ms171744.aspx

i'll keep looking....

thanks again

nigel gomm said...

update

using the .msi downloaded from the page above and using lessmsierables (http://lessmsi.activescott.com/)

i was able to extract
sqlxml4.dll
sqlxml4m.dll
sqlxml4r.dll
sqlxmlx.dll
xblkld4.dll
xblkld4r.dll

then regsiter sqlxml4.dll and xblkld4.dll

.... and it worked (uploading to an azure sql database)

(and wow is it quick!)

n

Doug Hennig said...

Thanks for the update, Nigel.