Monday, February 13, 2012

'concatenating' several columns together for Insert into an XML column

Hi,

I am inserting a row into a Table. One of the columns in this table is of XML datatype.

I am using a Select statement to provide the values that will be inserted into the row. However, I would like to combine some of these values and parse them into XML for insertion into the XML column.

An example might help:

The Table to insert into has the following columns:

Table name: myTable

ID int

Data1 string

Data2 string

Data3 string

DataXml xml


I am inserting into this table using the following:

Code Snippet

insertinto myTable

(Data1, Data2, Data3, DataXml)

select Data1, Data2, Data3,

(select Data4, Data5, Data6)from myOtherTable as inside

where inside.ID = outside.ID forxmlraw)

from myOtherTable as outside


This works but it strikes me as inefficient. There is an additional lookup for each row which seems unnecessary as we are already at that row of the same table.

If it was a simple concatenation, I would do something like 'Data4 + Data5 + Data6', and it seems this is not much different except that instead of concatenation everything is being wrapped in Xml.

Does anyone have any ideas of a better way to do this?

Any help much appreciated.

Holf:

I am definitely not an expert at XML. I too am learning. This is also pretty much the way that I do it. I am also interested in seeing if Martin has a better way of doing this. To me, it looks like you are doing it right.

Kent

|||

Kent,

Thanks for the response. Well, I've checked the query plan generated and I can confirm that there is defintely some nested looping going on. I was wondering if the optimizer would realize what is happening and do some shortcuts.

Given this, I'm thinking it is going to be more efficient to concatenate the XML bits in, e.g. '<row Data4="' + Data4 + '" />' etc.

I don't want to do this because to manipulate XML I'd like to use XML tools, but I'll happily concatenate if it works out to be faster.

Thanks again for the reply. I'm learning XML too, and it does seem as though there's a lot to learn just now!

Holf

|||Be aware that FOR XML will do necessary escaping (e.g. a value like "foo & bar" will be escaped as "foo &amp; bar") while SQL string concatenation will not do any escaping that XML requires.|||Thank you, Martin, I had forgotten about that particular side effect; I have experienced pain from this particular problem a couple of times.|||

Ah yes, that is a very good point.

Well, I may write a function to which I can pass my XML elements and which will respond with some nicely crafted XML. I am considering compiling and importing a .NET assembly specifically for this task, as in .NET there are some nice tools for building up valid XML from raw elements, and these tools take care of escaping invalid characters.

I know this sounds like overkill but I still think it will be more efficient than the subquery approach above. Of course, I will be testing to find out.

Better check that my hosting provider allows upload of .NET assemblies to SQL Server...

Thanks Martin and Kent for your thoughts on this.

|||

Well, I've checked with my hosting provider and they do not allow upload of .NET assemblies. This is not surprising really, given it is a shared database and with the rights to upload assemblies, users could compromise the entire database.

So, I've written a very simple function to escape the necessary characters (of which there are not many from what I read):

Code Snippet

SETANSI_NULLSON

GO

SETQUOTED_IDENTIFIERON

GO

ALTERFUNCTION [dbo].[fn_EscapeForXml]

(

@.stringToEscape asnvarchar(256)

)

RETURNSnvarchar(256)

AS

BEGIN

SELECT @.stringToEscape =REPLACE(@.stringToEscape,'&','&amp;')

SELECT @.stringToEscape =REPLACE(@.stringToEscape,'<','&lt;')

SELECT @.stringToEscape =REPLACE(@.stringToEscape,'>','&gt;')

SELECT @.stringToEscape =REPLACE(@.stringToEscape,'''','&apos;')

SELECT @.stringToEscape =REPLACE(@.stringToEscape,'"','&quot;')

RETURN(@.stringToEscape)

END

Note that the '&' has to be escaped first, or otherwise the '&'s resulting from the other replace operations are themselves replaced.

When I am concatenating my strings to make an XML document, I pass anything which I know may have invalid characters through this function first.

I have had some XML that I have treated in this way returned to a .NET environment. When the XML is deserialized these characters are resolved as they should be into their original form, so it all seems to be working.

No comments:

Post a Comment