XML - Pro SQL Server Internals

Database Reference

In-Depth Information

XML Data Type

An XML data type stores data in an internal format using UTF-16 encoding with some compression involved, and it

does not preserve the original XML document. Listing 11-3 shows an example of this.

Listing 11-3. XML data type does not preserve the original XML document

select cast(

N'<script>

<![CDATA[

function max(a,b)

{

if (a <= b) then { return b; }

else { return a; }

}]]>

</script>' as xml)

Result:

function max(a,b)

{

if (a <= b) then { return b; }

else { return a; }

}

</script>

As you see, there is no CDATA section in the output, and < character has been replaced with character entity < .

The total storage space used by XML data type varies. Even with compression, it can exceed the raw text size

when the original text uses UTF-8 encoding. Although with UTF-16 data, it could save some space compared to the

text representation.

There are two types of XML data available in SQL Server: untyped and typed . Untyped XML can store data as

long as it is in a valid format, while typed XML is bound by the XML schema. You can create an XML schema with a

CREATE XML SCHEMA COLLECTION statement and assign it to a column, parameter, or variable of the XML data type.

Typed XML allows SQL Server to take advantage of the data type information from the XML nodes. Although

it improves XQuery performance, it also introduces the overhead of the schema validation when data is inserted or

modified. Usually, you like to have XML typed in cases where the data conforms to specific XML schema, and you can

afford such overhead.

XML Schema is stored in the system tables in an internal format. As with regular XML data, SQL Server does

not persist the original schema definition. You need to store it separately, perhaps as a BLOB, in case you need to

reconstruct it in the future.

As I already mentioned, you can create indexes on XML data. There are two kinds of XML indexes: primary and

secondary . Primary XML indexes shred the XML data into a relational format, and it has either one or two rows for

each XML node. Secondary XML indexes are non-clustered indexes defined in the relational table that stores primary

XML index data. They can help with the performance of some operations against XML data.

Now let's create the table shown in Listing 11-4. We will insert one row of data using the XML from Listing 11-1.

Search WWH ::

Custom Search

Home