|
A Sample Chapter from
|
Professional XML Databases
|
|
Book Title: |
Professional XML Databases |
|
Published by: |
Wrox Publication |
|
Chapter Number: |
2 |
|
Chapter Title: |
XML Structures for Existing Databases |
|
Author: |
Wrox Author Team |
|
Price $: |
49.99 |
|
ISBN: |
1861003587 |
|
Buy It Now: |
Click here |
|
Additional Info: |
Wrox Publication |
|
|
XML Books
|
|
|
XML Structures for Existing Databases
In this chapter, we will examine some approaches for taking
an existing relational database and moving it to XML.
With much of our business data stored in relational databases, there are going to
be a number of reasons why we might want to expose that data as XML:
-
Sharing business data with other systems.
-
Interoperability with incompatible systems.
-
Exposing legacy data to applications that use XML.
-
Business-to-business transactions.
-
Object persistence using XML.
-
Content syndication.
Relational databases are a mature technology, which, as they have evolved, have
enabled users to model complex relationships between data that they need to store.
In this chapter, we will see how to model some of the complex data structures that
are stored in relational databases in XML documents.
To do this, we will be looking at some database structures, and then creating content
models using XML DTDs. We will also show some sample content for the data in XML
to illustrate this. In the process, we will come up with a set of guidelines that
will prove helpful when creating XML models for relational data.
Note that there are some mechanisms out there already that provide a "default"
way to derive XML from existing relational database structures. ADO 2.5 will return
a "flattened" recordset in an XML representation, while SQL Server 2000
provides direct extraction of joined structures as XML. However, these technologies
are still maturing, and can't handle more complex situations, like many-to-many
relationships, that must be represented by IDREF-ID pointers. In this chapter, we'll
see how structures can be handcrafted to properly represent these types of relationships.
We will tune our structures to maximize performance, and minimize document size.
Migrating a Database to XML
In this chapter, we'll be using an example to see how the rules we are creating
would be applied in a real-world situation. The structure we'll be migrating to
XML is an invoice tracking and reporting system, and looks like this:
Our invoice database maintains information about invoices submitted by customers,
and the parts ordered on those invoices, as well as some summary information about
those parts. We'll create our XML structure to hold this information.
Scoping the XML Document
The first rule when designing an XML structure to hold relational information is
to decide what the scope of the document is. The scope refers to the data
and relationships that we want to reproduce when creating our XML document
after all, when exposing the database content, we may not need all of the data that
the database stores.
If we think about executing a query against a database, we may only require a subset
of the information that it holds. For example, an e-commerce site stores data with
relationships that model everything the customer has bought in the past, as well
as current orders being processed. If we were writing a CRM application, we would
not necessarily need to retrieve all of their past purchases only those that
had recently been placed.
In short, the scope of the document that we are creating is driven by business
requirements what the data is going to be used for, and how it is going
to be used and these business requirements may vary widely.
For example, our business requirement could be to transmit information to our accounting
office about summarizing the monthly invoice totals, as well as a customer-by-customer
breakdown so that billing may be performed. In this case, we may want to send only
a certain subset of the information to our accounting office (the shaded tables):
An alternative business requirement might be to transmit an XML copy of an invoice
to a customer each time a new invoice is submitted, in which case the subset of
the information we would be transmitting might look like this:
Additionally, we might want to control the specific columns that are transmitted.
For example, say our customer wanted to query a product they had ordered; they have
their invoice number to identify their purchase, but they aren't necessarily going
to care about the invoice tracking number that our application uses internally.
The extra number may in fact confuse them more.
By identifying the specific set of tables and columns that are going to be transmitted,
we can start to get a feel for how the XML document needs to be laid out. If we
happen to have access to a logical data diagram of the database, such as an ErWIN
model, it can also be very helpful when constructing our XML.
Rule 1: Choose the Data to Include.
Based on the business requirement the XML document will be fulfilling, decide which
tables and columns from our relational database will need to be included in our
documents.
|
For the purposes of our example, we'll assume that all the information in our structure
is relevant to the process (with the exception of the system-generated keys, which
we can discard).
|