Adding ID Attributes
The next step is to create an ID attribute for each of the structural (nondata point)
elements we have defined so far in our XML database (with the exception of the root
element). This is to uniquely identify elements that need to be referred to by other
elements.
For the name of the attribute, we use the element name followed by ID. This might
cause name collisions with other attributes that have already been added to the
XML, in which case we need to change the names of these as appropriate. These should
be defined as being of type ID, and must be declared as #REQUIRED. If we add these
ID attributes to each element for now, we can optionally remove some from the created
XML structures when we come to model all of the relationships.
When populating these structures, a unique ID will need to be created for each instance
of an element that is generated. We need to ensure that these IDs are unique not
only across all elements of a specific type, but across all elements in our document.
One way to do this programmatically (assuming that we're using automatically incremented
integers for the primary keys in our database) is to use the primary key for the
row being created, prefixed by the name of the table in which it appears.
For example, for the customer in our database with the ID number 17, we might use
the string Customer17 for the value of the CustomerID attribute on the Customer
element. If we have nonnumeric keys in our database, or similar table names with
numeric suffixes (like Customer and Customer1), this may cause name collisions
as always, be on the look out for these.
In our example, then, we have:
|
<!ELEMENT SalesData EMPTY> |
|
<!ATTLIST SalesData |
|
Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED> |
|
<!ELEMENT Invoice EMPTY> |
|
<!ATTLIST Invoice |
|
InvoiceID ID #REQUIRED |
|
InvoiceNumber CDATA #REQUIRED |
|
TrackingNumber CDATA #REQUIRED |
|
OrderDate CDATA #REQUIRED |
|
ShipDate CDATA #REQUIRED> |
|
<!ELEMENT Customer EMPTY> |
|
<!ATTLIST Customer |
|
CustomerID ID #REQUIRED |
|
Name CDATA #REQUIRED |
|
Address CDATA #REQUIRED |
|
City CDATA #REQUIRED |
|
State CDATA #REQUIRED |
|
PostalCode CDATA #REQUIRED> |
|
<!ELEMENT Part EMPTY> |
|
<!ATTLIST Part |
|
PartID ID #REQUIRED |
|
PartNumber CDATA #REQUIRED |
|
Name CDATA #REQUIRED |
|
Color CDATA #REQUIRED |
|
Size CDATA #REQUIRED> |
|
|
|
|
|
<!ELEMENT MonthlyTotal EMPTY> |
|
<!ATTLIST MonthlyTotal |
|
MonthlyTotalID ID #REQUIRED |
|
Month CDATA #REQUIRED |
|
Year CDATA #REQUIRED |
|
VolumeShipped CDATA #REQUIRED |
|
PriceShipped CDATA #REQUIRED> |
|
<!ELEMENT MonthlyCustomerTotal EMPTY> |
|
<!ATTLIST MonthlyCustomerTotal |
|
MonthlyCustomerTotalID ID #REQUIRED |
|
VolumeShipped CDATA #REQUIRED |
|
PriceShipped CDATA #REQUIRED> |
|
<!ELEMENT MonthlyPartTotal EMPTY> |
|
<!ATTLIST MonthlyPartTotal |
|
MonthlyPartTotalID ID #REQUIRED |
|
VolumeShipped CDATA #REQUIRED |
|
PriceShipped CDATA #REQUIRED> |
|
<!ELEMENT LineItem EMPTY> |
|
<!ATTLIST LineItem |
|
LineItemID ID #REQUIRED |
|
Quantity CDATA #REQUIRED |
|
Price CDATA #REQUIRED> |
Rule 5: Add ID Attributes to the Elements
Add an ID attribute to each of the elements we have created in our XML structure
(with the exception of the root element). Use the element name followed by ID for
the name of the new attribute, watching as always for name collisions. Declare the
attribute as type ID, and #REQUIRED.
|
Handling Foreign Keys
In relational database structures, the only way to show a relationship between data
kept in different tables is via a foreign key. As we saw in the previous chapter,
there are two ways to show this relationship in XML. We can create hierarchical
structures, which allow us to use containment to show relationships between
data (where related information is nested inside a parent element). Alternatively,
if we want to keep the XML structures separate like the tables of a database
we can use an ID to point to a corresponding structure that has an
IDREF attribute.
Each way has its benefits and drawbacks. Pointing is more flexible than containment,
but pointing relationships may only typically be navigated in one direction by the
processor, and tend to be slower than navigating parent-child relationships.
The next thing we need to decide is whether to use containment or pointing to represent
the relationships between our tables. In addition, we need to add the enumerated
attributes that correspond to the lookup tables we are using. Let's see how to do
that first.
Add Enumerated Attributes for Lookup Tables
If we have a foreign key in a table that points to a lookup table, we need to add
an enumerated attribute to the element representing the table in which that foreign
key appears.
Before we can do so with our example transformation, we need to identify the nature
of the relationships between the tables we have selected to include in our XML structures.
For each relationship, we need to identify:
-
Whether it's a lookup or content relationship
-
Whether it is a content relationship, and if so the direction in which it will be
navigated
This is important in larger structures because some relationships can be navigated
in more than one direction, depending on how the relationships are arrived at. As
a general rule, relationships should be navigated in the same direction that a program
would most often navigate them. For example, in our case we're much more likely
to navigate from Invoice to LineItem than we are to navigate from LineItem to Invoice.
We need to determine the direction we'll be navigating between our elements because
it determines where our ID-IDREF relationships should be. Remember that these are
effectively unidirectional it's relatively easy to traverse a structure from
an IDREF value to an ID value, but not the other way around. Deciding how we will
normally be traversing our structures helps us to determine how we should structure
them. If we need to navigate between two elements in either direction, we may need
to add an IDREF in each element pointing back to the other element in the relationship.
However, this will increase document creation time and size.
Since we want our document to support invoices and monthly summary information,
we conclude that we want our relationships to provide a way to navigate from invoices
to associated information. For example, we want to be able to go from an invoice,
to its line items, to the part associated with each line item; or from an invoice,
to the customer who ordered it. Under other circumstances, we might order our relationships
differently for example, if we wanted a customer-centric XML document.
After assessing all the relationships in our structure, we can conclude that the
navigation between the tables looks something like this:
We have a foreign key called ShipMethodType pointing to a table called ShipMethod,
therefore we need to add an enumerated value for ShipMethod to the <Invoice>
element. Let's assume that in our database, the ShipMethod table contains the following
values:
|
ShipMethod |
Description |
|
1 |
US Postal Service |
|
2 |
Federal Express |
|
3 |
UPS |
|
The enumerated attribute should take the name of the lookup table, and should be
declared as #REQUIRED if the foreign key does not allow NULLS, or #IMPLIED otherwise.
The determination of the allowable values for the enumeration is a subjective process,
and will depend on other design constraints (such as size). The values allowed should
typically be human-readable versions of the description for each record.
So, creating an attribute with allowable enumerated values for the three possible
lookup values, and adding the attribute to the <Invoice> element, gives us
this:
|
... |
|
<!ELEMENT Invoice EMPTY> |
|
<!ATTLIST Invoice |
|
InvoiceID ID #REQUIRED |
|
InvoiceNumber CDATA #REQUIRED |
|
TrackingNumber CDATA #REQUIRED |
|
OrderDate CDATA #REQUIRED |
|
ShipDate CDATA #REQUIRED> |
|
ShipMethod (USPS | FedEx | UPS) #REQUIRED> |
|
<!ELEMENT Customer EMPTY> |
|
... |
Rule 6: Representing Lookup Tables.
For each foreign key that we have chosen to include in our XML structures that references
a lookup table:
1. Create an attribute on the element representing the table in which the foreign
key is found.
2. Give the attribute the same name as the table referenced by the foreign key,
and make it #REQUIRED if the foreign key does not allow NULLS or #IMPLIED otherwise.
3. Make the attribute of the enumerated list type. The allowable values should be
some human-readable form of the description column for all rows in the lookup table.
|
|