Chapter 4. XSLT Document Processing

aka: How it All Works

For an in depth review of XSLT (W3C, 1999), you are urged to consult a good reference book, such as the XSLT Programmer's Reference (Kay, 2003).

With that under your belt, you would clearly grasp that RefDB-lite works by initially creating two global xsl:variables named refdb.citation.style and refdb-lite.bib.doc. The former loads the RefDB citation style, with the latter being a temporary result-tree node-set that contains an assembly of source document citation references and their accompanying, resolved and sorted, bibliographic data. Because these are global variables, they are evaluated in a pre-parse stage, typically before the target transformation template matching rules are applied to the document root node - and certainly before any template matching rules encounter their first citation node in the source document.

The upshot of the foregoing is that by the time that first citation tag is encountered, both the citation formatting style and the relevant reference data are both on hand to transform that citation to its desired output format and link it with an appropriate bibliographic entry.

The citations are collated in document order (for numerical schemes) or alphabetically by citekey (for citekey based schemes) or by author (for author-year schemes) and saved to a temporary result tree as a list of, somewhat abused, DocBook (Walsh, Muellner and Stayton, 1999) xref tags with associated linkend attributes containing unique citation basenames (stripped of -X extensions), all encapsulated within a citations fragment.

The citations node also holds a complete DocBook bibliography, containing fully resolved biblioentry elements that hold the "raw" reference data obtained iether from within the document, or, if missing, resolved from the auxilliary bibliography database indicated by the DocBook xsl:parambibliography.collection. The remaining unresolved biblioentry are optionally, further resolved from a remote RefDB HTTP gateway server. Such URL resolution occurs on an entry by entry basis, with each access resolving to a single biblioentry. In actuality, the RefDB server serves RISX (Hoenicka, 2005d) documents and these are XSL Transformed to a compliant DocBook "raw" format. Failure to resolve at this stage is considered fatal and terminates further stylesheet processing.

For a source document containing multiple bibliography elements, the temporary refdb-lite.bib.doc node-set resembles the following structure:


 <refdb-lite>
   <citations id="BIB1-parent-id">
     <xref linkend="ref1" database="" type="A" sortkey="ref1"/>  
     <xref linkend="ref2" database="" type="A" sortkey="ref2"/>  
     <xref linkend="ref3" database="db2" type="A" sortkey="ref3db2"/>  
     <bibliography id="bib1-id">
       <biblioentry id="ref1">
         <abbrev>ref1</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
       <biblioentry id="ref2">
         <abbrev>ref2</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
       <biblioentry id="db2-ref3">
         <abbrev>ref3</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
     </bibliography>
   </citations> 
   <citations id="BIB2-parent-id">
     <xref linkend="ref4" database="db2" type="A" sortkey="ref4db2"/>  
     <xref linkend="ref2" database="" type="A"    sortkey="ref2"/>  
     <xref linkend="ref3" database="db2" type="A" sortkey="ref3db2"/>  
     <bibliography id="bib1-id">
       <biblioentry id="db2-ref4">
         <abbrev>ref4</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
       <biblioentry id="ref2">
         <abbrev>ref2</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
       <biblioentry id="db2-ref3">
         <abbrev>ref3</abbrev>
         <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc>
         <biblioset>
           ...
         </biblioset>
       </biblioentry>
     </bibliography>
   </citations> 
 </refdb-lite>

By the time that citation elements are encountered and actively being transformed, the xref elements in the above are fully redundant, having served their purpose of ordering the biblioentrys.

At transformation time then, each source document citation/biblioref node is linked to a target biblioentry with an attribute id created from the following components: {bibL-}{dbM-}{refN}. If the source document contains a single bibliography element, then the {bibL-} component is omitted (L is an integer and "bib" is specified by the global parameter refdb-multi-bib-prefix. If the source citation <biblioref endterm="?-?-X"/> did not specify a database component, then the {dbM-} component is also omitted.

The content of each biblioentry in the above listing is comprised of DocBook "raw" bibliographic information. Thus, it can be equally well included in the source DocBook document, or stored in an external bibliography resource file.

In addition to the "raw" DocBook data, the refdb.raw.biblist biblioentry data is augmented with a bibliomisc role='sortkey' attribute. The sortkey is comprised of the uppercased concatenation of all PRIMARY author surnames OR SECONDARY editor surnames OR TERTIARY editor surnames OR the PRIMARY title AND then postfixed with the publication year. Very probably this is publication style specific as well as reference type specific. You could well wish to customize the collation.xsl refdb.add.raw.bibentry template to accomodate your sorting requirements.

During ordinary DocBook XSLT (Stayton, 2005) stylesheet processing, when the templates finally encounter a bibliography node, the RefDB-lite stylesheets override the default templates and call the refdb.process.bibliography template. This applies the previously requested publication style to all biblioentrys contained within the corresponding refdb-lite.bib.doc bibliography node-set, matched by its associated id attribute.