Copyright © 2006 Doug du Boulay
Mar 19 2006
Abstract
Exploring the boundaries, fringes and possibilities of RefDB - the TEI and DocBook compatible reference database.
Table of Contents
List of Examples
Table of Contents
Two of the more popular dialects for XML authoring are those of the Text Encoding Initiative (1), and DocBook (2). A third, and relative newcomer on the block is the Darwin Information Typing Architecture (DITA) (3), though it remains outside the scope of this package. Both of the former dialects have extensive facilities for encoding or encapsulating bibliographic information, but their associated tools are devoid of comprehensive bibliographic authoring and collation facilities (to the best of the present authors' knowledge).
The program RefDB (4) fills that void. RefDB implements a relational database interface to various database management systems such as SQLite, MySQL and PostgreSQL. RefDB has a two-tier client-server architecture, providing methods for adding, retrieving and searching out reference data in externally managed databases. RefDB also provides convenience commands and interfaces for editing and annotating the reference data contained therein and for formatting the citations and bibliographical data that are emitted. Typically "cooked", i.e. preformatted, citations and bibliography entries are then simply included in, or with, the source TEI or DocBook douments, written in iether XML or SGML formats.
But as powerful as RefDB is, there are some instances where the overhead of a full database management system may be inexpedient, or might unnecessarily restrict the portability of the source document. Maybe you work in a small laboratory without the man-power to devote to comprehensive reference management, or perhaps a document needs to exist standalone in its entirety, or will be used in non-standard ways, such as submission to different journals (not that You would ever need that!), or inclusion in laboratory webpages and productivity manuals. In such situations it could well be useful to apply standard bibliographic and citation style formatting to the complete document, formatting it in different ways consistent with those different purposes.
For those somewhat esoteric reasons, some of the useful sorting and formatting features of the
RefDB package have been externalised here as a CITESTYLE
(5)
driven eXtensible Stylesheet Language Transformation
(6).
RefDB-lite is XSLT1.0 compatible but makes considerable use
of the standard extension function, exsl:node-set()
.
In fact that is a fundamental
prerequisite for any XSLT1.0 engine intended to apply these stylesheets.
For the moment, RefDB-lite has been written explicitly with DocBook output in mind and, in fact, that is further restricted to XML documents rather than the combined XML and SGML support of RefDB. However, there are no compelling reasons why analogous XML transformation stylesheets could not be written to extend formatting to other documentation systems, such as TEI. For that matter, its possible that the programmatic logic could be mapped into DSSSL to cater for SGML documents, though this is left as an exercise for the reader.
Good luck.
Table of Contents
Using RefDB-lite involves little more than adding an import instruction to your existing XSLT customisation file.
<xsl:import href="./docbook-xsl-1.69.0/html/chunk.xsl"/> <xsl:import href="./refdb-lite/xsl/docbook/html/biblio.xsl"/>
The import order above is quite crucial, because RefDB-lite overrides
some of the processing templates of the standard DocBook XSL stylesheets.
You would replace instances of html in the above with xhtml
or
fo
for alternative output styles (if and when implemented).
It is also necessary to select a bibliography style file (CITESTYLE) such as
Eur.J.Pharmacol.xml
.
This could be selected from command line arguments
using xsltproc
(1)
for example:
--stringparam refdb.citation.style.file.name "Eur.J.Pharmacol.xml"
or alternatively by configuring it within your customisation file.
<xsl:param name="refdb.citation.style.file.name" select="'Eur.J.Pharmacol.xml'"/>
Of course, both of these selection mechanisms assume that your desired style
exists within the RefDB-lite styles directory. If that isn't the case,
you need to override the xsl:variable
<xsl:variable name="refdb.citation.style" select="document( concat('../../../styles/',$refdb.citation.style.file.name),document(''))/CITESTYLE" />
The document()
function used above locates the style file with respect to the
refdb-lite/xsl/docbook/common/collation.xsl
file.
Replace document('')
in the above
with "/"
in your customisation,
to locate your style file with respect to your source document.
For use with DocBook you can specify an auxilliary reference database file.
This is simply a single DocBook XML file containing a bibliography
rootnode and filled with biblioentry
s with id attributes that
may or may not match those of your biblioref
citation targets. To specify this file you should iether add
<xsl:param name="bibliography.collection" select="'docbook.bib.data.xml'"/>
to your customisation file, or set it on the command line as an argument e.g. for xsltproc:
--stringparam bibliography.collection "./docbook.bib.data.xml"
Obviously this assumes that docbook.bib.data.xml
is
the relative (w.r.t. your source doc) pathname to your
raw bibliography
database.
For citation references that do not resolve iether internally, or to the external
database file (bibliography.collection
),
its possible to initiate an HTTP GET connection to a remotely running
RefDB installation. XSLT_1.0 blesses the document()
with
URL read ability, potentially through a Common Gateway Interface
(CGI) program that returns the missing items from your bibliography.
Because this interface only ever needs to read references from RefDB
and its underlying databases, it is best achieved by creating a "public"
user-account configured with read-only access permissions
(http://refdb.sourceforge.net/manual/sect1-add-user.html),
that would be used by everyone intending to work with RefDB-lite.
This would provide the greatest level of reference database security.
Notwithstanding the potential severe security breach, the potential user intending to imbue RefDB-lite with CGI access will need to set some or all of the following parameters in their XSLT customisation file:
<xsl:param name="refdb.server.address" select="'http://localhost/refdb/refdb-lite/refdb-lite-server.cgi'" /> <xsl:param name="refdb.server.username" select="'anon'" /> <xsl:param name="refdb.server.password" select="'password'" /> <xsl:param name="refdb.server.default.database" select="'refs'" /> <xsl:param name="refdb.server.data.format" select="'risx'" /> <xsl:param name="refdb.server.timeout" select="1"/> <!-- autodestroy session id in N minutes?? -->
In the interests of local security, you could add some or all of those
parameters to an auxilliary customisation file with read/write
permissions restricted only to yourself
(chmod 600 auxfile.xsl
), and then include that
within your primary customisation file:
<xsl:include href="auxfile.xsl"/>
While that will protect your RefDB access account details locally, your database password is still passed in cleartext form to the HTTP access URL and that is easily intercepted mid flight, as well as being logged by the gateway server access and error logs (Have to look into this a bit more, but presumably we could negotiate a variable password encoding key of some description and implement that in XSLT?). Do tread very, very carefully!
Table of Contents
The RefDB-lite stylesheets are, in essence, activated on matching
<citation role="REFDB"><biblioref endterm="SomeRef2003-X"/> </citation>
elements. For RefDB-lite there are no separate intermediate, short-citation expansion and
collation processing phases, analogous to those required with the
runbib or refdbib
commands, documented in the RefDB manual
(1).
Note also that in contrast to RefDB usage, we use the biblioref
DocBook element rather than xref
and we
use the corresponding endterm
attribute rather
than linkend
of the latter.
The utility of the role="REFDB"
attribute is also
debateable. The underlying philosophy was to permit coexistence with
standard DocBook citation and cross-referencing mechanisms, but as it stands,
that may not work and it seems unlikely to be reliable because RefDB-lite doctors
the target bibliography substantially (and yet ...?).
The biblioref
element was introduced c.a. DocBook 4.3 and is
a more appropriate container for bibliographic reference info. Moreover, the
linkend
attribute of xref
is intended to be an IDREF (reference) to a specific unique reference ID,
but for RefDB-lite those ID's may only exist in the transformed document, so that a
stand-alone source XML document including unresolvable xref
s
may technically be invalid.
![]() |
Note |
---|---|
According to
DocBook 5.0,
begin (token) e.g. start page number end (token) e.g. end page number endterm (IDREF) units (token) e.g. page xrefstyle e.g. extra formatting style info
of which you will note that |
You should note also that the -X
endterm extensions are
stripped and, in the simple case, SomeRef2003
would refer to
a <biblioentry id="Smith00"/>
entry in the transformed document's
<bibliography/>
.
Taking the above into consideration, citations are therefore
made in RefDB "full" notation, where
the ultimate citation format is governed by the trailing "-N" extension on the
endterm. In keeping with the "full" notation, for a document that knowingly
makes use of the RefDB server interface to resolve missing references, it
is possible to prepend the database identification string to the front of the
RefDB citekey identifier, for example:
endterm="MyOtherRefDB-Smith00-X"
. That would only be
necessary if you are resolving references from multiple RefDB databases
and need to distinguish between them. The default database, if needed, should be
identified in the XSLT customisation parameter
refdb.server.default.database
The following citation formatting rules are intended to apply:
endterm="Smith00-X" -> (Smith, Jones & Murphy, 2000) INTEXTDEF first endterm="Smith00-S" -> (Smith et al., 2000) INTEXTDEF subsequent endterm="Smith00-W" -> Smith, Jones & Murphy (2000) combined AUTHORONLY/YEARONLY first endterm="Smith00-U" -> Smith et al., (2000) combined AUTHORONLY/YEARONLY subsequent endterm="Smith00-A" -> Smith, Jones & Murphy AUTHORONLY first endterm="Smith00-Q" -> Smith et al. AUTHORONLY subsequent endterm="Smith00-Y" -> (2000) YEARONLY
i.e. -X -W -A for initial citations and -S -U -Q for subsequent citations
In fact -S, -U
and -Q
aren't needed at all because
with the source document at hand we can just count the number
of previous citations using the same key and can adjust the
style accordingly (when xsl:number works properly!). On the other hand, when RefDB itself
provides active support for biblioref
elements, then compatibility
could require them, for rigorous citation formatting.
But in fact that is not the final word on citation formatting, because the specific citation renderings are specified in every RefDB CITESTYLE (2) file, e.g.:
<CITSTYLE> <INTEXTDEF><REFNUMBER/></INTEXTDEF> <YEARONLY><REFNUMBER/></YEARONLY> <AUTHORONLY> ... </AUTHORONLY> </CITSTYLE>
With those specifications, the citations above could present as:
endterm="Smith00-X" -> [1] endterm="Smith00-S" -> [1] endterm="Smith00-W" -> Smith, Jones & Murphy (2000) (should that be [1]?) endterm="Smith00-U" -> Smith et al., (2000) (should that be [1]?) endterm="Smith00-A" -> Smith, Jones & Murphy endterm="Smith00-Q" -> Smith et al. endterm="Smith00-Y" -> [1]
and the corresponding bibliography would be ordered by citation sequence, and accordingly indexed.
I would also like to believe that a style configuration file resembling the following:
<CITSTYLE> <INTEXTDEF><CITEKEY/></INTEXTDEF> <YEARONLY><CITEKEY/></YEARONLY> ... </CITSTYLE>
might permit citations of the form:
endterm="Smith00-X" -> [Smith00] endterm="Smith00-S" -> [Smith00] endterm="Smith00-W" -> Smith, Jones & Murphy (2000) endterm="Smith00-U" -> Smith et al., (2000) endterm="Smith00-A" -> Smith, Jones & Murphy endterm="Smith00-Q" -> Smith et al. endterm="Smith00-Y" -> [Smith00]
with the corresponding bibliography ordered alphabetically and indexed in terms of citation keywords. This would provide consistency with the standard DocBook bibliographic style as documented in DocBook: The Definitive Guide (3). As yet it is not supported.
![]() |
Note |
---|---|
There is no facility for page-note i.e. bottom of page style bibliographies in DocBook and consequently none in RefDB-lite. |
Personally I would like to see new, possibly configurable citation formats such as:
endterm="Walsh99-T" -> DocBook: The Definitive Guide (hyperlinked to bib) endterm="Smith00-F2" -> Some extended title/volume CITESTYLE specified free format
This would simplify writing extended citations for things like "DocBook: The Definitive Guide (3)" by abbreviating
<citetitle>DocBook: The Definitive Guide</citetitle> <citation role="REFDB"><biblioref endterm="Walsh99-X"/></citation>.
to just:
<citation role="REFDB"><biblioref endterm="Walsh99-TW"/></citation>.
That would be very useful for citing software programs too. But is it practical and likely to be compatible with some future RefDB? Dunno.
Table of Contents
For an in depth review of XSLT (1), you are urged to consult a good reference book, such as the XSLT Programmer's Reference (2).
With that under your belt, you would clearly grasp that
RefDB-lite works by initially creating two global
xsl:variable
s named refdb.citation.style
and refdb-lite.bib.doc
. The former loads the RefDB
citation style, with
the latter being a temporary result-tree node-set that contains an assembly of source document
citation references and their accompanying, resolved and sorted, bibliographic data.
Because these are global variables, they are evaluated in a pre-parse stage, typically before
the target transformation template matching rules are applied to the document root node -
and certainly before any template matching rules encounter their first
citation
node in the source document.
The upshot of the foregoing is that by the time that first citation
tag is encountered, both the citation formatting style and the relevant reference
data are both on hand to transform that citation to its desired output format
and link it with an appropriate bibliographic entry.
The citations are collated in document order (for numerical schemes) or
alphabetically by citekey (for citekey based schemes) or by author (for author-year schemes)
and saved to a temporary result tree as a list of, somewhat abused,
DocBook
(3)
xref
tags with associated
linkend
attributes
containing unique citation basenames (stripped of -X extensions),
all encapsulated within a citations
fragment.
The citations
node also holds a
complete DocBook bibliography
,
containing fully resolved
biblioentry
elements that hold the "raw" reference
data obtained iether from within the document, or,
if missing, resolved from the auxilliary bibliography database indicated by the DocBook
xsl:param
— bibliography.collection
.
The remaining unresolved biblioentry
are optionally,
further resolved from a remote RefDB HTTP gateway server.
Such URL resolution occurs on an entry by entry basis, with each access resolving to a single
biblioentry
. In actuality, the RefDB server serves
RISX (4)
documents and these are XSL Transformed to a compliant DocBook "raw" format.
Failure to resolve at this stage is considered fatal and terminates
further stylesheet processing.
For a source document containing multiple bibliography
elements,
the temporary refdb-lite.bib.doc
node-set resembles the following structure:
<refdb-lite> <citations id="BIB1-parent-id"> <xref linkend="ref1" database="" type="A" sortkey="ref1"/> <xref linkend="ref2" database="" type="A" sortkey="ref2"/> <xref linkend="ref3" database="db2" type="A" sortkey="ref3db2"/> <bibliography id="bib1-id"> <biblioentry id="ref1"> <abbrev>ref1</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> <biblioentry id="ref2"> <abbrev>ref2</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> <biblioentry id="db2-ref3"> <abbrev>ref3</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> </bibliography> </citations> <citations id="BIB2-parent-id"> <xref linkend="ref4" database="db2" type="A" sortkey="ref4db2"/> <xref linkend="ref2" database="" type="A" sortkey="ref2"/> <xref linkend="ref3" database="db2" type="A" sortkey="ref3db2"/> <bibliography id="bib1-id"> <biblioentry id="db2-ref4"> <abbrev>ref4</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> <biblioentry id="ref2"> <abbrev>ref2</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> <biblioentry id="db2-ref3"> <abbrev>ref3</abbrev> <bibliomisc role="sortkey">AUTH1AUTH2AUTHnDATE</bibliomisc> <biblioset> ... </biblioset> </biblioentry> </bibliography> </citations> </refdb-lite>
By the time that citation
elements are encountered and
actively being transformed, the xref
elements in the
above are fully redundant, having served their purpose of ordering the
biblioentry
s.
At transformation time then, each source document citation/biblioref
node
is linked to a target biblioentry
with an attribute
id
created from the following components: {bibL-}{dbM-}{refN}
.
If the source document contains a single bibliography
element, then the {bibL-}
component is omitted (L
is an integer and "bib"
is specified
by the global parameter refdb-multi-bib-prefix
.
If the source citation <biblioref endterm="?-?-X"/>
did not specify a database component, then the {dbM-}
component is
also omitted.
The content of each biblioentry
in the above listing
is comprised of DocBook "raw" bibliographic information. Thus, it can
be equally well included in the source DocBook document, or
stored in an external bibliography resource file.
In addition to the "raw" DocBook data, the refdb.raw.biblist
biblioentry
data is augmented with a
bibliomisc
role
='sortkey'
attribute. The sortkey is comprised of the uppercased concatenation of all
PRIMARY author surnames OR SECONDARY editor surnames OR TERTIARY editor surnames OR
the PRIMARY title AND then postfixed with the publication year.
Very probably this is publication style specific as well as reference type specific.
You could well wish to customize the collation.xsl
refdb.add.raw.bibentry
template to accomodate your sorting
requirements.
During ordinary DocBook XSLT
(5)
stylesheet processing, when the templates finally encounter a
bibliography
node, the RefDB-lite
stylesheets override the default templates and call the
refdb.process.bibliography
template.
This applies the previously requested publication style to all
biblioentry
s contained
within the corresponding refdb-lite.bib.doc
bibliography
node-set, matched by its
associated id
attribute.
Table of Contents
RefDB-lite requires an input source of biblioentry
data in
"raw" DocBook format.
The organisation of DocBook bibliographies is incredibly flexible
so in fact it is necessary to restrict the potential formats to a
consistent though legal DocBook subset.
Otherwise the source document would not validate.
The raw bibliography format itself is a tentative proposal
and open for negotiation.
It should capture most of the potential of RISX
(1)
and ultimately, perhaps MODS
(2).
Each biblioentry
must have an
id
attribute corresponding to
the basename of a citation
biblioref
element.
<biblioentry role='JOUR' id='FoxCoef'><abbrev>FoxCoef</abbrev> <biblioset relation='JOUR'>
<titleabbrev role='SECONDARY'>Acta Cryst.</titleabbrev>
</biblioset> <biblioset relation='SERIES' role='TERTIARY'>
<titleabbrev role='TERTIARY'>A</titleabbrev> </biblioset> <bibliomisc role='USERDEF1'>A</bibliomisc>
<biblioset relation='ARTICLE'>
<authorgroup role='PRIMARY'>
<author> <firstname>A</firstname> <othername role='mi'>G</othername> <surname>Fox</surname> </author> <author> <firstname>M</firstname> <othername role='mi'>A</othername> <surname>O'Keefe</surname> </author> <author> <firstname>M</firstname> <othername role='mi'>A</othername> <surname>Tabbernor</surname> </author> </authorgroup> <title role='PRIMARY'>Relativistic Hartree-Fock X-ray and electron atomic scattering factors at high angles</title> <volumenum>45</volumenum> <pubdate role='PRIMARY'>1989</pubdate>
<pagenums role='start'>786</pagenums>
<pagenums role='end'>793</pagenums> <bibliosource class="uri"> <ulink url='http://www.iucr.org/paper?hh0289'/>
</bibliosource> </biblioset> </biblioentry>
If the "raw" data exists in an external database, then the internal
document bibliography could be comprised of "empty"
biblioentry
tags, one to match each
citation basename used within the document. However, this is not
strictly necessary, and is probably superfluous.
<bibliography> <title>Bibliography</title> <biblioentry id="Smith00"/> <biblioentry id="Walsh99"/> <biblioentry id="Stayton05"/> </bibliography>
In this case the associated "raw" data must be accessible through the
bibliography.collection
variable or the RefDB CGI
(web) interface.
Table of Contents
Being an all XSLT solution, the options and possibilities for customisation are endless. To that extent, many of the actual formatting templates were written expressly to provide opportunities and hooks to capture the processing at different stages and to cater for different reference types in different fashions. These options exist over and above the style options provided by the CITESTYLE (1) file:
In addition, there are many possibilities to configure the gross behaviour and common processing features.
The XSLT parameter refdb.bibliography.collection.relative
defines a node-set
with respect to which the DocBook XSL
(2)
auxilliary bibliographic database file parameter,
bibliography.collection
is located.
The default definition of refdb.bibliography.collection.relative
loads the
bibliography.collection
database with respect to
your DocBook source document, which seems the logical choice.
It could however, be made with respect to the RefDB-lite stylesheet directory.
Conceivably the template that makes use of the
refdb.bibliography.collection.relative
parameter
(that would be refdb.raw.biblist
in file collate.xsl
)
could be overridden to access that bibliography database via HTTP.
According to my understanding of Kay (3),
the XSLT document()
function only ever loads a given
unique URL once, regardless of the modification status of that URL. So basically,
it would not be inefficient to make multiple access calls to such a database file ...
One possibility is to customise citation matching to obviate the
role="REFDB"
requirements on citation
.
You would need the following template added to your customisation file.
<xsl:template match="citation"> <xsl:choose> <!-- xsl:when test="@role='REFDB' and child::biblioref" --> <xsl:when test="child::biblioref" > <xsl:call-template name="refdb-render-citation"/> </xsl:when> <xsl:otherwise> <xsl:apply-imports /> <!-- normal DocBook XSL citations --> </xsl:otherwise> </xsl:choose> </xsl:template>
You might also need this:
<xsl:template match="citation" mode="refdb-lite.collate.mode"> <!-- ... --> </xsl:template>
This is closely related to the previous customisation.
Search for all biblioref
s and replaces with xref
Another possibility is to permit uncited references to be included in the bibliography:
<xsl:param name="refdb.use.uncited.references" select="1"/>
to your customisation file, or set it on the command line as an argument e.g. for xsltproc:
--param refdb.use.uncited.references 1
Do some references need special treatment?
Need a sort key based on series title/volume/date instead of by author/date?
Get in there and customize your own refdb.add.raw.bibentry
template.
How are these things used in practise? Can we standardise some uses?
For instance, the Acta.Cryst.xml
style uses, (I believe)
USERDEF1
as a container for the abbreviated series title
(that being A, B, C, D or E.
Was that a good choice? Is it treated correctly?
You want to cite by title? or include a computer program title as part or all of a citation link? This would be a very useful customisation!
Dump your RefDB database as RISX and apply the RefDB-lite?
RISX-to-DocBook-raw
XSL Transformation stylesheet as a once-off conversion.
<?xml version="1.0" encoding="iso-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl='http://www.w3.org/1999/XSL/Transform' > <xsl:import href="refdb-lite/xsl/docbook/risx/risx2dbk.xsl"/> <xsl:template match="/"> <xsl:call-template name="RISX-to-DocBook-raw"> <xsl:with-param name="risx" select="."/> </xsl:call-template> </xsl:template> </xsl:stylesheet>
Write yourself a complete XSL Transformation stylesheet. It can't be that hard to do? Can it?
Actually, there may be problems of ambiguity.
You want to change citation styles midstream? Initially, that sounds a bit inconsistent and esoteric. Yet it could happen, particularly in, say, different parts of a book, or different books of a set, or to faithfully amalgamate published articles from journals, with differing styles, into a thesis.
It would require a significant reorganisation, rather than a trivial XSLT customisation. But it is not inconceivable.
Do you need to resolve DOI adresses or MODS via SRU? http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2006/02/26/opa-proxy-script I have no idea what that involves!
Do we have to handle citebiblioid
as well?
Table of Contents
Although installation of the RefDB-lite CGI gateway script is trivial, there are a number of reasonably intuitive, but essential prerequisites.
For testing, you need a client computer with XSLT capability and both
the DocBook and RefDB-lite stylesheets installed and your
your DocBook XML source document, including citation
s.
You need a second computer running an HTTP server, such as Apache. This computer must have the RefDB client program refdbc installed and operational.
You need a third computer running the RefDB refdbd daemon process and hosting the SQL based bibliographic reference-database.
Installing RefDB is beyond the scope of this document. Consult the RefDB manual (1) for details.
In actual fact, all three of those processes can be run concurrently
on the same computer, communicating via the loopback (lo
)
ethernet interface.
This would be the securest scenario. Then, your RefDB CGI interface would be
accessed via the localhost
address, for example:
<xsl:param name="refdb.server.address" select="'http://localhost/refdb/refdb-lite-server.cgi'" />
Assuming your prerequisites are in place, and fully operational, you need to install the RefDB-lite server script and configure your HTTP server to enable CGI execution permissions in the scripts installation directory.
On your HTTP server computer, create a directory such as
/usr/local/share/refdb/www/
Copy the RefDB-lite server script,
refdb-lite/www/server/refdb-lite-server.cgi
to the directory /usr/local/share/refdb/www/
.
It has to be executable (chmod 555 refdb-lite-server.cgi).
Configure your HTTP server to serve files and enable CGI execution of
.cgi
files in the
/usr/local/share/refdb/www/
directory.
For the Apache2 server running on a
Debian GNU/Linux box, this ammounts to adding
the following to iether
/etc/apache2/httpd.conf
or, better, a file in the
/etc/apache2/sites-enabled/
directory, such as
000-default
:
Alias /refdb/ "/usr/local/share/refdb/www/" <Directory "/usr/local/share/refdb/www/refdb-lite"> Options +ExecCGI AddHandler cgi-script .cgi AllowOverride None Order allow,deny Allow from all </Directory>
Follow that with /etc/init.d/apache2 restart and you should be away!
![]() |
Warning |
---|---|
You use this script at your risk! Potentially it is a very large security hole. Tread carefully. Read the source code. Figure out how to do it better! |
To communicate with the RefDB refdbc client, the XSLT client
should first provide a username and database password. This info is
stored in a /tmp/refdb-sid*
filename to provide
persistence of state over repeated transactions.
Sadly, the temporary file permissions currently
permit all bonafide user accounts on the server to read them
(You thought I was joking about the security thing didn't you!).
Worse than that, under a heavy HTTP server load, it is possible that your HTTP server thread could be reallocated to service a different HTTP client address and in doing so, reallocate your session state file to the new client. You'll know when it happens, because your RefDB references won't resolve and the stylesheet processing will abort with a message saying you aren't recognised any more.
I would like to think that people who properly knew what they were doing could improve on this relatively easily.
For any particular citation, the basic principle
for deciding the appropriate bibliography
to insert
the associated biblioentry
data into and then to link to,
is that it will the first bibliography
child
of its closest ancestor. The following examples should therefore
demonstrate plausible DocBook document structures where RefDB-lite
would operate in a perfectly logical and consistent manner.
Example 8.1. A perfectly valid book
<book> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> <bibliography id="bib1"/> </chapter> <chapter> <para><citation/></para> <!-- definitely pointing to bib2 --> <bibliography id="bib2"/> </chapter> <appendix> <para><citation/></para> <!-- definitely pointing to bib3 --> <bibliography id="bib3"/> </appendix> </book>
Example 8.2. Another perfectly valid book
<book> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <appendix> <para><citation/></para> <!-- definitely pointing to bib1 --> </appendix> <bibliography id="bib1"/> </book>
Example 8.3. A third perfectly valid book
<book> <bibliography id="bib1"/> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <appendix> <para><citation/></para> <!-- definitely pointing to bib1 --> </appendix> </book>
Example 8.4. A perfectly valid part
<part> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <chapter> <para><citation/></para> <!-- definitely pointing to bib1 --> </chapter> <bibliography id="bib1"/> <appendix> <para><citation/></para> <!-- definitely pointing to bib2 --> <bibliography id="bib2"/> <!-- but if bib2 was missing, then also to bib1 --> </appendix> </part>
Example 8.5. A perfectly valid article
<article> <section> <para><citation/></para> <!-- definitely pointing to bib1 --> </section> <section> <para><citation/></para> <!-- definitely pointing to bib1 --> </section> <appendix> <para><citation/></para> <!-- definitely pointing to bib1 --> </appendix> <bibliography id="bib1"/> </part>
Example 8.6. A wierd concoction that should still be valid
<article> <articleinfo> <para><!-- a citation here would have no target bibliography --></para> </articleinfo> <section> <para><citation/></para> <!-- definitely pointing to bib3 --> <section> <para><citation/></para> <!-- definitely pointing to bib1 --> <bibliography id="bib1"/> </section> <section> <para><citation/></para> <!-- definitely pointing to bib2 --> <bibliography id="bib2"/> </section> <section> <para><citation/></para> <!-- definitely pointing to bib3 --> </section> <section> <para><citation/></para> <!-- definitely pointing to bib3 --> </section> <bibliography id="bib3"/> </section> </part>
Table of Contents
The astute author would note that DocBook permits
bibliography
elements within
appendix
, article
, book
, chapter
, glossary
, part
, preface
,
sect1
, sect2
, sect3
, sect4
, sect5
and section
parents.
RefDB-lite, should generally work quite happily with those. However, it is
technically possible to create a complete nonsense hierarchy without too much effort,
such as the following pseudo DocBook structure:
<set> <setinfo> <citation/> <!-- With no possible target bibliography --> </setinfo> <book> <bibliography id="bib1"/> <chapter> <citation/> <!-- definitely pointing to bib2 --> <bibliography id="bib2"/> </chapter> <chapter> <citation/> <!-- Which bibliography should this point to?--> </chapter> <bibliography id="bib3"/> <appendix> <citation/> <!-- Which bibliography should this point to ?--> </appendix> </book> </set>
Clearly there is ample scope for processing confusion. But, it is to be hoped that the wary author will exercise a modicum of restraint in their placement of bibliographies, despite the flexibility that DocBook provides.
Another major pitfall lies with the mismatch between the RISX, and DocBook "raw" bibliographic formats, and the application of the CITESTYLE formatting thereof. To be clear, RISX is a relatively simple mapping into XML of the aged, but widely supported, RIS bibliographic information format. Although useful, RIS, is rather crude in terms of detailed semantic and formatting information it can handle. The RefDB CITESTYLE formatting specification was designed to reflect the keys and elements of the RDBMS relational tables into which RefDB interally partitions and disects the imported RIS data. In contrast, the capacity of DocBook to hold "raw" bibliographic information, is almost limitless, though not necessarily clear, concise, easy to author or unique.
As a result there are several issues that will need careful consideration. In particular, can the representation of dates in the RefDB exported RISX be standardised in a numerical format? Are there better ways to handle journal series abbreviations? Can use of the USERDEFn and MISCn attributes be standardised across different citation styles? Could corporate authors, editors and publisher items be identified and transformed more consistently? And does the current RefDB-lite RISX (1) to DocBook "raw" conversion process rigorously capture all the intricacies of the source format?
Other issues will arise in the form of bugs and defects of the current XSL implementation of RefDB-lite, for instance RISX uses an abbreviated journal title compactification scheme whose expansion hasn't yet been implemented here
Another glaring ommission is that ordering of multiple references in a multi-biblioref citation, according to a numeric citation style, does not apply ascending or descending date ordering. Of course you could do it yourself, but this is a computer dammit. It automates things.
A third example arose with the handling of olink
s in a
handcrafted DocBook "raw" biblioentry, that exposed a deficiency in the
underlying DocBook style sheets.
If olink
s are added within an internal bibliography, resolving those in the
DocBook style sheets requires customising the xref.xsl
olink
template
to support <xsl:param name="context" select="/"/>
.
Then modify its document($target.database.filename,/)
statement to use
document($target.database.filename,$context)
(Potentially this could be remedied in the DocBook XSL stylesheets, but of course
noone ever needed it before).
You will have to pass the context parameter to the select.target.database
template as well (called within the olink template) and then repeat the above
for the common/olink.xsl
select.target.database
template too.
The problem is that the olink
is coppied into a
temporary result tree with an obscure base:uri
,
whereas document()
needs to resolve the
olink
with respect to the source DocBook documents identifier
in the olink
database (hmmm, maybe we could copy the document ID
to the rootnode of the temporary tree?).
Thats about it. Enjoy!
Table of Contents
There is just a practical demonstration of using and formatting citations. Consider it a tutorial, but there is really nothing new to be learnt here.
We used the CITESTYLE INTEXTDEF format (-X
) citation,
for example
"(1)",
throughout in this document.
This, so-called parenthetic allusion,
effortlessly switches presentation style between numerically ordered
(numeric)
and alphabetically sorted (author-year) bibliography formats.
However we could have used an AUTHORONLY (-A
) format to
say something like, "Walsh, Muellner, and Stayton wrote
a tremendously useful book — DocBook: The Definitive Guide
(1)" where we followed
followed the explicit title with a YEARONLY (-Y
) format citation.
Like INTEXTDEF, the year can change from being a year (obviously), to a numerical reference,
according to the specifics of the CITESTYLE file you choose.
It really is a shame though that we can't explicitly cite the reference title here though.
Anyway, those are the most portable citation formats across the different styles.
But, if you know from the outset that you will only be using an
author-year format, then it would be perfectly ok
to go using Whole AUTHORYEAR,
(-W
) style citations, willy nilly,
to your hearts content, so long as the sentence grammar remains intact and
for example,
Walsh, Muellner, and Stayton (1) as well as
the USA Library of Congress (2), do
not object to having their names cast about gratuitously, just to illustrate
a rather irrelevant issue. You should note, that AUTHORYEAR is not
supported by RefDB propper, so in the event you switch from RefDB-lite
to full RefDB support, using the runbib command,
you might be sadly disappointed.
Just for the record though, most bibliographic styles do not include a parenthetic AUTHORONLY mode, for the simple reason that it does not in general discriminate unambiguously between different references by the same author(s). You could however construct your own CITESTYLE to achieve this effect if it was really important to you.
Beyond the foregoing it starts to get tricky, as we move into the domain of multiple citations. For instance Markus appears to be really quite a prolific chap herein, with four references to his name (3-6), created as:
<citation role="REFDB"> <biblioref endterm="RefDB-X"/> <biblioref endterm="RefDB_Man-X"/> <biblioref endterm="RISX-X"/> <biblioref endterm="CITESTYLE-X"/> </citation>
But sadly this package, RefDB-lite, doesn't really do him justice, formatting wise. A good formatting package, in an author-year mode, would have contracted that down to something like (Hoenicka, 2005; 2005b; 2005c; 2005d). Possibly we could simulate that by citing years on the last three, e.g. (3-6), but that is more fluffing around than is really desireable in an automated world. In a numeric CITESTYLE scheme, of course that is irrelevant, as it contracts quite nicely to something like "(3—6)".
For the final citation formatting tests, it is useful to examine the treatment of multiple references, particularly sequential references in a numeric style. For example
<citation role="REFDB"> <biblioref endterm="Walsh99-X"/> <biblioref endterm="MODS-X"/> <biblioref endterm="RefDB_Man-X"/> <biblioref endterm="RISX-X"/> <biblioref endterm="CITESTYLE-X"/> <biblioref endterm="XSLT_1.0-X"/> </citation>
(One could really get carried away here). Now that parenthetic allusion renders as: "(1; 2; 4-7)". But watch what happens if we change the citation order to:
<citation role="REFDB"> <biblioref endterm="XSLT_1.0-X"/> <biblioref endterm="RefDB_Man-X"/> <biblioref endterm="Walsh99-X"/> <biblioref endterm="RISX-X"/> <biblioref endterm="MODS-X"/> <biblioref endterm="CITESTYLE-X"/> </citation>
Hopefully the citations
"(1; 2; 4-7)"
are now ordered and sorted identically to the previous example.
Note that these biblioref
references explictly used the
portable -X
formatting command. Watch out for that
or your carefully crafted document may unexpectedly transform into rubbish!
Clearly there are certain issues to be settled regarding optional
sorting and reordering of multiple biblioref
s,
but thats a job for another day.
A better chap than I might also have demonstrated the presentation of accented characters and foreign language titles etc. But I don't know nuttin' bout such things.
So that concludes tonights viewing. Goodnight.