TransQuery Specification

Evan Lenz

This version: 25 Oct 2001

Latest version

Abstract

TransQuery comprises a small, flexible set of XSLT conventions and processing model constraints that enable the use of XSLT as a query language over multiple XML documents. This is a working draft for version 1.0 of TransQuery.


Table of Contents

Introduction
Meta-document
TransQuery Input
Conformance
XSLT Stylesheets
XSLT Processing Models
XSLT Processors
Future Work
References

Introduction

Though [XSLT] is principally oriented toward transformation of a single source document, it has built-in mechanisms for accessing multiple source documents. It is the purpose of TransQuery to reconcile and harmonize the single-source view with a multiple-source view, using existing standard constructs in XSLT 1.0.

In XSLT 1.0, multiple source documents can be accessed in two ways:

  1. using the document() function (see 12.1 Multiple Source Documents)

  2. passing a node-set in as the value of a top-level parameter (see 11.4 Top-level Variables and Parameters)

TransQuery defines two conventions that utilize the above two mechanisms, respectively.

Meta-document

This convention involves the use of a default XML source tree, or "meta-document", that contains metadata about other XML documents. The word "default" is open to interpretation; in general, it refers to situations where multiple documents are treated equally as input to a query/stylesheet. There is no restriction on the type, extent, or format of the metadata, except that URIs of documents must be provided in order to be resolved by the document() function. This approach provides the query author direct access to metadata, application-level collection constructs, etc. It is flexible enough to accommodate any of a number of metadata paradigms, such as [RDF], [XLink], and [WebDAV].

Editor's Note

There is room for discussion on whether TransQuery should recommend or invent particular meta-document formats, URI schemes, etc.

TransQuery Input

This specification defines a namespace-qualified name, {http://www.xmlportfolio.com/transquery}input, for use as the name of a top-level parameter, the value of which must be a node-set consisting of zero or more root nodes. The set of root nodes is determined externally by the application. This convention facilitates the use of collection-oriented APIs (e.g. the [XML:DB API]), where the input available to a given query is determined outside of the query itself.

Conformance

XSLT Stylesheets

The purpose of this specification is to encourage a particular use of XSLT that is not antagonistic to any other use of XSLT. In fact, the meta-document convention is simply a codification of existing practice. It implies no restriction on the use of individual documents as source trees; a TransQuery system should provide equal support for meta-document queries and transformations of individual source documents. Accordingly, this specification does not define conformance requirements for XSLT stylesheets.

Note

While there are no TransQuery conformance requirements for XSLT stylesheets, it is considered bad practice to use names in the TransQuery namespace for purposes other than as described in this specification.

XSLT Processing Models

The term "processing model" is used here to denote the framework in which the XSLT processor itself lies. The processing model determines how the source tree is located, how parameters are passed to the stylesheet, what to do with the result tree, etc.

Meta-document

This specification neither requires nor forbids the use of any particular URI scheme, fragment identifier, or document retrieval mechanism in the implementation of XSLT's document() function. These, as is allowed for in the XSLT recommendation, are implementation-dependent. In addition, the name of the meta-document, where it is stored, and how it is retrieved are currently beyond the scope of this specification.

TransQuery Input

TransQuery-conformant processing models must provide some mechanism outside of the stylesheet for binding a parameter with the TransQuery input name to a node-set consisting of zero or more root nodes. No other type may be bound to this parameter.

XSLT Processors

The XSLT recommendation states that "XSLT does not define the mechanism by which parameters are passed to the stylesheet."[XSLT] Since the TransQuery conventions do not utilize any processor-specific extensions, the only requirement for an XSLT processor to be used in a TransQuery application is that it provide a mechanism by which an XPath node-set can be supplied as the value of a top-level parameter.

Future Work

The following items are under consideration for future discussion and experimentation:

  • Specific meta-document formats

  • Using XSLT as a document update language

References

James Clark, editor. XSL Transformations Version 1.0. W3C (World Wide Web Consortium), 1999.

Ora Lassila, Ralph R. Swick, editors. Resource Description Framework (RDF) Model and Syntax Specification. W3C (World Wide Web Consortium), 1999.

Y. Goland, E. Whitehead, A. Faizi, S. Carter, D. Jensen. HTTP Extensions for Distributed Authoring -- WEBDAV.

XML:DB API Working Group, XML:DB Initiative. Application Programming Interface for XML Databases.