Tangle

Part of Literate Programming in XML

Norman Walsh

$Id: tangle.xweb,v 1.4 2002/12/27 15:51:54 nwalsh Exp $

05 Oct 2001


Table of Contents

1. The Stylesheet
1.1. The tangle.xsl Stylesheet
1.2. The xtangle.xsl Stylesheet
2. Initialization
3. The Root Template
4. Processing Fragments
4.1. Convenience Variables
4.2. Handle First Node
4.3. Handle Last Node
4.4. Handle the Middle Nodes
5. Copying Elements
5.1. Copying src:passthrough
5.2. Copying src:fragref
5.3. Copying Disable-Output-Escaping Fragment References
5.4. Copying Everything Else
6. Copy XML Constructs

The tangle.xsl stylesheet transforms an XWEB document into a “source code” document. This is a relatively straightforward process: starting with the top fragment, all of the source fragments are simply stitched together, discarding any intervening documentation.

The resulting “tangled” document is ready for use by the appropriate processor.

1. The Stylesheet

This XWEB document contains the source for two stylesheets, tangle.xsl and xtangle.xsl. Both stylesheets produce tangled sources, the latter is a simple customization of the former for producing XML vocabularies.

Each of these stylesheets performs some initialization, sets the output method appropriately, begins processing at the root template, and processes fragments, copying the content appropriately.

1.1. The tangle.xsl Stylesheet

The tangle stylesheet produces text output.

§1.1.1

  1| <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |                 exclude-result-prefixes="src"
   |                 version="1.0">
   | 
  5|   §2.1. Initialization
   | 
   |   <xsl:output method="text"/>
   | 
   |   §3.1. The Root Template
 10|   §4.1. Processing Fragments
   |   §5.1. Copying Elements
   | </xsl:stylesheet>

1.2. The xtangle.xsl Stylesheet

The xtangle stylesheet produces XML output.

§1.2.1

  1| <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |                 exclude-result-prefixes="src"
   |                 version="1.0">
   | 
  5|   §2.1. Initialization
   | 
   |   <xsl:output method="xml"/>
   | 
   |   §3.1. The Root Template
 10|   §4.1. Processing Fragments
   |   §5.1. Copying Elements
   |   §6.1. Copy XML Constructs
   | </xsl:stylesheet>

2. Initialization

The stylesheet initializes the processor by loading its version information (stored in a separate file because it is shared by several stylesheets) and telling the processor to preserve whitespace on all input elements.

The stylesheet also constructs a key for the ID values used on fragments. Because XWEB documents do not have to be valid according to any particular DTD or Schema, the stylesheet cannot rely on having the IDs identified as type ID in the source document.

§2.1: §1.1.1, §1.2.1

  1|   <xsl:include href="VERSION"/>
   |   <xsl:preserve-space elements="*"/>
   | 
   |   <xsl:key name="fragment"
  5|            match="src:fragment"
   |            use="@id"/>
   | 
   |   <xsl:param name="top"
   |              select="'top'"/>
 10| 

3. The Root Template

The root template begins processing at the root of the XWEB document. It outputs a couple of informative comments and then directs the processor to transform the src:fragment element with the $top ID.

Source code fragments in the XWEB document are not required to be sequential, so it is necessary to distinguish one fragment as the primary starting point.

§3.1: §1.1.1, §1.2.1

  1| <xsl:template match="/">
  2|   <xsl:apply-templates select="key('fragment', $top)"/>
  3| </xsl:template>

4. Processing Fragments

In order to “tangle” an XWEB document, we need only copy the contents of the fragments to the result tree.

Processing src:fragment elements is conceptually easy, simply copy their children. However, if we simply used:

<xsl:apply-templates mode="copy"/>

we'd copy the newlines at the beginning and end of a fragment that the author might have added for editing convenience. In environments where whitespace is significant (e.g., Python), this would introduce errors. We must avoid copying the first and last newlines.

§4.1: §1.1.1, §1.2.1

  1| <xsl:template match="src:fragment">
   |   §4.1.1. Convenience Variables
   |   §4.2.1. Handle First Node
   |   §4.4.1. Handle the Middle Nodes
  5|   §4.3.1. Handle Last Node
   | </xsl:template>

4.1. Convenience Variables

For convenience, we store subexpressions containing the first, last, and all the middle nodes in variables.

§4.1.1: §4.1

  1|   <xsl:variable name="first-node"
   |                 select="node()[1]"/>
   |   <xsl:variable name="middle-nodes"
   |                 select="node()[position() > 1 and position() < last()]"/>
  5|   <xsl:variable name="last-node"
   |                 select="node()[position() > 1 and position() = last()]"/>

4.2. Handle First Node

Handling the leading newline is conceptually a simple matter of looking at the first character on the line and skipping it if it is a newline. A slight complexity is introduced by the fact that if the fragment contains only a single text node, the first node is also the last node and we have to possibly trim off a trialing newline as well. We separate that out as a special case.

4.2.1. Handle A Fragment that Contains a Single Node

If the $first-node is a text node and the fragment contains only a single child, then it is also the last node.

In order to deal with a single text node child, we must address four cases: the node has both leading and trailing newlines, the node has only leading newlines, only trailing newlines, or no newlines at all.

4.2.1.1. More Convenience Variables

For convenience, we calculate whether or not the node in question has leading and/or trailing newlines and store those results in variables.

§4.2.1.1.1: §4.2.1.1

  1|         <xsl:variable name="leading-nl"
   |                       select="substring($first-node, 1, 1) = '
   | '"/>
   |         <xsl:variable name="trailing-nl"
  5|                       select="substring($first-node, string-length($first-node), 1) = '
   | '"/>
4.2.1.2. Handle a Single Node With Leading and Trailing Newlines

If the node has both leading and trailing newlines, trim a character off each end.

§4.2.1.2.1: §4.2.1.1

  1|           <xsl:when test="$leading-nl and $trailing-nl">
  2|             <xsl:value-of select="substring($first-node, 2, string-length($first-node)-2)"/>
  3|           </xsl:when>
4.2.1.3. Handle a Single Node With Only Leading Newlines

If the node has only leading newlines, trim off the first character.

§4.2.1.3.1: §4.2.1.1

  1|           <xsl:when test="$leading-nl">
  2|             <xsl:value-of select="substring($first-node, 2)"/>
  3|           </xsl:when>
4.2.1.4. Handle a Single Node with Only Trailing Newlines

If the node has only trailing newlines, trim off the last character.

§4.2.1.4.1: §4.2.1.1

  1|           <xsl:when test="$trailing-nl">
  2|             <xsl:value-of select="substring($first-node, 1, string-length($first-node)-1)"/>
  3|           </xsl:when>
4.2.1.5. Handle a Single Node with No Newlines

Otherwise, the node has no newlines and it is simply printed.

§4.2.1.5.1: §4.2.1.1

  1|           <xsl:otherwise>
  2|             <xsl:value-of select="$first-node"/>
  3|           </xsl:otherwise>

4.2.2. Handle a First Node with a Leading Newline

If the first node is a text node and begins with a newline, trim off the first character.

§4.2.2.1: §4.2.1

  1|       <xsl:when test="$first-node = text() and substring($first-node, 1, 1) = '
  2| '">
  3|         <xsl:value-of select="substring($first-node, 2)"/>
  4|       </xsl:when>

4.2.3. Handle a First Node without a Leading Newline

Otherwise, the first node is not a text node or does not begin with a newline, so use the “copy” mode to copy it to the result tree.

§4.2.3.1: §4.2.1

  1|       <xsl:otherwise>
  2|         <xsl:apply-templates select="$first-node"
  3|                              mode="copy"/>
  4|       </xsl:otherwise>

4.3. Handle Last Node

Handling the last node is roughly analagous to handling the first node, except that we know this code is only evaluated if the last node is not also the first node.

If the last node is a text node and ends with a newline, strip it off. Otherwise, just copy the content of the last node using the “copy” mode.

§4.3.1: §4.1

  1|     <xsl:choose>
   |       <xsl:when test="$last-node = text() and substring($last-node, string-length($last-node), 1) = '
   | '">
   |         <xsl:value-of select="substring($last-node, 1, string-length($last-node)-1)"/>
  5|       </xsl:when>
   |       <xsl:otherwise>
   |         <xsl:apply-templates select="$last-node"
   |                              mode="copy"/>
   |       </xsl:otherwise>
 10|     </xsl:choose>

4.4. Handle the Middle Nodes

The middle nodes are easy, just copy them using the “copy” mode.

§4.4.1: §4.1

  1|     <xsl:apply-templates select="$middle-nodes"
  2|                          mode="copy"/>

5. Copying Elements

Copying elements to the result tree can be divided into four cases: copying passthrough elements, copying fragment references, and copying everything else.

5.1. Copying src:passthrough

Passthrough elements contain text that is intended to appear literally in the result tree. We use XSLT “disable-output-escaping” to copy it without interpretation:

§5.1.1: §5.1

  1| <xsl:template match="src:passthrough"
   |               mode="copy">
   |   <xsl:value-of disable-output-escaping="yes"
   |                 select="."/>
  5| </xsl:template>

5.2. Copying src:fragref

With a unique exception, copying fragment references is straightforward: find the fragment that is identified by the cross-reference and process it.

The single exception arises only in the processing of src:fragref elements in the weave.xweb document. There is a single template in the “weave” program that needs to copy a literal src:fragref element to the result tree. That is the only time the §5.3.1. Copying Disable-Output-Escaping Fragment References branch is executed.

§5.2.1: §5.1

  1| <xsl:template match="src:fragref"
   |               mode="copy">
   |   <xsl:variable name="node"
   |                 select="."/>
  5|   <xsl:choose>
   |     §5.3.1. Copying Disable-Output-Escaping Fragment References
   |     §5.2.1.1. Copying Normal Fragment References
   |   </xsl:choose>
   | </xsl:template>

5.2.1. Copying Normal Fragment References

To copy a normal fragment reference, identify what the linkend attribute points to, make sure it is valid, and process it.

§5.2.1.1: §5.2.1

  1|     <xsl:otherwise>
   |       <xsl:variable name="fragment"
   |                     select="key('fragment', @linkend)"/>
   |       §5.2.1.1.1. Fragment is Unique
  5|       §5.2.1.2.1. Fragment is a src:fragment
   |       <xsl:apply-templates select="$fragment"/>
   |     </xsl:otherwise>
5.2.1.1. Fragment is Unique

Make sure that the linkend attribute points to exactly one node in the source tree. It is an error if no element exists with that ID value or if more than one exists.

§5.2.1.1.1: §5.2.1.1

  1|       <xsl:if test="count($fragment) != 1">
   |         <xsl:message terminate="yes">
   |           <xsl:text>Link to fragment "</xsl:text>
   |           <xsl:value-of select="@linkend"/>
  5|           <xsl:text>" does not uniquely identify a single fragment.</xsl:text>
   |         </xsl:message>
   |       </xsl:if>
5.2.1.2. Fragment is a src:fragment

Make sure that the linkend attribute points to a src:fragment element.

FIXME: this code should test the namespace name of the $fragment

§5.2.1.2.1: §5.2.1.1

  1| <xsl:if test="local-name($fragment) != 'fragment'">
   |   <xsl:message terminate="yes">
   |     <xsl:text>Link "</xsl:text>
   |     <xsl:value-of select="@linkend"/>
  5|     <xsl:text>" does not point to a src:fragment.</xsl:text>
   |   </xsl:message>
   | </xsl:if>

5.3. Copying Disable-Output-Escaping Fragment References

A src:fragref that specifies disable-output-escaping is treated essentially as if it was any other element. The only exception is that the disable-output-escaping attribute is not copied.

Because tangle and weave are XSLT stylesheets that process XSLT stylesheets, processing src:fragref poses a unique challenge.

In ordinary tangle processing, they are expanded and replaced with the content of the fragment that they point to. But when weave.xweb is tangled, they must be copied through literally. The disable-output-escaping attribute provides the hook that allows this.

§5.3.1: §5.2.1

  1|     <xsl:when test="@disable-output-escaping='yes'">
   |       <xsl:element name="{name(.)}"
   |                    namespace="{namespace-uri(.)}">
   |         §5.4.1.1. Copy Namespaces
  5|         <xsl:for-each select="@*">
   |           <xsl:if test="not(name(.) = 'disable-output-escaping')">
   |             <xsl:copy/>
   |           </xsl:if>
   |         </xsl:for-each>
 10|         <xsl:apply-templates mode="copy"/>
   |       </xsl:element>
   |     </xsl:when>

5.4. Copying Everything Else

Everything else is copied verbatim. This is a five step process:

  1. Save a copy of the context node in $node so that we can refer to it later from inside an xsl:for-each.

  2. Construct a new node in the result tree with the same qualified name and namespace as the context node.

  3. Copy the namespace nodes on the context node to the new node in the result tree. We must do this manually because the XWEB file may have broken the content of this element into several separate fragments. Breaking things into separate fragments makes it impossible for the XSLT processor to always construct the right namespace nodes automatically.

  4. Copy the attributes.

  5. Copy the children.

§5.4.1: §5.1

  1| <xsl:template match="*"
   |               mode="copy">
   |   <xsl:variable name="node"
   |                 select="."/>
  5|   <xsl:element name="{name(.)}"
   |                namespace="{namespace-uri(.)}">
   |     §5.4.1.1. Copy Namespaces
   |     <xsl:copy-of select="@*"/>
   |     <xsl:apply-templates mode="copy"/>
 10|   </xsl:element>
   | </xsl:template>

For non-XML source docuements, this template will never match because there will be no XML elements in the source fragments.

5.4.1. Copy Namespaces

Copying the namespaces is a simple loop over the elements on the namespace axis, with one wrinkle.

It is an error to copy a namespace node onto an element if a namespace node is already present for that namespace. The fact that we're running this loop in a context where we've constructed the result node explicitly in the correct namespace means that attempting to copy that namespace node again will produce an error. We work around this problem by explicitly testing for that namespace and not copying it.

§5.4.1.1: §5.3.1, §5.4.1

  1|   <xsl:for-each select="namespace::*">
   |     <xsl:if test="string(.) != namespace-uri($node)">
   |       <xsl:copy/>
   |     </xsl:if>
  5|   </xsl:for-each>

6. Copy XML Constructs

In the xtangle.xsl stylesheet, we also want to preserve XML constructs (processing instructions and comments) that we encounter in the fragments.

Note that many implementations of XSLT do not provide comments in the source document (they are discarded before building the tree), in which case the comments cannot be preserved.

§6.1: §1.2.1

  1| <xsl:template match="processing-instruction()"
   |               mode="copy">
   |   <xsl:processing-instruction name="{name(.)}">
   |     <xsl:value-of select="."/>
  5|   </xsl:processing-instruction>
   | </xsl:template>
   | 
   | <xsl:template match="comment()"
   |               mode="copy">
 10|   <xsl:comment>
   |     <xsl:value-of select="."/>
   |   </xsl:comment>
   | </xsl:template>