Tag Library PDF Documentation Generator

I unlike other people, like tag libraries, they are nice, compact and satisfy all of the reusability that encapsulates good OO design. Unfortunately, there are some very basic rules which seem to be skipped over in a lot of the tutorials out there, little things that would ensure that you wouldn’t encounter the problems that turn people away from these things. This is not one of the posts where I will delve into the ins and outs of tag libraries. Instead I will use this to ‘scratch an itch’ of mine to do with tag libraries and their documentation.

  1. I like the hard copy, kill a tree documentation1. Something to hold in my hand for reference while marking up some HTML, rather than switching between editor windows on the monitor.
  2. XML, whilst it is a human readable format, is still hard to read2. Most of the XML is elements, with whitespace for formatting I have become relatively good at looking through a .tld file to find what I need.
  3. I also like cheat-sheets – I have used the JSTL for long enough now to know pretty much all of the tags and functions – however, there are some which I don’t use frequently enough and still have to refer to some sort of documentation.
  4. XML, whether you like it or not, can be very verbose3.

Due to the fact that a Tag Library Descriptor (.tld) file is just a plain old xml file it is not easy to add markup to the <description /> element for the individual tags and and header information. Of course you could add a CDATA section to the element which would then allow almost any type of formatting, however you will run into the following problems:

  1. Hard to human parse when looking at the text file
  2. Only contains one type of markup – be it HTML or xsl:fo


From the above list, the markup side should satisfy the following:

  • Be human readable in text format (e.g. when looking at it through an IDE, or text editor)
  • Be able to be used for multiple output formats4

On the project side, the following goals should be satisfied:

  • Be easily extensible
  • Work both in stand-alone and Ant modes
  • Be easy to install and use
  • Well documented

Which brings us to…


Rather than having inline CDATA sections which only supports a single markup format, a light-weight markup was designed to be human readable and easy to process for both PDF and HTML or any other output format

See the example page for more information

Implementation details

Having given some thought to how this should be implemented, the basic task is set up and running. However, I feel that I may have broken the easy to install and use mantra as it is quite onerous to satisfy the requirements for the fop conversion – oh well.

There are two ways that I can think of to parse the text.

The first involves tokenising the input element and writing out valid xml into a working copy. The tokenising is far more powerful and is written in Java, giving me greater control over everything. I have already written quite a few of these so it shouldn’t be too much of a challenge to write another one – just mind-numbingly similar to other code.

The second… wait for it… involves using xslt only with a helluva lot of recursive5 parsing algorithms. This is going to be my preferred way as it involves a little bit of a challenge without the hack-i-ness of using temporary files.

Ant integration

First for the library path installation of all of the required fop libraries

<path id="taskpath">
    <fileset dir="${util.lib.dir}">
        <include name="*.jar" />

Now for the actual task definition and invocation itself:

<taskdef name="tagdoc" classname="synapticloop.ant.TagDocTask" classpathref="taskpath" classpath="${dist.dir}/${ant.project.name}.jar" />
<tagdoc verbose="false" pdf="true" outputDir="tlddocs">
    <fileset dir="src/main/java/">
        <include name="**/*.*"/>
    <fileset dir="src/test">
        <include name="**/*.*"/>

Command line usage

This use case scenario may be dropped – I am not really sure that it is necessary.


  1. That being said – I very much do like the HTML documentation set for JavaDoc – go figure.
  2. Dostoyevsky writes some great books – but they are hard to read.
  3. I just wanted to add this as I don’t think I have mentioned it enough!
  4. Even through this only generates PDF output for files, from an extensibility aspect, it should be able to output to almost anything – be it HTML, xml or even image format. For HTML output see Tag Library Documentation Generator
  5. you haven’t really lived until you have done recursion through xslt

Like my footnotes?
Want to add footnotes to your blog?
They can be added easily to your WordPress installation