[OPEN-ILS-DOCUMENTATION] More semantic markup - good project for an enthusiastic volunteer

Dan Scott dan at coffeecode.net
Sat Sep 11 01:47:29 EDT 2010


Hey:

It's awesome to see so much happening with the documentation! I
created a Gitorious project for the Evergreen documentation over at
http://gitorious.org/evergreen-documentation if anyone is interested
in what I'm up to. I'll track Robert's repo, but hope to be able to
help out as time permits.

In a few of the sections I poked at, there's currently a lot of use of
<emphasis> tags; these are presentation-oriented rather than semantic.
For the purposes of generating HTML and PDF, that's definitely cool in
the short term, but over the longer term adopting more semantic markup
gives us the ability to have more fine-grained control over the
stylesheets and resulting output. It also makes it a bit easier to
focus on marking up _what_ the thing is rather than trying to remember
how it's supposed to look.

As a couple of quick examples, in one file <emphasis role="bold"> is
used to mark up usernames, commands, and file paths and file names;
<emphasis> is used to mark up command options (--foo) and commands
(/bin/foo); and double-quotes are used to denote commands. But DocBook
gives us ways to differentiate between all of those things:

  * <command> is used for command names
(http://www.docbook.org/tdg5/en/html/command.html)
  * <filename> is used for file paths and file names
(http://www.docbook.org/tdg5/en/html/filename.html)
  * <option> is used for command options
(http://www.docbook.org/tdg5/en/html/option.html)
  * <systemitem class="username"> is used for user names
(http://www.docbook.org/tdg5/en/html/systemitem.html)

In other areas, values that users are expected to input are not marked
up at all - but could be marked up as <userinput>. <screen> sections
could often be broken down into <prompt> and <userinput> sections. I
have to give kudos to Sitka for their DocBook style guide
(http://coconut.pines.bclibrary.ca:21080/docbook/Style/draft/html/ch01.html)
- that kind of resource goes a long way towards keeping a single body
of documentation coherent.

Now, it's a bit hypocritical for a person like me who has been writing
primarily in AsciiDoc for the past year to suggest using more semantic
markup. I'm a big fan of getting things done, though, and the DIG is
definitely getting things done, so keep writing, first and foremost!
As the subject suggests, someone else can always come along and
enhance the markup, but pulling quality content together is job #1.
I'm willing to tackle one big chunk of content for a semantic cleanup;
I'm looking at the installation document, as I've been one of the
maintainers of that information in the wiki and am familiar with the
turf.

Ah, and one quick tip that might save you some pain: if you find
yourself with long <programlisting> or <screen> sections that contain
a lot of XML that you're changing < to &lt; to avoid XML errors, you
can wrap the whole thing in <![CDATA[ .... ]]> and the XML parser will
skip it. For example:

<programlisting language="xml">
<![CDATA[
<!-- Example of an app-specific setting override -->
<opensrf.persist>
  <app_settings>
  <dbfile>/tmp/persist.db</dbfile>
  </app_settings>
</opensrf.persist>
]]>
</programlisting>

It's a lot easier to copy and paste that as a writer than to copy and
paste it, then run a regular expression to escape the worrisome
characters!


More information about the OPEN-ILS-DOCUMENTATION mailing list