Sunday, November 29, 2009

Textml - an xml shorthand for your text files

The idea

A shorthand that can be used to mark up text in text files but which is easier to handle than writing full-blown xml.

Why bother?

Sometimes I'd like to be able to document things in a text file - from a todo list, to a plan, to documentation on something. I'd like to be able to add slightly richer structures to my text documents. If you use text documents and text editors, you might find this a reasonable idea.

Quickstart

You create tags like this:

  @plan {
    Here is a plan.
  }
  => <plan>Here is a plan.</plan>

Add attributes like this:

  @plan {
    @date: 24-Nov-09
    Here is a plan.
  }
  => <plan date="24-Nov-09" >Here is a plan.</plan>

I often find myself wanting to create a tag context but using long-winded names eg
  @tasks {
    @finish-by: 24-Nov-09

    hosting agreement {
      - do X
    }
  }
  

In this example "hosting agreement" is not necessarily something I want to be a tag. It's not so much the tag as the content of the tag or one of its attributes. As a result, I leave off the '@'.

Now it's up to the context to define how we interpret this structure. Let's suppose we were writing an xml converter for this textml document, then we might define a rule like this: "For non-'@' sections under @tasks, convert to:"

   <task finish-by="24-Nov-09" >
     <issue>hosting agreement</issue>
     <ul>
       <li>do X</li>
     </ul>
   </task>
  

A lot of the time you probably don't ever want to convert textml so your brain applies something similar to the above rule although without all the detail. In your mind you simple say "This is a section about ... ".

I find myself using this above shorthand a lot since I don't want to write in the extra context every time I start up a new section and textml is already doing a good job of providing context within which this shorthand is used.

Paragraphs

The main motivation for textml was to work with text files. so the issue of whether to insert paragraphs tags around each paragraph is a non-issue. You don't bother. You might use a blank line to break them up instead.

However, if you were wanting to convert to real xml, you might want to consider some sort of hinting system. A hint would work within the tag you set it and would probably be inherited. This hint would direct the parser to insert paragraph tags under certain conditions, or not to.

  
    @some-tag {
      #'auto-paragraph 1

      Sentence 1.
      Sentence 2.

      Sentence 3.
    }
    =>
    <some-tag>
      <p>Sentenxt 1.  Sentence 2.</p>
      <p>Sentenxt 3.</p>
    </some-tag>
  
  

In the above example "#'" signifies a hint which has been set for @some-tag. This hint tells the parser to generate paragraph tags around content and to start a new paragraph after 1 empty line.

Bullets

Before I used textml, I used bullets - usually hyphens or asterisks in my text files. The same applies in textml. Use hyphens to create unordered lists. Indentation is assumed to be important. Note how the 2nd @description is indented the same as the string "item 1".

  - item 1
    - item 1.1
      @description {
        Description about item 1.1
      }
    @description {
      Description about item 1.
    }
  

The following attempts to make it more explicit and to remove reliance on the significance of indentation, but I don't think it will catch on:

  -{ item 1
     -{ item 1.1
        @description {
          Description about item 1.1
        }
      }
      @description {
        Description about item 1.
      }
   }
  

Numbered bullets might take the following form:

  # item 1
    # item 1.1
      @description {
        Description about item 1.1
      }
    @description {
      Description about item 1.
    }
  

Converting to xml

The point of textml is allow you to add structure to your text files. For the sorts of things I work on, I don't need to publish what I'm doing into xml.

However it would be fairly straightforward to write script or parser to generate the appropriate xml. Then transform that xml using xslt into other xml-based formats such as xhtml.

Benefits

  • It's easier to date things, sign things, or otherwise add some sort of meta data to something to give you extra information about what you were doing in your text file, when and why etc
  • Makes it easier to explore using xml to create structure or semi-structure. I'm interested in this because I'm looking at xml to represent certain types of knowledge and learning - think of dialectics, assignments, forums, q & a. textml makes it easy to play with some of this straight away in your text editor.
  • Once you've defined a "context" - a parent tag or hierarchy with attributes, you don't have to constantly remind your audience about it. Textml allows you to create structure within your text documents which defines context; once that's out of the way you can concentrate on the specific issue. For instance If my tag is
      @plan {
        @start-date: 24-Nov-09
        - Do X
        - Do Y
      }
     
    A context of a plan and its start date has already been set. Everything I write is within that context.

Limitations

  • A text file is not a web-based document so you can't show images.
  • You are confined to the linear format of a text file; in a browser you can potentially break things up a bit more as well as insert hyperlinks. If you're wanting to do that, you should work with real xml or xhtml in a browser or other capable user-agent.