mercredi 23 décembre 2009

Strict reversibility in XSugar

XSugar is a tool to do bidirectional transformations between two file format. This is particulary useful to provide common API to configuration files under Linux. For example, here is the result of a stylesheet on /etc/hosts file :

<hosts xmlns="http://usherbrooke.ca/">
<record>
<ipaddr>127.0.0.1</ipaddr>
<canonical>localhost</canonical>
</record>
</hosts>

This file can be converted back to it's flat format. But, as you may notice, indentation doesn't appears in the XML file, and will be lost. Spacing is reset to a default value. The round-trip between hosts file and XML format keeps the semantic, but looses formating. Even without modification, if the file is written back, diff will show changes. Once spaces are reset, round-trip will yield identity function i.e. strings will be exactly the same.

One solution to overcome this problem is to add to the XML all elements that would be lost otherwise. This can be done by labeling terminal elements, and add corresponding nodes to XML part of the stylesheet. For examples, this rule loose optional "a" header :

A = [a]*
X = [x]+
n : [A] [X x] "z" = <x> [X x] </x>

Providing input "aaaaxxz" will give the following XML :
<x>xx</x>
Converting it back to non-XML will yield the string "xxz". Since the empty string matches "[a]*", this is the default string that is returned.

Now, let's label the terminal "A" :

A = [a]*
X = [x]+
n : [A a] [X x] "z" = <x> [X x] <a> [A a] <a></x>

Now, we get the string
<x>
  xx
  <a>aaaa</a>
</x>

and converting it back to non-XML format yield "aaaaxxz", the exact same string as the original input.

Preserving semantic of the file is simple bidirectional property. In addition, if the stylesheet preserve the concrete representation of an input, I call this strict bidirectionality.

Strict bidirectionality can be achieved by labeling unlabeled terminal, and add corresponding element to the XML part. I did a small prototype of this algorithm, that augment the resulting stylesheet. Hence, any stylesheet can be made strict bidirectional.

It rises the question : can we staticaly verify that a stylesheet is strictly bidirectional. Hopefully yes, it's really simple. We have to do the basic check that the stylesheet is bidirectional, and then verify that all regular expression terminal are labeled. This way, we are sure that all the variable concrete string will be represented in the XML.

Automatic strict bidirectionality for stylesheet and static validation of this property will be useful to provide the behavior a system administrator would expect from a tool that modify configuration files under Linux. Let's go on!

Aucun commentaire: