jeudi 10 septembre 2009

Configuration as XML

Augeas provides an XPath like interface and API to access and modify configuration files. But, there are few limitations :
  • Augeas is limited to regular grammar, it can't parse nested structured documents. Il did search to see if it could be possible to use regular approximations for context free grammar, but in this case it's not possible. A regular expression parser uses only a finite automata, and for general context free grammar, we need a stack to keep track of the nested level of the document.
  • Augeas doesn't provide a complete xml file from a configuration file, and hence, can't use all the XML libraries processing available.
We need a parser that will be able to parse a general context-free grammar. It should be easy to write grammars, and LR or LALR parsers are too hard to user, since grammar must be written to avoid ambiguities and some type of recursion. The Earley parser algorithm is able to do that.

The project XSugar is exactly what I was looking for. First, it implements a tokenless Earley parser, that has relative acceptable performances on config files. XSugar is able to do bidirectional transformation between a concrete file and an XML document and vice versa. There are few issues that must be resolved.
  • Bidirectional relation doesn't preserve formating of the config file. The reversability propriety is hence approximate, because a round trip will yield the same result, except for spaces and indentation. You have to keep formating manually, and this can be tedious. There is no way to verify that the stylesheet is able to capture all character of the input. Strict unidirectionality is required for config files.
  • Ignorable Elements, like nodes to keep spaces and indentation, has to be present in the XML file, otherwise the unparsing fail. The problem is that clients that will modify the XML will have to add formating nodes. One of the main benefit of using XML was to abstract formating, and this requirement on XML breaks this abstraction. Ignorable Elements must be optional, and when not provided, a default value should be used.
  • The order of elements matters in the XML. If nodes are not provided in the right order, the unparsing fail. The client has to know in which order to provide Elements, and it would be better if the client has not to worry about it.
Those are the main issues I see to make a new day for configuration management come true.

To test those concepts, I created a new project, called Noesis. It means "insight", and I thought it would be meaningful for the current project. News soon.

Aucun commentaire: