I’ve just discovered generateDS (via Uche Ogbuj’s column).

generateDS.py generates Python data structures (for example, class definitions) from an XML Schema document. These data structures represent the elements in an XML document described by the XML Schema. It also generates parsers that load an XML document into those data structures. In addition, a separate file containing subclasses (stubs) is optionally generated. The user can add methods to the subclasses in order to process the contents of an XML document.

In the limitations, we find that

  • it supports the following XML schema constructs (which are described as a small subset of XML schema):
    • Attributes of types xs:string, xs:integer, xs:float, and xs:boolean.
    • Repeated sub-elements specified with maxOccurs=”unbounded”.
    • Sub-elements of simple types xs:string, xs:integer, and xs:float.
    • Sub-elements of complex types defined separately in the XML Schema document.
  • generateDS.py generates two kinds of parsers: one kind is based on SAX and the other is build on minidom.
    • the SAX parser is noted to be pretty broken and it’s advised not to use it.
    • both styles of parsers construct instances of the data structures generated by generateDS.py. This means that, even when the SAX parser is used, generateDS.py may not be well-suited for applications that read large XML documents, although what “large” means depends on the hardware involved.