Groklaw reports that the (US) State of Massachusetts is going to require all agencies to store public documents in nonproprietary formats such as HTML or PDF.

Acceptable formats, according to the state's Information Technology Division, are now Rich Text Format v. 1.7 (.rtf); Plain Text Format (.txt); Hypertext Document Format (.htm); Portable Document Format (.pdf) - Reference version 1.5; Extensible Markup Language (XML) v. 1.0 (Third Edition) or v 1.1 "when necessary".

This article has some meeting notes which encapsulate the issue nicely:

It should be reasonably obvious for a lay person who looks at the concept of Public Documents that we've got to keep them independent and free forever because it is an overriding imperative of the American democratic system. That we cannot have our public documents locked up in some kind of proprietary format or locked up in a format that you need to get a proprietary system to use some time in the future. So, one of the things that we're incredibly focused on is insuring that the public records remain independent of underlying systems and applications insuring their accessibility over very long periods of time. In the IT business a long period of time is about 18 months, in government it's about 300 years, so we have slightly different perspective.

There is even some comment in the article that Microsoft may come to the party and alter the license terms associated with their (Word 2003) XML doc format. The issue here is that while the doc is an open format, the semantic interpretation of it is wrapped up in a very restrictive license. (There is considerable discussion of this licensing issue in the groklaw article). The bottom line is that if Microsoft chose to implement those license conditions it is feasible that an open source piece of software which read Word 2003 docs could be found to be illegal …

… meanwhile, until the MS Office schema are in the public domain with an appropriate license, there will be no Microsoft documents held in our long term archive without pdf copies … (or of course, MS goes to using a proper semantic markup of their documents).

In the mean time, well done Massachusetts!