Office 2007 & IS29500 Conformance…
ZDNet UK this morning carried a story headlined “Microsoft Office 2007 fails OOXML conformance test“, and in a subheading explains;
Word documents generated by today’s version of Microsoft Office 2007 do not conform to the Office Open XML standard under development by the International Organization for Standardization, according to tests run by a document standards specialist.
Given that the recent ISO process to standardize OpenXML took significant amounts of feedback around the details of the specification and made changes based on that feedback this shouldn’t really be all that much of a revelation.
Knowing that the specification evolved significantly as it moved from Ecma 376 to IS29500, there should be no dispute that there are differences between the Ecma-376 documents produced by Microsoft Office today and the final IS29500 specification.
The ZDNet news story is generated based on a blog post that Alex Brown posted on his companies site a few days ago, he opens with;
I was excited to receive from Murata Makoto a set of the RELAX NG schemas for the (post-BRM) revision of OOXML, and thought it would be interesting to validate some real-world content against them, to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007.
Alex goes on to look at how the specification document for Ecma-376 (an Office 2007 document) conforms to the post BRM details of the specification now known as ISO/IEC IS29500.
He states in his post that he is doing this to “to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007“, and much as you might expect the document does not conform to a standard that didn’t exist at the time when the document was originally created.
At the end of his post Dr Brown asks “What’s next?”
To repeat the exercise with ISO/IEC 26300:2006 (ODF 1.0) and a popular implementation of OpenDocument. Will anybody be brave enough to predict what kind of result that exercise will have?
We’ll see.
My colleague Doug Mahugh took a closer look at Dr. Brown’s post and offers some of his own commentary on the unsurprising outcome of the tests that Alex did.
He opens with…
It’s an interesting question. Office 2007 supported the ECMA-376 standard, but many changes were made during the evolution from ECMA-376 to IS29500. How many of those changes affect the content in a typical large document?
… and later in the post goes on to say…
The results were predictable: the document was not conformant to either class. Changes made at the BRM are not yet reflected in any existing implementations, and in this case the Ecma spec was created over a year before those changes were made. Here are the totals:
- Validation against the strict schemas: 122,000 errors
- Validation against the transitional schemas: 84 errors
Office 2007 was designed to be highly compatible with existing documents, so it uses features of Open XML that provide backward compatibility, including many of the elements and attributes that were moved to “transitional status” as a result of the BRM. So the test of strict conformance, although interesting, is a bit abstract: it’s testing whether a document conforms to a subset of the spec that was defined after the document was created.
The second number is the more meaningful one. Those are places in the test document where something is done in a way that doesn’t match the final IS29500 spec. Alex provides one specific example, to show the types of changes caught by that test: an attribute with a value of “on” that should say “true” instead, due to “one of the many tidying-up exercises performed at the BRM.”
So where exactly does Microsoft stand on the issue of conformance of Microsoft Office 2007 with the final IS29500 specification?
Chris Capossela (Microsoft’s Senior VP for Office) addressed this in an open letter around a month ago by saying;
We’ve listened to the global community and learned a lot, and we are committed to supporting the Open XML specification that is approved by ISO/IEC in our products.
It is coming… Microsoft is committed to supporting the ISO/IEC standard for OpenXML!
As an aside, Doug’s post closes with a comment about two of this years winners of the Google Summer of Code;
Sphere: Related ContentGoogle recently unveiled the winning entries in Google’s Summer of Code 2008, a program that offers student developers stipends to write code for various open source projects. Two of this year’s winners are enhancements to the Open XML implementation in AbiWord.
I’m now seeing reports that a vote has been taken and the standard IS29500 has been adopted. Microsoft voted yes to appprove the Standard
I heard through the grapevine (Twitter) via @ErikaEC (site manager for the MSDN Office Developer Center) that the Standard has been approved, and also that Microsoft announces “Office 2010″ scheduled for tech preview in end of 2009, and scheduled for release in middle of 2010.