Archive for Standards

OpenXML PowerTools released to CodePlex

12 June 2008

I see Eric White is carrying details of the release of the OpenXML Powertools on his blog today.

If you want to be able to generate OpenXML documents on the server, without an installation of Microsoft Office, then this is the way to do it.

You’ll find the details on Eric’s blog by following this link;

Processing Open XML documents server-side using PowerShell is a powerful approach for creating, modifying, and transforming Open XML documents. The PowerTools for Open XML are examples and guidance that show how to do this. They consist of PowerShell cmdlets, and a number of example scripts that demonstrate the use of the cmdlets. Examples include automated word processing document and spreadsheet generation, and preparing documents for distribution external to a company, including removing comments, accepting revisions, applying a uniform theme to them, and applying a watermark to them.

His blog links to a video that explains how to install and use PowerTools for OpenXML in conjunction with the release version of the OpenXML SDK.

The three scenarios covered in the linked video are;

  • Developers who need to automatically generate documents programmatically. For example, developers may need to generate word processing documents from an XML file containing customer data.
  • IT professionals who often need to send reports, charts, and spreadsheets that summarize the state of their network, servers, computers, and more.
  • Information workers who need to prepare documents for publication outside of their company. To present a consistent appearance of documents, information workers may want to accept all revisions in the document, remove all comments, give a consistent style to the documents, digitally sign them, add a watermark, and more.

He also has a collection of other information that you will find helpful if you are looking for a way to generate or play with OpenXML documents on the server, or on a desktop without Microsoft Office installed.

OpenXML SDK v1.0 Now Available

10 June 2008

Back in March of this year, Doug Mahugh talked about the roadmap for the OpenXML SDK, an important set of tools that will allow developers to quickly develop applications that read and right OpenXML (ECMA-376) documents.

timeline

This first version of the SDK, which is available as of today, includes a set of APIs capable of manipulating Open XML Formats at the part level. This version of the SDK is a fully supported release that developers can use to build and deploy shipping solutions.

Version 2 will contain all the necessary components of the Open XML API architecture and the first CTP will be available in late summer on the MSDN download site.

Hundreds of solutions have been created by developers worldwide building on the 2007 Microsoft Office system. There are currently nearly 150 partners who have developed Open XML solutions. You can see profiles of some of them in the MSDN Partner directory, including Captaris, Intergen and Xinnovation just to name a few.

Through the Open XML SDK’s sample code and how-to articles on the programming object model, developers will be able to decrease their development time for scenarios such as:

  • Creating documents programmatically
  • Customizing parts within documents
  • Adding and inspecting custom XML within documents
  • Working with and customizing document properties

You can download the OpenXML SDK v1.0 from here, you will find more reading material on the MSDN site here, or you can participate in the MSDN discussion forums for the SDK here.

A Standard For Corporate Governance of IT

8 June 2008

Several blogs this week caught the ratification of IS 38500 by the ISO, at standard providing a framework for Corporate Governance of Information Technology.

Serge Thorn talks a little about the origins of this work, discussing how it draws from an Australian Standard, AS8015;

Establish clearly understood responsibilities for ICT (eg, ensure individuals understand and accept their responsibilities)

Plan ICT to best support the organisation (eg, ensure ICT plans fit current and future needs and the organisation’s corporate plans)

Acquire ICT validly (eg, ICT acquisitions should be made for approved reasons and in the approved way; on the basis of ongoing analysis)

Ensure ICT performs well, whenever required (eg, ensure ICT is fit for its purpose and is responsive to changing requirements)

Ensure ICT conforms with formal rules (eg, ensure compliance with external regulations and internal policies and practices)

Ensure ICT use respects human factors (eg, ensure ICT meets the evolving needs of the ‘people in the process’)

Consistent governance of information technology in corporate environments helps drive several different agendas, including assisting with delivering on the goal of interoperability between systems across diverse organizations which becomes easier as organizations become more predictable.

If you want to find out more about IS 38500 then Rod Drury provides a way to contact the ISO Chair for JTC1/SC7, Alison Holt, or you can download the standard from ISO directly by following this link.

ODF Support added to the Microsoft Office System - Additional Reading

23 May 2008

SaveAs The laziest type of blog post is one that just quotes a bunch of other people and adds little value in its own right, I tend to use this blog as a combination of a place to document some of my own views and a place to store my own notes as various events of interest take place, so I know that from time to time I’m guilty of over quoting.

This post is a combination of the two. Whenever Microsoft makes an announcement that is blog worthy there are generally two types of post that get generated, initially there is quick commentary on the announcement itself, but then shortly afterwards more considered words start to appear as people take time to think though the details.

This morning I thought it might be worthwhile bundling together some of the other posts that are out there, positive and negative.

So, here are a few of the more notable entries that are floating around this morning, for those interested in the topic of Microsoft and our support for both interoperability and document formats I think this makes for a reasonable round up of many of the views that are out there.

In each case I have pulled out a small quote from the posts I have linked, I would encourage you to follow the links and read the whole post though - there will always be more to digest than just my brief extract.

OpenMalaysiaBlog, Yoon Kit Yong - “Microsoft Office Supports ODF? AYE!”

However, I am an optimist, and I do hope that the Microsofties driving ODF support in core Microsoft applications are sincere in their intent. So far, I don’t see too much of the smarmy doublespeak this time in their press release, and I really applaud the guys for that. Although they tried to dilute the ODF subject with PDF (didnt they already have that last year?) and XPS (who really uses that?) and UOF (ni hui jiang ODF ma?), the message is quite clear.

So overall, its very encouraging. I hope Microsoft follows through with this announcement, and does not mess it up when they finally release the patch.

Before today, it used to be very hard in taking these statements seriously  …

“…  it is very important that customers have the freedom to choose from a range of technologies to meet their diverse needs.”
July 2006Jean Paoli, GM of Interoperability and XML architecture at Microsoft

… but now its definitely reads a lot less hypocritical.

Kudos Microsofties, and I wish your team and efforts well

Strong and supportive words indeed from one of the louder voices driving for ODF adoption here in the region. Yoon Kit and Ditesh (two of the principal bloggers at OpenMalaysia) frequently bring a blunt sense of reality to the way that the work that we do across the region is received by the FOSS community.

I’m pretty pleased to see Yoon Kit carrying a sense of optimism around what we’re doing here, but would encourage them both to keep our feet held close to the fire as we deliver on the promises we’re making!

NOOOXML.org - “Microsoft finally playing nice?”

A press release from Microsoft now promises native ODF support in the next service pack for Office 2007, while full support for the ISO version of OOXML will have to wait until the next major release of Office. Have they finally realized that their “format war” was a lost cause, and that the formal ISO acceptance of DIS29500 was a victory only on paper? If this is an honest attempt to play nice, it is a very welcome move. Of course, only time will tell if they will deliver on this promise, but the tone has changed dramatically, and this might actually be a good time to celebrate. We wish to welcome Microsoft to the party, even though they are very late and managed to make a fool of themselves in the process of trying to fight this outcome in every way possible. Had they only made this move a year ago, it would have saved many people a lot of trouble, including themselves. It is probably safe to assume that it was the strong opposition that forced them to the ODF table.

Pretty encouraging words from a site that was originally set up to oppose the work that we did to standardize the OpenXML file format. It is unfortunate that this is still seen as some kind of “Document Format War”. I still hold a strong view that different document formats serve different purposes. Our announcement yesterday is demonstrable of that point of view, support for ODF adds to the 20+ formats that Microsoft Office already supports, and as additional customer demand comes forth I would not be surprised to see that list continue to grow over the years.

Arnaud Le Hors - “My take on why Microsoft finally decided to support ODF”

One trick they could try and pull for instance would be to put just enough support for ODF to claim that they support it but not enough for people to really use it systematically. They could then tell customers who complain something isn’t working that it’s because ODF isn’t powerful enough, and if they want the full power of Office they need to use OOXML. That’d be a sneaky way to fulfill the ODF requirement set by customers and then force people into using OOXML anyway. Sneaky but not unlike Microsoft unfortunately. So, beware.

Reading assumptions about why we’re doing what we’re doing is always fun, I would like to think that just about everything possible is on the table and out in public view at this point, anything beyond that is just conjecture.

Maybe we’re also planning to take total control of the worlds chocolate supply, after all we do have an office in Switzerland. Next time I’m attending a planning meeting in our secret pacific island volcano I’ll ask around and see what I can uncover.

Francois Ragnet - “Microsoft opens up Office to open document formats”

Interesting to see how Microsoft have moved away from their proprietary document formats, which were previously considered as their “crown jewels”, and now focus their innovation efforts on the applications themselves. More surprising though, is the fact that Office 2007 will not support OOXML, Microsoft’s own competing format for ODF, which they recently “fast-tracked” through ISO approval. In any event, this is great news for the Future of Documents, as this is a major step towards one open document format for easy interchange between applications.

François raises an interesting point. I totally agree, innovation around office suites in general from now on will come through improvements in usability, accessibility and the role of the suite as a developer platform.

Value of an office suite will increasingly be measured thorough a combination of increased individual and group productivity and the role of that suite in integrated and diverse business processes.

Jesper Lund Stocholm - “No reason anymore to mandate anything but ODF?”

A lot of people are now spinning information about this move pulling the rug under OOXML and that ODF should be mandated everywhere - but nothing could be further from the truth. The reason why we approved OOXML still stands and the incompatible feature-sets of OOXML and ODF did not suddenly become compatible. There are still stuff in OOXML that cannot be persisted in ODF and vice versa. The backwards compatibility to the content in the existing corpus of binary documents is still a core value of OOXML and this incompatibility of ODF has not disappeared. You will still loose information and functionality when you choose to persist an OOXML-file in ODF … just as you would when persisting it to old WordPerfect formats. Insisting that having ODF-support in Microsoft Office (12 SP2) makes the need for OOXML go away is a moot point - since I am sure no one would argue to replace OOXML with TXT - simply because TXT is a supported format in Microsoft Office.

Ditto. Microsoft’s support and commitment to the OpenXML format is as strong as it ever was. As Jesper highlights for Denmark, OpenXML provides functionality that is key for customers, partners and the IT ecosystem as a whole. Support for one document format will never negate the need for another that is designed for a different purpose.

Groklaw, Pamela Jones - “Microsoft supporting ODF? –Close, But No Cigar”

I wish I could wholeheartedly applaud the Microsoft announcement about native support for ODF, but I can’t. Of course, it’s better to have native support for ODF, no matter what motives may have influenced Microsoft’s announcement, and I’m glad about that for the sake of end users. But it hasn’t happened yet. Was the word ‘vaporware’ not coined for Microsoft? In any case, I’m in the “I will believe it when I see it” category when it comes to Microsoft. They’ve earned my caution.

Fair enough Pamela, it is up to us to deliver from here, no disagreement there. Enough said. You might want to look up the word ‘vaporware’ though, it wasn’t coined for Microsoft!

Alex Brown - “Microsoft Moves to Support ODF Standard”

Whatever Microsoft’s motivations, users are set to benefit from a world in which MS Office, easily the most used office software, has aligned itself with open, documented standards. But while announcementsare all well and good the true test of Microsoft’s commitment will be found in the byte-by-byte details of the files that Office reads and writes. ODF lays down some strict rules for how these XML documents must be in order to be conformant, and software exists for testing them – I look forward on this blog to holding the magnifying glass to Microsoft’s efforts to see if what is claimed to be Standard really is so. Success will deserve praise; failure will deserve correction.

Alex has an interesting (as in genuinely interesting, not as in curious) role to play in the evolution of both OpenXML and ODF thorough his position in SC34. Ultimately ISO/JTC1 SC34 will be the working group who not only lead both the evolution of these formats, but also help the world understand what interoperability between formats actually means and how it can be best achieved.

Sheri McLeish - “Microsoft Crashing The Party: Announces Intent to Support ODF And Join Standards Boards”

Wow. Microsoft opened up today, taking a nearly 180-degree turn to announce its intent to support ODF, PDF, and XPS. Overall, this is a great, positive move. While unexpected, it’s not surprising. Microsoft has been moving toward more open standards, like with its recent DAISY XML initiative. But it’s also a no-brainer. Sticking exclusively with its competing Open XML was divisive, complicating IT’s efforts to leverage the benefits that open source XML provides.

This is back to that point around the value of Microsoft Office supporting multiple standards. What I see in Microsoft’s moves is a position that is driven by market and customer demands. Following needs of the community of companies and people who use our software seems to be the right route to take, and that is really what the addition of ODF to the list of supported formats in Microsoft Office is all about. 

Glyn Moody - “Microsoft and ODF: Has Hades Gone Sub-Zero?”

As Microsoft well knows, these markets are where most of the future growth can be expected. If they are bent on ODF adoption regardless of ISO ratification for OOXML, Microsoft will effectively be shut out of the hottest markets unless it builds some bridges (one of its favourite metaphors at the moment). Supporting this view is the fact that Microsoft’s latest announcement also includes news support for the less well-known (in the West, at least) Chinese national document file format standard, Uniform Office Format (UOF).

Again - more commentary on following customer and market demand. UOF (Uniform Office Format) is a big deal for us here in Asia, our neighbor to the north is keen to see it as the principal format for documents produced in China.

Rick Jeliffe - “Success has a thousand fathers…”

Developers/standarizers on both sides need to be whacked on their heady heads with a mackeral that Not Invented Here is not acceptable. I think people accept that until now there have been reasonable excuses: that Office could not implement ODF before it existed, that Office could not use ODF as its default format until ODF had even minimal features and completeness, that OpenFormula could be syntactically incompatible with everyone else’s spreadsheet syntax, that ODF’s graphics could cherry pick SVG without really providing actual SVG compatibility (SVG Tiny please?), and so on. (Actually, I don’t mean NIH in the sense that there absolutely cannot be multiple syntaxes or technologies for the same thing if there is some historical reason or feature difference, I am primarily talking about rejecting features merely because of their provenance.) The state of the schemas for DIS 29500 mark 1 and ODF 1.0 just reveal their level of maturity and production-level adoption, and there is nothing wrong with being an adolescent. ODF and OOXML will grow up, and they need the partisan spirit and the NIH attitude to be kept under control to do so.

I left this one until last because I think it goes to the heart of where we all need to go, and how we should think about operating from here.

Choice in document formats isn’t a war, it is a discussion, different people and groups hold different views and none can be considered wrong. Participation in the development of ODF and OpenXML provides a platform for these discussions, and a forum for resolution of the technical, political and in some cases ideological issues that need to be resolved.

I personally think we’re on a good path at the moment, but will agree with Pamela’s comment that it is principally up to us to deliver from here…

More Interop for Microsoft Office (ODF, PDF, PDF/A, XPS)

22 May 2008

There are no shortage of press and blog stories this morning sharing the news that Microsoft has committed to supporting version 1.1 of the Open Document Format in SP2 of Office 2007.

iconsAs the announcement happened while those of us here in Asia were sleeping peacefully pretty much everything that could have been said on the topic has already been said, so I thought it might be more useful to present more of a round up of what I’ve been reading this morning.

First of all a little about the announcement itself.

There is a lot more to this than just support for ODF in the Microsoft Office product, although obviously the native support for ODF is a focus for many of the words that have been written overnight.

The company also announced plans to offer greater support for a number of alternative document formats - including Open Document Format (ODF) v1.1, Adobe Portable Document Format (PDF) 1.5, PDF/A and XML Paper Specification (XPS) - within Word 2007, Excel 2007 and PowerPoint 2007.  

In addition, Microsoft will support the future maintenance and evolution of these format standards by participating on the standards committees charged with these activities. This means that Microsoft folks will join the OASIS ODF TC and participate alongside IBM, Sun, Novell and everybody else present.

Finally ODF will be added to the list of specifications that are covered by the Open Specification Promise, ensuring that every developer has access to any intellectual property that Microsoft might put forwards during these maintenance processes.

The Microsoft blogs that first carried the announcement were the usual folks.

Jason Matusow looks at this announcement in the context of the companies continuing commitment to interoperability as a tenant of the way we design products and collaborate with the rest of the industry. Jason and I share views on the issue of so called “single standards” and he eloquently explains that further in his post.

This is not about any one document format “winning” – it is about enabling customers to evaluate and use document formats that make the most sense for them. Just as the MS deal with JBOSS didn’t mean we were saying that J2 was better than .NET – it is that we want our customers to have the most positive experience possible when using our product.

Doug Mahugh talks about some of the more technical details of the announcement, as well as discussing what this means to existing initiatives. He talks about our continued commitment to the translator projects for ODF, DAISY, UOF etc. and links to the ODF Translator team blog where they have just kicked off version two of that project.

Finally Doug answers a question I was asked over dinner earlier this week… we’ll be adding APIs that allow third parties to intercept the ODF load and save paths so if anybody disagrees with our implementation then all the tools are available for them to write their own.

Gray Knowlton digs around the “Why?” question, again one that came up in my dinner conversation earler this week. Why now? Why when OpenXML just got approval? etc.

Success in our industry (like a lot of other industries) boils down to successfully addressing the needs of customers. By offering greater choice for file formats, our products address more scenarios and provide greater flexibility in enabling specific solutions. From a pragmatic standpoint, adding ODF to Office allows us to re-focus Office on product capabilities rather than a debate about file formats. We’re quite comfortable when we compete in the marketplace on these merits.

Looking around the blogosphere this morning the announcement appears to be very well received by just about everybody, as I said earlier in this post most people seem to be focused on the component of this announcement that talks about native ODF support in Microsoft Office, but it is important to recognize that this is bigger than just that one item.

The announcement, in my view, demonstrates a strong commitment to the Interoperability Principles that we shared earlier this year. As always there is still much work to be done, but this is a great step in the right direction.

If you want to read a little more then here are some links that you might find useful. There is a lot more out there, feel free to link anything addition that you find in the comments of this post.

Press: PC World NZ, Information Week, CNet News, SD Times, New York Times, itWire, Slashdot(!)

Blogs: Stephen McGibbon (MS), Jerry Fishenden (MS), Brian Jones (MS), Jesper Lund Stocholm, Richard Koman, Andy Updegrove, Bob Sutor, Ed Brill, GeekZone NZ, Joe Wilcox, Eric White (MS), Savio Rodrigues

On a final note, I feel compelled to pull one paragraph out of Bob Sutor’s (IBM) post;

There is no reason for more governments and organizations not to start mandating the use of ODF. If you are not using ODF today, you should put adoption plans in place.

There is an area where Microsoft and IBM seem to disagree.

My own personal view on this, which appears to be shared by a majority of the customers I work with, is that mandating a single standard for anything IT related is generally not a great move for government.

IT standards, like any area of technology, move on.

Governments need to remain ready to move with the technology that is in use by their citizens and businesses, mandates for information technology standards often do little more than operate as a hurdle to doing this.

OpenXML/DaisyXML Translator Now Available

7 May 2008

daisy Cast your mind back to last November and you may remember Microsoft committing to working with the Digital Accessible Information SYstem (DAISY) Consortium to produce a translator to their DAISY XML file format (translating WordprocessingML to Daisy DTBooks format), this allows anybody with OpenXML files to convert them for use with a wide array of assistive technologies.

I’m pleased to say that as of today the translator is available, and will run either in the shell in Windows (right click to translate, just like the ODF translator) or will integrate well with Microsoft Office.

From the Microsoft press release;

Microsoft Corp. today joined with industry and advocacy group leaders worldwide to launch new software that will make it easier for anyone to create documents and content that will be accessible for blind and print-disabled individuals. The new “Save as DAISY XML” add-in, designed for Microsoft Office Word 2007, Word 2003 and Word XP, will allow users to save Open XML-based text files into DAISY XML, the foundation of the globally accepted DAISY standard for reading and publishing navigable multimedia content (www.daisy.org).

It is also worth noting that the code for the translator is up on SourceForge if anybody wants to go take a look for themselves, again from the press release

The “Save as DAISY XML” add-in was created through an open source project with Microsoft, Sonata Software Ltd. and the Digital Accessible Information SYstem (DAISY) Consortium and can be downloaded by Microsoft Office Word users for free at http://www.openxmlcommunity.org/daisy.

The open source nature of the Open XML to DAISY XML translation project enables technologists to utilize the source code and other resources for their own applications. As Open XML adoption continues to expand across the software industry for use on various platforms, including Linux, Windows, Mac OS and the Palm OS, solution providers interested in creating their own Open XML to DAISY XML translators can reference information available through the SourceForge open source project site at http://sourceforge.net/projects/openxml-daisy.

The Game Of Jing Pong

6 May 2008

Almost a week ago now Alex Brown posted the details of his “smoke test” looking at an ODF document produced by OpenOffice 2.4.0, checking conformance with IS26300 with the ODF 1.0 RelaxNG schemas, using Jing.

For most of last week nobody really seemed to care, there were a couple of press stories but nothing like the coverage of his similar test with an IS29500 schema and document produced by Microsoft Office a week earlier.

Then a couple of days ago IBM’s Rob Weir jumped in with an extremely long post that he titled “ODF Validation for Dummies“. I’ll let you read the details for yourself, while I’m interested in the detail I’m more concerned by the tone of the overall post itself - I’ll come to that further down in this text.

For flavour, here is the opening line from Rob’s post;

Alex Brown has a problem. He can’t figure out how to validate ODF documents.

As you might expect, Dr. Brown felt the need to respond to this comment and posted a similarly long post of his own, digging deeper into his objectives, his findings and his intent.

Alex’s post, titled “ODF validation for the cognoscenti” responds to several parts of Rob’s monologue, as I read through it a part headed “Negativity” caught my eye;

Amid the general downer that is Rob’s blog entry, is an assumption that I share such negative thoughts. I find myself described as “someone who would be well served if he could show that all consortia standards are junk, and that only SC34 (and he himself) could make them good”. Hmmmmm - where did that come from?

For the record, I am an enthusiastic supporter of consortia and consortium standards and know from experience that consortia contain great people who are producing some of the best standards work in the planet: XML 1.0, ODF, XSLT, UBL, OOXML (ha!) – the list goes on. Most recently I was very pleased to see a new working draft of the important new W3C XProc specification – something that SC 34 is specifically deferring to rather than attempt something similar itself. I thoroughly disapprove of the kind of oppositional mindset that sees things in a polarised “ISO vs OASIS” or “ISO vs W3C” way. In my view that mode of thinking already did enough damage during the DIS 29500 project.

Rob’s response - a hand crafted piece of XML that will validate as an IS26300 document.

Well, Yahoo! (am I allowed to use that word?)

So here is my concern.

There are literally over a billion users of office suites in the world today. These users are self selecting their favourite office suite, and at the same time choosing whatever document format is right for them.

While the debate around document formats has been an interesting one for those of us embroiled in it we have to remember that these users are the reason why we’re having the conversations, not because we have nothing else to do other than bicker with one another.

It is fascinating to watch the back and forth ping pong on blogs as points are scored, but the mentality of directly attacking an individual with the goal of proving that you’re right (regardless of the facts) really does not help anybody.

At this point it feels like we are still a long way from a scenario where somebody from the OASIS TC might reach out to Alex or another member of JTC1/SC34 to discuss the challenges that arose during Alex’s simple test, instead the goal seems to be to prove something in the blogosphere. (I’m not sure what)

Common goals around interoperability, long term sustainability of documents and simplicity for users are often articulated by all parties - but if we’re going to achieve any of those goals then the blog based fun has to end, and professional dialogue has to begin.

OpenOffice.org 2.4.0 and IS26300 Conformance…

2 May 2008

As he promised last week, Alex Brown has gone ahead and tested an ODF file saved by OpenOffice 2.4.0 against the RelaxNG schema for IS26300, and as you would expect the test failed. (just like his test of Office 2007)

Clearly there is still work to be done.

Again, only tentative conclusions can be drawn from a smoke test (readers unfamiliar with this term as applied to software testing are recommended to read the Wikipedia article on it before grumbling about the depth of the test, please).

  • For ISO/IEC 26300:2006 (ODF) in general, we can say that the standard itself has a defect which prevents any document claiming validity from being actually valid. Consequently, there are no XML documents in existence which are valid to ISO ODF.
  • Even if the schema is fixed, we can see that OpenOffice.org 2.4.0 does not produce valid XML documents. This is to be expected and is a mirror-case of what was found for MS Office 2007: while MS Office has not caught up with the ISO standard, OpenOffice has rather bypassed it (it aims at its consortium standard, just as MS Office does).

Just like the Microsoft Office 2007 and IS29500 test that he did last week the results are not at all surprising, both Microsoft and the OpenOffice.org project have work to do before either suite produces ISO compliant document files in either IS26300 or IS29500 format.

It will be interesting to see if the press try to make the same drama out of this non-event as they did out of last weeks non-event.

If you are wanting to read more on the topic then Jesper Lund Stocholm has some details of his own testing posted here, and Doug Mahugh has some additional commentary that you will find here.

Office 2007 & IS29500 Conformance…

22 April 2008

ZDNet UK this morning carried a story headlined “Microsoft Office 2007 fails OOXML conformance test“, and in a subheading explains;

Word documents generated by today’s version of Microsoft Office 2007 do not conform to the Office Open XML standard under development by the International Organization for Standardization, according to tests run by a document standards specialist.

Given that the recent ISO process to standardize OpenXML took significant amounts of feedback around the details of the specification and made changes based on that feedback this shouldn’t really be all that much of a revelation.

Knowing that the specification evolved significantly as it moved from Ecma 376 to IS29500, there should be no dispute that there are differences between the Ecma-376 documents produced by Microsoft Office today and the final IS29500 specification.

The ZDNet news story is generated based on a blog post that Alex Brown posted on his companies site a few days ago, he opens with;

I was excited to receive from Murata Makoto a set of the RELAX NG schemas for the (post-BRM) revision of OOXML, and thought it would be interesting to validate some real-world content against them, to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007.

Alex goes on to look at how the specification document for Ecma-376 (an Office 2007 document) conforms to the post BRM details of the specification now known as ISO/IEC IS29500.

He states in his post that he is doing this to “to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007“, and much as you might expect the document does not conform to a standard that didn’t exist at the time when the document was originally created.

At the end of his post Dr Brown asks “What’s next?”

To repeat the exercise with ISO/IEC 26300:2006 (ODF 1.0) and a popular implementation of OpenDocument. Will anybody be brave enough to predict what kind of result that exercise will have?

We’ll see.

My colleague Doug Mahugh took a closer look at Dr. Brown’s post and offers some of his own commentary on the unsurprising outcome of the tests that Alex did. 

He opens with…

It’s an interesting question. Office 2007 supported the ECMA-376 standard, but many changes were made during the evolution from ECMA-376 to IS29500. How many of those changes affect the content in a typical large document?

… and later in the post goes on to say…

The results were predictable: the document was not conformant to either class. Changes made at the BRM are not yet reflected in any existing implementations, and in this case the Ecma spec was created over a year before those changes were made. Here are the totals:

  • Validation against the strict schemas: 122,000 errors
  • Validation against the transitional schemas: 84 errors

Office 2007 was designed to be highly compatible with existing documents, so it uses features of Open XML that provide backward compatibility, including many of the elements and attributes that were moved to “transitional status” as a result of the BRM. So the test of strict conformance, although interesting, is a bit abstract: it’s testing whether a document conforms to a subset of the spec that was defined after the document was created.

The second number is the more meaningful one. Those are places in the test document where something is done in a way that doesn’t match the final IS29500 spec. Alex provides one specific example, to show the types of changes caught by that test: an attribute with a value of “on” that should say “true” instead, due to “one of the many tidying-up exercises performed at the BRM.”

So where exactly does Microsoft stand on the issue of conformance of Microsoft Office 2007 with the final IS29500 specification?

Chris Capossela (Microsoft’s Senior VP for Office) addressed this in an open letter around a month ago by saying;

We’ve listened to the global community and learned a lot, and we are committed to supporting the Open XML specification that is approved by ISO/IEC in our products.

It is coming… Microsoft is committed to supporting the ISO/IEC standard for OpenXML!

As an aside, Doug’s post closes with a comment about two of this years winners of the Google Summer of Code;

Google recently unveiled the winning entries in Google’s Summer of Code 2008, a program that offers student developers stipends to write code for various open source projects. Two of this year’s winners are enhancements to the Open XML implementation in AbiWord.

Home At Last… Tired And Drowning

21 April 2008

Well, vacations can be fun but at some point the holiday has to end and you have to return to work…

I’m sat here trying to catch up on everything from over the last two weeks.

For the first time in as long as I remember this vacation involved me leaving both my laptop and phone at home and in effect falling completely off the face of the digital planet.

It was as good and relaxing as I expected to be away from communication for a few days, but I’m not sure I fully thought through the implications of the workload I would face when I returned!

Several thousand emails, several thousand unread articles on the Internet and what feels like several thousand hours worth of meeting requests hovering around.

Sifting through the OpenXML related headlines not a lot appears to have changed in the last couple of weeks. Groklaw and Slashdot are still faintly buzzing with the same stories as the start of last month.

Jason Matusow has been engaged in a number of vibrant discussions around the role of IP in standards creation and usage, the comments being expressed on several of his recent posts are making for a good overall discussion. His most recent entry brings some of those conversations together.

One of the most challenging aspects to the threads I’ve been reading in the responses to my post (and I see this in the Groklaw post as well) is that many issues are getting squashed together - and that is the very basis of misunderstanding these issues.

He also has posted some details of the next steps in Microsoft’s commitments to the Interoperability Principles that were published earlier this year.

Continuing with the theme of publishing protocol related information for high volume products some 14,000 more pages of documentation has been posted to the web.

The documentation is for:

  • Protocols between Microsoft Office SharePoint Server 2007 and Office client applications
  • Protocols between Microsoft Office SharePoint Server 2007 and other Microsoft server products
  • Protocols between Microsoft Exchange Server 2007 and Microsoft Office Outlook
  • Protocols between Microsoft Office 2007 client applications and other Microsoft server products

Alex Brown has written about the process that ISO’s SC34 is going through to take full control of OpenXML following on from the recent SC34 meeting in Oslo;

Now however, the whole process moves forward into a much more significant stage. At the just-finished SC 34 meeting in Oslo a number of resolutionswere passed relating to 29500. The most significant of these is resolution 4, “Creation of Ad Hoc Group 1 on ISO/IEC 29500 Maintenance”, and it’s worth looking at it in some detail. I will go through the complete resolution below with some explanation of my own …

You should read Alex’s full post for further information, as might be expected there is a lively conversation taking place in the comments.

On a “less process, more technical” note, the April CTP version of the OpenXML SDK is now available, and Erika Ehrli has all the details.

The Open XML Format SDK Technology Preview simplifies the task of manipulating Open XML packages. The Open XML Application Programming Interface (API) encapsulates many common tasks that developers perform on Open XML packages, so you can perform complex operations with just a few lines of code. Using this API, you can programmatically generate and manipulate Word 2007 documents, Excel 2007 spreadsheets, and PowerPoint 2007 presentations. The programming model uses managed code, so it’s safe for server-side scenarios.

… Erika goes on to talk a little about the future of the SDK…

The Open XML API will release in two versions. Open XML API Version 1.0 is the updated version of the CTP in June 2007 and will only contain the Open XML Packaging API. Open XML API Version 2.0 will contain all of the Open XML API components, including the Open XML Packaging API with further updates. It will enforce validity of the content either in the original Open XML documents or being generated through this API. The purpose of this plan is to give out the long awaited Go-Live license of the existing Open XML Packaging API to external developers.

She is looking for feedback from developers on the path that the SDK is taking, so please consider joining the conversation on her blog.

Finally, ISO/IEC have published an FAQ that talks about the process that IS29500 has been through, you’ll find it here.

The FAQ looks at many of the questions that have been raised over the last fifteen months and offers a direct ISO/IEC response.

It ends with a high level question about the process itself.

Will ISO and IEC review how ISO/IEC 29500 was adopted?

We reviewed the process before it started, all the while during its course and afterwards as well. While the voting on ISO/IEC 29500 has attracted exceptional publicity, it needs to be put in context. ISO and IEC have collections of more than 17 000 and 7 000 successful standards respectively, these being revised and added to every month. This suggests that the standards development process is credible, works well and is delivering the standards needed, and widely implemented, by the market. Because continual improvement is an underlying aim of standardization, ISO and IEC will certainly be continuing to review and improve its standards development procedures.

I guess it is time I stopped hiding in Live Writer and reading other peoples blogs, I should get back to clearing out my overflowing Inbox… more tomorrow.