Interoperability And A Role in SOA…

12 December 2007

In discussions about Open XML one of the questions I get asked most frequently is one that was originally presented by noooxml.org and our colleagues at a large and well known competitor. “We already have an international standard for document formats, why do we need another?”

The answer is a straight forward one, Open XML has different design objectives to other document standards, those differences are defined in the first few paragraphs of the Office Open XML specification.

  • Open XML provides for the migration of existing Microsoft Office binary documents to the new XML format, providing new levels of transparency and access to existing data.
  • Open XML provides a mechanism for the use of Custom XML schema as part of the document format.

When you look at those two unique requirements the first is backwards looking, protecting the huge existing investment in binary documents, the second requirement is not only very forward looking, but for the developer community it is probably one of the most exciting features of the Office Open XML specification.

I’ve talked a little before about custom schema in Open XML, but only recently realized that it was not obvious what this unique proposition is that is offered by the Ecma-376 specification.

Two weeks ago I was in Beijing where I was presenting at an international standards conference organized by OASIS, I was presenting on Open XML and how Microsoft proposed to interoperate with the Open Document Format and the Chinese office document standard, the Uniform Office Format.

After my presentation I found myself chatting to a well known blogger in the standards and open document space, and as part of the conversation he and I were discussing the difference between ODF and Open XML at a very conceptual level. He drew a diagram on a piece of paper that looked a little like the graphic below, personally I think it gives a very clear picture of the role that custom schema plays, one of the key differences between Open XML and ODF, and how the two document formats could interoperate in the longer term.

OpenXML and ODF The graphic talks about the role that Office Open XML compliant applications have in wider business processes within an organization, moving well beyond the traditional office automation file format that we have all become used to over the last twenty or so years. It also shows how converter technology will assist users of the ODF file format (along with other file formats) who want to convert their documents to Open XML or vice versa.

The same applies to UOF for which conversion tools between Open XML and UOF already exist, the Beijing event was a great opportunity to demonstrate this additional converter to a few people working in this space in China.

Getting Started with Open XML

3 December 2007

For a while now I’ve been meaning to put together a list of some of the resources that exist today to help developers get off the ground with Open XML.

My inbox this morning contained a link to a post on James Newton-King’s blog, and he appears to have saved me the trouble.  James is a developer at Intergen in New Zealand, you might remember an earlier discussion about the tool that they published up on CodePlex earlier this year that allows you to convert your IIS logs to XLSX so you can work with them in Excel.

The post “Getting Started with Open XML” is a great round up of several starting points for anybody wanting to learn more about the file format and understand how to get started developing their own Open XML based tools and applications.

If you really want to get into the details of Open XML then the specification itself is probably your best starting point. As Doug Mahugh pointed out at TechEd in Malaysia earlier this year you would probably be likely to start out with part 3 of the current Ecma-376 specification which is designed to be a primer for developers, part 3 represents less than 8% of the overall material and is a great entry point for anybody who is keen to work with the Open XML file format.

RosettaNet Uses Ecma Open XML To Reach SMEs

25 October 2007

If you cast your mind back just over a year or so you might remember this agreement that Microsoft announced with Intel and RosettaNet to develop the next generation of their supply chain standards around Ecma Open XML.

It turns out that RosettaNet do a lot of their development work in their labs in Malaysia, and we are starting to see fruits of their efforts appear in the market.

This week there have been several stories in the press in Malaysia about the development work that has been taking place, and this morning you will find a regional article on ZDNet Asia.

From the article;

“Having proven itself to be a successful standard with large enterprises and multinationals operating in Malaysia, RosettaNet is now moving into its next phase of encouraging local Small Medium Industries (SMIs) to automate their procurement processes,” Foong said.

“Open XML opens up exciting opportunities for RAE based solutions, such as its support for custom defined schemas which facilitates wider success of e-commerce, while assuring users of long-term preservation of data.

“Another benefit of Open XML for the SMI community is its capability of storing and managing business data in documents, resulting in lower costs for implementing business process automation that enhances global competitiveness,” Foong added.

Foong Heng Huo, is the director of RosettaNet Malaysia.

This work with RosettaNet very clearly demonstrates the value that the Ecma Open XML draft brings to implementation of business process systems, something that you just can’t do with “other” formats.

Open XML: Custom Schema Support

3 October 2007

 Another very long post I’m afraid… I promise to try harder in future.

 A couple of days ago the comments on another post strayed into the area of custom schema support in the Ecma Open XML specification. In my reply I documented a scenario that explained how this functionality might be put to use by developers. This is a really exciting part of the Open XML functionality, so I thought it might be worthwhile pulling the comment out into a post in its own right… so here it is.

The scenario below looks at how the Open XML specification may be used in a medical environment, looking at how some data that starts off being input into a document may pass through a series of systems, both internal and some provided externally.

First of all forget about the file format as an office automation document for a moment, and think about it as a container for data in its raw form. The Ecma Open XML spec defines a way to embed custom schema into the document that can represent just about any data you like, then guarantee that it will remain intact as one application or another opens the document, works with it then saves it out.

Now bringing it back to being a document format again, Open XML allows you to bind elements from those custom schema back to properties in the document if you choose, so not only can the custom schema be manipulated by automated systems, but also by a user through a form in their office automation application.

If we apply that to our health-care scenario then you can imagine the Open XML document being used in a diagnosis process. A clinician opens up an office automation app and documents the patients symptoms into custom fields in the document. When the document is saved the patient data is stored independently as a custom schema in the docx file.

As a next step a billing system picks up the newly created document and embeds a second custom schema into the document that includes invoice information that will eventually make its way back to the patients health-care insurer. An addition to this scenario that is only important in so much as it shows that a single document can have multiple embedded custom schemas. The billing system only needs code to work with the OPC, it does not need to deal with the document, or the diagnosis information.

As a final step, I’ll submit my encapsulated patient transaction to a web service somewhere that analyzes the custom XML document that describes the patients symptoms, and as a result drops a third custom schema into the OPC that details a possible diagnosis and some suggested medication. Again, no office automation involved, and no need for the web service to understand the document or the billing schema.

The original clinician can then reopen the document in their original office automation app and work with all the new information that has been added by various systems.

What is important about the way Open XML deals with this is the segmentation of the data and the ability for the developer to decide up on the structure of the custom embedded schema. This means that the Open XML spec is not dictating how this data is stored, and developers can embed any one of the thousands of XML based business schema standards that exist in the world today. In the example above, for the US, a developer might choose to embed the HL7 schema into the Open XML file.

This capability in itself is a lot more to do with being able to use Open XML in end point systems in a larger SoA environment, and a less to do with what you might traditionally think of in terms of office automation apps, although of course the office automation app still has a key role to play whenever the document reaches a user.

I use a health-care example, but it could be any business process, and I’m already seeing examples of enterprise organizations doing this sort of work in scenarios such as supply chain or banking processes.

Can you do this with other doc formats? Well, most of them just are not designed to do this. The majority of the plethora of document formats that are out there are designed with pure OA or document presentation in mind. For those that do allow the embedding of custom data elements it isn’t clear to me that this data would be protected as it passes through different applications, or that it would possible to implement in a form that allows the segmentation of the data and the conformance to existing business process schema standards.

You can see a demo of how this might look for the user in this video. The demo is based upon work in Microsoft Word, but it could be done in any other application that supports Open XML embedded custom schema.