Another very long post I’m afraid… I promise to try harder in future.
A couple of days ago the comments on another post strayed into the area of custom schema support in the Ecma Open XML specification. In my reply I documented a scenario that explained how this functionality might be put to use by developers. This is a really exciting part of the Open XML functionality, so I thought it might be worthwhile pulling the comment out into a post in its own right… so here it is.
The scenario below looks at how the Open XML specification may be used in a medical environment, looking at how some data that starts off being input into a document may pass through a series of systems, both internal and some provided externally.
First of all forget about the file format as an office automation document for a moment, and think about it as a container for data in its raw form. The Ecma Open XML spec defines a way to embed custom schema into the document that can represent just about any data you like, then guarantee that it will remain intact as one application or another opens the document, works with it then saves it out.
Now bringing it back to being a document format again, Open XML allows you to bind elements from those custom schema back to properties in the document if you choose, so not only can the custom schema be manipulated by automated systems, but also by a user through a form in their office automation application.
If we apply that to our health-care scenario then you can imagine the Open XML document being used in a diagnosis process. A clinician opens up an office automation app and documents the patients symptoms into custom fields in the document. When the document is saved the patient data is stored independently as a custom schema in the docx file.
As a next step a billing system picks up the newly created document and embeds a second custom schema into the document that includes invoice information that will eventually make its way back to the patients health-care insurer. An addition to this scenario that is only important in so much as it shows that a single document can have multiple embedded custom schemas. The billing system only needs code to work with the OPC, it does not need to deal with the document, or the diagnosis information.
As a final step, I’ll submit my encapsulated patient transaction to a web service somewhere that analyzes the custom XML document that describes the patients symptoms, and as a result drops a third custom schema into the OPC that details a possible diagnosis and some suggested medication. Again, no office automation involved, and no need for the web service to understand the document or the billing schema.
The original clinician can then reopen the document in their original office automation app and work with all the new information that has been added by various systems.
What is important about the way Open XML deals with this is the segmentation of the data and the ability for the developer to decide up on the structure of the custom embedded schema. This means that the Open XML spec is not dictating how this data is stored, and developers can embed any one of the thousands of XML based business schema standards that exist in the world today. In the example above, for the US, a developer might choose to embed the HL7 schema into the Open XML file.
This capability in itself is a lot more to do with being able to use Open XML in end point systems in a larger SoA environment, and a less to do with what you might traditionally think of in terms of office automation apps, although of course the office automation app still has a key role to play whenever the document reaches a user.
I use a health-care example, but it could be any business process, and I’m already seeing examples of enterprise organizations doing this sort of work in scenarios such as supply chain or banking processes.
Can you do this with other doc formats? Well, most of them just are not designed to do this. The majority of the plethora of document formats that are out there are designed with pure OA or document presentation in mind. For those that do allow the embedding of custom data elements it isn’t clear to me that this data would be protected as it passes through different applications, or that it would possible to implement in a form that allows the segmentation of the data and the conformance to existing business process schema standards.
You can see a demo of how this might look for the user in this video. The demo is based upon work in Microsoft Word, but it could be done in any other application that supports Open XML embedded custom schema.