OpenXML/DaisyXML Translator Now Available

Posted on May 7, 2008
Filed Under General, Standards | 1 Comment

daisy Cast your mind back to last November and you may remember Microsoft committing to working with the Digital Accessible Information SYstem (DAISY) Consortium to produce a translator to their DAISY XML file format (translating WordprocessingML to Daisy DTBooks format), this allows anybody with OpenXML files to convert them for use with a wide array of assistive technologies.

I’m pleased to say that as of today the translator is available, and will run either in the shell in Windows (right click to translate, just like the ODF translator) or will integrate well with Microsoft Office.

From the Microsoft press release;

Microsoft Corp. today joined with industry and advocacy group leaders worldwide to launch new software that will make it easier for anyone to create documents and content that will be accessible for blind and print-disabled individuals. The new “Save as DAISY XML” add-in, designed for Microsoft Office Word 2007, Word 2003 and Word XP, will allow users to save Open XML-based text files into DAISY XML, the foundation of the globally accepted DAISY standard for reading and publishing navigable multimedia content (www.daisy.org).

It is also worth noting that the code for the translator is up on SourceForge if anybody wants to go take a look for themselves, again from the press release

The “Save as DAISY XML” add-in was created through an open source project with Microsoft, Sonata Software Ltd. and the Digital Accessible Information SYstem (DAISY) Consortium and can be downloaded by Microsoft Office Word users for free at http://www.openxmlcommunity.org/daisy.

The open source nature of the Open XML to DAISY XML translation project enables technologists to utilize the source code and other resources for their own applications. As Open XML adoption continues to expand across the software industry for use on various platforms, including Linux, Windows, Mac OS and the Palm OS, solution providers interested in creating their own Open XML to DAISY XML translators can reference information available through the SourceForge open source project site at http://sourceforge.net/projects/openxml-daisy.

Listen to this article Listen to this post

The Game Of Jing Pong

Posted on May 6, 2008
Filed Under Standards | 2 Comments

Almost a week ago now Alex Brown posted the details of his “smoke test” looking at an ODF document produced by OpenOffice 2.4.0, checking conformance with IS26300 with the ODF 1.0 RelaxNG schemas, using Jing.

For most of last week nobody really seemed to care, there were a couple of press stories but nothing like the coverage of his similar test with an IS29500 schema and document produced by Microsoft Office a week earlier.

Then a couple of days ago IBM’s Rob Weir jumped in with an extremely long post that he titled “ODF Validation for Dummies“. I’ll let you read the details for yourself, while I’m interested in the detail I’m more concerned by the tone of the overall post itself - I’ll come to that further down in this text.

For flavour, here is the opening line from Rob’s post;

Alex Brown has a problem. He can’t figure out how to validate ODF documents.

As you might expect, Dr. Brown felt the need to respond to this comment and posted a similarly long post of his own, digging deeper into his objectives, his findings and his intent.

Alex’s post, titled “ODF validation for the cognoscenti” responds to several parts of Rob’s monologue, as I read through it a part headed “Negativity” caught my eye;

Amid the general downer that is Rob’s blog entry, is an assumption that I share such negative thoughts. I find myself described as “someone who would be well served if he could show that all consortia standards are junk, and that only SC34 (and he himself) could make them good”. Hmmmmm - where did that come from?

For the record, I am an enthusiastic supporter of consortia and consortium standards and know from experience that consortia contain great people who are producing some of the best standards work in the planet: XML 1.0, ODF, XSLT, UBL, OOXML (ha!) – the list goes on. Most recently I was very pleased to see a new working draft of the important new W3C XProc specification – something that SC 34 is specifically deferring to rather than attempt something similar itself. I thoroughly disapprove of the kind of oppositional mindset that sees things in a polarised “ISO vs OASIS” or “ISO vs W3C” way. In my view that mode of thinking already did enough damage during the DIS 29500 project.

Rob’s response - a hand crafted piece of XML that will validate as an IS26300 document.

Well, Yahoo! (am I allowed to use that word?)

So here is my concern.

There are literally over a billion users of office suites in the world today. These users are self selecting their favourite office suite, and at the same time choosing whatever document format is right for them.

While the debate around document formats has been an interesting one for those of us embroiled in it we have to remember that these users are the reason why we’re having the conversations, not because we have nothing else to do other than bicker with one another.

It is fascinating to watch the back and forth ping pong on blogs as points are scored, but the mentality of directly attacking an individual with the goal of proving that you’re right (regardless of the facts) really does not help anybody.

At this point it feels like we are still a long way from a scenario where somebody from the OASIS TC might reach out to Alex or another member of JTC1/SC34 to discuss the challenges that arose during Alex’s simple test, instead the goal seems to be to prove something in the blogosphere. (I’m not sure what)

Common goals around interoperability, long term sustainability of documents and simplicity for users are often articulated by all parties - but if we’re going to achieve any of those goals then the blog based fun has to end, and professional dialogue has to begin.

Listen to this article Listen to this post

Hotel Hypocrisy

Posted on May 3, 2008
Filed Under General | Leave a Comment

I’m generally pretty easy going, there isn’t much that annoys me in life, but like most people I have a couple of unexplainable pet peeves.

This week has involved a series of meetings in Berkeley, CA and the hotel that I’ve been staying in has spent the week doing one of those annoying things.

If you stay in pretty much any hotel today and you will find that they are making some token gestures towards environmental sustainability, usually you’ll find a card that needs to be put on the bed if you want the sheets changed and instructions in the bathroom suggesting that you hang up your towels in the morning and use them again.

I’m a big supporter of every small step that any organization takes in this regard, and the idea of hotels using a little less detergent or energy on a daily basis is a very good thing in my opinion.

At the same time though many hotels have not quite got their complete environmental sustainability agenda worked out.

A small (but frustrating to me) example in many hotels is the newspaper that sits on the floor outside of my room when I wake up in the morning, I never read it and I doubt many other guests read it either. Most of them probably pile up in a corner of the room until the day the guest checks out then they get thrown out.

Whenever this happens I’ll generally stop by the front desk and ask the hotel to stop delivering the daily paper… this place though, like a number of other hotels I’ve stayed in this year, agreed not to then when I opened the door the following morning there is ANOTHER newspaper sitting there.

If you’re staying in a hotel anywhere in the world then I’d ask that you join me in raising this with the management when it happens, we should push hotels to move forwards and complete the agenda of environmental sustainability that they have started with the linens in your room, but don’t yet seem to have worked out in other parts of their business.

Listen to this article Listen to this post

OpenOffice.org 2.4.0 and IS26300 Conformance…

Posted on May 2, 2008
Filed Under Standards | 1 Comment

As he promised last week, Alex Brown has gone ahead and tested an ODF file saved by OpenOffice 2.4.0 against the RelaxNG schema for IS26300, and as you would expect the test failed. (just like his test of Office 2007)

Clearly there is still work to be done.

Again, only tentative conclusions can be drawn from a smoke test (readers unfamiliar with this term as applied to software testing are recommended to read the Wikipedia article on it before grumbling about the depth of the test, please).

  • For ISO/IEC 26300:2006 (ODF) in general, we can say that the standard itself has a defect which prevents any document claiming validity from being actually valid. Consequently, there are no XML documents in existence which are valid to ISO ODF.
  • Even if the schema is fixed, we can see that OpenOffice.org 2.4.0 does not produce valid XML documents. This is to be expected and is a mirror-case of what was found for MS Office 2007: while MS Office has not caught up with the ISO standard, OpenOffice has rather bypassed it (it aims at its consortium standard, just as MS Office does).

Just like the Microsoft Office 2007 and IS29500 test that he did last week the results are not at all surprising, both Microsoft and the OpenOffice.org project have work to do before either suite produces ISO compliant document files in either IS26300 or IS29500 format.

It will be interesting to see if the press try to make the same drama out of this non-event as they did out of last weeks non-event.

If you are wanting to read more on the topic then Jesper Lund Stocholm has some details of his own testing posted here, and Doug Mahugh has some additional commentary that you will find here.

Listen to this article Listen to this post

The Digital Divide

Posted on April 29, 2008
Filed Under General | Leave a Comment

As an industry we have a lot more work to do to close the aptly named Digital Divide, some great work has taken place around a number of under served groups, but we need to work out how we help all areas of society in all parts of the world get access to the technology that they need to live their lives, compete for business, to learn and to grow.

The Digital Divide has been a focus of many projects over the last two decades, looking at broadband connectivity to rural areas, access to information for disenfranchised individuals and more recently projects such as the “One Laptop per Child” (OLPC) project which has been looking at the provision of hardware for educational purposes.

Microsoft has done a lot of work in this area as well, examples such as Windows Starter Edition which provides a low cost version of Windows for those who need it, or the Local Language Program, a project that works with local communities to localize our products into many more languages than we would ever reach on a regular commercial basis.

Finally the Digital Divide is a huge area of focus for the NGOs here in the region, many of whom are administering projects to provide technology to individuals or establishing funding programs that allow individuals to provide their own access to technology that most of us take for granted.

In many of these instances the divide is defined around the individual, and very often that definition is carried in terms of the developing world and the access to technology that individuals have a right to, enabling them to learn, work, play and live in the same way as those living in any other part of the world.

I’m of the opinion that in the world that we live in today we should be thinking about the scope of this definition and considering that there are several areas where we should expand that scope. It is a given that every one of the 6bn individuals around the globe have a right to the same level and type of information, but there are several ways of delivering on this vision.

There is no question that we need to continue the push everything thta is currently underway around the individual, but we can do more.

Thinking beyond the individual we should also be looking at solutions to the problems faced by emerging companies, established companies, city administrations, government agencies and even entire national governments.

The examples here are common to the current scenarios that we work with today, but need to be expanded to think about this issue in new ways, providing creative new solutions.

As a child the technology coupled with improved teaching techniques help you learn faster, and in parallel with other students elsewhere in the world.

As an emerging business you need access to the same information, along with management and logistical technology as other businesses elsewhere in the world. Companies and commerce are just as reliant on the availability of broadband, and access to the Internet as individuals.

As a government, at city, state or federal level, access to technology helps run a more efficient administration, using new technologies to reach citizens, integrate with your local businesses and communicate with other parts of your own government.

As we race towards 2010 we need to see more projects addressing the whole of society at all levels, we need to provide connectivity, transactional capability, common training and user education to any part of society that needs it regardless of the demographic of the area in question.

Access to many aspects of technology today is seen as a mediating factor, enabling financial growth, education, healthcare and general human well being.

As an industry and as a society we need to be ready to provide these same opportunities to all of the 6bn people on the planet. Explanation of the definition of the Digital Divide will push us to think about how one European city competes with another European city, how a business in Thailand competes with a business in New Zealand, how a city in the United States provides the same standard of living, education and healthcare to its citizens as any other city on the continent.

We can’t rest here, we have work to do.

Listen to this article Listen to this post

Office 2007 & IS29500 Conformance…

Posted on April 22, 2008
Filed Under Standards | 1 Comment

ZDNet UK this morning carried a story headlined “Microsoft Office 2007 fails OOXML conformance test“, and in a subheading explains;

Word documents generated by today’s version of Microsoft Office 2007 do not conform to the Office Open XML standard under development by the International Organization for Standardization, according to tests run by a document standards specialist.

Given that the recent ISO process to standardize OpenXML took significant amounts of feedback around the details of the specification and made changes based on that feedback this shouldn’t really be all that much of a revelation.

Knowing that the specification evolved significantly as it moved from Ecma 376 to IS29500, there should be no dispute that there are differences between the Ecma-376 documents produced by Microsoft Office today and the final IS29500 specification.

The ZDNet news story is generated based on a blog post that Alex Brown posted on his companies site a few days ago, he opens with;

I was excited to receive from Murata Makoto a set of the RELAX NG schemas for the (post-BRM) revision of OOXML, and thought it would be interesting to validate some real-world content against them, to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007.

Alex goes on to look at how the specification document for Ecma-376 (an Office 2007 document) conforms to the post BRM details of the specification now known as ISO/IEC IS29500.

He states in his post that he is doing this to “to get a rough idea of how non-conformant the standardisation of 29500 had made MS Office 2007“, and much as you might expect the document does not conform to a standard that didn’t exist at the time when the document was originally created.

At the end of his post Dr Brown asks “What’s next?”

To repeat the exercise with ISO/IEC 26300:2006 (ODF 1.0) and a popular implementation of OpenDocument. Will anybody be brave enough to predict what kind of result that exercise will have?

We’ll see.

My colleague Doug Mahugh took a closer look at Dr. Brown’s post and offers some of his own commentary on the unsurprising outcome of the tests that Alex did. 

He opens with…

It’s an interesting question. Office 2007 supported the ECMA-376 standard, but many changes were made during the evolution from ECMA-376 to IS29500. How many of those changes affect the content in a typical large document?

… and later in the post goes on to say…

The results were predictable: the document was not conformant to either class. Changes made at the BRM are not yet reflected in any existing implementations, and in this case the Ecma spec was created over a year before those changes were made. Here are the totals:

  • Validation against the strict schemas: 122,000 errors
  • Validation against the transitional schemas: 84 errors

Office 2007 was designed to be highly compatible with existing documents, so it uses features of Open XML that provide backward compatibility, including many of the elements and attributes that were moved to “transitional status” as a result of the BRM. So the test of strict conformance, although interesting, is a bit abstract: it’s testing whether a document conforms to a subset of the spec that was defined after the document was created.

The second number is the more meaningful one. Those are places in the test document where something is done in a way that doesn’t match the final IS29500 spec. Alex provides one specific example, to show the types of changes caught by that test: an attribute with a value of “on” that should say “true” instead, due to “one of the many tidying-up exercises performed at the BRM.”

So where exactly does Microsoft stand on the issue of conformance of Microsoft Office 2007 with the final IS29500 specification?

Chris Capossela (Microsoft’s Senior VP for Office) addressed this in an open letter around a month ago by saying;

We’ve listened to the global community and learned a lot, and we are committed to supporting the Open XML specification that is approved by ISO/IEC in our products.

It is coming… Microsoft is committed to supporting the ISO/IEC standard for OpenXML!

As an aside, Doug’s post closes with a comment about two of this years winners of the Google Summer of Code;

Google recently unveiled the winning entries in Google’s Summer of Code 2008, a program that offers student developers stipends to write code for various open source projects. Two of this year’s winners are enhancements to the Open XML implementation in AbiWord.

Listen to this article Listen to this post

The World Clock

Posted on April 22, 2008
Filed Under General | Leave a Comment

A link to this “World Clock” was sent to me in mail by a colleague recently. I’m not sure why I liked it so much, probably my uncontrollable fascination with numbers.

clip_image001

Listen to this article Listen to this post

Home At Last… Tired And Drowning

Posted on April 21, 2008
Filed Under Standards | Leave a Comment

Well, vacations can be fun but at some point the holiday has to end and you have to return to work…

I’m sat here trying to catch up on everything from over the last two weeks.

For the first time in as long as I remember this vacation involved me leaving both my laptop and phone at home and in effect falling completely off the face of the digital planet.

It was as good and relaxing as I expected to be away from communication for a few days, but I’m not sure I fully thought through the implications of the workload I would face when I returned!

Several thousand emails, several thousand unread articles on the Internet and what feels like several thousand hours worth of meeting requests hovering around.

Sifting through the OpenXML related headlines not a lot appears to have changed in the last couple of weeks. Groklaw and Slashdot are still faintly buzzing with the same stories as the start of last month.

Jason Matusow has been engaged in a number of vibrant discussions around the role of IP in standards creation and usage, the comments being expressed on several of his recent posts are making for a good overall discussion. His most recent entry brings some of those conversations together.

One of the most challenging aspects to the threads I’ve been reading in the responses to my post (and I see this in the Groklaw post as well) is that many issues are getting squashed together - and that is the very basis of misunderstanding these issues.

He also has posted some details of the next steps in Microsoft’s commitments to the Interoperability Principles that were published earlier this year.

Continuing with the theme of publishing protocol related information for high volume products some 14,000 more pages of documentation has been posted to the web.

The documentation is for:

  • Protocols between Microsoft Office SharePoint Server 2007 and Office client applications
  • Protocols between Microsoft Office SharePoint Server 2007 and other Microsoft server products
  • Protocols between Microsoft Exchange Server 2007 and Microsoft Office Outlook
  • Protocols between Microsoft Office 2007 client applications and other Microsoft server products

Alex Brown has written about the process that ISO’s SC34 is going through to take full control of OpenXML following on from the recent SC34 meeting in Oslo;

Now however, the whole process moves forward into a much more significant stage. At the just-finished SC 34 meeting in Oslo a number of resolutionswere passed relating to 29500. The most significant of these is resolution 4, “Creation of Ad Hoc Group 1 on ISO/IEC 29500 Maintenance”, and it’s worth looking at it in some detail. I will go through the complete resolution below with some explanation of my own …

You should read Alex’s full post for further information, as might be expected there is a lively conversation taking place in the comments.

On a “less process, more technical” note, the April CTP version of the OpenXML SDK is now available, and Erika Ehrli has all the details.

The Open XML Format SDK Technology Preview simplifies the task of manipulating Open XML packages. The Open XML Application Programming Interface (API) encapsulates many common tasks that developers perform on Open XML packages, so you can perform complex operations with just a few lines of code. Using this API, you can programmatically generate and manipulate Word 2007 documents, Excel 2007 spreadsheets, and PowerPoint 2007 presentations. The programming model uses managed code, so it’s safe for server-side scenarios.

… Erika goes on to talk a little about the future of the SDK…

The Open XML API will release in two versions. Open XML API Version 1.0 is the updated version of the CTP in June 2007 and will only contain the Open XML Packaging API. Open XML API Version 2.0 will contain all of the Open XML API components, including the Open XML Packaging API with further updates. It will enforce validity of the content either in the original Open XML documents or being generated through this API. The purpose of this plan is to give out the long awaited Go-Live license of the existing Open XML Packaging API to external developers.

She is looking for feedback from developers on the path that the SDK is taking, so please consider joining the conversation on her blog.

Finally, ISO/IEC have published an FAQ that talks about the process that IS29500 has been through, you’ll find it here.

The FAQ looks at many of the questions that have been raised over the last fifteen months and offers a direct ISO/IEC response.

It ends with a high level question about the process itself.

Will ISO and IEC review how ISO/IEC 29500 was adopted?

We reviewed the process before it started, all the while during its course and afterwards as well. While the voting on ISO/IEC 29500 has attracted exceptional publicity, it needs to be put in context. ISO and IEC have collections of more than 17 000 and 7 000 successful standards respectively, these being revised and added to every month. This suggests that the standards development process is credible, works well and is delivering the standards needed, and widely implemented, by the market. Because continual improvement is an underlying aim of standardization, ISO and IEC will certainly be continuing to review and improve its standards development procedures.

I guess it is time I stopped hiding in Live Writer and reading other peoples blogs, I should get back to clearing out my overflowing Inbox… more tomorrow.

Listen to this article Listen to this post

Searching For Conspiracies

Posted on April 5, 2008
Filed Under Standards | 4 Comments

As we know, last Wednesday ISO/IEC announced that DIS29500 had gained enough votes for it to pass the ratification process and become IS29500.

The last week has been a quiet one for me personally, I have spent most of it clearing up many of my badly neglected admin tasks interspersed with reading news stories and blogs that document the ongoing OpenXML conversations.

Groklaw and Slashdot are buzzing away quietly with various stories, looking for conspiracies in the darkest corners of the internet. Some of the news stories are focusing on the facts, some are praising Microsoft for the steps we have taken over the last two years, others are predictably saying that we have not yet gone far enough. Give us time on the last point, we still have a lot to learn and even more to do.

A few of the blog entries caught my attention along the way.

Jan van den Beld, the former Secretary General of Ecma has been blogging for a while now, I have highlighted a couple of his posts already.

His most recent post reflects on the process, how strong the support for OpenXML is globally and what he describes as the hypocrisy of those who are still pushing back on the tremendous amount of hard work that has been applied by so many over the last fifteen months.

This is a resounding collective voice of support from countries around the world, including the four largest IT markets: the US, Japan, Germany and the UK. Is there any other document format standard that has received such widespread support from the global community? No. This was not a close vote – Open XML won by a healthy margin. Only ten markets voted against ratification, and in a number of those there were strong voices in support of Open XML. By any measure this is a clear statement of support for ratification after a very careful review process that rivals any other standards review in history.

… he goes on to talk about some of the negative voices that are currently echoing around the blogosphere.

These direct attacks on the integrity or national standards bodies are without merit. They reflect a lack of understanding of how standards are developed and how standards bodies operate, or are a cynical attempt to spin things now that 61 countries have decided not to follow their hotly delivered directions. Understandably, national standards bodies are striking back, protecting their hard-earned and well deserved reputations from this smear campaign.

Jason Matusow has posted some related comments, he talks a little about participation in the national standards bodies and highlights the fact that lots of people from all parts of the industry have come to the table to participate in the conversations around DIS29500, many of whom were not there two years ago.

He highlights some examples where IBM and Google have come late to the party, along with the fact that they have as valid a voice as anybody else at the table, providing the NB rules allow their participation.

In Norway when IBM and Google join the committee 2 days before the final vote…or when IBM brings a subsidiary company to the table with them in Italy effectively giving one company 2 votes…or when Oracle and Red Hat join the US V1 committee just before it votes….that is participation, right? I actually believe that to be true. It is no different than Microsoft or its business partners coming to the table to have their voices be heard in the process. As long as the participation is within the context of the rules for a given NB, then it is legitimate participation.

From a personal point of view, I say the more the merrier.

My only hope from here is that the many industry voices who have turned up for these discussions in local committees stay engaged as further specifications are brought to the table, regardless of who submits them or how they’re submitted.

Finally, Miguel de Icaza talks a little about the progress that has been made by Microsoft during the process to standardize OpenXML, I would like to think that whatever your personal views are around the process or the specification that some goodness has come out of this for the entire community.

Speaking as a Microsoft employee I see an amazingly strong commitment at all levels of the company to get our participation in the standards processes right, to change the company in whatever way is needed and to ensure that we continue to have a strong, interoperable and participatory role in the community and the industry as it evolves from here.

Here are just some of those steps, as highlighted by Miguel.

1. The specifications for the old binary file formats were published under the OSP (February of 2008).

2. The above documents were backed up by the British Library in case Microsoft ever stops publishing them (announcement).

3. Microsoft is funding the development of a translator between the old binary file formats and OOXML which should assist folks that have experience in one format and want to understand the other, or just want to convert documents back and forth. If your app lacked support for OOXML, but had support for the old formats, you can use these tools.

4. Microsoft agreed that future versions of OOXML will be covered by the OSP a concern that some people had about future versions of the document.

5. Microsoft pledged to modify future versions of Office to implement the ISO version of OOXML.

6. working group was created to look into harmonization of OOXML and ODF, something that many developers involved in office suites have been advocating for a long time.

7. Microsoft pledged to support features to support other file formats as native file formats in their office suite:

Last year we sponsored a translator project that gave people the ability to read and write ODF files from Microsoft Office. Last month we announced that we would update the Office product so that the ODF translators could natively plug into Office and give people the same options they get from the other file formats. People will be able to set ODF as the default format in Office if that’s what they want by simply installing the translators and then changing their settings.

8. Lots of clarifications went into the spec, and people should be happy about that.

9. And finally, now that OOXML is an ISO standard, as Patrick Durusau implied there are many winners.

If you’re looking for a real conspiracy - for the last two years my wife and I have been planning a trip to Africa, we leave tonight and will be gone for two weeks… with the process to standardize OpenXML now complete it will not conflict with my vacation.

I can assure you that the completion date for the OpenXML standardization process was not planned with my vacation in mind, or was it?

Listen to this article Listen to this post

Final: ISO/IEC DIS 29500 receives necessary votes for approval as an International Standard

Posted on April 2, 2008
Filed Under Standards | Leave a Comment

ISO/IEC have now posted their press release, click this link for the full text.

ISO/IEC DIS 29500, Information technology – Office Open XML file formats, has received the necessary number of votes for approval as an ISO/IEC International Standard.

Approval required at least 2/3 (i.e. 66.66 %) of the votes cast by national bodies participating in the joint technical committee ISO/IEC JTC 1, Information technology, to be positive; and no more than 1/4 (i.e. 25 %) of the total number of ISO/IEC national body votes cast to be negative. These criteria have now been met with 75 % of the JTC 1 participating member votes cast positive and 14 % of the total of national member body votes cast negative.

The 30-day period during which ISO/IEC national bodies had the opportunity to reconsider their votes on the draft ISO/IEC DIS 29500 closed at midnight on Saturday, 29 March 2008, with the result that the criteria for approval of the document as an ISO/IEC International Standard have now been met.

The breakdown of the votes for those of us here in Asia can be found in an earlier post.

Listen to this article Listen to this post
keep looking »

Tag Cloud

External Data