Home > Standards > Binary File Format Specifications Under the OSP And A New Open Source Converter Project

Binary File Format Specifications Under the OSP And A New Open Source Converter Project

Several of my Microsoft colleagues are blogging this morning about a comment that Brian Jones posted yesterday. Brian announces two things that the company is doing to support developers and organizations that want to understand the relationship between the binary file formats and OpenXML and/or manage conversions between the two.

1. The specifications for the binary file formats will be placed under the Open Specification Promise (OSP). This means that any developer is now able to gain access to these specifications without the need to contact Microsoft or sign any agreement. The file format documentation has been available since 2006 under a RAND-Z licence through the process described in this knowledge base article, the further step of applying the OSP to the documentation simplifies the process further.

2. There will be a project established on Sourceforge to build a converter between the binary files (.doc, .xls, .ppt) and the new OpenXML format (.docx, .pptx, .xlsx). This means there will be libraries available under the BSD license that clearly demonstrate the mapping between the two file formats that can again be used by any developer as a reference.

Brian’s post carries the text from one of TC45’s proposed dispositions that relates to this decision;

We believe that Interoperability between applications conforming to DIS 29500 is established at the Office Open XML-to- Office Open XML file construct level only.

Prescriptive guidance on, or tools to enable, transformation from Microsoft Office  “binary” file formats (i.e., .doc., .xls, and .ppt) (the “Binary Formats”) to Office Open XML formatted files is not the intention or in scope of DIS 29500.  As a result this request is outside the bounds of this process.

It is important to note that substantial use is being made of both the Binary Formats and Office Open XML in the marketplace today.  Many products (such as OpenOffice.org) support the Binary Formats. Microsoft has indicated that many companies and public institutions have received the documentation for the Binary Formats, and are working with it at this time, and can create mappings between the Binary Formats and Office Open XML. Translators from the Binary Formats  to XML formats such as ODF have already been developed and are in wide use. For example, the Sun ODF Plug-in for Microsoft Office (http://sun.systemnews.com/articles/112/3/sw/18208) states that  “The plug-in allows users the ability to seamlessly convert Microsoft Office documents to and from ODF. The ODF plug-in supports Microsoft Word, Excel and Powerpoint”.

Likewise, there is widespread use of Office Open XML in the marketplace today across platforms and applications.  A few examples include the implementations released by Apple (Mac OS X Leopard, iWork 08, iPhone), Adobe (InDesign), Microsoft (Office 2007, Office 2003, Office XP, Office 2000, Office 2008 Mac OS X), Novell (Suse Open Office), Google (Search / Preview), Mindjet (MindManager), Intergen, OpenXML/ODF Translator (Open Source project on Sourceforge), Dataviz (DocumentsToGo on Palm OS, MacLinkPlus on Mac OS X Leopard), NeoOffice, Altova (XMLSpy), MarkLogic (XML Content Server), Datawatch (Monarch Pro), QuickOffice  (QuickOffice Premier 5.0 on Symbian), Altsoft (XML2PDF Server 2007) and those under development by Corel (WordPerfect), AbiWord, Gnome (GNumeric),  Xandros, Linspire, Turbolinux and others.  These implementations are now available on many platforms, including Linux, the Macintosh, Windows, and handheld devices (PalmOS, Symbian, iPhone, and Windows Mobile).

The widespread use of both  Binary Formats and Office Open XML formats indicates that, at this time, 3rd party can use both formats and build mappings between them.

Nonetheless, Ecma International discussed this subject with Microsoft Corporation, the author of the Binary Formats.  To make it even easier for third party conversion of Binary Format-to-DIS 29500, Microsoft agreed to:

· Initiate a Binary Format-to-ISO/IEC JTC 1 DIS 29500 Translator Project on the open source software development web site SourceForge (http://sourceforge.net/ ) in collaboration with independent software vendors.  The Translator Project will create software tools, plus guidance, showing how a document written using the Binary Formats can be translated to DIS 29500.  The Translator will be available under the open source Berkeley Software Distribution (BSD) license, and anyone can use the mapping, submit bugs and feedback, or contribute to the Project.  The Translator Project will start on February 15, 2008.

· Make it even easier to get access to the  Binary Formats documentation by posting it and making it available for a direct download on the Microsoft web site no later than February 15, 2008.  The Binary Formats have been under a covenant not to sue and Microsoft will also make them available under its Open Specification Promise (see www.microsoft.com/interop/osp) by the time they are posted.

We will modify DIS 29500 to include an informative reference to the SourceForge project.

This is great news for a number of developers in the Asia Pacific region that I have worked with over the last year, the request for this has been raised both as part of the ISO process and additionally by developers who are already working with the file formats outside of that process.

Categories: Standards Tags: ,
  1. No comments yet.
  1. No trackbacks yet.