Where To Find The Microsoft Office Binary File Format Specifications
26 February 2008A short while ago I mentioned that Microsoft had committed to releasing the file format specifications for the Microsoft Office Binary files under the Open Specification Promise and making them generally available, removing any of the complications that developers previously had to go through to get hold of these documents.
So, the only remaining question to answer is where you have to look for these documents. There are a few organizations stepping forwards to hosting and archiving these documents.
The first location is an obvious one, and it is Microsoft. The documents can be found on Microsoft.com by following this link.
There you will find;
- Word 97-2007 Binary File Format (.doc) Specification PDF | XPS
- PowerPoint 97-2007 Binary File Format (.ppt) Specification PDF | XPS
- Excel 97-2007 Binary File Format (.xls) Specification PDF | XPS
- Office Drawing 97-2007 Binary Format Specification PDF | XPS
Additionally, Microsoft also made specifications for a number of supporting technologies available, also under the OSP, these include;
- Windows Compound Binary File Format Specification PDF | XPS
- Windows Metafile Format (.wmf) Specification PDF | XPS
- Ink Serialized Format (ISF) Specification PDF | XPS
The other part of the announcement about the binary file formats was the creation of a translator project on Sourceforge that would look at the translation of older Microsoft Office documents from the binary file format to the new OpenXML format.
The project is now live, and can be found here.
At the same time there are two other organizations that have agreed to host these specifications. The first of these was the British Library, below is a small excerpt from the page that they are hosted on;
The British Library believes that it is essential to archive and, where possible, provide access to the specifications of digital file formats. These specifications are important today for people developing applications that work with digital file formats, but archived copies will be even more critical in the future when today’s applications are long obsolete.
You will find the specifications on this page on the British Library site.
The second 3rd party organization who will host the documents is the United States National Library of Congress, and here is an excerpt from their site that that again highlights the intention to preserve access to these documents for generations to come;
Listed here are selected specifications made available for downloading by the Library of Congress with the permission of their owners and the intention of ensuring permanent access to the specifications for the digital preservation community and other users. Also listed are URLs for sources of freely downloadable specifications for digital formats from standards organizations.
You will find the documents on the Library of Congress digital preservation site here.
All in all this means that the documents are available for developers who want access to them today, and are preserved for future generations by a combination of the perpetual nature of the OSP and the effort of the Library of Congress and the British Library to host this specification documentation on an equally perpetual basis…