Archive for December, 2007

Enjoy The Holidays And Have A Great Start To The New Year!

24 December 2007

It has been a fun year, it seems that I picked a near perfect time to step out of the corporate fortress and head back into the real world, with the Open XML debate underway and all.

That said, I can’t think of a single individual that I’ve met in the last year that I have not totally enjoyed engaging with, in the Open XML debate alone (and I’m pleased to say that this year has involved a lot more than just that single thread!) I have met a whole range of folks that Microsoft employees just don’t generally get an opportunity to spend time with.

As the year has gone on some of these individuals have moved from being adversarial to folks that I have enjoyed some slightly more social time with… and that is worth a fortune, both in terms of my own personal growth, my increasing understanding of the true diversity of cultures that are represented in this region and how I represent Microsoft.

I think I’ll leave the year with a link to a blog post that I came across earlier this week on Jesper Lund’s blog, the post started off as a tongue in cheek look at some of the blogging techniques that have been used, specifically by our friends at IBM, over the last year. In itself the post is mildly amusing, the interesting part though is the discussion that then ensues between Rob Weir and Gary Edwards in the comments section - it makes for an interesting read, and a fun way to burn some of the spare time that we all have during this season!

You will find the post linked here, below I’ve copied a small extract of one of the comments, I don’t know that any of this has any value but the drama is certainly entertaining.

Don’t click the link unless you have about an hour to spare.

In the words of Gary Edwards from the comments section; (just a teaser)

Interestingly, IBM joined OASIS ODF two days prior to the final vote period. IBM joined the ODF technical Committee on April 14th, 2005, right before the final vote on ODF 1.0 which took place between April 16th and April 30th of 2005. Two days! No meetings. No work. No contributions. Zero participation. But oh what a motherload of media coverage IBM continues to reap.

I’m sure next year holds even more than this year has, I know Microsoft has some big plans for continuing to engage in the standards world and I’m personally looking forward to being a part of it.

Enjoy the holidays and have a great New Year!

Third Batch Of Proposed DIS29500 Dispositions Posted

23 December 2007

A very quick update for the various technical teams working on OpenXML around the region.

The third batch of proposed technical dispositions of the comments for OpenXML have been published on the Ecma International web site for review by the National Standards Bodies and their representatives.

As with the last batch, there are some pretty substantial changes proposed, Brian Jones has the full information in a post over on his blog.

From Brian’s post;

At a high level, we’re responding to feedback that requested simplification for those people who only wanted to implement a subset of the spec. We are also removing from the main specification functionality that is necessary for completeness but is not necessarily appropriate for documents created in the future. We’re moving that functionality to an annex. And finally, we fixed some concerns that National Bodies had about how to handle some legacy issues, Dates in particular with the infamous “date leap bug”.

Over 20 Million Successful Downloads Of The Open XML Compatibility Pack

21 December 2007

The compatibility pack comes up in conversation pretty frequently these days, in past years when Microsoft has upgraded to a new file format users of older versions of office products have had to deal with some well document issues when files are sent to them in the newer format.

With the advent of Open XML and the implementation that we have in Office 2007 Microsoft took one further step and delivered a piece of code that we call the “Compatibility Pack for Open XML”. The compatibility pack provides functionality for users of Office 2003 and Office XP that allows them to work with Open XML files without any need to upgrade to newer versions of Office. 

The program owner for this work in Microsoft is a lead by the name of Gray Knowlton, some of you will have met him as he got involved in several technical discussions around the region, including the technical workshop in New Zealand last August.

Gray has a wealth of knowledge in the office productivity and file formats space, he has spent a few years with Microsoft now and before that was with Adobe working in similar areas. A couple of weeks ago somebody finally managed to talk Gray into starting his own blog, he is more of a product type and less of a standards type so we should start to see some useful info from him around the impact of Open XML and related technologies.

One of his first posts covers the download stats of the compatibility pack, mentioning that we just passed a milestone of 20 million successful downloads.

In many of the public debates that I have participated in around Open XML I frequently get told that Open XML isn’t being adopted, the example cited is usually based on the number of DOCX files Google has indexed or some similar measure. Given that most users are more likely going to be using the files produced by Microsoft Office inside of a firewall, then if the document is going to be used externally converting them PDF or some other publishing format, these numbers from Google and other search engines are not really much of a surprise or of much real use.

Watching the download counters on the compatibility pack, the converter projects on Sourceforge and a number of other real metrics does show us that the usage of Open XML is high, users are choosing to install the compatibility pack to either create and manage Open XML documents of their own or to exchange documents with the many millions of users of Office 2007, other office and other software packages that support Open XML that are out there now.

Anyway, Gray’s post on the topic is a lot more interesting than mine, if you want to have a read for yourself it is linked here, and I’ve copied a small excerpt below;

We decided to make it available as a manual download, and not as an automatic update, and during the first 12 months of its release, the compatibility pack has been successfully downloaded over 20 million times. This means that 20 million people have elected to manually download this 26.2MB software to their computer. This is a significant number of people adding Open XML to their environment.

Now is a good time to get past the denial phase that some quarters still seem to be stuck in and accept that Open XML, like PDF, ODF and a number of other office document formats have broad adoption in the market today.

Standardizing Open XML alongside these other formats provides a strong base from which we can collectively start to look at conversations such as interoperability and document fidelity in a way that will help our mutual customers.

Haansoft Announces Support For Both OpenXML and ODF

18 December 2007

Earlier this week Korea’s largest producer of Office Automation software announced that they will support both Open XML and ODF in the next version of their product.

This is extremely significant for Korean users of Haansoft’s Hangul package, where as I understand it the number of users of Hangul outweighs users of any other office package.

You’ll find an English version of the announcement on ZDNet’s Korean site;

On Thursday 13, it announced that it planed to support not only ‘ODF’(Open Document Format), the standard of International Organization for Standardization (IOS), but also ‘Open XML’, promoted by ECMA International, in its next version of Hangul software. 

By supporting internationally recognized open type documentation standards in its new version of Office, Haansoft plans to cultivate its competitive power and lead the standardization of domestic office documentation.

There is substantial coverage of this in the Korean press, if you are interested in reading more then the following links will help. (in Korean)

Chosun, Money Today, ETNews, FNNews, ZDNet, DDaily, HeraldBiz, INews24 and eToday.

ICEGov ‘07 & Open XML Discussions In Australia

15 December 2007

Like many other recent weeks, most of the last seven days has been consumed by travel and interspersed with real work at a couple of really interesting events. On the plus side, I did get to undertake part of that travel on SIA’s new Airbus A380, a stunning plane and a stunning experience, returning home on one of their 777s from Christchurch next week just won’t be the same.

Anyway, the first part of the week was spent with colleagues from the United Nations University in Macau attending their ICEGov event, the second part in Sydney where I got the opportunity to participate in the symposium that the University of New South Wales were hosting, looking at the technical and legal aspects of Open XML as they pertain to the needs of Australian users, developers and business.

And to round things off I’m now sat in a hotel room in Sydney, trying to catch up on the events of the week and clear my inbox down to a point where it becomes manageable again, as I am sure you have read before we exchange a LOT of email inside Microsoft.

The ICEGov conference was pretty unique in its makeup, we have been working with a couple of members of the faculty for a little while now on some research questions around eGovernment and Interoperability but this was the first opportunity I have had to visit the school and gain a wider view of the work that is going on there.

Unfortunately I could only stay for the first two days of the event, the sessions I attended looked at elements such as applying formal engineering techniques to eGovernment development, Interoperability through decisions around architecture and technology, eGovernment policy management, and a session on eParticipation which is an area of eGovernment where I personally believe we will see an increasing focus in years to come.

Usually at this type of conference sessions consist of various government or industry leaders presenting best practice based upon recent projects that they have been involved in. These types of events are interesting, it is always good to learn what is going on elsewhere in the world, but every government differs in terms of technology use, social structure, culture and related government policy so it is sometimes hard to see how these best practices can be picked up and put to good use in another jurisdiction.

The format of ICEGov was far more academic in its approach, with each of the sessions being closer to half a day and the format of the content being constructed more as a topic tutorial, drawing on occasional cases where needed. I found every session I participated in helpful, and in every case walked out of the session with a handful of new ideas that I hadn’t walked in with.

Great stuff, and a big congratulations to the organizing team who I know put a lot of effort into pulling this together.

The second event was equally as interesting. The symposium at the University of New South Wales’ CyberLaw Centre has been arranged for some time, about 30 people took part in both halves of the day. The first half was a technical discussion, the second half was looking at the legal coverage for the specification.

As is always the case with these events it was a spirited but constructive discussion with Rick Jelliffe and Matthew Cruickshank facilitating conversations around the technical aspects of Open XML and then Colin Jackson presenting the views of the New Zealand Government on the topic of Open XML and open documents in general.

The conversations during the afternoon session were led by Ronald Yu and Microsoft’s Steve Mutkoski. Good points were made all sides of the debate, and several of us agreed that a post-February beer or two might be a good idea.

I would really like to see more of these types of event in the region. The debate on the Internet sometimes consists of one side throwing a grenade over the wall at the other, then the other side throwing one back. Events like the one at UNSW give everybody a chance to spend time getting into the technical, legal and standardization questions. I know that I learned a few things on the day and I would like to think that some of the other participants did as well. It was good fun, there is always a lot to be gained from open conversation.

Open XML: TC45 Begins Addressing The More Complex Comments

12 December 2007

A number of sites are reporting that a further set of proposed dispositions have been posted to the Ecma web site for review by the national standards bodies and various technical committees.

Brian Jones, Microsoft’s representative on TC45 within Ecma, mentions on his blog that the committee has now looked at over half of the submitted comments and have constructive responses that talk about how those comments can be addressed.

With this drop of dispositions Ecma have begun to look at the subset of the more complex comments that were raised by the technical review phase of the fast track process, some of these will represent substantial changes to the specification, part of the TC45 status report on Ecma’s site details these suggestions.

You can find the full TC45 status report here, for ease of reading I have copied the text that talks about the specific changes that are being proposed;

Allowing for ISO-8601 Dates

ECMA-376, the original Open XML standard adopted by Ecma, assigned a unique numeric value to each date in a spreadsheet, in order to improve the speed of date calculations. Based on the comments received from some National Bodies on this issue, DIS 29500 will be updated to allow date values to be stored using the format defined by the ISO 8601 standard.

Internationalized handling of weekdays and weekends

ECMA-376 allowed for a week that begins on Sunday or Monday, but not a week that begins on any other day, such as Saturday. Ecma is proposing a comprehensive range of options for what is defined as the first day of the week, and what is defined as the weekend.

Language tags

ECMA-376 used a set of integer values to identify the language applied to regions of a document. Ecma is proposing that the language tags specified in the DIS should instead leverage an internationally recognized practice for representing languages, IETF BCP 47. IETF BCP 47 is a Best Current Practices document that incorporates use of the ISO 639 standard for languages, ISO 15924 for scripts, and ISO 3166 for regions. This proposal directly follows recommendations from National Bodies in several countries.

Page Borders

ECMA-376 included support for a variety of graphical elements that could be used as page borders. Several National Bodies noted that this closed list of graphical elements was not sufficiently diverse and global in its contents. Based on that feedback, Ecma is proposing to change the Open XML standard to allow for custom page borders. This will enable implementers to determine the best option for including borders relevant to their applications.

Usage of ISO standards for grammars

ECMA-376 used its own notation for defining the grammar for some of the more advanced functionality, such as spreadsheet formulas and word processing fields. Several National Bodies noted that the existing grammars in ECMA-376 are non-standard and were not fully described within the DIS. In response to this concern, Ecma proposes to revise the notation for spreadsheet formulas and fields to use an existing ISO standard. Formula notation will now use ISO/IEC 14977:1996 – Syntactic metalanguage – Extended BNF. This proposal improves the ability for implementers to test and validate conformance to the specification.

Interoperability And A Role in SOA…

12 December 2007

In discussions about Open XML one of the questions I get asked most frequently is one that was originally presented by noooxml.org and our colleagues at a large and well known competitor. “We already have an international standard for document formats, why do we need another?”

The answer is a straight forward one, Open XML has different design objectives to other document standards, those differences are defined in the first few paragraphs of the Office Open XML specification.

  • Open XML provides for the migration of existing Microsoft Office binary documents to the new XML format, providing new levels of transparency and access to existing data.
  • Open XML provides a mechanism for the use of Custom XML schema as part of the document format.

When you look at those two unique requirements the first is backwards looking, protecting the huge existing investment in binary documents, the second requirement is not only very forward looking, but for the developer community it is probably one of the most exciting features of the Office Open XML specification.

I’ve talked a little before about custom schema in Open XML, but only recently realized that it was not obvious what this unique proposition is that is offered by the Ecma-376 specification.

Two weeks ago I was in Beijing where I was presenting at an international standards conference organized by OASIS, I was presenting on Open XML and how Microsoft proposed to interoperate with the Open Document Format and the Chinese office document standard, the Uniform Office Format.

After my presentation I found myself chatting to a well known blogger in the standards and open document space, and as part of the conversation he and I were discussing the difference between ODF and Open XML at a very conceptual level. He drew a diagram on a piece of paper that looked a little like the graphic below, personally I think it gives a very clear picture of the role that custom schema plays, one of the key differences between Open XML and ODF, and how the two document formats could interoperate in the longer term.

OpenXML and ODF The graphic talks about the role that Office Open XML compliant applications have in wider business processes within an organization, moving well beyond the traditional office automation file format that we have all become used to over the last twenty or so years. It also shows how converter technology will assist users of the ODF file format (along with other file formats) who want to convert their documents to Open XML or vice versa.

The same applies to UOF for which conversion tools between Open XML and UOF already exist, the Beijing event was a great opportunity to demonstrate this additional converter to a few people working in this space in China.

What do a VIC20, A Sinclair Spectrum, An IBM PC and GT Power Have In Common?

9 December 2007

The answer, very selfishly, is that these are the technologies that got me progressively more interested in what was at the time the emerging field of personal computing. 

I have to admit that when it comes down to it I’m a geek in every sense of the word, I always have been and hopefully always will be. I love technology, sometimes for the innovative new way that it can be used to accomplish something and although I know it isn’t vogue to say this, sometimes just for the sake of it.

This is not a recent phenomena for me.

My first introduction to computers came with my first, very innocent, introduction to girls when I was about 8 years old. A girl in my year at school give a wonderful recital of Beethoven’s Fur Elise in assembly one morning, my father had just bought me a cassette tape of classical music that had this piece on it and I somehow ended up chatting to her about how much I enjoyed it afterwards, on the basis of that we became very firm friends for a while.

One Saturday I was invited around to her house to play and I discovered she had a Commodore VIC20 which we spent most of the day playing various primitive games on. After a few weekends I think she worked out that I was more excited by the 8-bit computer she had plugged into her living room TV than I was in playing games outdoors and oddly I didn’t get invited back all that often after that.

The following Christmas my parents and my grandfather clubbed together to buy me a Sinclair Spectrum 48k. Sir Clive Sinclair did an enormous amount of good work in promoting micro computers for the home in the UK, and this was his second or third generation device. It was an amazing machine, running on a Z80 processor. Sinclair had managed to work out that a successful computer needed to have a plethora of supporting partners delivering tools, games, educational applications and so forth, and as a result there was a wide array of applications for the Spectrum, along with support for a native programming language called Sinclair BASIC. I eventually donated the computer to my school library, I had written a very simple application that tracked books as they were taken out by students and printed out a long list on a dot matrix printer that listed return due dates. 

An Uncle of mine worked for IBM, and he first turned up a the house with an original IBM PC when I was about 10 years old, complete with the click click keyboard and a monochrome orange (or maybe green) screen. At that point I started by writing the same game that everybody probably wrote in their early days of programming in IBM BASIC, it generated a random number between 1 and 100, then gave hints as the user tried to guess what it was. At the time it felt like total genius.

The IBM PC felt much more like an industrial machine, my father was a small business owner and we eventually bought one with a 10Mb hard drive for the family business. The application that I wrote at the time was similar to my library application, it used to track all of the companies customers, when they had been in, who had spoken to them and when they need to be contacted next for various reasons of service and support.

Then the fun began, modems started to become available to the consumer and computers moved on from being stand alone devices and morphed into a communication tool, although it was obviously only one computer talking to one other computer in those early days.

The obvious thing to play with next was the Bulletin Board Systems that were emerging at the time. This was amazing technology, and brought with it the ability to talk to people far away through discussion forums and direct person to person email. The fact that it might take a week or more for an email to reach the other party and garner a reply really seemed irrelevant at the time, it was a big step forwards for communication. At the time I remember trying to explain to a friend that there would be a day when we would all be able to email each other, he told me recently that he immediately marked me down as being a little bit “out there”.

The particular BBS network I got involved with was called GT Power. It was a niche network and the primary developers had their own ideas about transfer of mail and discussion information in a manner that didn’t directly integrate with FidoMail, the dominant player at the time.

Somehow I ended up playing two roles in the network.

First of all I rented a service called a “Night Line” from British Telecom. At the time long distance person to person calls were pretty expensive, the night line service cost about three hundred pounds a quarter (expensive for me in my mid teens) and for that BT would turn off the billing between midnight and 6am for me daily. Using this service I scheduled calls to every GT node in the UK twice a night, one round to pick up any mail that was waiting for delivery, I would then process and “re-bag” it, then make a second round of calls to deliver whatever I had collected and repackaged. This meant that by 6am every morning the entire network of about 50 UK nodes was entirely up to date. Email was faster, and more exciting to use!

Secondly I started making calls overseas for the same purpose, but only once a day for these. In the UK at that time making an overseas call was quite a rare thing, and I remember my parents being very confused about the whole deal. Regardless, daily calls to friends in the US, in Australia and various European countries got the job done and kept the network functioning.

Looking back on it, I think the whole process taught me a great deal about systems design and management. Not the best education but certainly one that was very hands on.

At the same time I also started writing freeware utilities for GT Power, they were a combination of management and reporting tools. If you search hard enough on the internet today you will still find a couple of them available for download. Quite why you might want them, or what you might be able to do with them today after obtaining them I’m not really sure.

Interestingly for me, and probably a reflection on the unique state of the industry at that point in time this then led to my first introduction to corporate computing. The lead for the GT Power network in Germany got in touch one day to talk about how some of my utilities managed the sequential delivery of entries into discussion groups, it was obviously important that discussions arrived in sequence for them to make sense.

He was working on a data distribution tool for a company in southern Germany that would extract data from an IBM mainframe and deliver it in sequence to a series of machines to be used by sales representatives around the country, and he was curious to know if the code that I had written could be applied to his problem rather than just simple mail transfer.

I spent the next six months or so working with him on this problem, we eventually deployed a pretty interesting system based mostly on freeware to a major German corporation.

The rest, as they say, is history.

Oddly over the last couple of weeks a few of the people involved in this twenty or more year chain of events have been turning up in my inbox again, tools like Plaxo, LinkedIn and FaceBook are slowly bringing us all back together. It is fascinating to see what has become of these early influences on my interests in technology, and unsung pioneers from the early days of consumer computer communications. They range from being employees of Cisco Systems and IBM (and Microsoft of course!), running their own independent software vendors, and in one case living the England’s beautiful Lake District as far away from technology as he can be.

It has been fun to track these folks down, if anybody else is out there who recognizes any part of this story please drop me a line!

One Day, Then The Next, Then A Totally Different One…

7 December 2007

The work with Open XML over the last year has been enlightening in several respects. One of the strangest things I have encountered is how much the conversation about Ecma-376 changes depending upon the environment that I happen to be in, and how little relationship there is between the different modes in which I encounter Open XML.

I’ll give you some examples;

Scenario #1 - Discussing Open XML with Developers and ISVs.There are already a wide array of developers doing work with Open XML today, some building document related apps and others using the custom schema capability to build a wide range of business integration tools. Conversations with these folks tend to be technical, generally very focused on how to just get on with using the existing spec from Ecma. I’ve talked about a few of them in earlier posts on this blog, and there is a lot more work underway in the region.

Scenario #2 - Discussing Open XML with various standards experts and some of the national bodies. As the Regional Technology Officer for Microsoft in Asia I have been a member of conversations in various countries as Open XML slowly progresses through the steps at ISO. Generally I get involved in helping our local teams answer technical or process questions. These conversations are always pretty clear cut, the process of standardization seems to be well understood and everybody seems to know exactly what the rules are and where we are in that process. I have learned over the last year that if I have a question about the ISO process then the definitive source for information is most likely a member of staff of one of the regional national standards bodies.

Scenario #3 - Discussing Open XML in various Internet forums. Frankly as you might expect this varies, the ISO process is a complex one and without the experience of being directly involved some of the terms and modes within which the organization operates can be confusing and appear to be open to some interpretation. The Internet is a place where you can shout and scream, have a bunch of fun doing it and probably meet some like-minded friends along the way. The conversations on blogs, on /. and a number of other forums are generally quite constructive with some obvious (yet still very entertaining) exceptions.

Over the year I’ve gained a great deal of respect for individuals that I’ve worked with in all three of these scenarios, frequently gaining a lot from every conversation.

I have learned that there are clearly experts in all three of these domains, but as I have also said the conversations in each generally seem to head off down different routes and rarely have any direct relationship to each other, beyond the common thread of Open XML.

Now for the part where it starts to become an out of body experience. Every now and again I stumble across events where somebody who has direct experience in one of these scenarios decides to say or post something in another forum that just does not add up given the direct experience or expertise that they have. 

A couple of examples from the last week. Here is somebody who was pretty complementary after his direct experience of the work Microsoft was doing in one of the national standards bodies technical committee meetings here in the region who now appears to have decided that what he has read on the Internet is more likely to be the truth than his own experience. Second is a post from Rob Weir (IBM) a couple of days ago that relates to Ecma’s proposal for the maintenance of Open XML when given his experience I have to assume that he knows full well that the proposal is much more robust than the management of some other document standards, on the same basis I also have to guess that he also knows that a proposal is a proposal and it’s a pretty smart way to begin a conversation.

In some cases there are deep philosophical differences with Microsoft, and I can respect and understand that, as a company we need to prove our intentions which I sincerely hope we will be given a chance to do. In other cases, I have to guess that it is just good fun to make things up and post them on the net to see how much havoc you can cause. Flamebait anyone?

Getting Started with Open XML

3 December 2007

For a while now I’ve been meaning to put together a list of some of the resources that exist today to help developers get off the ground with Open XML.

My inbox this morning contained a link to a post on James Newton-King’s blog, and he appears to have saved me the trouble.  James is a developer at Intergen in New Zealand, you might remember an earlier discussion about the tool that they published up on CodePlex earlier this year that allows you to convert your IIS logs to XLSX so you can work with them in Excel.

The post “Getting Started with Open XML” is a great round up of several starting points for anybody wanting to learn more about the file format and understand how to get started developing their own Open XML based tools and applications.

If you really want to get into the details of Open XML then the specification itself is probably your best starting point. As Doug Mahugh pointed out at TechEd in Malaysia earlier this year you would probably be likely to start out with part 3 of the current Ecma-376 specification which is designed to be a primer for developers, part 3 represents less than 8% of the overall material and is a great entry point for anybody who is keen to work with the Open XML file format.