Archive for the ‘OOXML’ Category

TiVoisation and the ISO

Monday, April 7th, 2008

<GPLv2> I know my rights; I want my phone call!
<DRM> What use is a phone call… if you are unable to speak?

Over the years I’ve explained Open Source to many people and my favourite analogy is around doctors. People wouldn’t want someone operating on them that just came up with their ideas in isolation. They wouldn’t want a new operation performed on them that hadn’t been in a peer-reviewed journal. They would want someone who had been taught by a community of doctors throughout their career. For critical software, being Open Source allows analysis and feedback — it’s the scientific method applied to software. The fundamental (or bare minimum) meaning of Open Source is an openness and opportunity for review and feedback in order to foster improvement.

The spirit of the Open Source license was open, but unfortunately the wording of the license had some loopholes that could be used to keep software closed. One loophole allowed people to restrict it by using software patents. Another loophole allowed people to make hardware that prohibits changes to Open Source software. The TV recording device TiVo used hardware in this way and so this loophole was called TiVoisation. TiVo sell TV recording devices at below cost and they recover the losses by enforcing (through software) a $12.95 monthly subscription. At some later date they are able to recoup their costs and start making a profit. Removing TiVoisation would mean, to TiVo, that customers could change the software and avoid the monthly subscription cost. Openness of software was a threat to their business, so naturally they valued the loophole.

When some lawyers were helping draft a new version of the open source license they wanted to remove this loophole and one lawyer, Eben Moglen, received yet another phone call from TiVo pleading with him not to remove it. TiVo then tried to bargain with him. They offered to remove DRM from the recorded video if they could keep their TiVoisation. Strangely, Moglen responded by explaining their business model to them: he said that this was the real cause of their woes.

Moglen explained that by using Open Source they were benefiting from hundreds of thousands of developer-hours worth of work. It’s free software but it’s supposed to be open and that conflicts with their business model. Change the business model and solve the problem. TiVo could be up-front about it and get people to sign a 12-month contract, or they could charge what the hardware is worth.

What does this has to do with the ISO?

The strangest thing for me in the ISO process was part of the ISO’s business model, and how that conflicted with my technical review of OOXML. I never received a final specification to review and neither did anyone else in the world. For the ISO this was business as usual.

When any National Body is about to vote they have these two things for review:

  1. A copy of the original specification (for OOXML, this was an old version from early 2007)
  2. A list of editor instructions. These editor instructions are just like patches to software, except they’re patches to the text of the specification. Some patches might be exacting instructions such as “replace paragraph 15 on page 5874 with this text…” but others can be vague like “fix bad references” or “make features required as appropriate”. In the case of OOXML these patches were typically half-a-page in length. OOXML had over 1000 of these.

In order to review the quality of OOXML one would need to take the original specification and to understand the ramifications of 1000+ patches. By doing this you were supposed to be able to derive the final text and each person would do this individually to the best of their ability. This was a mammoth task, mostly due to the poor quality of the original specification and of the patches.

For example, patch 222 and 691 were in conflict with each other but both patches were to be applied. These two patches were about whether the text or the XML Schema have precedence. One patch reads:

“If discrepancies exist between the electronic version of a schema and its corresponding representation as published in this part, Part 2, the electronic version is the definitive version.”

…meaning that the XML Schema has precedence over the text, but patch 691 comes to an opposite conclusion. As a consequence reviewers don’t know whether the text or the XML Schemas (both normative) are to be followed in any part of the final specification. All national bodies in the world were expected to evaluate these two nonsensical patches, and they all faced this same dilemma.

Patch 1 (yes, the very first patch) says that there will be an “editorial pass” over the document to fix usage of the terms MAY, MUST, SHALL, SHOULD, etc. These words have a special meaning in standards because they distinguish between what’s required, what’s recommended, or what is merely suggested. This is not necessarily a straight-­forward fix that could go un­reviewed in an editorial pass because defining what’s required or not goes to the very heart of a standard. Again, this patch affects any review of OOXML in any part of the specification.

There are numerous other contradictory or nonsensical patches that were to be applied to the final specification. While I did my best to review OOXML I was stifled by a process that didn’t give me a final specification.

A core part of the ISO’s business model involves selling standards. If they gave away a final copy of the standard for nations (and their advisors, like me) to review before voting then a large portion of their market would already have the standard and wouldn’t need to buy it. Not revealing a final standard for review (and voting upon) is intentional — it’s their business model.

I don’t know how the ISO should sustain itself. Should participants pay up front to be involved? Should national bodies pay an annual fee? Should the organisation proposing the standard pay instead? Should the ISO sell pleather-bound paper copies but give away the files for free? I don’t know how to solve this but I do know that the current process must change because it conflicts with any analysis of a proposed standard.

That’s the blog post. I’ll leave you with the Standards Council of Canada’s Final Position Statement which reads:

“ISO/IEC DIS 29500 OOXML – Fast Track
Canadian Final Position Statement

Canada has carefully reviewed the results of the ISO/IEC DIS 29500 OOXML Fast Track Ballot Resolution Meeting and determined after detailed analysis that Canada will maintain its Disapprove vote.

Canada notes that major enhancements had been made to ISO/IEC 29500 during the Ballot Resolution Meeting, but the general quality of the standard was not yet what was expected of an ISO/IEC Standard, and that there were still too many unknowns.

Canada states that the inappropriate use of the fast track process for this DIS has rendered it impossible to ascertain whether in fact 29500 meets the standard of quality and correctness required in an International Standard.

Canada further recommends that the ISO/IEC JTC 1 Fast Track procedures and processes be reviewed and enhanced to ensure that this situation does not arise again in the future, and bring disrepute to the whole ISO and IEC International Standards process.

Finally, Canada recommends that the ISO/IEC DIS 29500 OOXML Fast Track documents and materials, plus the enhancements made at the Ballot Resolution Meeting be submitted to ISO/IEC JTC 1/SC 34 as a New Work Item for processing via the normal standards development processes.”

(emphasis mine)

Update (later that day): I said I’d write about the Standards New Zealand process that I was involved in for a year. Well, I’m very happy (proud, even) with how Standards NZ ran the process, and of course I agree with their final decision to vote No. But as for my fellow participants it’s actually quite hard to write about the meetings and the arguments without talking about them as individuals and their behaviour. Rather than writing a blog post about that I think I’ll just send some personal thank-you emails to a few of the people involved. Cheers :)

The ISO Standardisation of OOXML in 17 Easy Steps

Monday, March 31st, 2008

(This isn’t a nuanced opinion, that would be very long!)

The ISO Standardisation of OOXML in 17 Easy Steps

  1. We have had over 15 years of secret file-formats changing with every version of Microsoft Office in order to stifle competition and force annual upgrades to compatible software (the upgrade treadmill),
  2. It’s a principle of government that they should be vendor neutral. If a government said “All Ford trucks can drive 20 kilometres faster than all other cars” there would be outrage! In the late 1990s governments all around the world realized that web sites shouldn’t favour Microsoft Internet Explorer, and that they must use vendor-neutral standards.
  3. This argument is then extended to Office Suites and their secret file-formats. For vendor-neutrality/competition some governments propose moving away from Microsoft Office’s format to a new standard called OpenDocument (ODF) which is used by OpenOffice.org, KOffice and many others. ODF was approved by ISO under the ‘PAS’ process.
  4. Microsoft are concerned that they’ll lose their government sales because their Office Suite doesn’t use a standard. If government start using a competitor and putting money into them then maybe something like Firefox will spring up to take them on in Office Suites. Their Microsoft Office cash-cow that earns them (something like) 3.8 billion every 3 months is under threat!
  5. Microsoft respond not by supporting ODF but by proposing a competing faux-standard, OOXML (Office Open XML). They hurriedly rush through some poorly written documentation with hundreds (if not thousands) of mistakes that can’t be implemented in full. This is good enough for Ecma International, who approve it as a standard called ECMA-376. ECMA-376 is a complete mess — inconsistent, buggy, inflexible, ugly (non-mixed content model, OLE, DEVMODE).
  6. ECMA-376 is submitted to the ISO under the ‘Fast Track’ process, and is now given the name DIS-29500. It’s not a normal process that allows time for improvement, it’s a brief 9 month review of 6000 pages (that’s a lot).
  7. Lobbying begins internationally. To stereotype the process into two camps, it’s the people who want to get out from the monopoly Vs those who benefit from the monopoly (Microsoft and business partners).
  8. Every country gets a vote in the ISO, so New Zealand is as big as the United States, China, India … and each country has 9 months to comment on OOXML. The proposed standard is soon recognized as being technically awful, broken, not-cross-platform, designed to confer the appearance of standardisation but without the detail necessary.
  9. The ISO doesn’t necessarily decide on technical merit, there’s a lot of non-techies who are open to all kinds of arguments other than the quality of the standard. They’re not the ITTF either, they don’t need implementations to prove the standard. The ‘Fast Track’ can just approve stuff.
  10. Process irregularities come out in favour of Microsoft. There are accusations of corruption. They’re caught stuffing the ballot in Sweden. Lots of small African nations suddenly sign-up and favour Microsoft. Public perception is that the ISO process itself is quite hackable.
  11. Microsoft lose the late 2007 vote, but there’s another final chance.
  12. Microsoft make some changes to OOXML in response to national comments, but a 9 month review has only touched the surface of the problems within OOXML.
  13. They probably will win this current vote (March 2008) and gain ISO approval for OOXML.
  14. A lot more accusations of process irregularities, some by people from within the process.
  15. If OOXML gains approval then the ISOs reputation will be in tatters within the technical community.
  16. The backlash against Microsoft and the ISO will be strong. This Slashdot post sumarises this well: Slashdot: Microsoft’s Miscalculation.
  17. But really we’re just back to Microsoft Office and its secrets (due to the poor quality of OOXML). We have the same task ahead of us. We need to promote open standards as a way out from the upgrade treadmill. We need to get people to switch software. The work has only just begun.

Coming up next… my involvement in the New Zealand process.

Edit: corrected estimations of Microsoft revenue on their office suite (it’s only 3.8 Billion every 3 months).

A review of Rick Jelliffes talk at Catalyst on the 26th of July, 2007

Tuesday, February 19th, 2008

NOTE: This was originally an email on the NZOSS mailing list. I’m posting this here upon request by several readers.

———————–

Hi folks,

As you know Rick Jelliffe has been traveling around New Zealand talking to SSC and various other groups about Microsoft’s proposed OOXML, the ISO Standards process, and “Wiki-gate” (which for those who don’t know was Microsoft looking to hire Rick to edit Wikipedia’s OOXML entry and the resulting media furore). Rick has a history in various ISO standards, Schematron (XML validation), and the original XML working group at W3C.

Last Thursday Rick did a talk at Catalyst IT here in Wellington and afterwards I was asked by a few people to post a review of the evening (I’m an SGML/XML guy too, I do docvert.org and analysis of ODF and OOXML).

Rick’s (and Microsoft’s) explanation of Wiki-gate is that he was paid to tell the truth — to correct inaccuracies about OOXML. Rather than Microsoft editing the article themselves (which is apparently against Wikipedia’s “Conflict of Interest” rules) they say they paid someone to correct mistakes.

While correcting mistakes is good, Rick challenges people to accuse him of “Microsoft [payment] for my opinion: i.e. that I would change my opinion to suit them or whoever paid me”.

So he says he’ll get his meal-ticket angry if he wants because he’s an independent man, and to say otherwise is to attack his credibility.

In my opinion the Wikipedia thing was no big deal, and I think to focus on whether he’s been bought or not is a VERY weak argument because in theory he could be posting what he wanted regardless of money.

I say “in theory” because the real issue is that he doesn’t post balanced stuff which should really be the focus — not the payment.

The talk was due to start at 4:30 which gave the 30 some people in attendance (mostly Catalyst folks) time to have a drink and mingle. I was introduced to Rick and he seemed an approachable friendly guy. We talked a little about how if Government want to use OOXML or ODF they need to define a “profile” (a subset so that people don’t embed proprietary binaries or use poorly supported parts of the spec) in the same way that they do with the Web Standards specification which constrains use of HTML and CSS. Soon after this it was time for the talk.

The first topic was “Wiki-gate”. Rick explained how the event unraveled and took on a life of its own: the initial email where Microsoft were worried about their competitors changing Wikipedia; how he posted Microsoft’s offer on his blog before any contractual agreement; and how news sites took that as Microsoft buying him. Rick pointed out quite clearly that the ‘Conflict of Interest’ rules as shown at http://urltea.com/13b9?wikipedia-coi applied to biased views, not what Rick was asked to do. Certainly there was a great deal of misinformation and many people had accused him of accepting Microsoft bribes (he had screenshots).

Rick never denied taking payments, he only said that he would have the same opinion regardless (and this may well be true, he comes off as a likeable and sincere guy).

IBM’s Rob Weir had this to say about “Wiki-gate” (it’s a tired cliche, that *-gate): Rob Weir’s blog: Crocodile tears

(go read that post by Rob, I’ll wait…)

So this post by Rob Weir was interesting (the approach of seeing what Wikipedia OOXML entry looked like before this — was it as bad as Microsoft said?), but in particular read the comments that followed the post where Rob and Rick talk it out and Rick sees accusations of bribery where there were none. Rick posts under his full name, so it’s easy enough to find his posts.

What I wanted to see whether Rick would present a balanced view of OOXML and ODF, because his online posts didn’t give me much hope…

Rick Jelliffe writes,

The OOXML specification requires conforming implementations to accept and understand various legacy office applications

Several readers respond,

I’m not sure if I’ve heard that one. I know that in the OOXML spec just about everything is optional, so the compliance issue isn’t really the point.

A more relevant point is that real-world OOXML files will contain stuff from all over the spec, and even external file formats like RTF. There’s no reasonable way that I, as a software developer, can handle that. This is a practical concern, not a legal issue.

[...]

Perhaps I’m missing something but I thought the purpose of a standard was for people to create interoperable implementations. Having something like “strings” spew out bits of ascii text from an ODF document would be “conformant” under your definition it would just not implement 99% of the “bits” and say so. This seems even more absurd.

Rick got it quite wrong here. OOXML conformance is defined in the spec as “A conforming consumer shall not reject any conforming documents of the document type expected by that application” (OOXML. Part 1, Section 2.5 ‘Application Conformance’). In other words conformance is defined as anything that application decides to read, rather than any feature-set or compliance with the standard. See http://urltea.com/13cw?rob-weir-ooxml-conformance

Rick Jelliffe writes,

On the other hand, the kind of openness that a completed external specification like OOXML can have is different from the kind of openness that a work-in-progress external specification like ODF affords

Several readers respond,

Er, um, ODF is an ISO standard — ISO/IEC 26300 — hardly a “work in progress”.

[...]

Every format is a work in progress, even Microsoft are working on improvements to OOXML. What’s that kind of snide comment supposed to mean Rick?

Heh… the slant he’s put on that one speaks for itself.

Rick Jelliffe writes,

As I have mentioned before on this blog, I think OOXML has attributes that distinguish it: ODF has simply not been designed with the goal of being able to represent all the information possible in an MS Office document

A reader responds,

This is inaccurate to the extent that it requires qualification. ODF was not designed specifically for MS Office, but it _was_ designed to be able to represent any office document, including, but not limited to, old, current, and future functionality of MS Office. This, of course, requires an explanation and I will explain.

Before I explain, I want to note that I know the ODF spec well, and I’m familiar with the EOXML spec. The only thing I’ve noticed in EOXML that can’t be done with ODF features is inserting RTF files.

Now my explanation:

ODF can be extended. ODF section 1.5 explicitly allows extensions to the spec, with only minor conditions (essentially, you have to use your own namespace).

Microsoft is free to use ODF features for the bulk document and create their own, separate namespace for functionality they couldn’t map. This is permitted. This doesn’t make the file invalid, and I know of one other ODF application that does this too.

Of course, it is highly preferable that MS standarize their extension through a standards body, but that is a separate argument now.

So, as I was saying, I didn’t hold much hope for hearing a balanced view at this Catalyst talk, which this blog post should now get back to…

The next topic Rick covered was the OOXML format and in particular BITMASKS

A bitmask is a technique to encode multiple values inside a single variable, by assigning a meaning to each individual bits of the variable. For example, the binary 10110001 (decimal 177) would mean Yes/No/Yes/Yes/No/No/No/Yes and contain the answers to 8 different yes/no questions.

Rick brought up bitmasks because it’s a popular argument for OOXML detractors. He said that the complaints about OOXML using bitmasks for storing the code pages that a given font supports were exaggerated and this was an acceptable use of bitmasks, that XML processors wouldn’t touch this, and that it was like an hexadecimal RGB value in that it had it’s own non-XML string syntax. There was — he thought — was only a single other place that used bitmasks in the spec (when actually there are at least 8 other places – but I’ll get to that).

XML itself avoids endian issues and is encoding independent; it doesn’t have ways of referring to bits. You can however store strings of “010101101″s and then validate these strings but as W3C Schema has no bitwise operators this will be an arduous task of string comparison. This is completely true, and I brought this up with Rick. His response was to say that you can validate a bitmask serialized as a string of “010110110″ by doing lots of string comparisons. For example, in XSLT you could write…

<xsl:choose>
<xsl:when test”element[substring(@attr, 1, 1) ='0']“>
….
</xsl:when>
<xsl:when test”element[substring(@attr, 1, 1) ='1']“>
….
</xsl:when>
<xsl:when test”element[substring(@attr, 2, 1) ='0']“>
….
</xsl:when>
<xsl:when test”element[substring(@attr, 2, 1) ='1']“>
….
</xsl:when>
<xsl:when test”element[substring(@attr, 3, 1) ='0']“>
….
</xsl:when>
<xsl:when test”element[substring(@attr, 3, 1) ='1']“>
….
</xsl:when>

[...]

</xsl:choose>

As opposed to something half that size which was more readable/maintainable/extensible/bugfree because they’d used attribute values. Rick’s specific suggestion is validating via “regular expressions or enumerations or unions or numeric ranges”. He also suggests a somewhat cleaner Schematron schema here, http://urltea.com/13cf

The other areas that define bitmasks are these,

  • Section 2.3.1.18, Paragraph conditional formatting (page 842).
  • Section 2.4.7, Table cell conditional formatting (page 1085).
  • Section 2.4.8, Table row conditional formatting (page 1087).
  • Section 2.4.51, Table style conditional formatting settings (page 1211).
  • Section 2.4.52, Table style conditional formatting settings exceptions (page 1213)
  • Section 2.15.1.86, Suggested filtering for list of document styles (page 2034)
  • Section 2.15.1.87, Suggested sorting for list of document styles (page 2036)
  • Section 6.1.2.7, tableproperties attribute of shape group (page 5227)

That’s the kind of stuff I do in Docvert so it’s definitely something that I’d want to be able to manipulate (people use such formatting to suggest headings which Docvert then converts into DocBook <section> tags and such).

But even if I didn’t want to manipulate a bitmask (or an RGB value) I may want to validate it. For the RGB example it’s good to be able to notice that a value of “#FF00GF” is invalid.

Personally I consider bitmasks to be one of the weaker arguments against OOXML. It’s a poor design decision but at least it’s documented, unlike many other parts of the specification.

But enough about bitmasks,

NOW TO ISO STANDARDS

A question from the audience asked a great question of whether OOXML can achieve interoperability with other implementations.

Rick said that Standards are not about interoperability but rather about people with common interests coming together and agreeing something mutually useful.

I was a little shocked at Rick’s view, as were others in the room. Fortunately the ISO/IEC appear to disagree with Rick here and they set higher standards…

A purpose of IT standardization is to ensure that products available in the marketplace
have characteristics of interoperability, portability and cultural and linguistic adaptability.
Therefore, standards which are developed shall reflect the requirements of the following
Common Strategic Characteristics

● Ιnteroperability;
● Portability;
● Cultural and linguistic adaptability;

- http://urltea.com/13bg

He mentioned that ODF didn’t define the ZIP format, and that OOXML did (Open Package Container, or OPC as it’s called). This was actually a good point! Even though it’s well known ODF should have defined the container format.

After the presentation we all transformed (*Insert cool Transformers noise here*) into groups.

I had a chat with him about the legal issues of OOXML. Microsoft have granted patents over the required parts of the standard, but not the non-required parts — effectively restricting competitors to only implementing a subset of OOXML.

His argument was that it didn’t matter if Microsoft would only grant patents over the required parts of the spec. Microsoft wouldn’t win any court case because courts would frown on defining a standard only to legally restrict people from implementing it (bait and switch) — and that there were several court cases to do with pipe fittings that established precedence.

So the obvious response is that precedence is good if it goes to court but the affect can start much earlier (Eg, Microsoft claiming that Linux violates 235 patents without ever saying which patents they are or taking it to court). Similarly I see any OOXML patents having a life of their own outside the court for marketing and legal threats. Many others agree with this idea…

http://www.youtube.com/watch?v=6YExl9ojclo

I also asked what he thought about “Office Open XML” having a similar name to OpenOffice.org and he didn’t have a problem with it. He said that what ISO name it won’t affect what Microsoft call their product/feature so ISO don’t have much influence there. British Standards Institute panelists have suggested the name “RODDL” (see http://urltea.com/12nw ) but this won’t affect what Microsoft calls their effort.

There was a question about ’stacking the vote’, how companies interested in OOXML had suddenly joined standards groups all around the world in order to vote for it. What happened was this,

“As you can see, at the start of the year, V1’s membership consisted of seven organizations, six of whom on Friday voted “Disapproval, with comments”, and one (Microsoft) who voted “Approval, with comments”.

The membership spurt came at the very end, in the last month, when 16 new members joined V1. Of these 16 new members, 14 of them voted, “Approval, with comments” on Friday.”

- http://urltea.com/13cu?rob-weir-blog

Rick excused this by way of saying that as the voting process allows companies to join it was acceptable.

And I don’t have a good response to that. I don’t know what I could suggest that would be better. As Rick quite rightly said are you going to disallow companies who are interested in the tech from voting — if so, many people are guilty, not just Microsoft and OOXML supporters.

Finally he talked a little more about whether the OOXML and ODF standards can be merged, and again he quoted an ODF author as saying they couldn’t be. Luckily Microsoft’s Alan Yates and Tim Bray (co-lead developer of XML1) think it can and I do too… see http://urltea.com/12nt .

…and that was the talk.

Rick is a smart guy and — as I hope I’ve described — he’s very pleasant and likable. Wikigate was overblown — I think we can all agree on that. I didn’t write this to personally hassle Rick, just to explain his arguments as I understood them and if I got any of it wrong I’ll say so.

I wrote this to point out that his analysis of OOXML is typically one-sided and it ignores established facts (again he talked about OOXML requiring legacy support at the Catalyst talk).

I think Rick is very wrong in his analysis of OOXML, and I hope I’ve got enough references here to prove that.