[odf-discuss] .ott file format

Daniel Carrera daniel.carrera at zmsl.com
Wed Feb 21 09:47:14 EST 2007


On Wed, 2007-21-02 at 21:22 +0700, Damon Anderson wrote:
> A technical question about ODF. How does the specification define the  
> transition between ASCII, extended, ASCII, and UTF-16HEX. I see OOo making  
> mistakes for example where they convert the ampersand (&) to HTML (&)  
> rather than UTF-16HEX, that can't be allowed in the ODF spec surely?

Yes, it's allowed, and it's actually mandated. The ampersand thing is
part of XML itself, not HTML. In XML, the characters & < and > have
special meaning; and so you need "identities" for when you want to
denote those characters in your document.

You know that XML uses <tags> <like> <this>. What if your document
actually contains a '>'? You have to call it &gt; (and < becomes &lt;).
But to make that work, ampersand (&) has to e a special character too,
so you need an identity for ampersand too, and so you get &amp;.

All other entities you know (&euro; &pound; etc) are HTML. But &lt; &gt;
and &amp; are XML.

This issue has nothing to do with ASCII. The ampersand is in ASCII. It
is character 38. And ODF is not limited to ASCII. ODF documents are
generally UTF-8, but AFAIK even that is not mandated.


Cheers,
Daniel.
-- Catalan is essentially Spanish and French spoken at the same time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : http://lists.opendocumentfellowship.com/pipermail/odf-discuss/attachments/20070221/5a1ea83b/attachment-0001.pgp


More information about the odf-discuss mailing list