Tag Archives: Microsoft

DOCX is not pure XML

Hi,

So one more bashing post against OOXML from a TDF member? I see it more as a reflection as I had to create a non-binary (do not say anything about binary formats to save important data) for a homework from my university. And I realized that I made the very same mistake, it is just easier to explain…..

Anyway I start with bashing 😉 The example I took is from here.

<!--?xml version="1.0" encoding="UTF-8" standalone="yes"?-->

<xml xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel">
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1"/>
</o:shapelayout>
<v:shapetype id="_x0000_t202" coordsize="21600,21600" o:spt="202" path="m,l,21600r21600,l21600,xe">
<v:stroke joinstyle="miter"/>
<v:path gradientshapeok="t" o:connecttype="rect"/>
shapetype>
<v:shape id="_x0000_s1025" type="#_x0000_t202" style="position:absolute;
margin-left:203.25pt;margin-top:37.5pt;width:96pt;height:55.5pt;z-index:1;
visibility:hidden" fillcolor="#ffffe1" o:insetmode="auto">
<v:fill color2="#ffffe1"/>
<v:shadow on="t" color="black" obscured="t"/>
<v:path o:connecttype="none"/>
<v:textbox style="mso-direction-alt:auto">
<div style="text-align:left"/>
</v:textbox>
<x:ClientData ObjectType="Note">
<x:MoveWithCells/>
<x:SizeWithCells/>
<x:Anchor>
4, 15, 2, 10, 6, 15, 6, 4</x:Anchor>
<x:AutoFill>False</x:AutoFill>
<x:Row>3</x:Row>
<x:Column>3</x:Column>
</x:ClientData>
</v:shape>
</xml>

So, do you find the mistake? What is not XML? It’s this line:

<x:Anchor>4, 15, 2, 10, 6, 15, 6, 4</x:Anchor>

So please keep that code fragment in mind…. I will continue with the story of my program. It is a sudoku program. (Wikipedia for Sudoku). What I want to achieve: A sudoku (for me, you can skip this it is not necessary) consists out of x boxes, where x is the total # of numbers. Each box of a sudoku has a height h and a width w.  x = w*h. w and h naturally have to be natural numbers starting from inclusive 2. This also means you have to store the width and height, as you will need them (A box can have 12 numbers and a height of either 3 ,4 or 6). The following shows the final XML file for a sudoku,

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sudoku>
<innerWidth>3</innerWidth><innerHeight>3</innerHeight>
<row>6 0 0 0 1 0 5 0 0 </row>
<row>8 0 3 0 0 0 0 0 0 </row>
<row>0 0 0 0 6 0 0 2 0 </row>
<row>0 3 0 1 0 8 0 9 0 </row>
<row>1 0 0 0 9 0 0 0 4 </row>
<row>0 5 0 2 0 3 0 1 0 </row>
<row>0 7 0 0 3 0 0 0 0 </row>
<row>0 0 0 0 0 0 3 0 6 </row>
<row>0 0 4 0 5 0 0 0 9 </row>
</sudoku>

So, do you see the parallel between this file (or a row-element of it) and the docx file? Although it should not be too difficult to get a sudoku out of this, it is no real XML…. Why? Because a row in fact is application encoded.  So

<row>1 0 0 0 9 0 0 0 4 </row>

should in a perfect world be

<row>
<cell>1</cell>
<cell/>
<cell/>
<cell/>
<cell>9</cell>
<cell/>
<cell/>
<cell/>
<cell>4</cell>
<cell/>
<cell/>
<cell/>
</row>

I am quite sure, that you realized it in the sudoku context, that it should not be a big problem, but

 <x:Anchor>4, 15, 2, 10, 6, 15, 6, 4</x:Anchor> 

And this could be 4 points x1,y1,x2,y2,x3,y3,x4,y4, but this would be a rect but no anchor. (As I really do not know what it is, but just want to give you an idea how tricky it could be ( means a “Test” property…):   ,  ,  ,  .

Just before the end please have a look at this line

21600r21600,l21600,xe">

coordsize=”21600,21600″: Ok, let us just hope it is pixel or distance from some corner of the page….

But you cannot tell me, that this is not application encoded….. Do you have a clue what it should be? I guess it is a curve (You see the coordsize here again)

path="m,l,21600r21600,l21600,xe">

I have written two articles about Microsoft office, you may find the first here. As always your thoughts are very much appreciated 😀

LibO vs. MS Commercial

shot-2013-07-09_09-09-00

(Screenshot taken from: whygetasurfacepro.com)

So, here is the very first real post in the category “LibO vs”. I would like to start analyzing the commercial, which is about the surface Pro, which runs the standard Windows 8, so that you are able to use all of the Win 7 programs.

So let’s compare the functionality of LibreOffice with MS Office (MS Office prices from http://office.microsoft.com/en-us/buy/ )

Full Office Suite LibreOffice 4.1 (0$) MS Office Home & Student 2013 (139,99$) Office 365 Home Premium (9,99$ / month or 99,99$ / year) or Office Professional 2013 ($399) Office Professional 2010 ($???) Office Online (0$)
Create Presentation
(Easy to add effects and organize them in a beautiful way)
YES (Personally not that easy)
+
YES (If you are OK with Ribbon [UI since 2007] than absolutely)
+
YES (If you are OK with Ribbon [UI since 2007] than absolutely)
+
YES (If you are OK with Ribbon [UI since 2007] than absolutely)
+
YES (Basics only)
+
Read and save OOXML documents
(e.g.: docx pptx)
[1]
YES (See MS Office for some incompatibility issues)
+
YES (If saved in “strict”)
+
YES (If saved in “strict”)
+
YES (Keep in mind: Default format since 2007 –> The OOXML this program saves is not the standard)
YES (Basic functionality, No strict mode)
Read ODF files (e.g.odt, odp) 1.2 extended
++
1.2
+
1.2
+
1.1 (That’s why some ODF files are “damaged” for MS Office….)
YES (Do not know which ODF version, assume 1.2)
+
Open | Edit MS Publisher files YES | YES (No possibility to safe .pub files)
+
NO | NO
YES | YES
++
YES | YES
++
NO | NO
Open | EDIT MS Visio files YES | YES
++
NO | NO (I do not own this program, it is not advertised on the website)
YES | NO (I guessed this one, I am sure about 2010 Professional)(I do not own this program, it is not advertised on the website)
+
YES | NO(Not advertised, but I own this program)
+
NO | NO
Save an editable copy of a document inside of a PDF YES
++
NO
NO
NO
NO
Where can I use it? Windows, Mac and Linux [iOS and Android, as well a online version in work]
+++
Windows and Mac (There is a version for Windows Phone, Windows RT and online, but with reduced functionality]
++
Windows (There is a version for Windows Phone, Windows RT and online, but reduced functionality]
Office 365: Additionally: iOS [reduced functionality]
+++
Windows (There is a version for Windows Phone and online, but reduced functionality]
+
Online Connection required, can be used additionally to the normal suites
++
Touch optimized NO
NO (not really)
NO (not really)
NO
NO
E-mail client included NO (You may use the OpenSource Thunderbird by Mozilla)
NO
YES
++
YES
++
NO [Outlook.com is another online service, but no desktop app)
End 12 + / 3 4 + / 7 11 + / 2 7 + / 6 4 + / 9
Final result

+9

-3

+9

+1

-5

Rank

1

0$

4

139$

1

9,99$ / month or 99,99$ / year or 399$ one-time

3

???$

5

0$

Disclaimer: I have done my best to research this. If you think I am wrong, please comment!!!

So, please use the full Office Suite: LibreOffice

[en] Migrationsleifaden des Bundes

In Germany there is something called like that.LINK to Rainer’s Blog Post.

There is not really a to do list, but I will try to create one. If there is a point missing, please comment, I am having a look!

  • Better linguistic tools (grammar + spell)
  • Better ODF compatibility
  • Less crashes when opening .docx 😉
  • Better Office 2007/10 format compatibilityy
  • HTML 4.01 (Some people said that the code is… difficult to read….) | xHTML 1 (See Excel | Calc)
  • I really would like to see SUCH a tool in LibreOffice ( PDF compatibility tool)

Some (German) quotes (With English description and some questions of mine:

Desktop-Datenbanken Insbesondere der Einsatz von Desktop-Datenbanken wie Microsoft Access
oder LibreOffice Base sollte künftig vermieden oder wenigstens stark zurückgefahren werden, weil einerseits
ihre Dateiformate nicht standardisiert sind und daher auch künftige Migrationen erschweren,
und weil andererseits die damit möglichen „persönlichen Datenbanken“ regelmäßig nicht von der IT Abteilung
erfasst sind.

That means:

Is is strongly recommended not to use either LibO Base nor MS Access, because migration is not possible.

Why don’t they consider using Base or Access as a SQL front end?

——

From: Migrationsleitfaden P 137

171 Für Mac OS X bietet Microsoft eine eigene Office-Suite an.
172 ODF kann bei der Installation anstelle von OOXML als Standard-Dokumentenformat vorgegeben werden.

171 A office suite is also available for Mac OSX
172 ODF can be chosen as the standard format

That is quite good, we are better than MS Office ( At least here 😉 )

——-

Word | Writer

Migrationsleitfaden des Bundes P 142

194 Über Plug-In
195 Über Etiketten- oder Serienbriefdruck
196 Laut ODF Validator wird ODF v1.1 von Word 2010 nicht korrekt umgesetzt, siehe Seite 140.
197 Teilweise Abstürze, geringe Interoperabilität

194: With a plugin
195 Via Mailmerge
196 According to ODF Validator Word 2010 is not totally correct (More page 140)
197 Some crashes, all on all working worse

They “hate” our spellchecker:

Die mitgelieferte Rechtschreibprüfung hat vor allem bei der Komposita-Bildung Probleme. Sie erkennt
z.B. „Sortierfunktionen“ nicht und bietet „Tortierfunktionen“ als Alternative an.

That means: Compositae ( Ihope that is the word) are not working good enough, that is the reason that there is the note “via plugin” in the table above.

They also dislike that LibreOffice is crashing, when opening a “Word 2007 XML (Quasi-DOCX) file)…

——

Excel | Calc

Migrationsleitfaden P144

Where I was shocked: No ODF compatibility of Calc??

Calc speichert Tabellen im ODS-Format, je nach Einstellung im Format ODF v1.2 oder ODF 1.0 / 1.1.
Hinsichtlich der Schema-Validierung treten bei der mit Calc erstellten Testdatei mit Formeln, Bildern
und Diagrammen gemäß ODF Validator für alle ODF-Versionen Fehler auf. In der Einstellung ODF 1.0
/ 1.1 werden für beide ODF-Versionen jeweils 4 x nicht erlaubte Werte für das Attribut „chart:labelcell-
address“, bei der strikten Validierung zudem 2 x nicht erlaubte Werte für das Atrtibute „style:textposition“
bemängelt. Letztere beide werden auch in der Einstellung ODF v1.2 kritisiert; zusätzlich wird 4
x der nicht erlaubte Tag-Name „chartooo:coordinate-region“ als Fehler festgestellt. Office-o-tron hingegen
meldet auch in diesem Fall keine Validitäts-Verletzungen

There is also a note:

Die Prüfung einer in ODF v1.2 gespeicherten ODS-Datei ergibt bei nicht gesetzter Option „Force validation of ODF against
ISO/IEC 26300“ statt eines Validierungs-Ergebnisses eine java.lang.NullPointerException

That means: There is a crash while checking….. ( More @ page 144)

So, now the second part of this table

Migratinsleitfaden P145

199 Konformitäts-Fehler, siehe Seite 143
200 Konformitäts-Fehler, siehe Seite 144
201 Dieselben Konformitäts-Fehler wie für ODF v1.1.
202 Geringe Interoperabilität
203 Eingeschränkte Steuerungsmöglichkeiten beim PDF-Export.
204 Diagramme erscheinen unverhältnismäßig groß.
205 Teilweise mehrfacher Export derselben Diagramme und Bilder.

199 Conformity error (page 143)
200 Conformity error (page 144)
201 Conformity error like ODF v1.1
202 Less interoperability
203 Less settings when exporting as a PDF
204 Diagrams are extremely big
205 Sometimes there are pictures more than once…

NO Comments here

Both pictures are from the Migrationsleifaden p146f


211 Konformitäts-Fehler, siehe Seite 145
212 Konformitäts-Fehler, siehe Seite 146
213 Konformitäts-Fehler, siehe Seite 146
214 Geringe Interoperabilität
215 Eingeschränkte Steuerungsmöglichkeiten beim PDF-Export.
216 Sehr gute Steuerungsmöglichkeiten beim HTML-Export.

211: Error see p145
212-213: Error, see p146
214: Less interoperability
215: Less control options
216: Good control options at the HTML export

That is all for now 😉