So, I recently discovered that by renaming an Open XML document to .zip allowed me to find elements within it. It turns out that there's a lot more you can do with it and following an email conversation with Doug Mahugh (our resident Open XML guru) I discovered that there's a bunch of other cool tricks you can play because you can get at the data directly.

You can edit any of the XML parts by just dragging them to the desktop, opening in IE, View Source and make changes, save, drag back into the ZIP package, and rename back to DOCX/XLSX/PPTX.

The first one of those is separation of presentation and data in spreadsheets

Create a spreadsheet with a column of random numeric data in it.  For example, type =RAND()*100 into A1, then drag the lower right corner of A1 down to A20.  Then click on Conditional Formatting and pick something you like.  Save the XLSX, open as a ZIP, and check out xl\worksheets\sheet1.xml.  The sheetData element contains all the data, and the conditionalFormatting element afterward contains the conditional formatting definition (and the range it applies to, in the sqref attribute).

This demonstrates nice clean separation of presentation and data, which occurs many places in the Open XML formats.  You can edit the sheetData or conditionalFormatting separately, or delete the conditionalFormatting (or change its range) without touching the data.

This gives you great opportunities to programmatically update the data (or presentation) from, say, a web application and then push the resulting file back to the user.

If you're working with Open XML documents from a server you're not limited to using the Microsoft platform, it's equally possible from PHP etc and most of the heavy lifting is already bundled into easy to use libraries for popular platforms.

More to follow...