In trying to modify multiple Author-it topics (okay, 5,119 topics) with variable assignments, I have had to work with the XML output that Author-it exports.
To export to XML, you select a topic or multi-select topics, then right-click on the selection and choose XML > Save to file.
Turns out, Author-it outputs its XML encoded with UTF-16, but apparently most Windows applications understand UTF-8. When I tried to open my freshly export XML file, XML Copy Editor gave me an error.
So I had to discover how to convert the XML from UTF-16 encoding to UTF-8 (and you can’t just open it in Notepad on Windows and change the 16 to 8, there are other embedded characters indicating the encoding, and, well, it’s encoded.)
First, I used the identity transform documented many places, my favorite place being the XSLT Cookbook, to convert the Author-it output to UTF-8. Here’s the XSLT code for that:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output encoding="utf-8"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
I just ran the above transform against the AuthorIT Objects.xml file I exported, using the Instant Saxon XSLT processor.
Then, I wanted to remove all <VariableAssigments> elements, effectively removing an entire node. Again, the identity transform (or copy transform) was effective. And, I learned that I had to identify the AuthorIT namespace thanks to this excellent helper article, Handling Default Namespaces on topxml.com.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:ait ="http://www.authorit.com/xml/authorit"> <!--Match everything using identity copy--> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <!--Remove all the VariableAssignment nodes--> <xsl:template match="ait:VariableAssignment"> <xsl:comment>Removed VariableAssignment element</xsl:comment> </xsl:template> </xsl:stylesheet>
This transform is pretty dangerous, though, because it’s taking away all the VariableAssignment elements, and you might have valuable metadata stored that you would quickly blow away. So use with care.
This workaround is likely also useful for DITA and XHTML files, because Author-it outputs its DITA and XHTML files as UTF-16. I haven’t investigated its usefulness for those areas, but a quick search on the Author-it-users group on Yahoo revealed that sometimes people want UTF-8 rather than UTF-16. So, I hope this helps.