In the past, I’ve always hacked my own XML output functions. The result wasn’t always good XML, and it took a lot of fprintf()-massaging.

Then I needed a DOM parser for C (not C++ or C#), and the only one I really liked was libxml. It’s got the proper license for me to use it, it’s simple to use, and has botth DOM and SAX parsers.

Here’s a libxml example of how to make your own xml output, taken from the Eressea II sources (you’ll find examples for making your own parser everywhere):

#include <libxml/tree.h> int main(int argc, char** argv) { xmlDocPtr doc = xmlNewDoc(BAD_CAST "1.0"); xmlNodePtr node = xmlNewNode(NULL, BAD_CAST "eressea"); xmlNewProp(node, BAD_CAST "game", xml_s("Ümläutß")); xmlAddChild(node, xmlNewNode(NULL, BAD_CAST( xmlDocSetRootElement(doc, node); xmlKeepBlanksDefault(0); xmlSaveFormatFile(argv[1], doc, 1); xmlFreeDoc(doc); }

That BAD_CAST is just a macro to convert char* into (xmlChar*), and you write it whenever you think that your input is already good UTF-8 and are too lazy to convert. Please see Joel’s article on Unicode first. For places where I don’t have that guarantee, my code uses iconv, a character conversion library to convert the internal char* to UTF-8. Here’s an iconv example for the xml_s function used above:

#include <iconv.h> iconv_t utf8; xmlChar* xml_s(const char * str) { static char buffer[1024]; /* it's enough */ const char * inbuf = str; char * outbuf = buffer; size_t inbytes = strlen(str)+1, outbytes = sizeof(buffer); iconv(utf8, &inbuf, &inbytes, &outbuf, &outbytes); return (xmlChar*)buffer; } int main(int argc, char** argv) { utf_8 = iconv_open("UTF-8", ""); puts(xml_s("ä߀")); iconv_close(utf8); }

That’s so much more fun than fprintf-wrangling.

Advertisements