LGF Comment: re: #157 Charles Not really a guru, ...

Breitbart Shocked, Shocked to Find That Birtherism is Going on at TeaBagCon

SixDegrees2/06/2010 1:51:35 pm PST

re: #157 Charles

Not really a guru, but I know a bit. What’s the question?

We’ve been parsing a particular type of XML document for some time, without trouble, using Xerces-C++. We got a failure the other day, because every tag in the document suddenly has a namspace attached to it, which we’ve never seen before. The namespace is the same on every tag, and appears to be machine generated. There is a single tag at the top of the document (I don’t recall the name of it) containing a huge list of namespace definitions of the form ‘xmlns:foo=”urn:gdyn:ims:bar”’; ‘foo’ is the namespace found in the document tags, although there are many other namespaces listed.

I’m assuming that somewhere, there is a document that states one of two things: the namespace we’re supposed to be interested in, or the urn we’re supposed to be interested in. Is this correct? It seems impossible to me to attach such metainformation to the document itself, so we need some external source for the mapping between namespaces/urns and things we actually care about.

Also, since we’ve never seen namespaces before this, how do we set up our code to parse both types of document, ones with namespaces defined and ones without? It seems as though the only solution is to look for the namespace setting, and vary the function call based on it’s presence or absence, using either getElementsByTagName() or getElementsByTagNameNS() as appropriate. Is this also correct, or is there a way to tell the latter to use the “default” (non-explicit) namespace, maybe by using an empty string?