Basics of validating xmls with a given schema in C++.
1. Create parser instance.
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
2. Set required features to parser instance as follow.
// Enable the parser's schema support parser->setFeature(XMLUni::fgXercesSchema, true); // Schema validation requires namespace processing to be turned on. parser->setFeature(XMLUni::fgSAX2CoreValidation,true); parser->setFeature(XMLUni::fgSAX2CoreNameSpaces,true);
3. Set schema location using setPropery api call with/without namespace. If we want use ‘ExternalSchemaLocation’ property we need to append Namespace with a space char and then schema file path.
// Define the location of the schema. XMLCh* schemaLocation = XMLString::transcode("/directory/path/myschema.xsd"); parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,schemaLocation);
Current parser version requires the path in below format.
XMLCh* propertyValue = XMLString::transcode("myschema.xsd"); ArrayJanitor janValue(propertyValue); parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,propertyValue);
Another important thing to remember is – always file path should be in “file:/// “. If you don’t follow this format, parser won’t complain anything but validation/parsing don’t go well. Fortunately, if you are using in java, File API class provides you getURL() call to get the path in file protocol.
4. Now, set the content as well as error handler to parser instance. Remember always use custom handler by inheriting from DefaultHanlder while setting these. For error handler, inherited methods error(), warning(), fatal() needs to be overridden, otherwise parsing/validation errors go unnoticed without catching the exceptions.
parser->setContentHandler((ContentHandler*) myContentHandler); parser->setErrorHandler((ErrorHandler*) myContentHandler);
5. Finally, parse api call.
// Do the parse parser->parse(*xmlInputSource);
6. Now complete code would be.
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader(); parser->setFeature(XMLUni::fgSAX2CoreValidation, true); parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true); //* Enable strict validation parser->setFeature(XMLUni::fgSAX2CoreNameSpacePrefixes, true); parser->setFeature(XMLUni::fgXercesValidationErrorAsFatal, true); //* Enable the parser's schema support parser->setFeature(XMLUni::fgXercesSchema, true); parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true); parser->setFeature(XMLUni::fgXercesDynamic, false); XMLCh* propertyValue = XMLString::transcode(m_sDefSchema.getMBCSCopy()); ArrayJanitor<XMLCh> janValue(propertyValue); //* Define the location of the XML schema. if(isNS) //with/without namespace //Property name - http://apache.org/xml/properties/schema/external-schemaLocation parser->setProperty(XMLUni::fgXercesSchemaExternalSchemaLocation,propertyValue); else //Property name - http://apache.org/xml/properties/schema/external-noNameSpaceSchemaLocation parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,propertyValue);
Now, we’ll see the common errors we face during validation development. For most of the errors, we should make sure that Schema is having or not having target namespace. According to that parser/validator behave further.
1. Character '<' is grammatically unexpected
Cause: missing required tag.
2. [cvc-elt.1: Cannot find the declaration of element
Cause: no namespace is found in the xml or Parser’s schemalocation property doesn’t have namespace attached or Schema files were missing at the specified path for parser’s schemalocation property.
This list will be updated in future too.
Pingback: xerces-c: C++ SAX2 Parser | Internet blog