Home » 2014 » October » 26 » XMLLINT command in linux : a validating XML parser

9:06 AM
XMLLINT command in linux : a validating XML parser

xmllint - command line XML tool

FORMAT
       xmllint [--format --dtdvalid ..]
{XML-FILE(S)... -}

For help:       xmllint --help

 

DESCRIPTION

  • The xmllint program parses one or more XML files, specified on the command line as XML-FILE (or the standard input if the filename provided is - ).
  • It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itself.

Examples:

We will see examples on the same file testxml that we had test in xmlwf article . We have introduced some error as highlighted. We will test using xmllint command

shanky@localhost:/home/shanky/test:> cat testxml.xml
<?xml version="1.0" standalone="yes"?>
<student
<name>
shankar
</name>
<roll>
11234
</roll>
<marks>
99
</marks>
</student>

shanky@localhost:/home/shanky/test:> xmllint  testxml.xml
testxml.xml:3: parser error : error parsing attribute name
<name>
^
testxml.xml:3: parser error : attributes construct error
<name>
^
testxml.xml:3: parser error : Couldn't find end of Start Tag student line 2
<name>
^
testxml.xml:3: parser error : Extra content at the end of the document
<name>
^

The commands says that the parser could not find start tag for student at line 2. Now we will correct the error and retry.

shanky@localhost:/home/shanky/test:> cat testxml.xml
<?xml version="1.0" standalone="yes"?>
<student><name>shankar</name><roll>11234</roll><marks>99</marks></student>

Here you may notice that we have rectified the parser error. But additionally we have put all content like a story in one line i.e. the the content is not well formatted. Now we again try the command but this time with "--format" option

shanky@localhost:/home/shanky/test:> xmllint testxml.xml
<?xml version="1.0" standalone="yes"?>
<student><name>shankar</name><roll>11234</roll><marks>99</marks></student>
shanky@localhost:/home/shanky/test:> xmllint --format testxml.xml
<?xml version="1.0" standalone="yes"?>
<student>
  <name>shankar</name>
  <roll>11234</roll>
  <marks>99</marks>
</student>

 

 

Can you see the difference b/w the two output??

Here one is just the content of xml file and the other is well formatted. This is what --format option does.

If we have a very big xml file which is not formatted, its difficult to look into it, so we can use this command for better visibility:

xmllint --format xml-file


--validdtd and --valid options:

We can supply the DTD file to be validated against, with this command using --validdtd option and --valid option. See the command below:

xmllint --dtdvalid dtdfile --valid xmlfile

--path and --valid options:

We can give the source paths od dtd files to be validated against, with this command, also. The paths of dtd files should be enclosed in single or double quotes separated by space. See the command below:

xmllint --path 'path1 path2' --valid xmlfile


 


All OPTIONS: xmllint accepts the following options 

       --auto Generate a small document for testing purposes.

       --catalogs
              Use the SGML catalog(s) from SGML_CATALOG_FILES. Otherwise XML catalogs starting from /etc/xml/catalog are used by default.

       --chkregister
              Turn on node registration. Useful for developers testing libxml(3) node tracking code.

       --compress
              Turn on gzip(1) compression of output.

       --copy Test the internal copy implementation.

       --c14n Use the W3C XML Canonicalisation (C14N) to serialize the result of parsing to stdout. It keeps comments in the result.

        --dtdvalidfpi FPI
              Use the DTD specified by a Formal Public Identifier FPI for validation, note that this will require a catalog exporting that Formal Public Identifier to work.

       --debug
              Parse a file and output an annotated tree of the in-memory version of the document.

       --debugent
              Debug the entities defined in the document.

       --dropdtd
              Remove DTD from output.

       --dtdattr
              Fetch external DTD and populate the tree with inherited attributes.

       --encode ENCODING
              Output in the given encoding.

       --html Use the HTML parser.

       --htmlout
              Output results as an HTML file. This causes xmllint to output the necessary HTML tags surrounding the result tree output so the results can be displayed/viewed in a browser.

       --insert
              Test for valid insertions.

       --loaddtd
              Fetch an external DTD.

       --load-trace
              Display all the documents loaded during the processing to stderr.

       --maxmem NNBYTES
              Test the parser memory support.  NNBYTES is the maximum number of bytes the library is allowed to allocate.
              This can also be used to make sure batch processing of XML files will not exhaust the virtual memory of the server running them.

       --memory
              Parse from memory.

       --noblanks
              Drop ignorable blank spaces.

       --nocatalogs
              Do not use any catalogs.

       --nocdata
              Substitute CDATA section by equivalent text nodes.

       --noent
              Substitute entity values for entity references. By default, xmllint leaves entity references in place.

       --nonet
              Do not use the Internet to fetch DTDs or entities.

       --noout
              Suppress output. By default, xmllint outputs the result tree.

       --nowarning
              Do not emit warnings from the parser and/or validator.

       --nowrap
              Do not output HTML doc wrapper.

       --noxincludenode
              Do XInclude processing but do not generate XInclude start and end nodes.

       --nsclean
              Remove redundant namespace declarations.

       --output FILE
              Define a file path where xmllint will save the result of parsing. Usually the programs build a tree and save it on stdout, with this option the result XML instance will be saved onto a file.

       --pattern PATTERNVALUE
              Used to exercise the pattern recognition engine, which can be used with the reader interface to the parser.
              It allows to select some nodes in the document based on an XPath (subset) expression. Used for debugging.

       --postvalid
              Validate after parsing has completed.

       --push Use the push mode of the parser.

       --recover
              Output any parsable portions of an invalid document.

       --relaxng SCHEMA
              Use RelaxNG file named SCHEMA for validation.

       --repeat
              Repeat 100 times, for timing or profiling.


       --shell
              Run a navigating shell. Details on available commands in shell mode are below (see the section called "SHELL
              COMMANDS").

       --stream
              Use streaming API - useful when used in combination with --relaxng or --valid options for validation of files that are too large to be held in memory.

       --testIO
              Test user input/output support.

       --timing
              Output information about the time it takes xmllint to perform the various steps.

       --valid
              Determine if the document is a valid instance of the included Document Type Definition (DTD). A DTD to be validated against also can be specified at the command line using the --dtdvalid option. By default, xmllint also checks to determine if the document is well-formed.

       --version
              Display the version of libxml(3) used.

       --walker
              Test the walker module, which is a reader interface but for a document tree, instead of using the reader API on an unparsed document it works on an existing in-memory tree. Used for debugging.

       --xinclude
              Do XInclude processing.

       --xmlout
              Used in conjunction with --html. Usually when HTML is parsed the document is saved with the HTML serializer.
              But with this option the resulting document is saved with the XML serializer. This is primarily used to generate XHTML from HTML input.

 
 

Category: Open System-Linux | Views: 2538 | Added by: shanky | Tags: xmllint command examples, xmllint command in linux, validating parser, xmllint examples, xmllint command in unix, xmllint command with examples | Rating: 5.0/1

Related blogs


You may also like to see:


[2015-06-03][Open System-Linux]
STAT command : check file or filesystem statistics
[2015-06-01][Open System-Linux]
DIG command : A DNS lookup utility
[2014-04-24][Open System-Linux]
Addition of two numbers using shell scripting in UNIX
[2014-03-17][Open System-Linux]
AWK command for scanning and analysing a large file in Linux
[2016-05-24][Open System-Linux]
FACTER command in Linux : showing system facts

Total comments: 0
avatar