Warning: ini_set() has been disabled for security reasons in /home/bash/public_html/man.php on line 3

Warning: ini_set() has been disabled for security reasons in /home/bash/public_html/man.php on line 4

Warning: ini_set() has been disabled for security reasons in /home/bash/public_html/man.php on line 5

Warning: Cannot modify header information - headers already sent by (output started at /home/bash/public_html/man.php:3) in /home/bash/public_html/man.php on line 8

Warning: Cannot modify header information - headers already sent by (output started at /home/bash/public_html/man.php:3) in /home/bash/public_html/man.php on line 9
xmlwf Man Page - BASH Cures Cancer
Bash Cures Cancer
Learn the UNIX/Linux command line

Home     Man Pages     SpamDefeator


XMLWF(1)							     XMLWF(1)



NAME
       xmlwf - Determines if an XML document is well-formed

SYNOPSIS
       xmlwf  [ -s]  [ -n]  [ -p]  [ -x]  [ -e encoding]  [ -w]	 [ -d output-
       dir]  [ -c]  [ -m]  [ -r]  [ -t]	 [ -v]	[ file ...]


DESCRIPTION
       xmlwf uses the Expat library to determine if an XML document is	well-
       formed.	It is non-validating.

       If  you	do  not specify any files on the command-line, and you have a
       recent version of xmlwf, the input file will  be	 read  from  standard
       input.

WELL-FORMED DOCUMENTS
       A well-formed document must adhere to the following rules:

       ? The  file  begins with an XML declaration.  For instance, .	 NOTE: xmlwf does not currently check
	 for a valid XML declaration.

       ? Every	start tag is either empty () or has a corresponding end
	 tag.

       ? There is exactly one root element.  This element  must	 contain  all
	 other	elements  in  the  document.  Only comments, white space, and
	 processing instructions may come after the close of  the  root	 ele-
	 ment.

       ? All elements nest properly.

       ? All  attribute	 values are enclosed in quotes (either single or dou-
	 ble).

       If the document has a DTD, and it strictly  complies  with  that	 DTD,
       then the document is also considered valid.  xmlwf is a non-validating
       parser -- it does not check the DTD.  However, it does support  exter-
       nal entities (see the -x option).

OPTIONS
       When  an	 option	 includes  an  argument, you may specify the argument
       either separately  ("-d	output")  or  concatenated  with  the  option
       ("-doutput").  xmlwf supports both.

       -c     If  the  input  file is well-formed and xmlwf doesn't encounter
	      any errors, the input file  is  simply  copied  to  the  output
	      directory unchanged.  This implies no namespaces (turns off -n)
	      and requires -d to specify an output file.

       -d output-dir
	      Specifies a directory to contain transformed representations of
	      the  input files.	 By default, -d outputs a canonical represen-
	      tation (described below).	 You can select different output for-
	      mats using -c and -m.

	      The  output  filenames  will  be	exactly the same as the input
	      filenames or "STDIN" if  the  input  is  coming  from  standard
	      input.   Therefore,  you	must  be careful that the output file
	      does not go into the same directory as the input file.   Other-
	      wise,  xmlwf will delete the input file before it generates the
	      output file (just like running  cat  <  file  >  file  in	 most
	      shells).

	      Two  structurally equivalent XML documents have a byte-for-byte
	      identical canonical XML representation.	Note  that  ignorable
	      white  space  is	considered significant and is treated equiva-
	      lently to	 data.	 More  on  canonical  XML  can	be  found  at
	      http://www.jclark.com/xml/canonxml.html .

       -e encoding
	      Specifies	 the  character encoding for the document, overriding
	      any document encoding declaration.  xmlwf supports four  built-
	      in  encodings:  US-ASCII,	 UTF-8, UTF-16, and ISO-8859-1.	 Also
	      see the -w option.

       -m     Outputs some strange sort of XML file that completely describes
	      the  the	input file, including character postitions.  Requires
	      -d to specify an output file.

       -n     Turns on namespace processing.  (describe namespaces)  -c	 dis-
	      ables namespaces.

       -p     Tells xmlwf to process external DTDs and parameter entities.

	      Normally xmlwf never parses parameter entities.  -p tells it to
	      always parse them.  -p implies -x.

       -r     Normally xmlwf memory-maps the XML file  before  parsing;	 this
	      can  result  in faster parsing on many platforms.	 -r turns off
	      memory-mapping and uses  normal  file  IO	 calls	instead.   Of
	      course, memory-mapping is automatically turned off when reading
	      from standard input.

	      Use of memory-mapping can cause some platforms to	 report	 sub-
	      stantially  higher  memory usage for xmlwf, but this appears to
	      be a matter of the  operating  system  reporting	memory	in  a
	      strange way; there is not a leak in xmlwf.

       -s     Prints  an error if the document is not standalone.  A document
	      is standalone if it has no external subset and no references to
	      parameter entities.

       -t     Turns  on	 timings.  This tells Expat to parse the entire file,
	      but not perform any processing.  This gives a  fairly  accurate
	      idea  of the raw speed of Expat itself without client overhead.
	      -t turns off most of the output options (-d, -m, -c, ...).

       -v     Prints the version of the Expat library being  used,  including
	      some  information	 on  the  compile-time	configuration  of the
	      library, and then exits.

       -w     Enables support for Windows code pages.  Normally,  xmlwf	 will
	      throw  an	 error	if  it runs across an encoding that it is not
	      equipped to handle itself.  With -w, xmlwf will try  to  use  a
	      Windows code page.  See also -e.

       -x     Turns on parsing external entities.

	      Non-validating  parsers  are  not	 required to resolve external
	      entities, or even expand entities at all.	 Expat always expands
	      internal	entities  (?),	but  external  entity parsing must be
	      enabled explicitly.

	      External entities are simply entities that  obtain  their	 data
	      from outside the XML file currently being parsed.

	      This is an example of an internal entity:

	      

	      And here are some examples of external entities:

	        (parsed)
	      	   (unparsed)

       --     (Two  hyphens.)	Terminates the list of options.	 This is only
	      needed if a filename starts with a hyphen.  For example:

	      xmlwf -- -myfile.xml

	      will run xmlwf on the file -myfile.xml.

       Older versions of xmlwf do not support reading from standard input.

OUTPUT
       If an input file is  not	 well-formed,  xmlwf  prints  a	 single	 line
       describing  the problem to standard output.  If a file is well formed,
       xmlwf outputs nothing.  Note that the result code is not set.

BUGS
       According to the W3C standard, an XML file without  a  declaration  at
       the  beginning  is  not considered well-formed.	However, xmlwf allows
       this to pass.

       xmlwf returns a 0 - noerr result, even if the file is not well-formed.
       There  is  no  good  way for a program to use xmlwf to quickly check a
       file -- it must parse xmlwf's standard output.

       The errors should go to standard error, not standard output.

       There should be a way to get -d to send its output to standard  output
       rather than forcing the user to send it to a file.

       I  have	no  idea  why  anyone  would  want  to use the -d, -c, and -m
       options.	 If someone could explain it to me,  I'd  like	to  add	 this
       information to this manpage.

ALTERNATIVES
       Here are some XML validators on the web:

       http://www.hcrc.ed.ac.uk/~richard/xml-check.html
       http://www.stg.brown.edu/service/xmlvalid/
       http://www.scripting.com/frontier5/xml/code/xmlValidator.html
       http://www.xml.com/pub/a/tools/ruwf/check.html

SEE ALSO
       The Expat home page:	   http://www.libexpat.org/
       The W3 XML specification:   http://www.w3.org/TR/REC-xml

AUTHOR
       This  manual  page  was written by Scott Bronson 
       for the Debian GNU/Linux system (but may be used by others).   Permis-
       sion  is granted to copy, distribute and/or modify this document under
       the terms of the GNU Free Documentation License, Version 1.1.



			       24 January 2003			     XMLWF(1)


UNIX/Linux commands referenced on this page:
  1. file
  2. as
  3. cat
  4. sort
  5. raw
  6. expand
  7. at