Your file, be it autogenerated or somehow “mangled” is poorly indented and you want to get rid of this.
There are different solutions to this problem:
a “pretty-print” stylesheet
the XML parser xmllint
the xmlformat command
The XML parser xmllint offers the
--format
option to turn on indentation for each element.
xmllint --format XMLFILE
The simplest stylesheet for indentation is shown in Example 3.2, “pretty.xsl
”.
It relies on the copy.xsl
stylesheet.
The xmlformat tool is a Perl script which is available from the Web site http://www.kitebird.com/software/xmlformat/.
The tool distinguishes between block elements, inline elements, and verbatim elements (similar to DocBook). The difference between the types is the whitespace normalization.
Block elements typically begin with a new line and children are indented. Spacing before and after can be controlled too.
Inline elements occur in block elements. Normalization and line-wrapping occurs in regard to the enclosing block element.
Verbatim elements are not formatted at all. That means, the content of the input element is the same as the content of the output element, including whitespaces.
The xmllint command with its
--format
option is the easiest candidate but lacks
customization. This is useful if you do not have any other tools
at hand and you prefer a quick and rough reformatting.
The pretty.xsl
stylesheet is a pure XSLT
solution. As such, it works on every platform which supports an
XSLT processor. It is adaptable to your needs, but mixed content
(like in para
) is problematic.
The most adaptable method is xmlformat.
Project@GitHub | Issue#8 |