Changing the structure of an XML file
Besides cat(1), one of the most useful shell commands for interactive
use is head(1), which truncates its input after a few lines. There
are multiple generalizations of this idea for XML documents.
The xml-head(1) command has three main switches. The switch -t
truncates the tags, ie displays only the first few tags (but still
generates well formed XML). The -c switch truncates the text fields,
ie displays only the first few characters wherever text is present,
but leaves the tags as is, and the -n switch tuncates lines, so that
each text field does not exceed a certain number of lines. All three
main switches can be combined.
% xml-head -t 3 People.xml
<?xml version="1.0"?>
<People>
<Person Name="Fred Davis">
<Address>
<LineOne>4 Bushy Street</LineOne>
</Address>
</Person>
</People>
% xml-head -c 2 People.xml
<?xml version="1.0"?>
<People>
<Person Name="Fred Davis">
<Address>
<LineOne>4 </LineOne>
<LineTwo>Gr</LineTwo>
<County>Ma</County>
<Country>Ir</Country>
</Address>
<TelNo>+3</TelNo>
</Person>
</People>
Another way to modify the structure of an XML file is with xml-cut(1).
In traditional Unix, the cut(1) command prints columns from an input
file that is viewed as a table (the exact meaning of a column is
determined by switches). To understand xml-cut(1), think of a fully
indented XML file, where each level of indentation is printed in its
own column:
0 | 1 | 2 | 3 | 4
----------------------------------------
<?xml version="1.0"?>| | | |
|<a> | | |
| |<b> | |
| | |<c> |
| | | |xyz
| | |</c>|
| |</b>| |
|</a>| | |
Now we can print only the columns 2 and 4 as follows:
% xml-echo -e '[a/b/c]xyz' | xml-cut -t 2,4
<?xml version="1.0"?>
<root>
<b>
xyz
</b>
</root>
Note that the closing tag </b> in this example is out of
alignment. This makes sense, once you realize that the "xyz" text
field really begins with the first newline after <c> and
contains all the whitespace before </c>. As usual, xml-fmt(1)
can be used to align the tags if necessary.
Structural surgery can also be performed using xml-rm(1), xml-cp(1)
and xml-mv(1). These commands remove, copy, and move entire subtrees
of an XML document.
% xml-rm food.xml :/products/product[2]
<products>
<product price="3">Chicken</product>
<product price=".20">Apple</product>
<product price="1.09">Milk (2 litres)</product>
</products>
% xml-cp food.xml :/products/product[2]/ \
People.xml ://TelNo/
<?xml version="1.0"?>
<People>
<Person Name="Fred Davis">
<Address>
<LineOne>4 Bushy Street</LineOne>
<LineTwo>Green Road</LineTwo>
<County>Mayo</County>
<Country>Ireland</Country>
</Address>
<TelNo>Lobster</TelNo>
</Person>
</People>
% xml-mv food.xml :/products/product[3] \
food.xml :/products/product[1]/
<products>
<product price="3"><product price=".20">Apple</product></product>
<product price="11.50">Lobster</product>
<product price="1.09">Milk (2 litres)</product>
</products>
|