Xpath: Igonore nodes with certain innertext - xpath

how can you ignore nodes which have a certain innertext but you don't know the innertext of the other nodes:
<row>
<column>test</columm>
</row>
<row>
<column>???</columm>
</row>
this is what I tried but didn't work
row/column[not(.='test')]
row/column[.!='test']
row/column[not(text()='test')]
row/column[text()!='test']
row[column[text()!='test']]/column

This will get you the rows where the first <column> is not test.
//row[column[1][. != 'test']]
See http://www.xpathtester.com/obj/1ddc1930-ad7f-424c-9800-85df95fe6af3
(hit "Test!") to run it

Related

xquery group by not elimilate duplicated items

I have an XML document,
<resultsets>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Hass</last_name>
</row>
</resultsets>
I want to sort first names and remove duplicated first names to produce this:
<resultsets>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
</resultsets>
Following are the code I wrote:
for $last_name at $count1 in doc("employees.xml")//last_name,
$first_name at $count2 in doc("employees.xml")//first_name
let $f := $first_name
where ( $count1=$count2 )
group by $f
order by $f
return
<row>
{$f}
{$last_name}
</row>
However, this code sort the XML document by first names, but failed to remove the duplicated first name ('Bezalel'), it returns:
<resultsets>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Hass</last_name>
</row>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
</resultsets>
I know how to solve this using two FLOWR statements. group by behavior is weird, could you please explain why it does not remove the duplicates?
Is there any way we can solve this problem using ONE FLOWR loop and ONLY use $first_name and $last_name two variables? Thanks,
I would simply group the row elements by the first_name child and then output the first item in each group to ensure you don't get duplicates:
<resultssets>
{
for $row in resultsets/row
group by $fname := $row/first_name
order by $fname
return
$row[1]
}
</resultssets>
http://xqueryfiddle.liberty-development.net/jyyiVhf
As to how the group by clause works, see https://www.w3.org/TR/xquery-31/#id-group-by which says:
The group by clause assigns each pre-grouping tuple to a group, and
generates one post-grouping tuple for each group. In the post-grouping
tuple for a group, each grouping key is represented by a variable that
was specified in a GroupingSpec, and every variable that appears in
the pre-grouping tuples that were assigned to that group is
represented by a variable of the same name, bound to a sequence of all
values bound to the variable in any of these pre-grouping tuples.

Freemarker : Expression inside Expression

Is there any way that i can use an expression inside expression in Freemarker?
Example:
XML FIle
<Document>
<Row>
<item_date>01/01/2015</item_date>
</Row>
<Row>
<item_date>02/01/2015</item_date>
</Row>
</Document>
<#list 0..1 as i>
${Document.Row[${i}].item_date}
</#list>
I want to print as below
01/01/2015
02/01/2015
Any idea?
Thanks in Advance
Like this:
${Document.Row[i].item_date}
Note that if you are using an up-to-date version, you get this error message, which explains why:
You can't use "${" here as you are already in
FreeMarker-expression-mode. Thus, instead of ${myExpression}, just
write myExpression. (${...} is only needed where otherwise static text
is expected, i.e, outside FreeMarker tags and ${...}-s.)

How do I get the text from a node of a specific preceding sibling

If my XML is like this:
<sql result="success">
<row>
<column>
<name>USER_ID</name>
<value>TEST</value>
</column>
<column>
<name>EMAIL_ADDRESS</name>
<value>xxx#yyyy.com</value>
</column>
</row>
</sql>
How do I extract just the text of the node retrieved with this XPath:
//value[preceding-sibling::name[1][. = 'USER_ID']]
Just append the /text() to get the text child of the element:
//value[preceding-sibling::name[1][. = 'USER_ID']]/text()

How can I merge two XML files into one?

I need to merge two XML files. I saw this question before, but that poster wanted to simply concatenate the two files. I want to merge based on a specific child element, in this case, id.
I have two XML files that have the following structure:
File #1:
<document>
<row>
<id>1</id>
<data_field1>aaaa</data_field1>
<data_field2>bbbb</data_field2>
</row>
</document>
File #2:
<document>
<row>
<id>1</id>
<data_field3>cccc</data_field3>
</row>
</document>
And I want them to be merged into File #3:
<document>
<row>
<id>1</id>
<data_field1>aaaa</data_field1>
<data_field2>bbbb</data_field2>
<data_field3>cccc</data_field3>
</row>
</document>
Where it uses the id element to join each XML entry.
The code below will do this, using XML::Twig
It will work with more than 2 docs, and work even if not all id's are present in both docs. It will load both files in memory though, if you want to be able to work with documents too big to fit in memory, the code will be a bit more complex. The rows will be in the same order as in the first document, then in the second one (for those that only appear in the second one).
Since it is written as a test, you can make the test case more complex, or add more tests, which would probably be a good idea.
#!/usr/bin/perl
use strict;
use warnings;
use Test::More;
use XML::Twig;
# normally you would read the documents from file,
# but it's easier to write a self-contained test
my $d1='
<document>
<row>
<id>1</id>
<data_field1>aaaa</data_field1>
<data_field2>bbbb</data_field2>
</row>
</document>
';
my $d2='
<document>
<row>
<id>1</id>
<data_field3>cccc</data_field3>
</row>
</document>
';
my $merged=
'<document>
<row>
<id>1</id>
<data_field1>aaaa</data_field1>
<data_field2>bbbb</data_field2>
<data_field3>cccc</data_field3>
</row>
</document>
';
$merged=~ s{\n}{}g; # remove \n's,
# if you want the result indented, look at the pretty_print option
is( merged( $d1, $d2), $merged, 'one test to rule them all');
done_testing();
sub merged
{
my #docs= map { XML::Twig->new->parse( $_) } #_;
my $merged= XML::Twig->new->parse( '<document></document>');
my %row_id; # hash id => row_element
foreach my $doc (#docs)
{ foreach my $row ($doc->root->children( 'row'))
{ my $eid= $row->first_child( 'id');
my $id= $eid->text;
# if the row hasn't been created in the merged doc, do it
if( ! $row_id{$id})
{ $row_id{$id}= $merged->root->insert_new_elt( last_child => 'row');
$row_id{$id}->insert_new_elt( last_child => id => $id);
}
# move the data fields to the end of the row
foreach my $data_field ($eid->next_siblings)
{ $data_field->move( last_child => $row_id{$id}); }
}
}
return $merged->sprint;
}

XPath matching attribute and content of an element

Can anyone help with the following XPath question? Given the node-set:
<table>
<rows>
<row>
<value column="Product">Coal</value>
<value column="Quantity">10000</value>
</row>
<row>
<value column="Product">Iron</value>
<value column="Quantity">5000</value>
</row>
<row>
<value column="Product">Ore</value>
<value column="Quantity">4000</value>
</row>
</rows>
</table>
I want to query to find the node sub-set with a given product name. Note that the product name is being supplied by an attribute of the current node being processed (i.e. "#name"). So when the #name attribute has the value of "Coal" I would expect this to be returned:
<row>
<value column="Product">Coal</value>
<value column="Quantity">10000</value>
</row>
This is what I've come up with; I know it's wrong, because I don't get anything back.
$table/rows/row[value[#column='Product'][text()=#name]]
</code>
You are obviously missing the current() function
$table/rows/row[value[#column='Product'] = current()/#name]
Within an XPath predicate (i.e. within square brackets) the context node is the node the predicate is applied to.
In your case, when you say $table/rows/row[x=#name], then #name refers to the #name attribute of row. Which has no #name attribute, so the predicate always evaluates to false for all nodes.
current() returns the current XSLT context node to help in exactly this case.

Resources