Given the hpricot xml at the bottom of this post, how would I select the "item" without having to use .each? Every single piece of documentation uses a variation of
#res.items.each do |item|
# do stuff
end
Which is pointless in this case because there is only ever one "item". Been tearing y hair out for the last ages trying to get this right.
Edited to add more information:
Ok so judging from the early comments, I'm missing the point somewhere so I'll provide more information.
I'm using a ruby gem called amazon-ecs to retrieve product information from Amazon. On the gem's site it is described as
A generic Ruby Amazon Product Advertising API (previously known as E-commerce REST API) using Hpricot. It uses Response and Element wrapper classes for easy access to the REST API XML output. It is generic, so you can extend Amazon::Ecs to support the other not-implemented operations easily; and the response object just wraps around Hpricot element object, instead of providing one-to-one object/attributes to XML elements map.
Now to be hones I don't really understand what that means but I suspect the bit about the wrapping Response object is what's making this difficult!
Basically, when I do this:
#res = Amazon::Ecs.item_lookup(ean, options_hash)
and then I print out "debug #res", I get what I have below.
Hope that helps!
End edit
Hpricot xml:
<Amazon::Ecs::Response:0xa4449cc #doc=#<Hpricot::Doc
{xmldecl "<?xml version=\"1.0\" ?>"}
{elem <itemlookupresponse xmlns="http://webservices.amazon.com/AWSECommerceService/2005-10-05">
{elem <operationrequest>
{elem <httpheaders>
{emptyelem <header name="UserAgent" value="Ruby">}
</HTTPHeaders>}
{elem <requestid> "b89bad91-f5a1-4daf-87f2-d309dded35d6" </RequestId>}
{elem <arguments>
{emptyelem <argument name="Operation" value="ItemLookup">}
{emptyelem <argument name="SearchIndex" value="Books">}
{emptyelem <argument name="Signature" value="dasdasdsadsadsafdfdsfsdsasadsadsd">}
{emptyelem <argument name="ItemId" value="9780307463746">}
{emptyelem <argument name="IdType" value="ISBN">}
{emptyelem <argument name="AWSAccessKeyId" value="sdasdsadsadsadsadsadd">}
{emptyelem <argument name="Timestamp" value="2011-02-17T15:08:09Z">}
{emptyelem <argument name="Service" value="AWSECommerceService">}
</Arguments>}
{elem <requestprocessingtime> "0.0252220000000000" </RequestProcessingTime>}
</OperationRequest>}
{elem <items>
{elem <request>
{elem <isvalid> "True" </IsValid>}
{elem <itemlookuprequest>
{elem <condition> "New" </Condition>}
{elem <deliverymethod> "Ship" </DeliveryMethod>}
{elem <idtype> "ISBN" </IdType>}
{elem <merchantid> "Amazon" </MerchantId>}
{elem <offerpage> "1" </OfferPage>}
{elem <itemid> "9780307463746" </ItemId>}
{elem <responsegroup> "Small" </ResponseGroup>}
{elem <reviewpage> "1" </ReviewPage>}
{elem <searchindex> "Books" </SearchIndex>}
</ItemLookupRequest>}
</Request>}
{elem <item>
{elem <asin> "0307463745" </ASIN>}
{elem <detailpageurl> "http://www.amazon.com/Rework-Jason-Fried/dp/0307463745%3FSubscriptionId%3DAKIAIV6GP6CJC3AINUUQ%26tag%3Dws%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D0307463745" </DetailPageURL>}
{elem <smallimage>
{elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL._SL75_.jpg" </URL>}
{elem <height units="pixels"> "75" </Height>}
{elem <width units="pixels"> "50" </Width>}
</SmallImage>}
{elem <mediumimage>
{elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL._SL160_.jpg" </URL>}
{elem <height units="pixels"> "160" </Height>}
{elem <width units="pixels"> "106" </Width>}
</MediumImage>} {elem <largeimage> {elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL.jpg" </URL>} {elem <height units="pixels"> "500" </Height>} {elem <width units="pixels"> "331" </Width>} </LargeImage>} {elem <imagesets> {elem <imageset category="primary"> {elem <swatchimage> {elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL._SL30_.jpg" </URL>} {elem <height units="pixels"> "30" </Height>} {elem <width units="pixels"> "20" </Width>} </SwatchImage>} {elem <smallimage> {elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL._SL75_.jpg" </URL>} {elem <height units="pixels"> "75" </Height>} {elem <width units="pixels"> "50" </Width>} </SmallImage>} {elem <mediumimage> {elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL._SL160_.jpg" </URL>} {elem <height units="pixels"> "160" </Height>} {elem <width units="pixels"> "106" </Width>} </MediumImage>} {elem <largeimage> {elem <url> "http://ecx.images-amazon.com/images/I/41Qz6afdrLL.jpg" </URL>} {elem <height units="pixels"> "500" </Height>} {elem <width units="pixels"> "331" </Width>} </LargeImage>} </ImageSet>} </ImageSets>}
{elem <itemattributes>
{elem <author> "Jason Fried" </Author>}
{elem <author> "David Heinemeier Hansson" </Author>}
{elem <manufacturer> "Crown Business" </Manufacturer>}
{elem <productgroup> "Book" </ProductGroup>}
{elem <title> "Rework" </Title>}
</ItemAttributes>}
</Item>}
</Items>}
</ItemLookupResponse>}
First, extract the Hpricot object from #res (from the docs).
doc = #res.doc
Then, you should be able to use the Hpricot object:
puts (doc/:item).inner_html
You could do something like this
item = (doc/:header).first
The above should get you the first header node in the XML document. Its not tested so I would give it some testing
Related
Im having problem on integrating header mini cart into my custom theme. Below are the screenshots of the issue. Any help is much appreciated. Thanks in advance :)
Screenshots:
Below is the default.xml code.
<referenceContainer name="header.container">
<container name="header-wrapper" label="Page Header" as="header-wrapper" htmlTag="div" htmlClass="top-header">
<!-- top links with cart -->
<container name="topcartoptions" label="Top Cart Options" htmlTag="div" htmlClass="top-cart-options text-right" before="-">
<block class="Magento\Cms\Block\Block" name="block-top-links">
<arguments>
<argument name="block_id" xsi:type="string">block-top-links</argument>
</arguments>
</block>
</container>
<!-- top menu -->
<container name="mainnavigation" label="Main Navigation" htmlTag="div" htmlClass="main-navigation" after="topcartoptions">
<container name="main-navigation-container" label="Main Navigation" htmlTag="div" htmlClass="container">
<container name="main-navigation-container-row" label="Main Navigation" htmlTag="div" htmlClass="row">
<container name="main-nav-row-bootstrap-class" htmlTag="div" htmlClass="col-lg-12 col-md-12 col-sm-12 col-xs-12 text-center">
<block class="Magento\Cms\Block\Block" name="block-main-nav">
<arguments>
<argument name="block_id" xsi:type="string">block-main-nav</argument>
</arguments>
</block>
</container>
</container>
</container>
</container>
<block class="Magento\Checkout\Block\Cart\Sidebar" name="minicart" as="minicart" after="logo" template="cart/minicart.phtml">
<arguments>
<argument name="jsLayout" xsi:type="array">
<item name="types" xsi:type="array"/>
<item name="components" xsi:type="array">
<item name="minicart_content" xsi:type="array">
<item name="component" xsi:type="string">Magento_Checkout/js/view/minicart</item>
<item name="config" xsi:type="array">
<item name="template" xsi:type="string">Magento_Checkout/minicart/content</item>
</item>
<item name="children" xsi:type="array">
<item name="subtotal.container" xsi:type="array">
<item name="component" xsi:type="string">uiComponent</item>
<item name="config" xsi:type="array">
<item name="displayArea" xsi:type="string">subtotalContainer</item>
</item>
<item name="children" xsi:type="array">
<item name="subtotal" xsi:type="array">
<item name="component" xsi:type="string">uiComponent</item>
<item name="config" xsi:type="array">
<item name="template" xsi:type="string">Magento_Checkout/minicart/subtotal</item>
</item>
</item>
</item>
</item>
<item name="extra_info" xsi:type="array">
<item name="component" xsi:type="string">uiComponent</item>
<item name="config" xsi:type="array">
<item name="displayArea" xsi:type="string">extraInfo</item>
</item>
</item>
<item name="promotion" xsi:type="array">
<item name="component" xsi:type="string">uiComponent</item>
<item name="config" xsi:type="array">
<item name="displayArea" xsi:type="string">promotion</item>
</item>
</item>
</item>
</item>
</item>
</argument>
</arguments>
<container name="minicart.addons" label="Mini-cart promotion block"/>
</block>
</container>
</referenceContainer>
I have records in XML file like below. I need to search for <keyword>SEARCH</keyword> and if present
then I need to take the entire record and write to another file.(starting from <record> to </record>)
Below is my awk code which is inside loop. $1 holds line by line value of each record.
if(index($1,"SEARCH")>0)
{
print $1>> "output.txt"
}
This logic has two problems,
It is writing to output.txt file, only <keyword>SEARCH</keyword> element and not the whole record(starting from <record> to </record>)
SEARCH can also be present in <detail> tag. This code will even write that tag to output.txt
XML File:
<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<names>
<first_name/>
<last_name></last_name>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</record>
<record category="abc">
<person ssn="" e-i="F">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<names>
<first_name/>
<last_name></last_name>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>DONTSEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is not present in abc for xyz reason</detail>
</external_sources>
</details>
</record>
Use GNU awk for multi-char RS:
$ awk -v RS='</record>\n' '{ORS=RT} /<keyword>SEARCH<\/keyword>/' file
<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<names>
<first_name/>
<last_name></last_name>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</record>
If you need to search for any of multiple keywords then simply list them as such:
$ awk -v RS='</record>\n' '{ORS=RT} /<keyword>(SEARCH1|SEARCH2|SEARCH3)<\/keyword>/' file
$ cat x.awk
/<record / { i=1 }
i { a[i++]=$0 }
/<\/record>/ {
if (found) {
for (i=1; i<=length(a); ++i) print a[i] > "output.txt"
}
i=0;
found=0
}
/<keyword>SEARCH<\/keyword>/ { found=1 }
$ awk -f x.awk x.xml
$ cat output.txt
<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<names>
<first_name/>
<last_name></last_name>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</record>
You seem to have cross posted this question from Unix & Linux - I give the same answer here as I did there:
I'm going to assume that what you've posted is a sample, because it isn't valid XML. If this assumption isn't valid, my answer doesn't hold... but if that is the case, you really need to hit the person who gave you the XML with a rolled up copy of the XML spec, and demand they 'fix it'.
But really - awk and regular expressions are not the right tool for the job. An XML parser is. And with a parser, it's absurdly simple to do what you want:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
#parse your file - this will error if it's invalid.
my $twig = XML::Twig -> new -> parsefile ( 'your_xml' );
#set output format. Optional.
$twig -> set_pretty_print('indented_a');
#iterate all the 'record' nodes off the root.
foreach my $record ( $twig -> get_xpath ( './record' ) ) {
#if - beneath this record - we have a node anywhere (that's what // means)
#with a tag of 'keyword' and content of 'SEARCH'
#print the whole record.
if ( $record -> get_xpath ( './/keyword[string()="SEARCH"]' ) ) {
$record -> print;
}
}
xpath is quite a lot like regular expressions - in some ways - but it's more like a directory path. That means it's context aware, and can handle XML structures.
In the above: ./ means 'below current node' so:
$twig -> get_xpath ( './record' )
Means any 'top level' <record> tags.
But .// means "at any level, below current node" so it'll do it recursively.
$twig -> get_xpath ( './/search' )
Would get any <search> nodes at any level.
And the square brackets denote a condition - that's either a function (e.g. text() to get the text of the node) or you can use an attribute. e.g. //category[#name] would find any category with a name attribute, and //category[#name="xyz"] would filter those further.
XML used for testing:
<XML>
<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>
<record category="abc">
<person ssn="" e-i="F">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>DONTSEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is not present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>
</XML>
Output:
<record category="xyz">
<person
e-i="E"
ssn="">
<title xsi:nil="true" />
<position xsi:nil="true" />
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true" />
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>
Note - the above just prints the record to STDOUT. That's actually... in my opinion, not such a great idea. Not least because - it doesn't print the XML structure, and so it isn't actually 'valid' XML if you've more than one record (there's no "root" node).
So I would instead - to accomplish exactly what you're asking:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> new -> parsefile ('your_file.xml');
$twig -> set_pretty_print('indented_a');
foreach my $record ( $twig -> get_xpath ( './record' ) ) {
if ( not $record -> findnodes ( './/keyword[string()="SEARCH"]' ) ) {
$record -> delete;
}
}
open ( my $output, '>', "output.txt" ) or die $!;
print {$output} $twig -> sprint;
close ( $output );
This instead - inverts the logic, and deletes (from the parsed data structure in memory) the records you don't want, and prints the whole new structure (including XML headers) to a new file called "output.txt".
I have the following site.xml:-
<project xmlns="http://maven.apache.org/DECORATION/1.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/DECORATION/1.1.0 http://maven.apache.org/xsd/decoration-1.1.0.xsd">
<bannerLeft>
<name>Project Title</name>
<href>http://maven.apache.org/</href>
</bannerLeft>
<skin>
<groupId>org.apache.maven.skins</groupId>
<artifactId>maven-fluido-skin</artifactId>
<version>1.4</version>
</skin>
<custom>
<fluidoSkin>
<sourceLineNumbersEnabled>true</sourceLineNumbersEnabled>
<breadcrumbDivider>»</breadcrumbDivider>
</fluidoSkin>
</custom>
<body>
<links>
<item name="Link 1" href="#"/>
<item name="Link 2" href="#"/>
<item name="Link 3" href="#"/>
</links>
<breadcrumbs>
<item name="Crumb 1" href="#"/>
<item name="Crumb 2" href="#"/>
<item name="Crumb 3" href="#"/>
</breadcrumbs>
<menu ref="reports"/>
</body>
</project>
... and mvn site generates the following look and feel:-
How do I move the "Last Published" and "Version" to the right side, just like what I see from Apache Maven Fluido Skin itself?
Thank you.
You can add the following tags as children of <project>:
<publishDate position="right"/>
<version position="right"/>
The documentation for the things which are in the site descriptor can be found here:
https://maven.apache.org/plugins/maven-site-plugin/examples/sitedescriptor.html
There is this XSLT code in one of my projects
<xsl:template match="strategic-objectives/list/item" mode="table-of-contents">
<div class="toc">
<h3 class="report">Strategic objective: <xsl:value-of select="title"/></h3>
<xsl:variable name="flex_value" select="flex_value"/>
<xsl:apply-templates select="//root/data/operational-outcomes/list/item[sf = $flex_value]" mode="table-of-contents"/>
</div>
</xsl:template>
how much ever I try, the data for <operational-outcomes> does not match to sf = flex_value, but sf = '020000' matches.
I have checked the <flex_value> if proper in XML for the <strategic-objectives> item. It in fact also has 020000 as one of the values.
data
<data xsql-timing="3131">
<time>20 of April, 2014 (14:22) </time>
<strategic-objectives>
<list type="strategic_objective" xsql-timing="81">
<item num="1">
<context>PB08V6 </context>
<flex_value_set_name>ILO_AFF_SF </flex_value_set_name>
<attribute5>10 </attribute5>
<flex_value>010000 </flex_value>
<hierarchy_level>1 </hierarchy_level>
<title>Policy Making </title>
</item>
<item num="4">
<context>PB08V6 </context>
<flex_value_set_name>ILO_AFF_SF </flex_value_set_name>
<attribute5>10 </attribute5>
<flex_value>200000 </flex_value>
<hierarchy_level>1 </hierarchy_level>
<title>Employment </title>
<description> </description>
<text> </text>
</item>
</list>
</strategic-objectives>
<operational-outcomes>
<list type="outcome" xsql-timing="477">
<item num="9">
<context>PB08V6 </context>
<flex_value_set_name>ILO_AFF_SF </flex_value_set_name>
<flex_value>220025 </flex_value>
<parent_flex_value>220000 </parent_flex_value>
<hierarchy_level>3 </hierarchy_level>
<status>10 </status>
<attribute5>40 </attribute5>
<title>REVISED - Policies for growth, employment and poverty reduction </title>
<description></description>
<text> </text>
<sf>200000 </sf>
</item>
<item num="10">
<context>PB08V6 </context>
<flex_value_set_name>ILO_AFF_SF </flex_value_set_name>
<flex_value>740050 </flex_value>
<parent_flex_value>740000 </parent_flex_value>
<hierarchy_level>3 </hierarchy_level>
<status>10 </status>
<attribute5>40 </attribute5>
<title>DELETED - Internal Administration and Security </title>
<sf>700000 </sf>
</item>
</list>
</operational-outcomes>
</data>
Looking at your XML source, there's a white-space problem:
<flex_value>200000 </flex_value>
Every character matters, 200000␣␣␣␣ is not the same as 200000.
normalize-whitespace() strips leading and trailing whitespace, and collapses multiple subsequent whitespace characters within a string into one. Example:
<xsl:variable name="flex_value" select="normalize-space(flex_value)"/>
␣␣foo␣␣␣bar␣␣ becomes foo␣bar.
This is the structure of the file I need to import.
<channel>
<item>
<type>image</type>
<title>title image</title>
<id>1</id>
<image_url>url_to_image</image_url>
</item>
<item>
<type>page</type>
<title>node title</title>
<id>2</id>
<ref>
<entity>image_ref</entity>
<ref_value>1</ref_value>
</ref>
<ref>
<entity>category</entity>
<ref_value>5</ref_value>
</ref>
</item>
</channel>
In the page item the tag contains the id of the image item.
How do I add the image url from the image item to the page item?
I'm trying to use
/channel/item[id=ref/ref_value[../entity/text() = 'image_ref']]/image_url but it does not work...
What's the XPath expression to not import the image item but just the page item?
Thanks in advance
Use:
/*/item[type='image' and id=../item[type='page']
/ref[entity = 'image_ref']/ref_value]
/image_url/text()
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/item[type='image' and id=../item[type='page']
/ref[entity = 'image_ref']/ref_value]
/image_url/text()"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied to the provided XML document:
<channel>
<item>
<type>image</type>
<title>title image</title>
<id>1</id>
<image_url>url_to_image</image_url>
</item>
<item>
<type>page</type>
<title>node title</title>
<id>2</id>
<ref>
<entity>image_ref</entity>
<ref_value>1</ref_value>
</ref>
<ref>
<entity>category</entity>
<ref_value>5</ref_value>
</ref>
</item>
</channel>
the XPath expression is evaluated and the result of this evaluation is copied to the output:
url_to_image
Update:
The OP has implied in comments that there may be many "page items" and "image items" and that he needs an expression, getting the image url for only a specific page.
This XPath expression:
/*/item[type='image'
and id=../item[type='page'][1]
/ref[entity = 'image_ref']/ref_value
]
/image_url/text()"/>
produces the wanted image url for the first "page item" in the following XML document:
<channel>
<item>
<type>image</type>
<title>title image</title>
<id>1</id>
<image_url>url_to_image</image_url>
</item>
<item>
<type>image</type>
<title>title image</title>
<id>2</id>
<image_url>url2_to_image</image_url>
</item>
<item>
<type>page</type>
<title>node title</title>
<id>3</id>
<ref>
<entity>image_ref</entity>
<ref_value>1</ref_value>
</ref>
<ref>
<entity>category</entity>
<ref_value>5</ref_value>
</ref>
</item>
<item>
<type>page</type>
<title>node title</title>
<id>4</id>
<ref>
<entity>image_ref</entity>
<ref_value>2</ref_value>
</ref>
<ref>
<entity>category</entity>
<ref_value>5</ref_value>
</ref>
</item>
</channel>
The result produced is:
url_to_image
To get the wanted url for the second page item, we simply modify the above XPath expression to:
/*/item[type='image'
and id=../item[type='page'][2]
/ref[entity = 'image_ref']/ref_value
]
/image_url/text()"/>
and now the result is:
url2_to_image