YQL Losing HTML Element Attributes? - yahoo

YQL Console Link
Query:
select * from html where url='http://www.cbs.com/shows/big_brother/video/' and xpath='//div[#id="cbs-video-metadata-wrapper"]/div[#class="cbs-video-share"]/a'
Returns:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2011-07-09T23:14:02Z" yahoo:lang="en-US">
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<url execution-time="146" proxy="DEFAULT"><![CDATA[http://www.cbs.com/shows/big_brother/video/]]></url>
<user-time>163</user-time>
<service-time>146</service-time>
<build-version>19262</build-version>
</diagnostics>
<results>
<a class="twitter-share-button" href="http://twitter.com/share"/>
</results>
</query>
Should Return Something Similar To:
<results>
</results>
If I back out the query one level, it totally strips out the element, which I could also use to get the data I need.

We have a new html parser that recognizes custom attributes now.
Add compat="html5" to trigger the new parser.
e.g.:
select * from html where url = "http://mydomain.com" and compat="html5"

Related

Magento 2 How to set custom data / option in a quote item?

I would like to add some data into the quote item, not product.
my approach now is
$quoteItems = $this->cart->getItems();
foreach ($quoteItems as $eachQuoteItem) {
$eachQuoteItem->setCustomname('aaaa');
$eachQuoteItem->setIsSuperMode(true);
$eachQuoteItem->save();
};
I can use $eachQuoteItem->getCustomname(); to get back 'aaaa' in the same page, but i can not get the data in other request.
any suggestion?
thanks
The answers provided address the task of converting the quote items to order items. But, it sounds like you're asking how to set the data on the quote item in the first place.
You can do this by:
a) get items from quote using getAllVisibleItems(),
b) call setData('field', val) on each item.
c) set updated items on quote using setItems(items)
d) then, save the quote
It's late but possibly you need to create plugin for such case as suggested here.
You can save custom field value in quote_item table by following below,
Create ModuleName/CompanyName/etc/di.xml
<?xml version="1.0"?>
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:ObjectManager/etc/config.xsd">
<type name="Magento\Quote\Model\Quote\Item">
<plugin name="to_save_custom_field_to_quote_item" type="ModuleName\CompanyName\Plugin\ToQuoteItem" />
</type>
</config>
Create ModuleName\CompanyName\Plugin\ToQuoteItem.php
<?php
namespace ModuleName\CompanyName\Plugin;
use Magento\Quote\Model\Quote\Item;
class ToQuoteItem
{
public function afterBeforeSave(Item $subject)
{
$subject->setCustomField("YOUR_VALUE");
}
}

name of node when you know an attribute using path?

I have some XML where I know an attribute (in my case an ID#). I can get the node I'm looking for using //*[#id='v6969482']. But isn't there a way to tell me the name of this id? (I'm trying to have it return 'title' or , in my case. I know it has to do with using name(), but I can't seem to get the right syntax of returning the name when I have the id attribute.
<?xml version="1.0" encoding="UTF-8"?>
<topic id="v6969481">
<title id="v6969482">CR - ASE | AXX2500>Engines>EIOA>EIOAn>GMACn>Ingress</title>
<body id="v6969483">
<p id="v6969484">
<table id="v6153057" frame="all" colsep="1" rowsep="1">
<desc id="v6049915">Global ingress attributes for EIOA engine GMAC ports.</desc>
You need the name of the parent node of the attribute, its parent element:
name(//*[#id='v6969482'])

How to remove namespace from xml

I have a XML in following format
<Body xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<TransactionAcknowledgement xmlns="">
<TransactionId>HELLO </TransactionId>
<UserId>MC</UserId>
<SendingPartyType>SE</SendingPartyType>
</TransactionAcknowledgement>
</Body>
I want to user XQuery or XPath expression for it.
Now I want to remove only
xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/"
namespace from xml.
Is there any way to achieve it.
Thanks
Try to use functx:change-element-ns-deep:
let $xml := <Body xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<TransactionAcknowledgement xmlns="">
<TransactionId>HELLO </TransactionId>
<UserId>MC</UserId>
<SendingPartyType>SE</SendingPartyType>
</TransactionAcknowledgement>
</Body>
return functx:change-element-ns-deep($xml, "http://schemas.xmlsoap.org/soap/envelope/", "")
But as said Dimitre Novatchev this function doesn't change namespace of the source xml, it creates a new XML.

Parsing an XML file with Nokogiri?

<DataSet xmlns="http://www.atcomp.cz/webservices">
<xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="file_mame">...</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<alldata xmlns="">
<category diffgr:id="category1" msdata:rowOrder="0">
<category_code>P.../category_code>
<category_name>...</category_name>
<subcategory diffgr:id="subcategory1" msdata:rowOrder="0">
<category_code>...</category_code>
<subcategory_code>...</subcategory_code>
<subcategory_name>...</subcategory_name>
</subcategory>
....
How can I obtain all categories and subcategories data?
I am trying something like:
reader.xpath('//DataSet/diffgr:diffgram/alldata').each do |node|
But this gives me:
undefined method `xpath' for #<Nokogiri::XML::Reader:0x000001021d1750>
Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.
On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:
doc = Nokogiri::XML(my_document_string_or_io)
namespaces = {
'default' => 'http://www.atcomp.cz/webservices',
'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1'
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
# ...
end
Or you can remove the namespaces:
doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node| }

Use of text() function when using xPath in dom4j

I have inherited an application that parses xml using dom4j and xPath:
The xml being parsed is similar to the following:
<cache>
<content>
<transaction>
<page>
<widget name="PAGE_ID">WRK_REGISTRATION</widget>
<widget name="TRANS_DETAIL_ID">77145</widget>
<widget name="GRD_ERRORS" />
</page>
<page>
<widget name="PAGE_ID">WRK_REGISTRATION</widget>
<widget name="TRANS_DETAIL_ID">77147</widget>
<widget name="GRD_ERRORS" />
</page>
<page>
<widget name="PAGE_ID">WRK_PROCESSING</widget>
<widget name="TRANS_DETAIL_ID">77152</widget>
<widget name="GRD_ERRORS" />
</page>
</transaction>
</content>
</cache>
Individual Nodes are being searched using the following:
String xPathToGridErrorNode = "//cache/content/transaction/page/widget[#name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']/../widget[#name='TRANS_DETAIL_ID'][text()='77147']/../widget[#name='GRD_ERRORS_TEMP']";
org.dom4j.Element root = null;
SAXReader reader = new SAXReader();
Document document = reader.read(new BufferedInputStream(new ByteArrayInputStream(xmlToParse.getBytes())));
root = document.getRootElement();
Node gridNode = root.selectSingleNode(xPathToGridErrorNode);
where xmlToParse is a String of xml similar to the excerpt provided above.
The code is trying to obtain the GRD_ERROR node for the page with the PAGE_ID and TRANS_DETAIL_ID provided in the xPath.
I am seeing an intermittent (~1-2%) failure (returned node is null) of this selectSingleNode request even though the requested node is in the xml being searched.
I know there are some gotchas associated with using text()= in xPath and was wondering if there was a better way to format the xPath string for this type of search.
From your snippets, there is a problem regarding GRD_ERRORS vs. GRD_ERRORS_TMP and WRK_REGISTRATION vs. WRK_DNA_REGISTRATION.
Ignoring that, I would suggest to rewrite
//cache/content/transaction/page
/widget[#name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']
/../widget[#name='TRANS_DETAIL_ID'][text()='77147']
/../widget[#name='GRD_ERRORS_TEMP']
as
//cache/content/transaction/page
[widget[#name='PAGE_ID'][text()='WRK_REGISTRATION']]
[widget[#name='TRANS_DETAIL_ID'][text()='77147']]
/widget[#name='GRD_ERRORS']
Just because it makes the code, in my eyes, easier to read, and expresses what you seem to mean more clearly: “the page element that has children with these conditions, and then take the widget with this #name.” Or, if that is closer to how you think about it,
//cache/content/transaction/page/widget[#name='GRD_ERRORS']
[preceding-sibling::widget[#name='PAGE_ID'][text()='WRK_REGISTRATION']]
[preceding-sibling::widget[#name='TRANS_DETAIL_ID'][text()='77147']]

Resources