Is it possible to associate a stylesheet with with Nokogiri, to create this structure?
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://www.my-site.com/sitemap.xsl"?>
<root>
...
</root>
OMG, there is so much fail here that I am breaking the unofficial policy of Team Nokogiri and am providing the correct, sane answer to this question:
require "nokogiri"
doc = Nokogiri::XML "<root>foo</root>"
doc.root.add_previous_sibling Nokogiri::XML::ProcessingInstruction.new(doc, "xml-stylesheet", 'type="text/xsl" href="foo.xsl"')
puts doc.to_xml
# => <?xml version="1.0"?>
# <?xml-stylesheet type="text/xsl" href="foo.xsl"?>
# <root>foo</root>
In the future, please ask questions about Nokogiri on the nokogiri-talk mailing list (http://groups.google.com/group/nokogiri-talk), get the correct answer in a timely fashion, and save everyone a little effort.
There is not.
The way I did it:
xml.gsub!("<?xml version=\"1.0\"?>") do |head|
result = head
result << "\n"
result << "<?xml-stylesheet type=\"text/xsl\" href=\"#{stylesheet}\"?>"
end
Cheers.
Related
I need to compare two xml files and display the differences in a html report. In order to do this, I installed the ruby gem Diffy (and the gems rspec and diff-lcs as directed by the Diffy documentation), but it does not seem to be working properly as differences are not being returned.
I have two xmls files I want to compare.
Xml file one:
<?xml version="1.0" encoding="UTF-8"?>
<SourceDetails>
<Origin>Origin</Origin>
<Identifier>Identifier</Identifier>
<Timestamp>2001-12-31T12:00:00</Timestamp>
</SourceDetails>
<AsOfDate>2001-01-01</AsOfDate>
<Instrument>
<ASXExchangeSecurityIdentifier>ASX</ASXExchangeSecurityIdentifier>
</Instrument>
<Rate>0.0</Rate>
Xml file two:
<?xml version="1.0" encoding="UTF-8"?>
<SourceDetails>
<Origin>FEED</Origin>
<Identifier>IR</Identifier>
<Timestamp>2017-01-01T02:11:01Z</Timestamp>
</SourceDetails>
<AsOfDate>2017-01-02</AsOfDate>
<Instrument>
<CommonCode>GB0</CommonCode>
</Instrument>
<Rate>0.69</Rate>
When I supply the two xml files as arguments to the diffy function:
puts Diffy::Diff.new('xmldoc1', 'xmldoc2', :source => 'files').to_s(:html)
no differences are returned. When I store the two xml files in String variables and supply these variables as arguments to the Diffy function:
puts Diffy::Diff.new(doc1, doc2, :include_plus_and_minus_in_html => true).to_s(:html)
again no differences are returned. To figure out if my xmls were causing the problem, I also tried supplying two different strings to the Diffy function:
puts Diffy::Diff.new("Hello how are you", "Hello how are you\nI'm fine\nThat's great\n")
but this also returned nothing when there are clear differences.
Does anyone know what the problem may be?
I have a sample XML file (let's call it example.xml for the sake of this question) and want to turn it into a Nokogiri object.
According to documentation and lots of other online sources, this should work:
xml = Nokogiri::XML(File.read("example.txt"))
But the value of xml.to_xml is only:
"<?xml version=\"1.0\"?>\n"
In other words, it's ignoring the rest of the file. There are many tags afterwards and none of them are in the xml object.
How do I get Nokogiri to get all the tags?
Here's the XML I'm using:
<? xml version="1.0" encoding="UTF-8" ?>
<Document>
<Test>Test</Test>
</Document>
It looks like you are trying to parse an invalid XML doc.
This can be fixed by removing the spaces in the XML declaration:
<?xml version="1.0" encoding="UTF-8"?>
<Document>
<Test>Test</Test>
</Document>
How I figured this out
By default, when Nokogiri has errors parsing a document it populates an errors array.
xml = Nokogiri::XML(File.read("example.txt"))
p xml.errors
# => [#<Nokogiri::XML::SyntaxError: xmlParsePI : no target name>, #<Nokogiri::XML::SyntaxError: Start tag expected, '<' not found>]
You can also configure Nokogiri to raise an exception of it has parsing errors:
xml = Nokogiri::XML(File.read("example.txt")) do |config|
config.strict
end
Both of these cases show that there were issues parsing the document
I am using ruby 1.9.3 with rails 3.1. My requirement is that there is a parser file like below. when i opened with browser; Tags are not aligned in order. After the <item>; the data are clubbed format. There is a presence of
<?xml version="1.0" encoding="utf-8"?>
when I opened in sublime text; it shows after the <item>
<![CDATA[<?xml version="1.0" encoding="utf-8"?>
also after the </item> there is ]]> present. The data needs to be parsed are inside this <item></item>. the method called parse_file form Nokogiri called only start_element, end_element. When we tried manually by editing the file via removing the above statements; then it will call the characters method to fetch the data. Below is the example code.is there any other way?.
<batch transactionType="HC"><item><?xml version="1.0" encoding="utf-8"?><C><CI><Ve>00501</Ve></CI></C></item></batch>
You can do it easily using "xml-simple". Assuming your XML file name is "test.xml", first install the gem:
gem install xml-simple
Then, you can try:
require "XmlSimple"
abc = XmlSimple.xml_in File.read("test.xml")
puts abc['item']
The output should be:
{"C"=>[{"CI"=>[{"Ve"=>["00501"]}]}]}
This question already has answers here:
Nokogiri/Xpath namespace query
(3 answers)
Closed 8 years ago.
This is probably an XML namespace newbie question but I can't figure out how to get an XPath to work with the following trunctated XML with this particular root element:
<?xml version="1.0" encoding="UTF-8"?>
<CreateOrUpdateEventsRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://dhamma.org" version="3-0-0">
<LanguageKey>
<IsoCode>en</IsoCode>
</LanguageKey>
<Publish>
<Value>true</Value>
</Publish>
<Events>
<Event>
<EventKey>
<LocationKey>
<SubDomain>rasmi</SubDomain>
</LocationKey>
<EventId>10DayPDFStdTag</EventId>
</EventKey>
</Event>
</Events>
</LanguageKey>
</CreateOrUpdateEventsRequest>
Using Ruby and Nokogiri (with a just updated libxml2), it works fine with XPath only if I delete all the extra info in the root element, making it:
<CreateOrUpdateEventsRequest>
Otherwise nothing works:
$> #doc.xpath("//CreateOrUpdateEventsRequest") #=> [] with original header, an array of nodes with modified header
$> #doc.xpath("//LanguageKey") #=> [] with the original header, an array of nodes with modified header
$> #doc.xpath("//xmlns:LanguageKey") #=> undefined namespace prefix with the original
How do I address namespaces like this with XPath?
Many thanks for the help.
The answer seems to be that the XML re-declared XMLNS when it should have declared the namespace with a prefix as in xmlns:myns.
From www.w3.org:
The XML specification reserves all names beginning with the letters 'x', 'm', 'l' in any combination of upper- and lower-case for use by the W3C. To date three such names have been given definitions—although these names are not in the XML namespace, they are listed here as a convenience to readers and users:
xml: See http://www.w3.org/TR/xml/#NT-XMLDecl and http://www.w3.org/TR/xml-names/#xmlReserved
xmlns: See http://www.w3.org/TR/xml-names/#ns-decl
xml-stylesheet: See The xml-stylesheet processing instruction
I don't use Nokogiri nor Ruby,
but you need to register a prefix for namespace http://dhamma.org
When I read http://nokogiri.org/tutorials/searching_a_xml_html_document.html
I understand you must do something like
$> #doc.xpath('//dha:LanguageKey', 'dha' => 'http://dhamma.org')
Here's some code to consider. Starting with code to create a Nokogiri::XML::Document:
require 'nokogiri'
XML = <<EOT
<?xml version="1.0" encoding="UTF-8"?>
<CreateOrUpdateEventsRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://dhamma.org" version="3-0-0">
<LanguageKey>
<IsoCode>en</IsoCode>
</LanguageKey>
<Publish>
<Value>true</Value>
</Publish>
<Events>
<Event>
<EventKey>
<LocationKey>
<SubDomain>rasmi</SubDomain>
</LocationKey>
<EventId>10DayPDFStdTag</EventId>
</EventKey>
</Event>
</Events>
</LanguageKey>
</CreateOrUpdateEventsRequest>
EOT
doc = Nokogiri::XML(XML)
Here's the root node's name:
doc.root.name # => "CreateOrUpdateEventsRequest"
The docs say:
When using CSS, if the namespace is called “xmlns”, you can even omit the namespace name.
doc.at('CreateOrUpdateEventsRequest').name # => "CreateOrUpdateEventsRequest"
doc.at('LanguageKey').to_xml # => "<LanguageKey>\n <IsoCode>en</IsoCode>\n </LanguageKey>"
Using XPath, we can specify the default namespace as:
doc.at('//xmlns:LanguageKey').to_xml # => "<LanguageKey>\n <IsoCode>en</IsoCode>\n </LanguageKey>"
Sometimes, if there are a lot of namespaces it makes sense to use collect_namespaces and pass them in:
name_spaces = doc.collect_namespaces # =>
doc.at('//xmlns:LanguageKey', name_spaces).to_xml # => "<LanguageKey>\n <IsoCode>en</IsoCode>\n </LanguageKey>"
You'll need to look through the documentation for Nokogiri::XML::Node for more information on the various methods.
I recommend using CSS selectors for simplicity and readability over XPath, as a first try. I think XPath has more functionality but it makes my eyes bug out sometimes, so I prefer CSS.
I have an xml like this as mentioned below. I am trying to obtain value for Cardnumber using following expression.
XPATH :
paymentService/ns0:submit/ns0:order/ns0:paymentDetails/ns0:VISA-SSL/cardNumber
But it's giving me error. Can any1 guide me on this?
<?xml version="1.0" encoding="UTF-8"?>
<paymentService version="1.0">
<ns0:submit xmlns:ns0="http://www.tibco.com/ns/no_namespace_schema_location/Payment/PaymentProcessors/WorldPay_CC/SharedResources/Schemas/paymentService_v1.dtd">
<ns0:order>
<description>description</description>
<amount value="500" currencyCode="EUR" exponent="2"/>
<ns0:paymentDetails>
<ns0:VISA-SSL>
<cardNumber>00009875083428500</cardNumber>
<expiryDate>
<date month="02" year="2008"/>
</expiryDate>
<cardHolderName>test</cardHolderName>
</ns0:VISA-SSL>
<session shopperIPAddress="192.165.22.35" id=""/>
</ns0:paymentDetails>
<shopper>
<browser>
<acceptHeader>text/html</acceptHeader>
<userAgentHeader>mozilla 5.0</userAgentHeader>
</browser>
</shopper>
</ns0:order>
</ns0:submit>
</paymentService>
Thanks
Your xmlns:ns0 is misplaced, and (think of this way) because ns0 is defined after <ns0:submit> tag, ns0:submit is "undefined" and thus the parse error.
Edit:
If you need to use this XPath in PHP, you'll either have to declare the namespace before use:
<?xml version="1.0" encoding="UTF-8"?>
<paymentService version="1.0" xmlns:ns0="http://www.tibco.com/ns/no_namespace_schema_location/Payment/PaymentProcessors/WorldPay_CC/SharedResources/Schemas/paymentService_v1.dtd">
<ns0:submit>
Or register the namespace before evaluating your XPath (Thanks to #MiMo for pointing out):
$xml->registerXPathNamespace("ns0","http://www.tibco.com/ns/no_namespace_schema_location/Payment/PaymentProcessors/WorldPay_CC/SharedResources/Schemas/paymentService_v1.dtd");
Also add a slash before your XPATH:
/paymentService/ns0:submit/ns0:order/ns0:paymentDetails/ns0:VISA-SSL/cardNumber
Live demo with declaration first or Live demo with namespace registration (both are demonstrated in PHP).
The problem is the node
<paymentService version="1.0">
Since this is not ended you have to comment it or end it properly.
If you comment that try out with this XPATH
/ns0:submit/ns0:order/ns0:paymentDetails/ns0:VISA-SSL/cardNumber
You need to register the namespace before evaluating the XPath:
$xml->registerXPathNamespace('ns0', 'http://www.tibco.com/ns/no_namespace_schema_location/Payment/PaymentProcessors/WorldPay_CC/SharedResources/Schemas/paymentService_v1.dtd');
where $xml is a SimpleXMLElement variable containing your XML.
Your XPath needs to start with / as per Passerby answer:
/paymentService/ns0:submit/ns0:order/ns0:paymentDetails/ns0:VISA-SSL/cardNumber