How to resolve "RSpec::Expectations::ExpectationNotMetError: expected" error? - ruby

I am extracting a value from XML and using that value to check if it exists in a PDF file:
XML I have is
<RealTimeLetter>
<Customer>
<RTLtr_Acct>0163426</RTLtr_Acct>
<RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>
<RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>
<RTLtr_CustomerAddress2 />
<RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>
<RTLtr_CustomerState>GA</RTLtr_CustomerState>
<RTLtr_CustomerZip>30188</RTLtr_CustomerZip>
<RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>
<RTLtr_Addressee>0</RTLtr_Addressee>
</Customer>
</RealTimeLetter>
The PDF file has the Customer Name and address
LSIH JHTWVZ
887 YPCLY THYZO SU
WOODSTOCK, GA 30188
I am using PDF Reader and Nokogiri gems to read the text from PDF, extract the Customer name from XML and perform a check if the PDF includes the Customer name in it.
PDF reader is parsed as
require 'pdf_reader'
def parse_pdf
PDF::Reader.new(#filename)
end
#reader = file('C:\Users\ecz560\Desktop\30004_Standard.pdf').parse_pdf
require 'nokogiri'
#xml = Nokogiri::XML(File.open('C:\Users\ecz560\Desktop\30004Standard.xml'))
#CustName = #xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").map(&:text).to_s
page_index = 0
#reader.pages.each do |page|
page_index = page_index+1
if expect(page.text).to include #CustName
valid_text = "Given text is present in -- #{page_index}"
puts valid_text
end
end
But I am getting a error:
RSpec::Expectations::ExpectationNotMetError: expected "LSIH JHTWVZ\n 887 YPCLY THYZO SU\n WOODSTOCK, GA 30188\n Page 1 of 1" to include "[\"LSIH JHTWVZ\"]"
Diff:
## -1,2 +1,80 ##
-["LSIH JHTWVZ"]
+ LSIH JHTWVZ
+ 887 YPCLY THYZO SU
+ WOODSTOCK, GA 30188
./features/step_definitions/Letters/Test1_Letters.rb:372:in `block (2 levels) in <top (required)>'
./features/step_definitions/Letters/Test1_Letters.rb:370:in `each'
./features/step_definitions/Letters/Test1_Letters.rb:370:in `/^I validate the PDF content$/'
C:\Users\ecz560\Documents\GitHub\ATDD Local\features\FeatureFiles\Letters\Test1_Letters.feature:72:in `Then I validate the PDF content'
In understanding the issue is with the way I am comparing the #Custname.
How do I resolve this?

One thing I see is that your XPath selector isn't working.
//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName
Testing it:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<RealTimeLetter>
<Customer>
<RTLtr_Acct>0163426</RTLtr_Acct>
<RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>
<RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>
<RTLtr_CustomerAddress2 />
<RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>
<RTLtr_CustomerState>GA</RTLtr_CustomerState>
<RTLtr_CustomerZip>30188</RTLtr_CustomerZip>
<RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>
<RTLtr_Addressee>0</RTLtr_Addressee>
</Customer>
</RealTimeLetter>
EOT
doc.search("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").to_xml # => ""
Using a bit of a modification finds the <Customer> node:
doc.search('//Customer/RTLtr_Acct/text()[contains(., "0163426")]/../..').to_xml
# => "<Customer>\n <RTLtr_Acct>0163426</RTLtr_Acct>\n <RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>\n <RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>\n <RTLtr_CustomerAddress2/>\n <RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>\n <RTLtr_CustomerState>GA</RTLtr_CustomerState>\n <RTLtr_CustomerZip>30188</RTLtr_CustomerZip>\n <RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>\n <RTLtr_Addressee>0</RTLtr_Addressee>\n</Customer>"
At that point it's easy to grab content from elements inside <Customer>:
customer = doc.search('//Customer/RTLtr_Acct/text()[contains(., "0163426")]/../..')
customer.at('RTLtr_Acct').text # => "0163426"
customer.at('RTLtr_CustomerAddress1').text # => "887 YPCLY THYZO SU"

if expect(page.text).to include #CustName
expect is not used this way.
an expectation is used in testing, to verify that your code is working the way it should. It should not be used during normal code.
an expectation throws an exception and halts all code if it fails. It doesn't return true/false - you can't continue if it fails - it will throw the exception (correctly) as it did in your code, and then all your code will stop and won't start again.
What you probably want to do is just check the value like this:
if page.text.includes?(#CustName)
(Note: I have not bug-tested that... you will probably have to google for the correct way of writing it and write something similar that actually works.)

Related

Ruby Savon - Parse XML string

I have exhausted google on this subject and I just can't seem to get it right..
I have the following XML payload returned from Savon:
<?xml version='1.0' encoding='UTF-8'?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:listGFUsersResponse xmlns:ns="http://ws.fds.com">
<ns:return>
<responseCode>0000</responseCode><responseDescription>No Errors-DWI</responseDescription><user><login>aa1283</login><name>Andrew Alonzo</name><team>DIALER</team><secLev>-1</secLev><maxDiscount>0.00</maxDiscount><phoneSystemId></phoneSystemId></user><user><login>aaronc</login><name>Aaron Callison</name><team></team><secLev>-1</secLev><maxDiscount>0.00</maxDiscount><phoneSystemId></phoneSystemId></user>
</ns:return>
</ns:listGFUsersResponse>
</soapenv:Body>
</soapenv:Envelope>
I would like to parse out ALL values of <name> * </name> and <login> * </login>
A few of my attempts here:
response1 = client1.call(
:list_gf_users,
message: message)
doc = Nokogiri::XML(response1.to_s)
pp doc
p doc.search('/name').text
p doc.search('/login').text
Nothing returned...
doc = Nokogiri::XML(response1.to_s)
value = doc.xpath('/name').map(&:text)
puts value
Nada....
doc = Nokogiri::XML(response1.to_s)
value = doc.xpath('/user[name]').map(&:text)
puts value
Zilch...
would love to be able to see:
name: Andrew Alonzo
login: aa1283
or even better a Hash?
{"aa1283" => "Andrew Alonzo"}
Getting 0 results such as:
""
[]
nil
Figured it out... probably not most efficient but gets the job done:
Convert Savon response to string(can't use scan on Savon output)
doc = response1.to_s
subFile = doc.gsub("<","<") #Replace the string convert characters
Run scan using regex capture groups:
#user = subFile.scan /<user><login>(.+?)<\/login><name>(.*?)<\/name>.+?><\/user>/
In your comments you have
doc = response1.doc
which gives you a Nokogiri document. With that you should be able to do the following:
doc.xpath("//user").each do |user|
login = user.at("login")&.text
name = user.at("name")&.text
puts "#{login}: #{name}"
end
The output is
aa1283: Andrew Alonzo
aaronc: Aaron Callison
I used the XML from your comment:
<root>
<responseCode>0000</responseCode>
<responseDescription>No Errors-DWI</responseDescription>
<user>
<login>aa1283</login>
<name>Andrew Alonzo</name>
<team>DIALER</team>
<secLev>-1</secLev>
<maxDiscount>0.00</maxDiscount>
<phoneSystemId></phoneSystemId>
</user>
<user>
<login>aaronc</login>
<name>Aaron Callison</name>
<team></team>
<secLev>-1</secLev>
<maxDiscount>0.00</maxDiscount>
<phoneSystemId></phoneSystemId>
</user>
</root>
Note that I had to convert this to plaintext. You have some non-printing unicode characters sprinkled throughout the document in seemingly random places (which makes me wonder if that's actually the cause of your problems).

A JSON text must at least contain two octets! (JSON::ParserError)

I'm working with a Ruby script that reads a .json file.
Here is the JSON file:
{
"feed.xml": "93d5b140dd2b4779edef0347ac835fb1",
"index.html": "1cbe25936e392161bad6074d65acdd91",
"md5.json": "655d7c1dbf83a271f348a50a44ba4f6a",
"test.sh": "9be192b1b5a9978cb3623737156445fd",
"index.html": "c064e204040cde216d494776fdcfb68f",
"main.css": "21b13d87db2186d22720e8c881a78580",
"welcome-to-jekyll.html": "01d7c7d66bdeecd9cd69feb5b4b4184d"
}
It is completely valid, and is checked for its existence before trying to read from it. Example:
if File.file?("md5.json")
puts "MD5s exists"
mddigests = File.open("md5.json", "r")
puts "MD5s" + mddigests.read
items = JSON.parse(mddigests.read) <--- Where it all goes wrong.
puts items["feed.xml"]
Everything works up until that point:
MD5s exists
MD5s{
"feed.xml": "93d5b140dd2b4779edef0347ac835fb1",
"index.html": "1cbe25936e392161bad6074d65acdd91",
"md5.json": "655d7c1dbf83a271f348a50a44ba4f6a",
"test.sh": "9be192b1b5a9978cb3623737156445fd",
"index.html": "c064e204040cde216d494776fdcfb68f",
"main.css": "21b13d87db2186d22720e8c881a78580",
"welcome-to-jekyll.html": "01d7c7d66bdeecd9cd69feb5b4b4184d"
}
common.rb:156:in `initialize': A JSON text must at least contain two octets! (JSON::ParserError)
I've searched and tried a lot of different things, to no avail. I'm stumped. Thanks!
You have a duplicate call to read() at the point that it all goes wrong. Replace the second call to read() with the variable mddigests and all should be fine.
This code should work like you'd expect:
if File.file?("md5.json")
puts "MD5s exists"
mddigests = File.open("md5.json", "r")
digests = mddigests.read
puts "MD5s" + digests
items = JSON.parse(digests) #<--- This should work now!
puts items["feed.xml"]
end
The reason is that the file pointer is moved after the first read(), and by the second read(), it's at the end of file, hence the message requiring at least 2 octets.

How to use Nokogiri to combine multiple like-formatted XML files into CSV

I want to parse multiple like-formatted XML files into a CSV file.
I searched on Google, nokogiri.org, and on SO but I haven't been able to find an answer.
I have ten XML files in identical format in terms of node/element structure, that reside in the current directory.
After combining the XML files into a single XML file, I need to pull out specific elements of the advisory node. I would like to output the link, title, location, os -> language -> name, and reference -> name data to the CSV file.
My code is only able to parse a single XML document and I'd like it to take into account 1:many:
# Parse the XML file into a Nokogiri::XML::Document object
#doc = Nokogiri::XML(File.open("file.xml"))
# Gather the 5 specific XML elements out of the 'advisory' top-level node
data = #doc.search('advisory').map { |adv|
[
adv.at('link').content,
adv.at('title').content,
adv.at('location').content,
adv.at('os > language > name').content,
adv.at('reference > name').content
]
}
# Loop through each array element in the object and write out as CSV row
CSV.open('output_file.csv', 'wb') do |csv|
# Explicitly set headers until you figure out how to get them programatically
csv << ['Link', 'Title', 'Location', 'OS Name', 'Reference Name']
data.each do |row|
csv << row
end
end
I tried changing the code to support multiple XML files and get them into Nokogiri::XML::Document objects:
xml_docs = []
Dir.glob("*.xml").each do |file|
xml = Nokogiri::XML(File.new(file))
xml_docs << Nokogiri::XML::Document.new(xml)
end
This successfully creates an array xml_docs with the correct objects it in, but I don't know how to convert these six objects into a single object.
This is sample XML. All XML files use the same node/element structure:
<advisories>
<title> Not relevant </title>
<customer> N/A </customer>
<advisory id="12345">
<link> https://www.google.com </link>
<release_date>2016-04-07</release_date>
<title> The Short Description Would Go Here </title>
<location> Location Name Here </location>
<os>
<product>
<id>98765</id>
<name>Product Name</name>
</product>
<language>
<id>123</id>
<name>en</name>
</language>
</os>
<reference>
<id>00029</id>
<name>Full</name>
<area>Not Defined</area>
</reference>
</advisory>
<advisory id="98765">
<link> https://www.msn.com </link>
<release_date>2016-04-08</release_date>
<title> The Short Description Would Go Here </title>
<location> Location Name Here </location>
<os>
<product>
<id>12654</id>
<name>Product Name</name>
</product>
<language>
<id>126</id>
<name>fr</name>
</language>
</os>
<reference>
<id>00052</id>
<name>Partial</name>
<area>Defined</area>
</reference>
</advisory>
</advisories>
The code leverages Nokogiri::XML::Document but if Nokogiri::XML::Builder will work better for this, I am more than willing to adjust my code accordingly.
I'd handle the first part, of parsing one XML file, like this:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<advisories>
<advisory id="12345">
<link> https://www.google.com </link>
<title> The Short Description Would Go Here </title>
<location> Location Name Here </location>
<os>
<language>
<name>en</name>
</language>
</os>
<reference>
<name>Full</name>
</reference>
</advisory>
<advisory id="98765">
<link> https://www.msn.com </link>
<release_date>2016-04-08</release_date>
<title> The Short Description Would Go Here </title>
<location> Location Name Here </location>
<os>
<language>
<name>fr</name>
</language>
</os>
<reference>
<name>Partial</name>
</reference>
</advisory>
</advisories>
EOT
Note: This has nodes removed because they weren't important to the question. Please remove fluff when asking as it's distracting.
With this being the core of the code:
doc.search('advisory').map{ |advisory|
link = advisory.at('link').text
title = advisory.at('title').text
location = advisory.at('location').text
os_language_name = advisory.at('os > language > name').text
reference_name = advisory.at('reference > name').text
{
link: link,
title: title,
location: location,
os_language_name: os_language_name,
reference_name: reference_name
}
}
That could be DRY'd but was written as an example of what to do.
Running that results in an array of hashes, which would be easily output via CSV:
# => [
{:link=>" https://www.google.com ", :title=>" The Short Description Would Go Here ", :location=>" Location Name Here ", :os_language_name=>"en", :reference_name=>"Full"},
{:link=>" https://www.msn.com ", :title=>" The Short Description Would Go Here ", :location=>" Location Name Here ", :os_language_name=>"fr", :reference_name=>"Partial"}
]
Once you've got that working then fit it into a modified version of your loops to output CSV and read the XML files. This is untested but looks about right:
CSV.open('output_file.csv', 'w',
headers: ['Link', 'Title', 'Location', 'OS Name', 'Reference Name'],
write_headers: true
) do |csv|
Dir.glob("*.xml").each do |file|
xml = Nokogiri::XML(File.read(file))
# parse a file and get the array of hashes
end
# pass the array of hashes to CSV for output
end
Note that you were using a file mode of 'wb'. You rarely need b with CSV as CSV is supposed to be a text format. If you are sure you will encounter binary data then use 'b' also, but that could lead down a path containing dragons.
Also note that this is using read. read is not scalable, which means it doesn't care how big a file is, it's going to try to read it into memory, whether or not it'll actually fit. There are lots of reasons to avoid that, but the best is it'll take your program to its knees. If your XML files could exceed the available free memory for your system then you'll want to rewrite using a SAX parser, which Nokogiri supports. How to do that is a different question.
it was actually an Array of array of hashes. I'm not sure how I ended up there but I was easily able to use array.flatten
Meditate on this:
foo = [] # => []
foo << [{}] # => [[{}]]
foo.flatten # => [{}]
You probably wanted to do this:
foo = [] # => []
foo += [{}] # => [{}]
Any time I have to use flatten I look to see if I can create the array without it being an array of arrays of something. It's not that they're inherently bad, because sometimes they're very useful, but you really wanted an array of hashes so you knew something was wrong and flatten was a cheap way out, but using it also costs more CPU time. It's better to figure out the problem and fix it and end up with faster/more efficient code. (And some will say that's a wasted effort or is premature optimization, but writing efficient code is a very good trait and goal.)

Xml formatting using Node

Following is the method used to write an entry to xml file
def write_entry(entry)
node = Nokogiri::XML::Node.new("url", #xml_document)
node["loc"]= entry[:url]
node["lastmod"]= entry[:lastmod].to_s
node["changefreq"] = entry[:frequency].to_s
node["priority"] = entry[:priority].to_s
node.to_xml
end
The entry looks like this:
<urlset>
<url loc="http:`enter code here`//www.experteer.co.uk/vacaturebank/banen/vacatures/xing-ag" lastmod="2011-11-23 16:58:27 UTC" changefreq="0.8" priority="monthly"/>
</urlset>
I want the entry of xml to be like this
<urlset>
<url>
<loc> http://www.experteer.co.uk/vacaturebank/banen/vacatures/xing-ag </loc>
<lastmod> 2011-11-23 16:58:27 UTC </lastmod>
<changefreq> 0.8 </changefreq>
<priority> monthly </priority>
</url>
</urlset>
Is it possible with using Node or I have to use Builder?
If possible with Node Then how?
and If I have to use Builder it writes header for each entry how can I handle that it dont write header for each entry.
you can use << or add_child to append children nodes to a node.
def write_entry(entry)
url = Nokogiri::XML::Node.new( "url" , #xml_document )
%w{loc lastmod changefreq priority}.each do |node|
url << Nokogiri::XML::Node.new( node, #xml_document ).tap do |n|
n.content = entry[ node.to_sym ]
end
end
url.to_xml
end
For this to work correctly, you have to change entry[:url] to entry[:loc]. and entry[:frequency] to entry[:changefreq], which shouldn't be a bad thing (it's best to have the same name for the same thing everywhere, isn't it ?).
Alternatively, if your entry hash only contains what you need to convert to xml, use entry.each do |key,value| instead of the array.

How to replace a particular line in xml with the new one in ruby

I have a requirement where I need to replace the element value with the new one and I dont want any other modification to be done to the file.
<mtn:test-case title='Power-Consist-Message'>
<mtn:messages>
<mtn:message sequence='4' correlation-key='0x0F04'>
<mtn:header>
<mtn:protocol-version>0x4</mtn:protocol-version>
<mtn:message-type>0x0F04</mtn:message-type>
<mtn:message-version>0x01</mtn:message-version>
<mtn:gmt-time-switch>false</mtn:gmt-time-switch>
<mtn:crc-calc-switch>1</mtn:crc-calc-switch>
<mtn:encrypt-switch>false</mtn:encrypt-switch>
<mtn:compress-switch>false</mtn:compress-switch>
<mtn:ttl>999</mtn:ttl>
<mtn:qos-class-of-service>0</mtn:qos-class-of-service>
<mtn:qos-priority>2</mtn:qos-priority>
<mtn:qos-network-preference>1</mtn:qos-network-preference>
this is how the xml file looks like, I want to replace 999 with "some other value", under s section, but when am doing that using formatter in ruby some other unwanted modifications are taking place, the code that am using is as belows
File.open(ENV['CadPath1']+ "conf\\cad-mtn-config.xml") do |config_file|
# Open the document and edit the file
config = Document.new(config_file)
testField=config.root.elements[4].elements[11].elements[1].elements[1].elements[1].elements[11]
if testField.to_s.match(/<mtn:qos-network-preference>/)
test=config.root.elements[4].elements[11].elements[1].elements[1].elements[1].elements[8].text="2"
# Write the result to a new file.
formatter = REXML::Formatters::Default.new
File.open(ENV['CadPath1']+ "conf\\cad-mtn-config.xml", 'w') do |result|
formatter.write(config, result)
end
end
end
when am writting the modifications to the new file, the xml file size is getting changed from 79kb to 78kb, is there any way to just replace the particular line in xml file and save changes without affecting the xml file.
Please let me know soon...
I prefer Nokogiri as my XML/HTML parser of choice:
require 'nokogiri'
xml =<<EOT
<mtn:test-case title='Power-Consist-Message'>
<mtn:messages>
<mtn:message sequence='4' correlation-key='0x0F04'>
<mtn:header>
<mtn:protocol-version>0x4</mtn:protocol-version>
<mtn:message-type>0x0F04</mtn:message-type>
<mtn:message-version>0x01</mtn:message-version>
<mtn:gmt-time-switch>false</mtn:gmt-time-switch>
<mtn:crc-calc-switch>1</mtn:crc-calc-switch>
<mtn:encrypt-switch>false</mtn:encrypt-switch>
<mtn:compress-switch>false</mtn:compress-switch>
<mtn:ttl>999</mtn:ttl>
<mtn:qos-class-of-service>0</mtn:qos-class-of-service>
<mtn:qos-priority>2</mtn:qos-priority>
<mtn:qos-network-preference>1</mtn:qos-network-preference>
EOT
Notice that the XML is malformed, i.e., it doesn't terminate correctly.
doc = Nokogiri::XML(xml)
I'm using CSS accessors to find the ttl node. Because of some magic, Nokogiri's CSS ignores XML name spaces, simplifying finding nodes.
doc.at('ttl').content = '1000'
puts doc.to_xml
# >> <?xml version="1.0"?>
# >> <test-case title="Power-Consist-Message">
# >> <messages>
# >> <message sequence="4" correlation-key="0x0F04">
# >> <header>
# >> <protocol-version>0x4</protocol-version>
# >> <message-type>0x0F04</message-type>
# >> <message-version>0x01</message-version>
# >> <gmt-time-switch>false</gmt-time-switch>
# >> <crc-calc-switch>1</crc-calc-switch>
# >> <encrypt-switch>false</encrypt-switch>
# >> <compress-switch>false</compress-switch>
# >> <ttl>1000</ttl>
# >> <qos-class-of-service>0</qos-class-of-service>
# >> <qos-priority>2</qos-priority>
# >> <qos-network-preference>1</qos-network-preference>
# >> </header></message></messages></test-case>
Notice that Nokogiri replaced the content of the ttl node. It also stripped the XML namespace info because the document didn't declare it correctly, and, finally, Nokogiri has added closing tags to make the document syntactically correct.
If you want the namespace to be declared in the output, you'll need to make sure it's there in the input.
If you need to just literally replace that value without affecting anything else about the XML file, even if (as pointed by the Tin Man above) that would mean leaving the original XML file malformed, you can do that with direct string manipulation using a regular expression.
Assuming there is guaranteed to only be one <mtn:ttl> tag in your XML document, you could just do:
doc = IO.read("somefile.xml")
doc.sub! /<mtn:ttl>.+?<\/mtn:ttl>/, "<mtn:ttl>some other value<\/mtn:ttl>"
File.open("somefile.xml", "w") {|fh| fh.write(doc)}
If there might be more than one <mtn:ttl> tag, then this is trickier; how much trickier depends on how you want to figure out which tag(s) to change.

Resources