How does Xpath in groovy to return Tags and Values - xpath

I have the following XML, and in SOAPUI Groovy I am attempting go capture a set of XML with its tags and values, for example:
<telephoneNumbers>
<telephone>
<id>125042741</id>
<areaCode>0161</areaCode>
<phoneNumber>4804420</phoneNumber>
<extension>1234</extension>
<usage>Work</usage>
</telephone>
</telephoneNumbers>
im trying to return to following outcome (tags and values):
<telephone>
<id>125042741</id>
<areaCode>0161</areaCode>
<phoneNumber>4804420</phoneNumber>
<extension>1234</extension>
<usage>Work</usage>
</telephone>
Here is the groovy:
def groovyUtils = new com.eviware.soapui.support.GroovyUtils( context )
def Recall = groovyUtils.getXmlHolder( "Recall#Response" )
def telephone = Recall[ "//telephone//*" ] as String
String returnXml = ""
if ( Recall["//restrict"] != null ) {
returnXml= telephone
}
else
return returnXml

Similar to this answer How do I print a groovy Node with namespace preserved? you could parse the Xml using XmlSlurper and afterwards print the xml nodes you wish to with StreamingMarkupBuilder.
String xml = Recall.getXml()
def telephoneNumbers = new XmlSlurper().parseText(xml)​​​​​​​​
def outputBuilder = new groovy.xml.StreamingMarkupBuilder()
String telephoneXml = outputBuilder.bind { mkp.yield telephoneNumbers.telephone }
I posted a runnable example here: http://groovyconsole.appspot.com/script/1245001 not using SOAPUI but a simple string for demonstration. Hope this helps!

Related

Get value from xml and calculate

I am getting the amounts from an xml file but I need to sum them to check.
I am using Ruby on rails with the Nokogiri gem
Example from xml file:
<cfdi:Conceptos>
<cfdi:Concepto ClaveProdServ="15101514" NoIdentificacion="PL/762/EXP/ES/2015-16665610" Cantidad="52.967" ClaveUnidad="LTR" Descripcion="MAGNA (LT)" ValorUnitario="16.34" Importe="865.74">
<cfdi:Impuestos>
<cfdi:Traslados>
<cfdi:Traslado Base="842.59" Impuesto="002" TipoFactor="Tasa" TasaOCuota="0.160000" Importe="134.81"/>
</cfdi:Traslados>
</cfdi:Impuestos>
</cfdi:Concepto>
<cfdi:Concepto ClaveProdServ="15101514" NoIdentificacion="PL/767/EXP/ES/2015-8515840" Cantidad="35.045" ClaveUnidad="LTR" Descripcion="MAGNA (LT)" ValorUnitario="16.34" Importe="572.80">
<cfdi:Impuestos>
<cfdi:Traslados>
<cfdi:Traslado Base="557.49" Impuesto="002" TipoFactor="Tasa" TasaOCuota="0.160000" Importe="89.20"/>
</cfdi:Traslados>
</cfdi:Impuestos>
</cfdi:Concepto>
<cfdi:Concepto ClaveProdServ="15101514" NoIdentificacion="PL/762/EXP/ES/2015-16665910" Cantidad="21.992" ClaveUnidad="LTR" Descripcion="MAGNA (LT)" ValorUnitario="16.34" Importe="359.45">
<cfdi:Impuestos>
<cfdi:Traslados>
<cfdi:Traslado Base="349.84" Impuesto="002" TipoFactor="Tasa" TasaOCuota="0.160000" Importe="55.97"/>
</cfdi:Traslados>
</cfdi:Impuestos>
</cfdi:Concepto>
<cfdi:Concepto ClaveProdServ="15101514" NoIdentificacion="PL/762/EXP/ES/2015-16665560" Cantidad="25.002" ClaveUnidad="LTR" Descripcion="MAGNA (LT)" ValorUnitario="16.34" Importe="408.62">
<cfdi:Impuestos>
<cfdi:Traslados>
<cfdi:Traslado Base="397.69" Impuesto="002" TipoFactor="Tasa" TasaOCuota="0.160000" Importe="63.63"/>
</cfdi:Traslados>
</cfdi:Impuestos>
</cfdi:Concepto>
I managed to obtain all the amounts and taxes with these line of code:
array = []
array_i = []
file = Nokogiri::XML(File.open(params[:consumption][:factura]))
doc_pass = file.xpath("//cfdi:Comprobante/cfdi:Conceptos/cfdi:Concepto")
doc_pass.each do |pass|
hash_importe = {}
hash_importe[:total] = pass['Importe']
array << hash_importe
end
doc_pass2 = file.xpath("//cfdi:Comprobante/cfdi:Conceptos/cfdi:Concepto/cfdi:Impuestos/cfdi:Traslados/cfdi:Traslado")
doc_pass2.each do |pass2|
hash_impuesto = {}
hash_impuesto[:tax] = pass2['Importe']
array_i << hash_impuesto
end
these are the results I get from the xml file:
(byebug) array
[{:importe=>"865.74"}, {:importe=>"572.80"}, {:importe=>"359.45"}, {:importe=>"408.62"}, {:importe=>"324.48"}, {:importe=>"649.64"}, {:importe=>"823.45"}, {:importe=>"545.15"}, {:importe=>"428.02"}, {:importe=>"527.21"}, {:importe=>"487.67"}, {:importe=>"331.72"}, {:importe=>"511.64"}, {:importe=>"406.67"}, {:importe=>"820.81"}, {:importe=>"1635.54"}, {:importe=>"484.14"}, {:importe=>"564.83"}, {:importe=>"1463.30"}]
(byebug) array_i
[{:importe=>"134.81"}, {:importe=>"89.20"}, {:importe=>"55.97"}, {:importe=>"63.63"}, {:importe=>"50.52"}, {:importe=>"101.18"}, {:importe=>"128.21"}, {:importe=>"84.88"}, {:importe=>"66.73"}, {:importe=>"82.10"}, {:importe=>"75.90"}, {:importe=>"51.58"}, {:importe=>"79.67"}, {:importe=>"63.33"}, {:importe=>"127.80"}, {:importe=>"254.69"}, {:importe=>"75.36"}, {:importe=>"87.92"}, {:importe=>"227.84"}]
now what I want is to sum both values(importe + impuesto) ​​for example:
865.74 + 134.81
572.80 + 89.20
359.45 + 55.97
I am new with rails, I would appreciate your help
You can return an array with results if both arrays have the same size(I think yes), like this:
(0..array.size - 1).each_with_object([]) { |i, obj| obj << array[i][:importe].to_f + array_i[i][:importe].to_f }
result:
[1000.55, 662.0, 415.41999999999996, 472.25, 375.0, 750.8199999999999, 951.6600000000001, 630.03, 494.75, 609.3100000000001, 563.57, 383.3, 591.31, 470.0, 948.6099999999999, 1890.23, 559.5, 652.75, 1691.1399999999999]
Use zip method to combine values at corresponding index of two arrays
result = array.zip(array_i)
.map { |importe, impuesto| importe[:importe].to_f + impuesto[:importe].to_f }
Or can be simplified more for your concrete data structure
result = array.zip(array_i).map { |hashes| hashes.sum {|h| h[:importe].to_f }}
Better approach would be if you extract Concepto object with Impuesto and Importe values directly from xml, then you don't need to combine different arrays, but use nicely structured object.

xpath could not recognize predicate for a tag

I try to use scrapy xpath to scrape a page, but it seems it cannot capture the tag with predicates when I use a for loop,
# This package will contain the spiders of your Scrapy project
from cunyfirst.items import CunyfirstSectionItem
import scrapy
import json
class CunyfristsectionSpider(scrapy.Spider):
name = "cunyfirst-section-spider"
start_urls = ["file:///Users/haowang/Desktop/section.htm"]
def parse(self, response):
url = response.url
yield scrapy.Request(url, self.parse_page)
def parse_page(self, response):
n = -1
for section in response.xpath("//a[contains(#name,'MTG_CLASS_NBR')]"):
print(response.xpath("//a[#name ='MTG_CLASSNAME$10']/text()"))
n += 1
class_num = section.xpath('text()').extract_first()
# print(class_num)
classname = "MTG_CLASSNAME$" + str(n)
date = "MTG_DAYTIME$" + str(n)
instr = "MTG_INSTR$" + str(n)
print(classname)
class_name = response.xpath("//a[#name = classname]/text()")
I am looking for a tags with name as "MTG_CLASSNAME$" + str(n), with n being 0,1,2..., and I am getting empty output from my xpath query. Not sure why...
PS.
I am basically trying to scrape course and their info from https://hrsa.cunyfirst.cuny.edu/psc/cnyhcprd/GUEST/HRMS/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.HC_CLASS_SEARCH_GBL&IsFolder=false&IgnoreParamTempl=FolderPath%252cIsFolder&PortalActualURL=https%3a%2f%2fhrsa.cunyfirst.cuny.edu%2fpsc%2fcnyhcprd%2fGUEST%2fHRMS%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalContentURL=https%3a%2f%2fhrsa.cunyfirst.cuny.edu%2fpsc%2fcnyhcprd%2fGUEST%2fHRMS%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalContentProvider=HRMS&PortalCRefLabel=Class%20Search&PortalRegistryName=GUEST&PortalServletURI=https%3a%2f%2fhome.cunyfirst.cuny.edu%2fpsp%2fcnyepprd%2f&PortalURI=https%3a%2f%2fhome.cunyfirst.cuny.edu%2fpsc%2fcnyepprd%2f&PortalHostNode=ENTP&NoCrumbs=yes
with filter applied: Kingsborough CC, fall 18, BIO
Thanks!
Well... I've visited the website you put in the question description, I used element inspection and searched for "MTG_CLASSNAME" and I got 0 matches...
So I will give you some tools:
In your settings.py set that:
LOG_FILE = "log.txt"
LOG_STDOUT=True
then print the response body ( response.body ) where you should ( in the top of parse_page function in this case ) and search it in log.txt
Check there if there is what you are looking for.
If there is, use this https://www.freeformatter.com/xpath-tester.html (
or similar ) to check your xpath statement.
In addition, change for section in response.xpath("//a[contains(#name,'MTG_CLASS_NBR')]"):
by for section in response.xpath("//a[contains(#name,'MTG_CLASS_NBR')]").extract():, this will raise an error when you get the data that you are looking for.

For loop while using scrapy

I am trying to crawl a website. I want to do it for different dates. So i am storing date in a list. But while trying to access items of list, crawler works only for 1st value in list. Please help. following is my code:
class SpidyQuotesViewStateSpider(scrapy.Spider):
name = 'retail_price'
def start_requests(self):
print "start request"
urls = "http://fcainfoweb.nic.in/PMSver2/Reports/Report_Menu_web.aspx"
yield scrapy.Request(url=urls, callback=self.parse)
def parse(self, response):
dated = ["05/03/2017","04/03/2017"]
urls = "http://fcainfoweb.nic.in/PMSver2/Reports/Report_Menu_web.aspx"
#frmdata =
cookies1 ={}
val = response.headers.getlist('Set-Cookie')
print "login session values" ,response.headers.getlist('Set-Cookie')
if(len(val) != 0):
cookies1['ASP.NET_SessionId'] = str(response.headers.getlist('Set-Cookie')[0].split(";")[0].split("=")[1])
cookies1['path'] = str(response.headers.getlist('Set-Cookie')[0].split(";")[1].split("=")[1])
print cookies1;
for i in range(len(dated)):
yield scrapy.FormRequest(url=urls, callback=self.parse1, formdata={'ctl00$MainContent$btn_getdata1':"Get Data",
'ctl00$MainContent$Txt_FrmDate':dated[i],
'ctl00$MainContent$Ddl_Rpt_Option0':"Daily Prices",
'ctl00$MainContent$Rbl_Rpt_type':"Price report",
'ctl00$MainContent$ddl_Language':"English",
'ctl00$MainContent$Ddl_Rpt_type':"Retail",
'__EVENTVALIDATION':"IwZyKgfTXVzxiHxiPXGk/W8XQZBDb0EOPxJh6s8hofq0ffqOpiHSH77CafcxySF3PbkYgSMNFCJhLM2cGnL6SxT0PJuGDCJtV0V8Y4a94UErUCiSANiin+4uKckk9v9Ux8JqTVeaipppmlH+wyks2U9SgPfkNUsqw4eHCkDyB5akNNZImRIixOHHVY3JSXGkwXn7ueK9w+AgnqJzpXaWdMr9J1++M4VAFImSNF8brFSfPHe5kb/qzkGIwUr/KRouaRYK8WLWZh/Mbl9xwREwhDSxWJSOdihSE0WWoaqSMtpaR99rDDCsD3mdJqfu0aPIlREupTZRzlrmztXU0eS3949YW+ywdTRvykaMNgOW2Q4saYP5j/niKbRW6GiDnaLV2A38X/HW80+trrsjwJr9tjTKVFyikf6s/3gzyiTp11ivSkwIY2b3hutjYn7OfTDo",
#'__EVENTVALIDATION':"HqVo2xHk04clYwnBposXbZGhbIr181A7RbyeZv74Cia7rXSKmpOpbeSnn3XXnoDJKRxMK0W9nxKZFfkNje+P/K7gE5HVjHJr9Gr0Gs46TntzKDsvzyii8jZ7e0fdZgQCJKoXxQNgR2vNkWqChKcEldBuMHCOgJRqCNCF/JPFKpdKZoIWr7GU8rhzwLijf/Gkm+FuTULs/fl2HHK6Z1QQEozzEHFsDwzl0G4IiN//eNYfHuUBXKZ3wdZzPqG0s53WHEuSBzhqBC9AtCJOs4ZZhdtwFh8iyTJ4PlsLP9DLHYHRCOAd72UO0UH8gT7gAkKVo1I4L540DilowOR9SttH7MM/oOs9qhKlnG61FgqkYGW8zGzF/yNEXO+beVAK1RVvuO+FDnuq/g36TRnUieei5GpAZ+96CSoCIxykdvHx8R+smTNF/5erlowV4ci+tcI7",
'__VIEWSTATEENCRYPTED':"",
'__VIEWSTATEGENERATOR':"85862B00",
'__VIEWSTATE':"+a+3jrBEKxDdkPOzx2wXwKaTMWvCB60WPaHRfJUAZQrdFIpxSqFr5VseTclpGzeHXdxaFnxJe/PkxDKYa7sj3Wiv/os1bNeX0IEB3s45eFsHYWGiU8cvsXCGa5z7rrGRDL5hotg7k/MuUWj8w27xXZO423MN5OsHS+wh+tC/5/Xix+w3zxuQhi8jR5DnreimHbhGZn1sYaKYIGCc8mDIDRNl+w1OZ058F+3LAx96QUu5BYiMYOmrlyxrb9b2yPTmmIrI4NtC4ClBQlxuST5wMDP3vUqqWMhn4auk8ev5gHyPestCRrsAXWs07wDNnikemMwo/4wPiTEbnZQV6SLcDUw0gZpXjXwLI7mhsVjEyVNaQnJp6+Wi6FLsAEEMlFYmQut3JecpVIUkjF9uYSN2GLIbXHPs37AiEXPeQ8E/GyBMx3z1X5l8sw/xSNmFgYQC3riajn8V0+SdkuV2PbNbYKtc+uoSCNLppLYCqiOv5eWanGvAQro2Q67FBA4w2xY+V/K8mzHaGMLoDBxJxLslWyJpL5cX0C6qoXVUu8B028auAQM4eVzH1YPF5qrJiCDo",
#'__VIEWSTATE':"W+m8kNAS6QHiRPo+zFj00EDs/Dbq+y/XvtCmSNwOIkGKlikAlphT8HBAWQDskSm1vdNterBuo0Hy7m4xPbXMOnyEm6IlseXO3jPw+ofnI2WHAKknLil+GeS0IfMWGeoD5aNyiz3zh1jZkKU7R7hQsxwARoHRyjhf8UCooFbkVvL6ddHVYZbH5LcocmCF1BTOCqYN5y5yzfDfYbp3KNW9kH53pdmwCsjiEirdxxUGDoG1Ke3JBEXfSl+4XubirHSR8z+VlFmPPXZGU8mMogwq9Eg822RYjvbwvZG74djcf7kdfB9KXCPO9u6cWIjLiW+cfXHSXD+1XYFVf9ATU2/NV4YbUzsI4PJRwoGD4BryUNIm2JFeT4c8F4REYTA16shxz5mDTFQ6rbmg6SmqP8G9gAc2Hr9ABD8+2BUNabGhNZ8wDIZArfYS4pl5DNrlPlpqeCjhmvv0znKAJSOac3pCUej8G90ZGwQKOPORWbNVzQShoH7QvrXV8pCklcia6psuAGO+Oj72oDWPxedE4DjdjX5TbLoW4bzsk/YNfUv4JpjGR8DWpG8IFYJG9CCjMEYb",
'__LASTFOCUS':"",
'__EVENTARGUMENT':"",
'__EVENTTARGET':"",
'ctl00_MainContent_ToolkitScriptManager1_HiddenField':";;AjaxControlToolkit,+Version=4.1.51116.0,+Culture=neutral,+PublicKeyToken=28f01b0e84b6d53e:en-US:fd384f95-1b49-47cf-9b47-2fa2a921a36a:475a4ef5:addc6819:5546a2b:d2e10b12:effe2a26:37e2e5c9:5a682656:c7029a2:e9e598a9"},method='POST',cookies = cookies1)
def parse1(self, response):
path1 = "id('Panel1')"
value1 = response.xpath(path1).extract_first()
print value1
First of all, you are sending the spider more time on the same site, though with different form parameters. You have therefore to use dont_filter=True in the request, otherwise Scrapy blocks duplicate calls.
Then it seems to me that the site you are scraping don't allow you to make more than one request during the same session. Try for example to go to http://fcainfoweb.nic.in/PMSver2/Reports/Report_Menu_web.aspx with your browser, compile the form, get the data and than to go back to the initial page: It's impossible. So you have to modify your spider. Here's a very rough code just to give an idea. It works for me, but please don't use it in production!
class SpidyQuotesViewStateSpider(scrapy.Spider):
name = 'retail_price'
urls = "http://fcainfoweb.nic.in/PMSver2/Reports/Report_Menu_web.aspx"
def start_requests(self):
dated = ["01/03/2017","05/03/2017","04/03/2017"]
for i in dated:
request = scrapy.Request(url=self.urls, dont_filter=True, callback=self.parse)
request.meta['question'] = i
yield request
def parse(self, response):
thedate = response.meta['question']
cookies1 ={}
val = response.headers.getlist('Set-Cookie')
print("login session values" ,response.headers.getlist('Set-Cookie'))
if(len(val) != 0):
cookies1['ASP.NET_SessionId'] = str(str(response.headers.getlist('Set-Cookie')[0]).split(";")[0].split("=")[1])
cookies1['path'] = str(str(response.headers.getlist('Set-Cookie')[0]).split(";")[1].split("=")[1])
yield scrapy.FormRequest(url=self.urls, dont_filter=True, callback=self.parse1, formdata={'ctl00$MainContent$btn_getdata1':"Get Data",
'ctl00$MainContent$Txt_FrmDate': thedate,
'ctl00$MainContent$Ddl_Rpt_Option0':"Daily Prices",
'ctl00$MainContent$Rbl_Rpt_type':"Price report",
'ctl00$MainContent$ddl_Language':"English",
'ctl00$MainContent$Ddl_Rpt_type':"Retail",
'__EVENTVALIDATION':"IwZyKgfTXVzxiHxiPXGk/W8XQZBDb0EOPxJh6s8hofq0ffqOpiHSH77CafcxySF3PbkYgSMNFCJhLM2cGnL6SxT0PJuGDCJtV0V8Y4a94UErUCiSANiin+4uKckk9v9Ux8JqTVeaipppmlH+wyks2U9SgPfkNUsqw4eHCkDyB5akNNZImRIixOHHVY3JSXGkwXn7ueK9w+AgnqJzpXaWdMr9J1++M4VAFImSNF8brFSfPHe5kb/qzkGIwUr/KRouaRYK8WLWZh/Mbl9xwREwhDSxWJSOdihSE0WWoaqSMtpaR99rDDCsD3mdJqfu0aPIlREupTZRzlrmztXU0eS3949YW+ywdTRvykaMNgOW2Q4saYP5j/niKbRW6GiDnaLV2A38X/HW80+trrsjwJr9tjTKVFyikf6s/3gzyiTp11ivSkwIY2b3hutjYn7OfTDo",
#'__EVENTVALIDATION':"HqVo2xHk04clYwnBposXbZGhbIr181A7RbyeZv74Cia7rXSKmpOpbeSnn3XXnoDJKRxMK0W9nxKZFfkNje+P/K7gE5HVjHJr9Gr0Gs46TntzKDsvzyii8jZ7e0fdZgQCJKoXxQNgR2vNkWqChKcEldBuMHCOgJRqCNCF/JPFKpdKZoIWr7GU8rhzwLijf/Gkm+FuTULs/fl2HHK6Z1QQEozzEHFsDwzl0G4IiN//eNYfHuUBXKZ3wdZzPqG0s53WHEuSBzhqBC9AtCJOs4ZZhdtwFh8iyTJ4PlsLP9DLHYHRCOAd72UO0UH8gT7gAkKVo1I4L540DilowOR9SttH7MM/oOs9qhKlnG61FgqkYGW8zGzF/yNEXO+beVAK1RVvuO+FDnuq/g36TRnUieei5GpAZ+96CSoCIxykdvHx8R+smTNF/5erlowV4ci+tcI7",
'__VIEWSTATEENCRYPTED':"",
'__VIEWSTATEGENERATOR':"85862B00",
'__VIEWSTATE':"+a+3jrBEKxDdkPOzx2wXwKaTMWvCB60WPaHRfJUAZQrdFIpxSqFr5VseTclpGzeHXdxaFnxJe/PkxDKYa7sj3Wiv/os1bNeX0IEB3s45eFsHYWGiU8cvsXCGa5z7rrGRDL5hotg7k/MuUWj8w27xXZO423MN5OsHS+wh+tC/5/Xix+w3zxuQhi8jR5DnreimHbhGZn1sYaKYIGCc8mDIDRNl+w1OZ058F+3LAx96QUu5BYiMYOmrlyxrb9b2yPTmmIrI4NtC4ClBQlxuST5wMDP3vUqqWMhn4auk8ev5gHyPestCRrsAXWs07wDNnikemMwo/4wPiTEbnZQV6SLcDUw0gZpXjXwLI7mhsVjEyVNaQnJp6+Wi6FLsAEEMlFYmQut3JecpVIUkjF9uYSN2GLIbXHPs37AiEXPeQ8E/GyBMx3z1X5l8sw/xSNmFgYQC3riajn8V0+SdkuV2PbNbYKtc+uoSCNLppLYCqiOv5eWanGvAQro2Q67FBA4w2xY+V/K8mzHaGMLoDBxJxLslWyJpL5cX0C6qoXVUu8B028auAQM4eVzH1YPF5qrJiCDo",
#'__VIEWSTATE':"W+m8kNAS6QHiRPo+zFj00EDs/Dbq+y/XvtCmSNwOIkGKlikAlphT8HBAWQDskSm1vdNterBuo0Hy7m4xPbXMOnyEm6IlseXO3jPw+ofnI2WHAKknLil+GeS0IfMWGeoD5aNyiz3zh1jZkKU7R7hQsxwARoHRyjhf8UCooFbkVvL6ddHVYZbH5LcocmCF1BTOCqYN5y5yzfDfYbp3KNW9kH53pdmwCsjiEirdxxUGDoG1Ke3JBEXfSl+4XubirHSR8z+VlFmPPXZGU8mMogwq9Eg822RYjvbwvZG74djcf7kdfB9KXCPO9u6cWIjLiW+cfXHSXD+1XYFVf9ATU2/NV4YbUzsI4PJRwoGD4BryUNIm2JFeT4c8F4REYTA16shxz5mDTFQ6rbmg6SmqP8G9gAc2Hr9ABD8+2BUNabGhNZ8wDIZArfYS4pl5DNrlPlpqeCjhmvv0znKAJSOac3pCUej8G90ZGwQKOPORWbNVzQShoH7QvrXV8pCklcia6psuAGO+Oj72oDWPxedE4DjdjX5TbLoW4bzsk/YNfUv4JpjGR8DWpG8IFYJG9CCjMEYb",
'__LASTFOCUS':"",
'__EVENTARGUMENT':"",
'__EVENTTARGET':"",
'ctl00_MainContent_ToolkitScriptManager1_HiddenField':";;AjaxControlToolkit,+Version=4.1.51116.0,+Culture=neutral,+PublicKeyToken=28f01b0e84b6d53e:en-US:fd384f95-1b49-47cf-9b47-2fa2a921a36a:475a4ef5:addc6819:5546a2b:d2e10b12:effe2a26:37e2e5c9:5a682656:c7029a2:e9e598a9"},method='POST',cookies = cookies1)
def parse1(self, response):
path1 = "id('Panel1')"
value1 = response.xpath(path1).extract_first()[:574]
print(value1)

How to sort a list of text+date strings in Groovy

I have a list of strings, each one contains text with date like this:
"foo_6.7.2016"
"foo_5.10.2016"
"foo_6.30.2016"
"foo_6.23.2016"
"foo_6.2.2016"
"foo_5.22.2016"
I need to sort them by Date and get this:
"foo_6.30.2016"
"foo_6.23.2016"
"foo_6.7.2016"
"foo_6.2.2016"
"foo_5.22.2016"
"foo_5.10.2016"
An alternative might be:
def foos = [
"foo_6.7.2016",
"foo_5.10.2016",
"foo_6.30.2016",
"foo_6.23.2016",
"foo_6.2.2016",
"foo_5.22.2016"
]
def sorted = foos.sort(false) { Date.parse('M.d.yyyy', it - 'foo_') }.reverse()
For a quick answer that needs substantial cleanup:
def dates = [
"foo_6.7.2016"
"foo_5.10.2016"
"foo_6.30.2016"
"foo_6.23.2016"
"foo_6.2.2016"
"foo_5.22.2016"
]
def prefix = "foo_"
java.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat("M.d.yyyy")
def sorted_dates = dates.collect{ sdf.parse(
it, new java.text.ParsePosition(prefix.length()) ) }.sort().reverse()
def newDates = sorted_dates.collect{ "${prefix} + ${sdf.format(it)}"}
println newDates

how to find value with path x Groovy

please advise how to find and output cust_JiraTaskId. I need the value of cust_JiraTaskId based on the max number of inside node . In this example it'll be 111111.
I managed to find the max externalCode and now i need cust_JiraTaskId value.
<SFOData.cust_JiraReplication>
<cust_HRISId>J000009</cust_HRISId>
<externalCode>7</externalCode>
<cust_JiraTask>
<externalCode>3</externalCode>
<cust_JiraTaskId>12345</cust_JiraTaskId>
</cust_JiraTask>
<cust_JiraTask>
<externalCode>5</externalCode>
<cust_JiraTaskId>111111</cust_JiraTaskId>
</cust_JiraTask>
</SFOData.cust_JiraReplication>
My script is below
// Create an XPath statement to search for the
element or elements you care about:
XPath x;
x = XPath.newInstance("//cust_JiraTask/externalCode");
myElements = x.selectNodes(doc);
String maxvalue = "";
for (Element myElement : myElements) {
if (myElement.getValue() > maxvalue)
{
maxvalue = myElement.getValue();
}
}
props.setProperty("document.dynamic.userdefined.externalCode", maxvalue);
thanks for help.
This works for me with Groovy 2.4.5:
def xml = """
<SFOData.cust_JiraReplication>
<cust_HRISId>J000009</cust_HRISId>
<externalCode>7</externalCode>
<cust_JiraTask>
<externalCode>3</externalCode>
<cust_JiraTaskId>12345</cust_JiraTaskId>
</cust_JiraTask>
<cust_JiraTask>
<externalCode>5</externalCode>
<cust_JiraTaskId>111111</cust_JiraTaskId>
</cust_JiraTask>
</SFOData.cust_JiraReplication>
"""
def xs = new XmlSlurper().parseText(xml)
def nodes = xs.cust_JiraTask.cust_JiraTaskId
def maxNode = nodes.max { it.text() as int }
assert 111111 == maxNode.text() as int

Resources