I'm looking into the possibilities of jmeter, and it looks just great. However, one of the things my testing script should be able to do, is search for some values, and click on a random resulting link.
So what I would need to automate is:
Entering the values in the searchbox (I could do this by using the correct GET url in a second page, but how do I do this 5000 times?)
Clicking on one of the results listed.
Thanks for the help!
This can be done using the CSV Data Set, it loads a CSV file, and feeds the content into the variables of your choice:
http://jmeter.apache.org/usermanual/component_reference.html#CSV_Data_Set_Config
After that, you can use the regular expression extractor to extract the URL's you want from the resulting HTML, and follow those links:
http://jmeter.apache.org/usermanual/component_reference.html#Regular_Expression_Extractor
You could use a HTML Link Parser for your second part (the clicking on one of the results):
http://jmeter.apache.org/usermanual/component_reference.html#HTML_Link_Parser
See http://theworkaholic.blogspot.co.at/2009/11/randomly-clicking-links-in-jmeter.html for an example of using the link parser in a context similar to your question.
Related
I am trying to use Octoparse to extract the podcast details from Marie Brown's "Beyond the kitchen table" website. https://beyondthekitchentable.co.uk/podcast/
I'm using Octoparse's free version which allows for scraping locally. The problem is that while Octoparse will automatically auto-detect the Title, Title_URL, and Content webpage data and correctly set up the Pagination, Scroll Page, and Loop item workflow to extract (Title, Title_URL, and Content fields), it does not auto-detect the 'Date' and 'Podcast time duration' fields of each individual podcast as these pieces appear to be getting embedded from an iframe. However, while I am able to custom add Date and Podcast time duration using an Absolute Xpath i.e. //div[#class="cfm-episodes-list"]/div[1]/div[2]/div[1]/iframe[1]. This results in the same value copied for each record. So when I attempt to fix this by using the Relative XPath setting in Octoparse to loop each item //span[#class="cp-episode-date"] in order to gather all individually unique, it does not get any values even though this relative Xpath //span[#class="cp-episode-date"] is finding all items when I use WebDevTools to search and find all occurrences seen within Chrome. I saw what might be another helpful post on Stackexchange about this but I was not able to make sense of it.
This portion //span[#class="cp-episode-date"] is relative Xpath as it finds multiple Date items in Chrome WebDevTools but it is not complete and I am not sure how to implement the unique Iframe traversal for the Date and Podcast time duration custom added fields I added that Octoparse's Relative XPath settings are looking for. I even tried to install the SelectorsHub Chrome browser extension but it didn't pull up the nested SelectorHub to query the Xpath the way the SelectorHub Youtube video demonstrates - it only showed me the relative Xpath I already am showing below.
Please have a look at this site using Octoparse and see if it is possible. If so, how can I do it?
When Absolute Path is used - //div[#class="cfm-episodes-list"]/div[1]/div[2]/div[1]/iframe[1]
vs.
When Relative Path is used - //span[#class="cp-episode-date"]
There are plenty of iframes inside the webpage. I don't know if Octoparse could handle this. Choose another starting point.
For example, use Apple Podcast :
https://podcasts.apple.com/gb/podcast/the-website-coach/id1587503231
Dates could be recovered with the following XPath :
//div[#class="l-row"]//time[#class]/#aria-label
Other possibility, scrape the following page :
https://feeds.captivate.fm/the-website-coach/
Dates could be recovered with the following XPath :
//h4/text()
Even easier, get directly the data from this URL (.json file) :
https://itunes.apple.com/lookup?id=1587503231&media=podcast&entity=podcastEpisode&limit=100
I know there are similar articles like PDF file conversion in JMeter but they do not answer the actual problem which is "when converting a PDF to a variable/object/property then back to PDF the document whilst the correct number of pages is 'whit on white'= blank.
is there a way to :
Create a runtime variable/object/property from an existing PDF file that can be used in a subsequent action.
Here other actions happen in the Test Plan but they do not
Convert the variable/object/property back to a pdf so that when viewed it does not contain just blanks.
Notes: I do not just wish to just copy a to pdf to pdf.
I have also tried creating a UDV form the pdf using the following posted on here without success too.
${__groovy(vars.putObject("hoping_its_a_pdf"), new File("my_original.pdf"))}
Reading other posts here I have also noticed strange character strings like "%âãÏÓ" when using both putObect and props.put when viewing them post creation but as the article said, most probably page break characters or similar so I have ignored those for now as I assumed it is the conversion and not the reason for the blank content.
Can someone please assist as this is now 4 weeks in and I still have white pdf's.
We cannot state what's wrong with the code which you're copying and pasting from some random sources.
There is not problem with storing the PDF file in JMeter Variables or properties and creating the file back from them.
Demo:
There are 2 problems the only piece of "code" you're sharing:
Your way of using vars.putObject() function is wrong, it takes 2 parameters: variable name and the object value. See Top 8 JMeter Java Classes You Should Be Using with Groovy article for more information on this and other JMeter API shorthands
Apart from this the function itself is syntactically incorrect, you need to escape any comma in the function with a backslash
So if you change your:
${__groovy(vars.putObject("hoping_its_a_pdf"), new File("my_original.pdf"))}
to
${__groovy(vars.putObject('hoping_its_a_pdf'\, new File('my_original.pdf')),)}
at least this bit will start working as you expect.
Problem Summary:
Hi, I'm trying to learn to use the Scrapy Framework for python (available at https://scrapy.org). I'm following along with a tutorial I found here: https://www.scrapehero.com/scrape-alibaba-using-scrapy/, but I was going to use a different site for practice rather than just copy them on Alibaba. My goal is to get game data from https://www.mlb.com/scores.
So I need to use Xpath to tell the spider which parts of the html to scrape, (I'm about halfway down on that tutorial page on the scrapehero site, at the "Construct Xpath selectors for the product list" section). Problem is I'm having a hell of a time figuring out what syntax should actually be to get the pieces I want? I've been going over xpath examples all morning trying to figure out the right syntax but I haven't been able to get it.
Background info:
So what I want is- from https://www.mlb.com/scores, I want an xpath() command which will return an array with all the games displayed.
Following along with the tutorial, what I understand about how to do this is I'd want to inspect the elements from the webpage, determine their class/id, and specific that in the xpath command.
I've tried a lot of variations to get the data but all are returning empty arrays.
I don't really have any training in XPath so I'm not sure if my syntax is just off somewhere or what, but I'd really appreciate any help on getting this command to return the objects I'm looking for. Thanks for taking the time to read this.
Code:
Here are some of the attempts that didn't work:
response.xpath("//div[#class='g5-component--mlb-scores__game-wrapper']")
response.xpath("//div[#class='g5-component]")
response.xpath("//li[#class='mlb-scores__list-item mlb-scores__list-item--game']")
response.xpath("//li[#class='mlb-scores__list-item']")
response.xpath("//div[#!data-game-pk-id > 0]")'
response.xpath("//div[contains(#class, 'g5-component')]")
Expected Results and Actual Results
I want an XPath command that returns an array containing a selector object for each game on the mlb.com/scores page.
So far I've been able to get generic returns that aren't actually what I want (I can get a selector that returns the whole page by just leaving out the predicates, but whenever I try to specify I end up with an empty array).
So for all my attempts I either get the wrong objects or an empty array.
You need to always check HTML source code (Ctrl+U in a browser) for the data you need. For MLB page you'll find that content you are want to parse is loaded dynamically using JavaScript.
You can try to use Scrapy-Splash to get target content from your start_urls or you can find direct HTTP request used to get information you want (using Network tab of Chrome Developer Tools) and parse JSON:
https://statsapi.mlb.com/api/v1/schedule?sportId=1,51&date=2019-06-26&gameTypes=E,S,R,A,F,D,L,W&hydrate=team(leaders(showOnPreview(leaderCategories=[homeRuns,runsBattedIn,battingAverage],statGroup=[pitching,hitting]))),linescore(matchup,runners),flags,liveLookin,review,broadcasts(all),decisions,person,probablePitcher,stats,homeRuns,previousPlay,game(content(media(featured,epg),summary),tickets),seriesStatus(useOverride=true)&useLatestGames=false&language=en&leagueId=103,104,420
I would want to get a structured version of a Wikiquote page via JSON (basically I need all phrases)
Example: http://en.wikiquote.org/wiki/Fight_Club_(film)
I tried with: http://en.wikiquote.org/w/api.php?format=xml&action=parse&page=Fight_Club_(film)&prop=text
but I get all HTML source code. I need each pharse as an element of an Array
How could I achieve that with DBPEDIA?
For one thing Iam not sure whether you can query wiki quotes using DBpedia and secondly, DBpedia gives you only info box data in a structured way, it does not in a any way the article content in a structured way. Instead with a little bit of trouble you can use the Media wiki api to get the data
EDIT
The URI you are trying gives you a text so this will make things easier, but not completely.
Try this piece of code in your console:
require 'Nokogiri'
content = JSON.parse(open("http://en.wikiquote.org/w/api.php?format=json&action=parse&page=Fight_Club_%28film%29&prop=text").read)
data = content['parse']['text']['*']
xpath_data = Nokogiri::HTML data
xpath_data.xpath("//ul/li").map{|data_node| data_node.text}
This is the closest I have come to an answer, of course this is not completely right because you will get a lot on unnecessary data. But if you dig into Nokogiri and xpath and find out how to pin point the nodes you need you can get a solution which will give you correct quotes at least 90% of the time.
Just change the format to JSON. Look up the Wikipedia API for more details.
http://en.wikiquote.org/w/api.php?format=json&action=parse&page=Fight_Club_(film)&prop=text
I am using JMeter for a real estate application when I am selecting a plot it is generating a dynamic value like this 1305003402565. It is incrementing like this 1305003280751 per request to request I need to capture this value and I am not able to find it in the source code.
You may be able to force your application to show the dynamic value in the source code by requesting the page as a GET (instead of POST). Then, using Tree View, copy the source into your favorite regular expression extractor to write your regex to extract the value.