HTMLAgilityPack get AJAX value - ajax

I am trying to get the value of a timer using the HtmlAgilityPack however when I get the innerText by the element ID it returns --:--:--
Is there any way to get the time value since it uses AJAX?

The thing is that when you load an HttpAgilityPack.HtmlDocument (from the web, of course), it makes an HTTP web request to the website, and what you receive is plain text. No AJAX/JavaScript or images are loaded. When you see it in a browser, you don't realize about it, because it's a browser ;). After it receives the response, it starts loading images, animations, and executing javascript code, but HtmlAgilityPack uses HttpWebRequest to get the source, and it doesn't manage any javascript code.
I suggest you to download a really brilliant tool to inspect HTTP traffic in your network: Fiddler2. It will allow you to set breakpoints, and see exactly how the response is returned, and you will see that those things are really handled by the browser itself.
I don't really know what is the purpose of getting the time value of a timer in AJAX, but I think you could use WaTiN to load the source using Internet Explorer (and hidding it, because if not, an IE window will appear on the screen loading the page), and in the moment you need to get the value of the timer, get the source from WaTiN and load an HtmlDocument using LoadHtml(string html).

Related

Find navigation/redirect request with DevTools after button click that executes javascript/ajax

The question is probably easily misunderstood, so I'll go into more detail:
I am trying to automate a task in a certain (very outdated) browser-based idle game that is written in PHP in order to polish my portfolio with a little more variated projects.
I used DevTools to reverse most of the requests and wrote a small C# Request wrapper to test them. I can get most of the actions I want to work, using the respective ajax get requests and the correct cookies/headers - not really part of the problem.
Example:
Attacking an enemy:
https://somebrowsergame.com/game/ajax.php?mod=location&submod=attack&location=3&stage=2&premium=0&sh=****mysessionhash****
Making a GET request to this URI with the correct headers and cookies, I can perform the in-game action programmatically and successfully from my C# console application and see that the fight has taken place when visiting the site in the browser.
The problem:
When monitoring all requests after clicking the "attack" button, via DevTools, even with preserve logs enabled, I don't see any redirects or way of determining how my browser gets told where to navigate to.
Findings
I found out that the button calls a javascript function attack() in its onClick event and tried debugging the javascript in DevTools in order to find out where somethign happens (such as setting document.href or smth), but when Debugging I ran into a seemingly infinite loop of setInterval handler and setTimeout handler in the call stack.
I also cleared the Network tab after the onClick event (and after the ajax request which I could find during Debugging) but the only request/response I got was the document GET request for the final page, no request telling my browser which site to navigate to.
Monitoring requests
The request made to initiate the action (via button click on website or ajax GET request as outlined above)
The document response / site navigated to
What I want to know is how my browser got told which site to navigate too, as the request URI for the document request (getting the html of the target page) has a parameter generated on the server side (logId)
I have also used "All" request types in DevTools, as well as negative filters when monitoring requests but never was I able to see how my browser knows which page to navigate to. I tried with source breakpoints at "beforeunload", tried inspecting the javascript source connected to the onclick event of the button (which didnt give me anything, as the js is minified and barely readable - i am not even sure if the navigation is done via window.target.href) and googled this question in all possible wordings which lead me nowhere
I am not too versed in web development, but I am sure my browser has to be told where to navigate to in some fashion after clicking that button?

JMeter: JavaScript is not returning exact data in Response Data Sets

As we all know JMeter is not supporting JavaScript till now, but is there any alternative way to extract data from JavaScript's Response Data (Not generating exact response which we can get using Browser) using Regular Expression Extractor and inject it as parameter for another HTTP Request?
Note: In the response page getting message as "JavaScript is required. This web browser does not support JavaScript or JavaScript in this web browser is not enabled."
I think you are looking at HTML view. As documentation states:
The HTML view attempts to render the response as HTML. The rendered HTML is likely to compare poorly to the view one would get in any web browser; however, it does provide a quick approximation that is helpful for initial result evaluation.
Images, style-sheets, etc. aren't downloaded.
In your case that view is not very helpful, since page has <noscript> tag, which ensures that you only see one message about missing JavaScript. So don't look at it, use Text mode instead, which gives you the actual page source.
Another confusion you seems have is that JavaScript has some sort of "response data". It does not. JavaScript is a client-side technology, while JMeter is working directly with HTTP requests/responses. So when client issues a new HTTP request (which could be result of JavaScript code, user operation, or anything else), JMeter representation of such request is always the same: HTTP Sampler, which has some response data, which, as I said, bast viewed in Text mode.
So bottom line is: likely you have no problem with recording or playback of your script, you are just not checking it correctly.
If you send the same request as the browser you should get the same response. If you are receiving only the error message regarding not-enabled JavaScript - your test is not working properly and doesn't mimic all the requests which are being sent by a real browser with 100% accuracy (i.e. you are sending only main request with JMeter while browser does few more AJAX requests which fetch data from the server and actually render the content).
It also means that your test does not make a lot of sense as each JMeter virtual user needs to represent real user using real browser as close as possible with all its stuff (cookies, headers, cache, think times, etc.)
So I would recommend the next steps:
Make sure you have correlation in place.
Make sure you are following recommendations from the How to make JMeter behave more like a real browser article.
Once done - compare the request(s) which are being sent by browser and JMeter using a sniffer tool like Fiddler or Wireshark the requests should be exactly the same (apart from dynamic data which needs to be correlated). If there are inconsistencies or missing requests you need to amend JMeter configuration to so JMeter requests would exactly match the browser ones.

Getting the JSON file when triggering AJAX

I'm writing a crawler to get the content from a website which uses AJAX.
There is a "show more" button at the bottom of the page, and my origin approach is to use Selenium.PhantomJS to pretend a web browser but it works in some website and some don't.
I'm wondering if there is some way i can directly get the underly JSON file of the AJAX action. Please give me some details, thanks.
By the way, I'm using Python.
I understand this is less of a python than a scraping problem in general (and I understand you meant "scraping" instead of "crawling" as a scraper reads/parses/processes one page whereas a crawler processes multiple pages and they're relation to each other).
You can get the JSON file immediately given you know it's URL. If you don't (for example because the URL changes from time to time), you might need to search through javascript files on the page manually to find out how the URL is generated.
Once you know the JSON file's URL, it's quite simple. As you already seem to know how to get the HTML of the "main" page, you can use your existing code to get the JSON file.
I'm not familiar with PhantomJS, but I reckon it's easier to get the JSON file immediately instead of simulating an AJAX request (if that's even possible with Phantom).

Firefox strange iframe behavior

I have a site that normally embeds all content in an iframe. If you were to try to access the same content directly through the browser, we load the site framework and instead load that content in the iframe for you (this is all handled by referer determining if it's an internal or external request).
This works just fine in Google Chrome, but Firefox seems to refuse to request content in an iframe if it's the same as the parent window URL. Is this expected? I could imagine them doing this to prevent infinite loops, but I can't find it documented anywhere. The strange part is I can work around it by adding anything additional to the query string. Of course, I'd prefer not to have to do this.
And if this is expected behavior, is what I'm doing not such a good idea?
Using iFrames is in general not the hottest plan, but it may be justified. Firefox's behavior is to be expected, however. Your two options are:
1) When you detect a user loading an inner frame alone, redirect (via HTTP-HEADER) to the parent page and use a query string to tell that page what inner frame to load.
2) Do what you're doing now, and add a query string full of random data (&framebuster=231784783243253426543) to keep things nice and separate.

Iframe vs normal / ajax get request

I have a page that gathers environment status from a couple of IBM WebSphere servers using iframes similar to this:
<iframe src="http://server:9060/ibm/console/status?text=true&type=server&node=NODE&name=ServerName_server_NODE"></iframe>
and it happily prints out "Started" or "Unavailable" etc. But if I load the same url in a normal browser sometimes it works, sometimes it does not? Some of them are showing a login page, while others are simply return HTTP code 500.
So whats the difference between loading the page through an iframe vs through a browser?
I can tell you that the iframe solution works no matter which machine I am doing it on, so I do not belive it has anything to do with the user whos opening the page. And before you ask, why not keep the solution that works, well its because it takes a long time to open the page with the iframes vs a page where everything is requested through ajax.
Update: Using jQuery to perform the ajax call returns "error" and "undefined" for the servers that I can't see in a normal browser.
One difference is an iframe has to render the view while XHR would not.
An iframe is essentially the same as opening with the browser. In both cases the browsers credentials are used, so there will be no difference between the two.
Secondly, loading something in an iframe should take the same amount of time as requesting it through XHR, since in both cases the browser makes an HTTP request and waits for the response. Although I should add that an iframe will take time to render the content onto the page. However if you plan on displaying it with ajax anyways, an iframe/xhr solution will be more or less the same.
In case of ajax request same origin policy (which restricts cross domain call) comes into picture. So you can't make cross domain call using xhr. Alternative for same is embed flex swf file in your page as activex control and make flex call through javascript and then flex is responsible to make cross domain call (flex can if targeted domain allows cross domain using crossdomain.xml) and renders result using javascript again.

Resources