Browserless web scraping of ajax page - ajax

Have tried using Selenium after reading some tutorial for web scraping ..
The aim is to web/screen scrape a page that loads the required data after an Ajax call when (this ajax call made after Initial page load)..
The second aim is to run Selenium code in the background (not opening any browser) to allow loading the page (including the Ajax call) , retrieve the Final HTML and perform required processing locally ..
the code till now is as follows (code tutorial from http://www.geekonweb.com/net/web-page-scraping-using-selenium-and-net/)
public ActionResult Index()
{
//--
//Below path should contain IEDriverServer.exe
var chrome = new ChromeDriver(#"file path");
chrome.Url = #"<url>";
chrome.Navigate();
//extract the html
//var retval = ie.ExecuteScript("return document.body.outerHTML");
string result = chrome.PageSource;
return View();
}
currently have not been able to find a way to run Selenium Silently (without GUI). kindly assist if that can be done.
Secondly kindly tell that how can Selenium be told to wait for the Ajax call to finish and then retrieve the data.
regards,

Here is a question on how to wait until an element is present. This is done to wait for the AJAX.
Here is a question on weather it's possible to run selenium headless.

Related

How to call an external API in truclient protocol of loadrunner

I am recording a script using truclient protocol.In my script ,i need to externally call an API which generates the Password. The password is fetched using the co-relation,which is used as an input for Login.
I am however unable to call the external API using the true client protocol.
Could anybody please suggest how to call an external API in true client protocol.
Have you tried the evaluate JavaScript step? You can post the message to the server and get the generated password during the runtime. XHR and fetch API should be supported in Chrome and Firefox, TCIE should support XHR.
Sure. Please check the detail steps:
Drag and drop an evaluate JS step from TruClient
Open the script editor
Add these code, make sure use the sync XHR to ensure the password is returned before the end step started:
var xhr = new XMLHttpRequest();
xhr.open("POST", '/server', false);
//Send the proper header information along with the request
xhr.setRequestHeader("xxx", "value");
xhr.send();
if (this.status === 200) {
// Request finished. Do processing here.
}
var password = xhr.response;
Change the login password step from plain text to JS and use
ArgsContext.password
to reference the previous received password.
If you have another questions please let me know. How to use the argument context you could reference this link.
BTW. the window and document object of the page can be referenced with AUT.window, AUT.document in TruClient.
Please check the help document from here.

How to edit the url in current browser using Watin

I need to navigate to new url from the current opened browser using Wating Code, Let me know if any one tried the same scenario. Also I need to get the url in the current opened browser.
using (IE browser = new IE())
{
browser.GoTo("www.google.co.uk");
string curentUrl = browser.Url;
}
If the browser is already open, you use the AttachTo static method
http://watinandmore.blogspot.com/2010/01/browserattachto-and-iattachto.html
HTH!

Monitor file change through AJAX, how?

I'm looking for a way through AJAX (not via a JS framework!) to real time monitor a file for changes. If changes where made to that file, I need it to give an alert message. I'm a total AJAX noob, so please be gentle. ;-)
Edit: let me explain the purpose a bit more in detail. I'm using a chat script I've written in PHP for a webhop, and what I want is from an admin module monitor the chat requests. The chats are stored in text files, and if someone starts a chat session a new file is created. If that's the case, in the admin module I want to see that in real time.
Makes sense?
To monitor a file for changes with AJAX you could do something like this.
var previous = "";
setInterval(function() {
var ajax = new XMLHttpRequest();
ajax.onreadystatechange = function() {
if (ajax.readyState == 4) {
if (ajax.responseText != previous) {
alert("file changed!");
previous = ajax.responseText;
}
}
};
ajax.open("POST", "foo.txt", true); //Use POST to avoid caching
ajax.send();
}, 1000);
I just tested it, and it works pretty well, but I still maintain that AJAX is not the way to go here. Comparing file contents will be slow for big files. Also, you mentionned no framework, but you should use one for AJAX, just to handle the cross-browser inconsistencies.
AJAX is just a javascript, so from its definition you do not have any tool to get access to file unless other service calls an js/AJAX to notify about the change.
I've done that from scratch recently.
I don't know how much of a noob you are with PHP (it's the only server script language I know), but I'll try to be as brief as possible, feel free to ask any doubt.
I'm using long polling, which consists in this (
Create a PHP script that checks the content of the file periodically and only responds when it sees any change (it could include a description of the change in the response)
Create your XHR object
Include your notification code as a callback function (it can use the description)
Make the request
The PHP script will start checking the file, but won't reply until there is a change
When it responds, the callback will be called and your notification code will launch
If you don't care about the content of the file, only that it has been changed, you can check the last-modified time instead of the content in the PHP script.
EDIT: from some comment I see there's something to monitor file changes called FAM, that seems to be the way to go for the PHP script

Chrome extension AJAX/XHR request handling

Couple questions about the implementation of the XHR request:
I am trying to make a form in popup.html that allows for the filling of a box with text (in popup.html) and submits that to a remote website using get and replaces the contents of the box with the return of the php (json).
Heres the code so far:
Any idea why when I click submit nothing happens?
Also the manifest permissions:
"permissions": [
"https://*/",
"https://*/*"
]
}
forms don't need permissions at all to do a cross domain post (in theory). That being said, the popup.html never reloads in a browser action (or page action) when a form is submitted.
An easy thing to do is to capture onsubmit on the form and simply do an XMLHttpRequest attaching the form as per the Mozilla MDC site.
Which in summary is (copied from mozilla):
var formElement = document.getElementById("myFormElement");
var xhr = new XMLHttpRequest();
xhr.open("POST", "submitform.php");
xhr.send(new FormData(formElement));

jQuery load() not working in Internet Explorer

I am trying to use jQuery load() function to get content from another page via AJAX. It works on Firefox, Google Chrome, but not in Internet Explorer 7 & 8.
Here is the page I am developing: http://139.82.74.22/70anos/no-tempo
All the jQuery code is working normally in Internet Explorer, but the specific part that should bring the destination page isn't. To understand the problem, one must click the "Há 80 anos" or "Há 70 anos" block and click any of the links inside it. It should open a panel underneath the timeline with the content of the block.
Here is the code that pulls the external content:
jQuery('a.link-evento').click(function() {
var strUrl = jQuery(this).attr('href');
var objBlocoConteudo = jQuery(this).parents('div.view-content').next().find('div.conteudo-evento')
objBlocoConteudo.css('display','block').animate({ opacity: 1}, {duration: 350}).load(strUrl + ' #area-conteudo-evento');
return false;
});
With this code I am grabbing the URL of the destination page and telling the browser not to do a normal request, but to open it using jQuery load() function.
Any help appreciated fixing this IE... Thank you.
I'm pretty sure AJAX requests have to be made to a domain name in IE as a security precaution. If you map a domain to your 139.82.74.22 address your problem should go away.
You cant make an .Load(http://139.82.74.22/..), it would have to be .Load("http://mysite.com/mypage")

Resources