I'm using HtmlUnit to parse html with js code. The structure of the page is(using Chrome Developer Tools):
And my code is as follow:
WebClient wc=new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
wc.getOptions().setUseInsecureSSL(true);
wc.getOptions().setJavaScriptEnabled(true);
wc.getOptions().setCssEnabled(false);
wc.getOptions().setThrowExceptionOnScriptError(false);
wc.getOptions().setTimeout(10000);
wc.getOptions().setDoNotTrackEnabled(false);
HtmlPage page= wc.getPage(address);
List<HtmlDivision> items=(List<HtmlDivision>)page.getByXPath(
"/html/body/div[#id='wrapper']/div[#class='content_main']/div[#class='search_result']/div[#id='resultData']");
System.out.println(items);
if(items!=null && items.size()>0){
HtmlDivision resultMain=items.get(0);
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(#class,'search_one')]");
System.out.println(appDivList);
for(HtmlDivision resultItem:appDivList){
try{
DomElement appImgInfo=resultItem.getFirstElementChild();
List<HtmlDivision> appInfoList=(List<HtmlDivision>)resultItem.getByXPath("./div[#class='one_right']");
String appName=null;
The problem is when i debug this code, it works fine. When i run this code,
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(#class,'search_one')]");
doesn't work,that is appDivList is empty, but when i debug this code, appDivListis not empty.
Anyone know why?
Update:
I add some Thread.sleep code before
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(#class,'search_one')]");
The updated code is:
HtmlDivision resultMain=items.get(0);
try{
Thread.sleep(10000);
}catch(Exception e){}
List<HtmlDivision> appDivList=(List<HtmlDivision>)resultMain.getByXPath(".//div[contains(#class,'search_one')]");
System.out.println(appDivList);
It works! How does this happen?
Related
In HtmlUnit for testing, I'm coming across a case where, on page load, it'd be useful to NOT execute the Javascript automatically, and instead wait for me to initiate and tell the Javascript to start executing?
My specific use-case is testing something which the Javascript does some tests, and then does a location replace to send the user on to another page. I want to check some headers which I'm returning for testing/validation, and then let the JS execute as usual.
My current thought is to have a flag I pass to the page when testing which will cause the JS to not automatically run, and wait until I call a JS function from within the Java code via webClient.getJavaScriptEngine().execute().
While not specifically being able to pause JavaScript before invoking, it may be worthwhile to use the WebConnectionWrapper class to inspect/modify the response data or outgoing requests, effectively giving you a chance to execute your own code before the JavaScript is invoked.
An example usage of this is as follows:
try (final WebClient webClient = new WebClient()) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
// set more options
// create a WebConnectionWrapper with an (subclassed) getResponse() impl
new WebConnectionWrapper(webClient) {
public WebResponse getResponse(WebRequest request) throws IOException {
WebResponse response = super.getResponse(request);
if (request.getUrl().toExternalForm().contains("my_url")) {
String content = response.getContentAsString();
// intercept and/or change content
WebResponseData data = new WebResponseData(content.getBytes(),
response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
}
return response;
}
};
// use the client as usual
HtmlPage page = webClient.getPage(uri);
}
The above code is from the official documentation here:
How to modify the outgoing request or incoming response?
The getResponse() method that you would override is called before each request is made and also allows you to modify the WebResponse object that is passed back to WebClient for its continued processing.
Sorry but at the moment (version 2.43.0) we have no such option. Feel free to open a issue on github for this.
I guess other test tools might also benefit from this function.
Pretty basic test:
[TestClass]
public class ApiClientTest
{
private RestClient _client;
[TestInitialize()]
public virtual void TestInitialize()
{
_client = new RestClient("http://localhost:24144");
_client.CookieContainer = new System.Net.CookieContainer();
}
[TestMethod]
public void ApiClientTestCRUD()
{
// 1. Log out twice. Verify Unauthorized.
var response = LogOut();
response = LogOut();
Assert.AreEqual(response.StatusCode, HttpStatusCode.Unauthorized);
// Error here:
Result Message: Assert.AreEqual failed. Expected:<0>.
Actual:< Unauthorized >.
I get <0>, which isn't even something that my WebAPI returns.
I think the issue is with my use of RestSharp, because if I debug one time it passes, and then subsequent runs pass. Any clue what's going on?
To be clear - this occurs when I open up my solution and attempt to run the test for the first time. I can fix it by debugging once, watching it pass, and then running without debugging as much as I want. I can reproduce this by closing VS and opening up the solution again - and running the test without debugging first.
Here's the LogOut method in my WebAPI:
[Authorize]
public HttpResponseMessage LogOut()
{
try
{
if (User.Identity.IsAuthenticated)
{
WebSecurity.Logout();
return Request.CreateResponse(HttpStatusCode.OK, "logged out successfully.");
}
return Request.CreateResponse(HttpStatusCode.Conflict, "already done.");
}
catch (Exception e)
{
return Request.CreateResponse(HttpStatusCode.InternalServerError, e);
}
}
UPDATE:
I ended up running the tests with Trace.WriteLine:
// 1. Log out twice. Verify Unauthorized.
Trace.WriteLine("ENTERING FIRST LOGOUT");
var response = LogOut();
Trace.WriteLine("Content: " + response.Content);
Trace.WriteLine("ErrorMessage: " + response.ErrorMessage);
Trace.WriteLine("ResponseStatus: " + response.ResponseStatus);
Trace.WriteLine("StatusCode: " + response.StatusCode);
Trace.WriteLine("StatusDescription: " + response.StatusDescription);
response = LogOut();
Trace.WriteLine("COMPLETED LOGOUTS");
Assert.AreEqual(response.StatusCode, HttpStatusCode.Unauthorized);
And I found the following:
ENTERING FIRST LOGOUT
Content:
ErrorMessage: Unable to connect to the remote server
ResponseStatus: Error
StatusCode: 0
StatusDescription:
COMPLETED LOGOUTS
My solution has a test project with this RestSharp test, and a WebAPI project that's supposed to be accepting these requests. If I debug, the RestClient connects. If not, it times out. Any tips?
When debugging is not possible to solve the problem go to the old fashion way.
Add Trace.WriteLine (or even append text to a C:\temp.txt file).
Write some string before every return in the LogOut method, then try writing some more information (if it's the last return then write the Exception message, if it's the second return write the Identity information.
Hope this helps.
How are you hosting the server? I see this that you're using port 24144. Maybe in debug mode you're running the express IIS Web Server and that's the port, but in non-debug mode it's not?
I started learning AJAX recently and am trying a very simple project which involves capturing some form data and sending it to two servers.
The first server is the one which hosts the website and server side php handling. This worls fine
The second server is a python basic http server which handles only the POST operation request send from AJAX. This functionality works but is a bit weird.
Let me explain
Here is my AJAX code which is absolutely straight forward.
function xml_http_post(url, data) {
var req = false;
try {
// Firefox, Opera 8.0+, Safari
req = new XMLHttpRequest();
}
catch (e) {
// Internet Explorer
try {
req = new ActiveXObject("Msxml2.XMLHTTP");
}
catch (e) {
try {
req = new ActiveXObject("Microsoft.XMLHTTP");
}
catch (e) {
alert("Your browser does not support AJAX!");
return false;
}
}
}
req.onreadystatechange = function() {
if (req.readyState == 4) {
// callback(req);
}
}
req.open("POST", url, true);
req.setRequestHeader("Content-type","text/plain");
req.send(data);
}
Since I do not intend to send back any response , my callback function on ready state change is empty.
But when I execute this code ( triggered by onclick on a button) , the POST doesnt work and server doesnt seem to receive anything.
But the most surprising thing is that if I keep a breakpoint at req.open( ) and then do a manual step execution then it works always. Which means , I guess that there is some timing issue which needs to be resolved.
It works fine without breakpoints if the third parameter "async" is set to false but that is anyway undesirable so I want to make it work with async = true.
Any help would be greatly appreciated.
Thanks
Shyam
As I figured out, the form page was getting unloaded by a php script which was invoked as a action of the form b the first server. This resulted in the javascript code being partially or not executed.
So I figured out that sync XHR is the only way for my.
I'm trying to download a regular JSON string from this url https://valueboxtest.lb.dk/mobile/categories from a Windows Phone 7 Application.
I have tried to both use WebClient and HttpWebRequest. They both throw an exception
“The remote server returned an error: NotFound”
This is the code for using the WebClient
var webClient = new WebClient();
webClient.DownloadStringCompleted += (client_DownloadStringCompleted);
webClient.DownloadStringAsync(new Uri("https://valueboxtest.lb.dk/mobile/categories"));
The eventhandler then just show the content, but e.Result throws the above mentioned exception:
void client_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
if (e.Error == null && !e.Cancelled) MessageBox.Show(e.Result);
}
For the HttpWebRequest my code looks as follows:
var httpReq = (HttpWebRequest)WebRequest.Create(new Uri("https://valueboxtest.lb.dk/mobile/categories"));
httpReq.BeginGetResponse(HTTPWebRequestCallBack, httpReq);
With the following callback:
private void HTTPWebRequestCallBack(IAsyncResult result)
{
var httpRequest = (HttpWebRequest)result.AsyncState;
var response = httpRequest.EndGetResponse(result);
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
this.Dispatcher.BeginInvoke(
new delegateUpdate(update),
new Object[] { reader.ReadToEnd() }
);
}
And with the delegate method
delegate void delegateUpdate(string content);
private void update(string content)
{
MessageBox.Show(content);
}
Running it in a console application
Everything works just fine and the JSON string is returned with no problems and I am able to print the result to the console.
Different URL does work on WP7
The weird thing is that the URL http://mobiforge.com/rssfeed actually works fine in both of the above mentioned scenarios.
This issue occurs both in the Emulator and on an actual device.
What could be wrong? Is the REST service returning the data in misbehaving way? I really hope you can help me!
Note: I'm not running Fiddler2 at the same time!
The reason is because that site does not have a valid certificate. Just try it on Mobile Internet Explorer and you'll get the prompt about an issue with the certificate.
How to ignore SSL certificates
Mobile devices are stricter when it comes to SSL certificates.
If you want to get this app into a production environment, you'll either need to write a wrapper for this server (if it's not your own), or get a valid certificate. In the short-term, for testing, you can add a certificate into your device.
Here's a tool which might help you install a certificate.
I have the follow problem. I need to create a JS widget and set it on one blog, for example any blog from blogger.com. YOu can select there a box for javascript and I will post the JS in this box.
The problem what I have and don't know how to do this is, that the script should do an ajax polling for exmaple for 60 seconds. But how to execute an ajax call, when the host is not the same linke the host, where the JS is includet?
For example the easiest way to explai is: There is a search box and when enayone searches for anythign, then the JS script should streaming the results for 60 seconds from the server what I have set in the script and is different as the host, where the JS is includet, without to become a problem with the JS restriction for hosts.
Or for example a chat client, where the client is hosted on one other host and the server on another.
Can anyone tell me an idea, or send me an example how to do this?
Thanks
Nik
Well with this example is it possible but without JSONP?
function asyncreq(url) {
var xmlhttp = false;
try {
xmlhttp = new XMLHttpRequest();
} catch (trymicrosoft) {
try {
xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");
} catch (othermicrosoft) {
try {
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (failed) {
xmlhttp = false;
}
}
}
if (xmlhttp){
try {
xmlhttp.open("GET", url);
xmlhttp.onreadystatechange=function() {
document.getElementById('mydiv').innerHTML = xmlhttp.responseText;
}
xmlhttp.send(null);
}
catch (failed) {
xmlhttp = false;
}
}
}
If you send the response in chunks, then everything is fine. But here is the call in ajax again. And when I use it in a different host, then I can't call the url because of the same-origin policy.
Is there another way?
I found a very interesting example here.
Take a look at the bottom, there is a job search box. If you investigate a litte bit, then you will see there is a usage of a class RSL() which is doing the request. How this class is doing the request without ajax? I can't understand wow this class works. Can anyone show me a better example?
There are two main options:
Put an iframe where you want the widget to go. Its src URL would be on the same server that will receive the AJAX call.
Use JSONP, which consists of inserting a script tag into the page to bypass the same-origin policy. This requires that the AJAX server wrap its JSON output in ?(...), where the URL includes callback=?. Then, as soon as a response has been received, start another request.