PyQt4 get info including info that is generated by ajax - ajax

I want to Use PyQt4 to get all the infomation including these generated by AJAX requests. I tried a few examples, but it didn't achieve my goal.
For example:
amazon
In the page, the item 126 new is generated by AJAX request. I want to get the whole page including the item generated by AJAX request through one request. All of the AJAX requets are carried out automatically.
There's a simple code sample, it couldn't get the items generated by AJAX:
#!/usr/bin/env python
#coding: utf-8
import sys
from PyQt4.QtCore import QUrl, SIGNAL
from PyQt4.QtGui import QApplication
from PyQt4.QtWebKit import QWebView
app = QApplication(sys.argv)
web = QWebView()
web.load(QUrl('http://www.amazon.com/gp/product/B0042FV2SI/ref=s9_pop_gw_g107_ir01?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=center-2&pf_rd_r=1MKC5JJV07SG4WG5PJXY&pf_rd_t=101&pf_rd_p=1263340922&pf_rd_i=507846'))
web.show()
sys.exit(app.exec_())

Related

What is the keep_alive library and will it work?

I saw a question for a discord bot and it was using a library called "keep_alive". How can I import it to my bot if it does keep the bot online?
The below code is the keep_alive code which prevents your repl from dying by creating a flask server. You should use uptime-robot to make the server gets pinged in a certain amount of time to make it work for long time without stopping.
from flask import Flask
from threading import Thread
app = Flask('')
#app.route('/')
def main():
return "Your bot is alive!"
def run():
app.run(host="0.0.0.0", port=8080)
def keep_alive():
server = Thread(target=run)
server.start()
You should create a keep_alive and paste the above code there and then add keep_alive.keep_alive() in your main file to make it work.
You can refer this youtube-video to understand how to use that

JMeter - client-side for mobile in Safari

I have Safari browser on Win with installed webdriver on it and following code in JSR223 for Safari:
import org.openqa.selenium.safari.SafariOptions;
import org.openqa.selenium.safari.SafariDriver;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import java.util.concurrent.TimeUnit;
Map<String, Object> mobileEmulation = new HashMap<>();
mobileEmulation.put("userAgent", "vars.get("userAgent")");
Map<String, Object> safariOptions = new HashMap<>();
safariOptions.put("mobileEmulation", mobileEmulation);
SafariOptions safari = new SafariOptions();
options.setExperimentalOption("mobileEmulation", mobileEmulation);
SafariDriver driver = new SafariDriver(options);
driver.get("url");
WebDriverWait wait = new WebDriverWait(driver, 20);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath("xpath")));
driver.findElement(By.xpath("xpath")).click();
vars.putObject("driver", driver);
Error messages are folllowing:
Response code:500
Response message:javax.script.ScriptException: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
Script19.groovy: 2: unable to resolve class org.openqa.selenium.safari.SafariOptions
# line 2, column 1.
import org.openqa.selenium.safari.SafariOptions;
^
Script19.groovy: 3: unable to resolve class org.openqa.selenium.safari.SafariDriver
# line 3, column 1.
import org.openqa.selenium.safari.SafariDriver;
^
2 errors
Could you please help me to find, what I'm missing?
With regards to the error you're getting it seems that you don't have selenium-safari-driver under JMeter Classpath so you need to download the appropriate .jar, drop it to "lib" folder of your JMeter installation and restart JMeter to pick the library up.
With regards to your "Safari browser on Win" stanza, are you absolutely sure this is something you really want to be doing? Windows version of Safari EOL was in 2012 and I don't think you should be investing into automated testing of 8-years-old browser.
Going forward I think this line:
mobileEmulation.put("userAgent", "vars.get("userAgent")");
should look like:
mobileEmulation.put("userAgent", vars.get("userAgent"));
I also fail to see a valid use case for JMeter there, if you need to simulate different browsers doing a performance test - you can easily send the relevant User-Agent header using HTTP Header Manager and conduct the load using normal JMeter's HTTP Request Samplers and if you're just creating automated tests - you don't need JMeter at all

Crawl A Web Page with Scrapy and Python 2.7

Link: http://content.time.com/time/covers/0,16641,19230303,00.html [new DOM link]
Cover Page Html tag
How to get that SCR in Jason and download images
Next Button Tag
I want to scrap this 2 links using Scrapy
Any Help !!
I need to write a method to download images and click on next page, run them in for loop till final image get the download(Final Page).
how to download rest of part ill figure it out.
I follow this tutorial https://www.pyimagesearch.com/2015/10/12/scraping-images-with-python-and-scrapy/
[DOM is already outdated ]
I've already set all files and Pipelines for project
For Record, I tried different Different method XPath css response
https://github.com/Dhawal1306/Scrapy
Everything is done solution is on Github 4700 somewhere images we have and along with JSON also.
for a tutorial, any question you just have to ask !!
I know this is not scrapy but I found easier using BS4. so you have to "pip install beautifulsoup4". Here is a sample :
import requests
from bs4 import BeautifulSoup
import os
r = requests.get("https://mouradcloud.westeurope.cloudapp.azure.com/blog/blog/category/food/")
data = r.text
soup = BeautifulSoup(data, "lxml")
for link in soup.find_all('img'):
image_url = link.get("src")
print(image_url)
It worked like a charm

Scrapy: '//select/option' xpath not yielding any results

I've been trying Scrapy and absolutely love it. However, one of the things I'm testing it in does not seem to work.
I'm trying to scrape a page (apple.com, for example) and save a list of the keyboard options available, using the simple xpath
//select/option
When using Chrome console, the website below comes back with an array of selections that I can easily iterate through, however, if I use scrapy.response.xpath('//select/option') via the scraper, or via the console, I get nothing back from it.
My code for the scraper looks a bit like the below (edited for simplicity)
import scrapy
from scrapy.linkextractors import LinkExtractor
from lxml import html
from apple.items import AppleItem
class ApplekbSpider(scrapy.Spider):
name = 'applekb'
allowed_domains = ['apple.com']
start_urls = ('http://www.apple.com/ae/shop/buy-mac/imac?product=MK482&step=config#', )
def parse(self, response):
for sel in response.xpath('//select/option'):
item = AppleItem()
item['country'] = sel.xpath('//span[#class="as-globalfooter-locale-name"]/text()').extract()
item['kb'] = sel.xpath('text()').extract()
item['code'] = sel.xpath('#value').extract()
yield item
As you can see I'm trying to get the code and text for each option, along with the site "Locale Name" (country).
As a side note, I've tried with CSS selectors to no avail. Anyone knows what I'm missing?
Thanks a lot in advance,
A
The problem is the usage of JavaScript by the webpage. When you open the url in the Chrome, the JavaScript code is executed by the browser, which generates the drop-down-menu with the keyboard options.
You should check out a headless-browser (PhantomJS etc.) which will do the JavaScript execution. With Splash, Scrapy offers its own headless-browser which can be easily integrated via scrapyjs.SplashMiddleware Downloader Middleware.
https://github.com/scrapy-plugins/scrapy-splash
The cause that //select/option does not find anything is that there is no select tag in the website when you load it with scrapy. That's because JavaScript is not executed and the dropdown is not filled with values.
Try to disable javascript from your Chrome developer tools' settings and you should see the same empty website what scrapy sees when you scrape the page.

Webkit under Windows with PyQt doesn't get remote resources via xhr

I would like to write a Qt application which uses Webkit as its gui to get data from a server and display it. I got it working unter Linux and OS X without problems but under windows the XMLHttpRequest always returns status 0 and I don't know why. Here is the pyqt code I use:
import sys, os
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
app = QApplication(sys.argv)
web = QWebView()
web.page().settings().setAttribute(QWebSettings.LocalContentCanAccessRemoteUrls, True)
path = os.path.abspath(os.path.join(os.path.dirname(__file__), 'index.html'))
url = "file://localhost/" + path
web.load(QUrl(url))
web.show()
sys.exit(app.exec_())
and here is html HTML/JS I use to test it:
<!DOCTYPE html>
<title>TEST</title>
<h1>TEST</h1>
<div id="test"></div>
<script type="text/javascript">
function t(text) { document.getElementById("test").innerHTML = text }
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
if(this.status != 0)
t(this.responseText)
else
t("Status is 0")
}
xhr.open("GET", "https://jeena.net/")
xhr.send()
</script>
On Linux it opens a new Window with a WebKit view in it, loads html local index.html file into it and renders it which shows the TEST headline. After that it runs the XMLHttpRequest code to get a websites content and set it with innerHTML into the prepared div.
On windows it loads and shows the title but then when it runs the xhr code the status is always just 0 and it never changes, no matter what I do.
As far as I understand LocalContentCanAccessRemoteUrls should make it possible for the xhr to get that content from the remote website even on windows, any idea why this is not working? I am using Qt version 4.9.6 on my windows machine and python v2.7.
I think there are two simple attempts to solve this problem.
My first thinking is that it can be due to cross domain request.
Seems that there is no easy way to disable cross domain protection in QWebkit.
I got the information from this stackoverflow question:
QtWebkit Same-Origin-policy
As stated in the accepted answer:
"By default, Qt doesn't expose method to disable / whitelist the same origin policy. Extended the same (qwebsecurityorigin.cpp) and able to get it working."
But since you've got everything working on linux and mac, the above may not be the cause.
Another possibility is you don't have openssl enabled with your Qt on windows. Since I noticed you have requested to a https page, which should require openssl. You can change the page to a http one to quick test this possibility.

Resources