I have an NS Window with a WebView.
My program takes in a search query and executes a Google search with it, the results being displayed in the WebView, like a browser.
Instead of displaying the search results in the WebView, I'd like to automatically open the first link and display the contents of that result instead.
As a better example, how do I display the contents of the first result of Google in a WebView?
Is this even possible?
Any help greatly appreciated. Thanks!
You could use the direct Google Search API. That would be more convinient.
https://developers.google.com/custom-search/v1/cse/list?hl=de-DE
Also you could also try to make a google request like the "I'm feeling lucky" button, which will direct you automatically to the first search result.
If you have to parse the HTML, you need to have a look at the HTML structure of the google result page. Look for specific id and class css properties in the div and a tags. If you found the ones, where the actual results are you can start parsing that content. Also i guess it would be easier to put some javascript together, that will find the first result and open it. (More easier than parsing the HTML using obj-c). You can evaluate javascript in the webview using [myWebView stringByEvaluatingJavaScriptFromString: #"put your js code here"].
Sure it is possible.
The first way to accomplish that that goes through my head is to parse the HTML response from Google, then launch a WebView with the first link you extracted.
Take a look at regular expressions to make it easy.
Related
So I would go to an instagram account, say, https://www.instagram.com/foodie/ to copy its xpath that gives me the number of posts, number of followers, and number of following.
I would then run the command this command on a scrapy shell:
response.xpath('//*[#id="react-root"]/section/main/article/header/section/ul')
to grab the elements on that list but scrapy keeps returning an empty list. Any thoughts on what I'm doing wrong here? Thanks in advance!
This site is a Single Page Application (SPA) so it's javascript that render DOM is not rendered yet at the time your downloader working.
When you use view(response) the javascript that your downloader collected can continue render by your browser, so you can see the page with DOM rendered (but can't interacting with Site API). You can look at your downloaded content via response.text and saw that!
In this case, you can apply selenium + phantomjs to making a rendered page for your spider!
Another trick: You can use regular expression to select the JSON part of Script, parse it to JSON obj and select your correspond attribute value (number of post, following, ...) from script!
I am trying to scrape some data from the following website: https://xrpcharts.ripple.com/
The data I am interested in is Total XRP which you can see immediately below or to the side (depending on your browser) of the circle diagram. So what I first did was inspect the element I am interested in. So I see that it is inside <div class="stat" inside span ng-bind="totalXRP | number:2" class="ng-binding">99,993,056,930.18</span>.
The number 99,993,056,930.18 is what I am interested in.
So I started in a scrapy shell and wrote:
fetch("https://xrpcharts.ripple.com")
I then used chrome to copy the Xpath by right clicking on that place of HTML code, the result chrome gave me was:
/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span
Then I used the Xpath command to extract the text:
response.xpath('/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span/text()').extract()
but this gave me an empty list []. I really do not understand what I am doing wrong here. I think I am making an obvious mistake but I dont see it. Thanks in advance!
The bottom line is: you cannot expect the page you see in the browser to be the same page Scrapy would download and have available to work with. Scrapy is not a browser.
This page is quite dynamic and complex and is constructed with the help of multiple asynchronous requests bringing in both the logic and the data. There is also JavaScript executed in the browser that plays an important role in forming and supporting the HTML document object tree.
Scrapy does not have all these things, the thing you get when you do fetch() is just the very first initial "bare bones" HTML page without all the "dynamic content".
I use Google Chrome code inspect utility at the page http://7pay.in/to_phone?phone=9101235577 when I scroll down to Bitcoin icon, something happening in ajax: I don't know how ajax works at all, but I really want to parse the Bitcoin address which is not represented as HTML code when you fetch the link given above, until you press Bitcoin button (Its not represented when I curl or wget the page)
Could you please give me a hint on how to work with that kind of ajax elements. I understand logically that by default some parts of page are hidden by Javascript, but I can't quite understand what I should do in order to work with it using PHP Simple DOM parser.
Not sure if what I want to do is possible, but what I am hoping to do is somehow gather certain pieces of text from a website, remove the header, footer, background, all formatting, and place it into my application in a scrollview or something similar...
I'll give you an example... Imagine I was making wikipedia's iPhone app, I want to download the information about the wiki on dogs, without the header, side bars etc, just the text. How would I go about doing this?
I understand that for this I have not provided any example code or what I've tried or started, but that's just because in this case I'm lost! That doesn't mean I want full chunks of code either. Any help will do. If this doesn't work, I will just have to make a 'mobile optimised' version of the webpages I want to include in my app.
Thanks
(Edit: the term I was trying to use was 'strip the web page of its HTML coding')
You may be going about this the wrong way, or perhaps even asking the wrong question.
Does the target website have an API or datafeed of some kind?
Can you get the information you need in JSON or XML format directly from the site?
I think you've misunderstood the technology. HTML is merely the framwork on which the formatting and data is hung.
Parsing the HTML page seems like an awfully big headache, I doubt you'll ever be able to get it to work, because almost all sites these days are partially or wholly generated on the server side, the page is only the result.
Some sites hide the information in memory and others get it dynamically through ajax for example, which means that simply trying to get the data by parsing the HTML will get zero data.
Another issue you should be aware of though, is that simply copying the data from generated websites may open yourself up to copyright issues.
You have to parse the html code and search for the part that you want and "throw" away the part that you do not need. This is more or less like bruteforcing and the code of the website should not change otherwise you are screwed. So you have to write the parser by hand with this method. But maybe there is a atom or rss feed and you can parse this one. This will be much more easier and you are not depending on the website layout because the rss/atom feed is just about the data. For parsing rss you could try out NSXMLParser.
And then you have to make a valid html page out of the data and present it in the UIWebView
I'll explain:
I have a picture gallery, the first page is display.php.
Users can flip through pictures using arrows, when you click an arrow it sends an Ajax request to retrieve the next picture from the db. Now I want the URL to change according to the picture displayed.
So if the first picture is:
www.mydomain.com/display.php?picture=Paris at night
I'll flip to the next one and the URL would be
www.mydomain.com/display.php?picture=The Big Ben
How do I do this?
The trick here are uri's with an anchor fragment.
The part before '#' points to a resource on the internet, and after normally designates to a anchor on the page.
The browser does not refresh if the resource is the same but moves to the anchors position when present.
This way you can keep the convenience of browser history from a usability point of view while replacing certain parts on the page with ajax for a fast and responsive user interface.
Using a plugin like jQuery history (as suggested by others) is really easy: you decorate certain elements with a rel attribute by which the plugin takes care of the rest.
Also kinda related to this topic is something called 'hijax', and it's something I really like.
This means generating html just like you would in the old days before ajax. Then you hijack certain behavior like links and request the content with ajax, only replacing the necessary parts. This in combination with the above technique allows really SEO friendly and accessible webpages.
You can use the jQuery history plugin for example.
changing the search of the url will load the changed url.
See also: stackoverflow, javascript changing the get parameter without redirecting
Do you really want to use AJAX here?
A traditional web request would work like this...
User navigates to display.php
User clicks "next" and location is updated to "display.php?picture=Big-Ben"
Big Ben is shown to user, along with a link to "display.php?picture=Parliment"
User clicks "next" and location is updated to "display.php?picture=Parliment"
And so on.
With AJAX, you essentially replace the GET with a "behind the scenes" GET, that just replaces a portion of your page. You would do this to make things faster... for example...
User navigates to display.php
User clicks "next" and the next image location is obtained using an AJAX request
The image (and image description) is changed to the next image
What you are suggesting is that you retrieve the "next url" using AJAX and then also perform a GET on the whole page. You would be much better off sending the "next" image when you send each page and not using AJAX at all.
this best describes everything i think: http://ajaxpatterns.org/Unique_URLs