How to use Data scraping (UIPATH) to get data from a certain range of pages (page 11 to page 20)?

How to use Data scraping (UIPATH) to get data from a certain range of pages (page 11 to page 20)? - uipath

I'm trying to use the data scraping wizard in UiPath to get the details of books from pages 11 to 20, after applying filters, and input them in an Excel file. I've tried putting the data scraping activity in a loop, I've tried using a counter for the selector of the page, but none of them worked. Can anyone help?
url of book store

First of all you need to navigate on the start page (for example page 11) and set the scraping data on a single page.
Instead to select the next button in the data scraping wizard you leave the scraping for only one page.
Then you could loop these previous steps in a "for each" or "while" where you change dinamically the navigation path and attach browser element.
The loop could start dinamically from a integer counter (from 11 to 20).

Related

How to load a specific number of records per page and add an more button

On my page I would like to output all records of a specific folder
but the number should initially be limited to a certain quantity (to reduce the loading times). With a "Load more" button further records should be loaded.
Does anyone have a hint on how I can achieve this?
I have already found several approaches on the web in connection with AJAX, but since I'm not familiar with this yet, more questions than answers have emerged ...
For info: I use an own Template Extension / Distribution under Typo3 9.5.8
Thank you in advance for any help!!

The state of the art solution is the AJAX solution, where you load only the required records from the server and modify the page on the fly.
Another option would be an URL parameter which is evaluated by your extension.
With the parameter the full list is shown,
without only the first N and a button with the link to the same URL including the parameter for the full list.
Make sure the paramter is handled correctly and generates another cached version of the page. (keywords: cHash)
As you now have two pages with partially identical content: don't forget to tell the searchengines that the short variant should not be indexed.

You could use the Paginate Widget like documented here: https://docs.typo3.org/other/typo3/view-helper-reference/9.5/en-us/typo3/fluid/latest/Widget/Paginate.html
By overriding the paginate template file and only rendering the pagination.nextPage link, you could load the nextpage via AJAX.

Copying the xpath from Instagram inspect (using chrome) returns an empty list

So I would go to an instagram account, say, https://www.instagram.com/foodie/ to copy its xpath that gives me the number of posts, number of followers, and number of following.
I would then run the command this command on a scrapy shell:
response.xpath('//*[#id="react-root"]/section/main/article/header/section/ul')
to grab the elements on that list but scrapy keeps returning an empty list. Any thoughts on what I'm doing wrong here? Thanks in advance!

This site is a Single Page Application (SPA) so it's javascript that render DOM is not rendered yet at the time your downloader working.
When you use view(response) the javascript that your downloader collected can continue render by your browser, so you can see the page with DOM rendered (but can't interacting with Site API). You can look at your downloaded content via response.text and saw that!
In this case, you can apply selenium + phantomjs to making a rendered page for your spider!
Another trick: You can use regular expression to select the JSON part of Script, parse it to JSON obj and select your correspond attribute value (number of post, following, ...) from script!

Does using AJAX on your website drop your page views while ranking?

Since its related to AJAX technology so I thought this is the best place to ask.
I am displaying 5 articles at a time to the user on my website and when he clicks 'Next' I load the next 5 articles using AJAX without loading the entire page.The result is that he always stays at the same page .
One of my friend told me that website ranking depends on number of page views and I think this obviously reduce my page views.
Should I not use AJAX then?
(This might be a stupid question but I seriously have no idea about ranking and SEO so please help)

By loading your content dynamically Google will not see the entire page. Only the part that is loaded. So, if Google rank is important for you it's better to not use an infinity loader.

Actually it is not a good idea to navigate page using AJAX. Consider a scenario,
display 5 articles first then by clicking Next button, next 5 items will load and so on... by using this the page will not become Search engine friendly.
in this case search engine can't locate your contents exactly and will crawl only initial contents.
but with some efforts you can make ajax navigation search engine friendly.. see example here.
Currently the scheme of loading content of page dynamically is not a good idea for SEO friendly web page but try considering other ajax page navigation schemes that might help the page to make dynamic as well as search engine friendly.
some suggested ajax navigation schemes are listed below,
http://nickjohnson.com/b/how-to-make-ajax-search-engine-friendly-seo
http://ajax.rswebanalytics.com/
http://www.symatix.co.uk/articles/ajax/search-engine-friendly-ajax-navigation

Browser Plugin to fill a large html form with test data

In one of the flows in a java web application, I have a form page which captures around 50 odd fields. Now to test a code change in the last page in this flow I have to fill almost all the fields in all the pages that come before it (approximately around 75 fields). This takes a lot of effort in creating the test data and testing the change
Most of the time I enter the same data in these fields for testing. Any suggestion to automate this, something like a firefox plugin which could save the form data within the browser and populate it again the next time i want to ?
I tried searching over the internet but I could only find Charles Proxy which isn't what I need exactly.

You can use Selenium or iMacros for Firefox - https://addons.mozilla.org/en-US/firefox/addon/imacros-for-firefox/.

selecting a page content in apple script

I want to go through a MS Word document page by page and generate images of all pages. I am stuck in the very beginning. Although I can compute total number of pages I cannot get the content of say page 1 into selection object.
I want something like
select page 1 of active document
or
set myRange to create range active document page 1
or
create range active document start (start of page 1) end (end of page 1)
Of course I have the page count and I want to loop on it and generate images page by page but please help me first to get page content into a selection object so that I can proceed.
If anybody has some other idea of accomplishing the job them I am all for it.

I have solved it.
The steps are.
Convert word document to pdf through apple script.
Make a work flow in automater to render pdf as images.
Call the word flow through shell command from apple script.
I will make a tutorial in few days and post some where.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to use Data scraping (UIPATH) to get data from a certain range of pages (page 11 to page 20)? - uipath

Related

How to load a specific number of records per page and add an more button

Copying the xpath from Instagram inspect (using chrome) returns an empty list

Does using AJAX on your website drop your page views while ranking?

Browser Plugin to fill a large html form with test data

selecting a page content in apple script

Categories

Resources