Cursor vs. Start parameter in Scopus API - scopus

I am working on a project that uses Scopus API to get document names or journal names under different scenarios. I am using ScopusSearch API (https://dev.elsevier.com/documentation/ScopusSearchAPI.wadl) and SerialTitle API (https://dev.elsevier.com/documentation/SerialTitleAPI.wadl) for the purpose.
However, the total number of documents I am able to retrieve using these API's is very few. I want to increase the number of documents being fetched. Now, I've been through the documentation of these API's a several times but I am confused with the use of start parameter and the cursor parameter.
Take for example, ScopusSearch API, under its query params section:
start parameter
cursor parameter
Can someone please help me understand the difference between these two? And more specifically when to use the start and when to use the cursor parameter?

If you use pybliometrics, as your tag suggests, then you don't need to care about this.
The basic idea behind this pagination (that's what you're after) is:
Run a query with unlimited number of results with cursor set to "*"
Set start to 0 and get the first count results
Set start to start+count+1 and get the next count results
Repeat step 3 until all results are fetched

Related

Google fhir store alters query string in search results next link

I query with a count less than the total to make it paginate:
https://healthcare.googleapis.com/v1/projects//locations//datasets//fhirStores//fhir/Encounter?_sort=date&_count=5&practitioner=abcdefg&subject:missing=false&patient:Patient.name=John&patient:Patient.name=Doe&_include=Encounter:patient
And the returned next link has combined the 2 Patient.name values, make it be an OR instead of an AND:
.../?_count=5&_include=Encounter%3Apatient&_sort=-date&patient%3APatient.name=John%2CDoe&practitioner=abcdefg&subject%3Amissing=false&_page_token=
Is it right that it combines the 2 values for Patient.name? I still want the next page of results to have those 2 conditions ANDed together, not ORed. How do I get that?
This looks like a bug. I see that it works without the chained search, e.g. if I do Patient?name=John&name=Doe, I get a next link that has the correct AND.
For the chained search, the actual results seem to be from the AND query but the pagination links are incorrectly converted to OR.
I have reported this issue internally.
As a workaround, it appears that if you use the _page_token value from the link and run the query with the conditions you want and &_page_token=[value], it does return the correct next page.

Web Scraping returning empty data table UiPath

I’m using Data Scraping to scrape a product Information (i.e Product Name, Url, Price, Model) from a shopping website.
When I search for a product, I want whatever item comes first it scrapes that item’s data and for that purpose I have set maximum number of results to 1. But the problem is sometimes it is returning empty Data table And I cannot figure out why.
What I think is, if the current search result matches those elements that I selected in data scraping wizard, it returns the data table and if it doesn’t match it returns empty Data table.
For Example, While selecting elements in Data scraping wizard the search results were Samsung monitors. And when I ran the project I searched for Dell monitors, it returned Data table but when I searched for Samsung series or Dell Series it returned empty Data table. What is wrong with this?
You need to tell what you actually need as output.
But if your output is empty, mostly the reason is one of the following:
make sure the timeout is high enough, set it to 30000 if you are unsure
set a proper selector that has not a bad impact even when the website is being changed for some reason
For me it working properly with a proper timeout and a flexible selector with a *.

oracle - can I use contain and near with a clob? Need to speed up query

We have a query that takes 48 minutes to run a search on a clob. The query is written as if it is not a clob column and uses contains and near. This search for 3 words within a certain word distance from each other is important. I'm needing to speed this up and want to do an index on the clob, but don't know if that would work and don't fully understand how to do it. I found this from Tom Burleson
http://www.dba-oracle.com/t_clob_search_query.htm OR https://asktom.oracle.com/pls/apex/asktom.search?tag=oracle-text-contains-search-with-near-is-very-slow
, but can't figure out how to do it with contains and near to enable the search of 3 words withing a certain distance from each other.
current script:
SELECT clob_field
FROM clob_table
WHERE contains(clob_field,'NEAR (((QUICK),(FOX),(LAZY)),5)') > 0;
Want to use something like this if it will act like indexing:
SELECT clob_field
FROM clob_table
WHERE contains(dbms_lob.substr(clob_field,'near(((QUICK),(FOX),(LAZY)),5)')) > 0;
If not, I need to do indexing, but I don't quite understand how to use CTXCAT and CONTEXT (https://docs.oracle.com/cd/A91202_01/901_doc/text.901/a90122/ind4.htm). I also don't like what I read here that says that if one uses CTXCAT for indexing a clob you have to use CONTEXT, or something like that. It can't affect the other queries that are done on this field.
Thanks in advance!
Contains won't work unless it is globally indexed, so I had to index the field and then could get the original query working.

Count how many times one post has been read

Is it possible to show how many times one post has been read? In WordPress there is a plug-in,https://wordpress.org/plugins/wp-postviews/
I don't know whether there is such a plug-in in Anypic of Parse to count the times?
Of course it will be nice if it can display who has read a post as well.
Thanks
I'm not sure which language you working on.
But anyway you need to create:
Array column in Parse.com
And then just make query to add his name when viewWillAppear
Now you can count the array to get integer number for views and you can display their names from the array.
Two options are;
Add a viewcount column and increment it whenever needed.
Add an actions table which consist all actions within your webpage or app. This way you can store more data(custom analytics) in it like button pressing etc.. When you want to check the viewcount you can just count objects with specific type. For iOS SDK countObjectsInBackgroundWithBlock does this job.

How to identify a new pattern in a URL with a machine learning algorithm (Text mining)

I am trying to identify new patterns after analyzing a number of URLs. So let's say, I am investigating the hypothetical website Yoohle.com and their URLs have the following structure.
domain = yoohle.com
q= search phrase
lan= language used
pr= partner_id
br= browser_id
so a sample url will look like this
www.yoohle.com/test_folder/test_page?q=hello+world&lan=en&pr=stackoverflow&br=chrome
If I am investigating the web traffic of this website and seeing abnormal increase month over month, I would like to find out what's causing this. In this example I can just parse out the URL and look at the pr= value since it will tell me if there is a new partnership (maybe stackoverflow is going to be powered by yoohle.com and that drives the increase etc.)
The question is, how can I build something robust that can compare 2 (or more) months and tell me exactly what's driving the increase. I want to get something like, "we are seeing an increase and it is driven by the following pattern"
www.yoohle.com/test_folder/test_page%pr=stackoverflow%
The tricky part is, you do not know anything about what the tokens mean unlike this example since I will not know what token stands for partner_id. Another issue is, if we look at token by token, this will be misleading because lan=en will also go up with a new partner assuming the users will still have English as the language.
My idea is to analyze the tokens by looking at all the combinations but it is very costly, (4! in this example and probably 10+! for other websites). Also analyzing tokens itself is not going to solve the problem since I still need to analyze the values of the tokens.
I tried k-means clustering, apriori algorithm did some research on URL/text mining but could not get what I want. Any ideas about how to approach building an algorithm will be beneficial.
Imagine that you are seeing realtime data, so we are talking about analyzing around 100K URLs in a given month.
I would go the following way. You can create the following table:
URL
time
time_month -- time rounded to month, for demonstration purpose
q_bol -- boolean flag whether question parameter was used
q -- question parameter value
lan -- language parameter value
lan_bol -- boolean flag whether language parameter was used
pr -- partner parameter value
pr_bol -- boolean flag whether partner parameter was used
br -- browser parameter value
br_bol -- boolean flag whether browse parameter was used
Now, you can write some query.
with t as (
select
time_month,
q_bol, lan_bol, pr_bol, br_bol, count(*)
from
urldata
where
time_month > '2013-02-01'::date and time_month < '2013-04-01'::date -- last two months data
group by
time_month
)
, u as (
select
*,
t2-coalesce(t1,0) as abs_change, -- change in pattern MoM,
case when t1 is null then 0 else t2/t1 end as relchange -- relative change
from
t t1 full outer join t t2 using (q_bol, lan_bol, pr_bol, br_bol)
)
select * from u where abs_change > 5000 or relchange > 3
The query above gives you parameters patterns where there is more than 5000 change month over month or more than 300% increase month over month. If you can use group by rollup in your sql system it would give also higher level aggregations (combinations of three parameters, two parameters, one parameter).
You can do pretty the same with values of parameters. Because you do not know what tokens will be present with values, you can parse url in the following structure of tables:
-- urls
id_url
url
time
-- parameters
id_url
token
value
Then you will need to rewrite the query above in some way, e.g. you can use array aggregation function in PostgreSQL array_agg().

Resources