SonarQube API Issue search is only returning 100 results - sonarqube

Utilizing SonarQube 5.1, I have been attempting to utilize the API search feature to gather all of the issues pertaining to my current project to display on a radiator. On the Web interface, SonarQube indicates there are 71 major issues and 161 minor issues.
Using this search string
https://sonarqube.url.com/api/issues/search?projectKeys=myproject'skey
I get back a response with exactly 100 results. When I process those results for only OPEN items, I get back a total of 55 issues. 36 major, 19 minor.
This is being achieved through a Powershell script that authenticates to the SonarQube server and passes in the query, then deserializes the response into an array I can process. (Counting major/minor issues)
With the background out of the way, the meat of my question is: Does anyone know why the responses I am receiving are locked at 100? In my research I saw others indicating a response to an issue search would be capped at 500 due to an outstanding bug. However the expected number of issues I am looking for is far below that number. The API's instructions indicate that it would return the first 10,000 issues. Is there a server side setting that restricts the output it will return to a search query?
Thanks in advance,

The web service docs show that 100 is the default value of the ps parameter. You can set the value higher, but it will still max out.
You might have noticed a "paging" element in the JSON response. You can use it to calculate how many pages of results there are and loop through them using the p parameter to specify page number.

Related

Is there any way to get more than 60 results from Google Places API?

I can't figure out how to get more than 60 results from Google Places API.
Are there any workarounds, or is it actually impossible to get anything more than 60??
According to the documentation the max amount of results you can get is 60, please notice that the API returns a paginated response showing up to 20 results per page. It is impossible to get more than that since there is no option or parameter you can configure to set the amount of results yo want.

ElasticSearch document refresh=true does not appear to work

In order to speed up searches on our website, I have created a small elastic search instance which keeps a copy of all of the "searchable" fields from our database. It holds only a couple million documents with an average size of about 1KB per document. Currently (in development) we have just 2 nodes, but will probably want more in production.
Our application is a "primarily read" application - maybe 1000 documents/day get updated, but they get read and searched 10's of thousands of times/day.
Each document represents a case in a ticketing system, and the case may change status during the day as users research and close cases. If a researcher closes a case and then immediately refreshes his queue of open work, we expect the case to disappear from their queue, which is driven by a query to our Elastic Search instance, filtering by status. The status is a field in the case index.
The complaint we're getting is that when a researcher closes a case, upon immediate refresh of his queue, the case still comes back when filtering on "in progress" cases. If he refreshes the view a second or two later, it's gone.
In an effort to work around this, I added refresh=true when updating the document, e.g.
curl -XPUT 'https://my-dev-es-instance.com/cases/_doc/11?refresh=true' -d '{"status":"closed", ... }'
But still the problem persists.
Here's the response I got from the above request:
{"_index":"cases","_type":"_doc","_id":"11","_version":2,"result":"updated","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":70757,"_primary_term":1}
The response seems to verify that the forced_refresh request was received, although it does say out of total 2 shards, 1 was successful and 0 failed. Not sure about the other one, but since I have only 2 nodes, does this mean it updated the secondary?
According to the doc:
To refresh the shard (not the whole index) immediately after the operation occurs, so that the document appears in search results immediately, the refresh parameter can be set to true. Setting this option to true should ONLY be done after careful thought and verification that it does not lead to poor performance, both from an indexing and a search standpoint. Note, getting a document using the get API is completely realtime and doesn’t require a refresh.
Are my expectations reasonable? Is there a better way to do this?
After more testing, I have concluded that my issue was due to application logic error, and not a problem with ElasticSearch. The refresh flag is behaving as expected. Apologies for the misinformation.

Is Elasticsearch Scroll API not recommended for real-time pagination?

I understand that Elasticsearch Scroll API is not intended for real-time user requests. But would it be bad if it's used for that? I have a requirement to implement paginated results (to be displayed on web frontend) and from/size approach is returning duplicates across pages. Presumably because I have a sharded setup (with no replicas at all). I've tried setting preferencebut it did not help.
Scroll API does not seem to have this issue, I'm wondering if it's really bad to use it for my use case?
Thanks
Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests. it means that your pagination is based on the time you requested the search result, so you don't see new document or will see deleted in your result. Also Scroll API is not recommended by ES for deep pagination any more(ES 7.x). you can find more info on ElasticSearch documentation page: https://www.elastic.co/guide/en/elasticsearch/reference/7.x/scroll-api.html
On the question 'why you get duplicate results', I think this is caused by intermediate indexing. When doing independent search calls with pagination, each call runs independently (still using some caching). So if you ask the first 100, you get the first 100 at that time. When then asking x seconds later the 'next' 100, you get 100 - 199 at x seconds later. If meanwhile a new document got indexed which logically fits in the first 100, it will push the rest further. This way, your result 100 (first in the second results) might have been #99 in the first call. When then gluing them together in the UI, you see the same result twice.
Both scroll and search-after are designed to refer ES back to the original call, indicating it that you want to continue counting from that moment onwards.
I have not found a good explanation though why search_after is better than scroll.
I assume that scroll is optimized for the use case where you will go through the entire set anyway (so the pagination is to avoid overloading the client and the pipe between ES and client with too big chunks at once). While search_after is optimized for the use case where you are likely to only go a few pages far/deep (it is known that human users tend to stay on the first page with a quickly lowering frequency of going much further, because you would force your eyes to find something into overwhelming amounts of information). Implementing good filters in the user interface is the much better approach.

How to get all results for api/issues/search (not just first 500)?

I am trying to use the SonarQube web service API api/issues/search to extract the information of all issues. But I see that the maximum number of results from the API is only 500 with filters like pageSize.
Is there a different way of using this API so that I can get all the issues in the resultlist?
The web service results are paginated. Use ps (page size) and p to step through the result set.
That said, there's a hard limit of 10k.

Clarifications on google custom search API v1 parameters

I am experimenting with the google custom search API (free version) for performing image search. I would like to commence with the paid version. However, I have some difficulties in understanding the pricing and some documented query parameters in the API calls at https://developers.google.com/custom-search/json-api/v1/using_rest#api-specific_query_parameters
1) In the free version, we have 100 queries/day. If I understood well, 1 query means a single API call. This call can return a maximum of 10 (since the parameter 'num' takes a maximum value of 10) results only. Is this both for free and paid versions? Or is it possible to retrieve more results per API request in the paid version? Precisely, can 'num' take values greater than 10?
2) The parameter 'start' is documented as index of the first result to return. In the free version, I cannot get more than 100 results for a specific query (parameter 'q'). To summarize precisely, I can get 10 results/API call, each call with parameter 'start' taking the values 1, 11,... 91 and same value for 'q'. The API call returns an error for any value of 'start' greater than 91. Is n't the free version supposed to allow 100 API calls? Or perhaps, this restriction is placed to avoid being able to retrieve more than 100 results per search term 'q'?
3) In the paid version, are API calls which return non-200 responses billed for as well?
4) In the paid version, how many API calls can be made for a specific search term 'q'?
5) Do you think there are particular restrictions with respect to the number of results that apply specific to image search only?
Thanks in advance for your help.
The results are paginated. The search results show 10 per page. If you want more you need to set the start page to 11 & get 10 more. It is an exact imitation of what would happen in Google UI search. If you have trouble understanding goto Google search and observe the results. It should match almost. parameter n must be the number of results per page.
In the free version you have 100 free/day. Anything else will 0.5 cents per request. You cannot make more 10k calls per day. So free is not actually free.
In the "paid" version you can buy in bulk. AFAIK there is no daily limit. You can "buy" let us say 11000 requests for 55$ (11000*0.5) and use it all up in one day. But the paid version will be ended soon :( . Please check this blog for info https://customsearch.googleblog.com/

Resources