ElasticSearch pagination--Does ES count from 0 or from 1? - elasticsearch

I'm using pagination in a series of ES filters. in the first page i set from = 0, size = 10000. My question is in the next page do i used from = 10000, size = 20000, or do i use from = 10001? I suspect it's from = 10000, but don't want to duplicate or drop a hit.

GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10
When size=5 and from=5 it skips 5 and produces a result of 6,7,8,9,10.
More detail: https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html
So in your case for from=10000 it actually starts from 10001.

Related

How to click a particular Row in WpfCalendar of Flights Reservation Application in UFT

As per need, I want to select the Second Row from FlightsGrid Image shown below.
Applying below Code, I am getting RowCount as 6 but not able to click on 3rd Row.
Set ODesc = Description.Create
oDesc("micclass").value = "WpfTable"
Set objchild = WpfWindow("HPMyFlightSampleApplication").WpfTable("Table_FlightsGrid")
objCount = objchild.rowcount
objCount(2).click
Image from Flight Reservation Application:
The WpfTable object is not a collection, it doesn't support indexing. Did you try using its SelectRow method?
First, why are you using DP here if you are able to get row count of table.
Following two lines will give you row count of the table:
Set objchild = WpfWindow("HPMyFlightSampleApplication").WpfTable("Table_FlightsGrid")
objCount = objchild.rowcount
Second, try using its SelectRow method to select required row.
As #Motti suggested, you can use SelectRow. Or if you want to go more in depth and want to select particular cell (which will eventually select the entire row), you can use SelectCell like this way:
'Rows and Columns indexes are 0-based
iCols = WpfWindow("devname:=HPE MyFlight Sample Application").WpfTable("devname:=flightsDataGrid").ColumnCount
iRows = WpfWindow("devname:=HPE MyFlight Sample Application").WpfTable("devname:=flightsDataGrid").RowCount
sFlightNum = "12274 NW"
For i = 0 To iRows
If WpfWindow("devname:=HPE MyFlight Sample Application").WpfTable("devname:=flightsDataGrid").GetCellData(i, 4) = sFlightNum Then
WpfWindow("devname:=HPE MyFlight Sample Application").WpfTable("devname:=flightsDataGrid").SelectCell i, 4
Exit For
End If
Next
Here is the screenshot:

XQuery/XPath LIMIT a sum

I am new to XQuery. I need to limit by 1 in XQuery but I'm having trouble.
I am trying to find out the winner of a tournament by finding each players scores and summing up the scores to get the total scores. I then sort in descending order and try to limit the total score, however I am having problems with trying to limit in XQuery.
Here I am wanting to get the top score but I have tried to use subsequence($sequence, $starting-item, $number-of-items) and [position()] but it seems to not be working.
for $pair_ids in distinct-values(fn:doc("tourny.xml")/Competition[#date = "2012-12-12"]/Tennis/Partner/#pair_id)
let $total_scores := sum(fn:doc("tourny.xml")/Competition[#date = "2012-12-12"]/Tennis/Partner[#pair_id = $pair_ids]/#score)
order by $total_scores descending
return
$total_scores
the output is giving me:
$total_scores:
34
11
20
How can I limit the result so I only get 34 as the highest score?
Thanks
You can use fn:max() function as follow:
fn:max(
for $pair_ids in distinct-values(fn:doc("tourny.xml")/Competition[#date = "2012-12-12"]/Tennis/Partner/#pair_id)
let $total_scores := sum(fn:doc("tourny.xml")/Competition[#date = "2012-12-12"]/Tennis/Partner[#pair_id = $pair_ids]/#score)
order by $total_scores descending
return $total_scores
)

Count with a basic query

I´m doing a paging query with a simple filter, it´s working like a charm.
var result = client.Search<MyMetaData>(
x => x.Index("MyIndex")
.Type("MyType")
.QueryString(filtro)
.From(from)
.Size(size)
);
But I need to know the number of results without paging to inform users.
I´m trying to do with the Count method, but without success.
In ES you can use the "Size" field to limit the number of records returned but the "Total" field will always have the correct total on the server even if only 100 records are returned (as with my sample below).
var result = ElasticClient.Search<PackingConfigES>(x =>
x.Size(100)
.MatchAll()
);
var totalResults = result.Total;

How to retrieve total view count of large number of pages combined from the GA API

We are interested in the statistics of the different pages combined from the Google Analytics core reporting API. The only way I found to query statistics multiple pages at the same is by creating a filter like so:
ga:pagePath==page?id=a,ga:pagePath==page?id=b,ga:pagePath==page?id=c
And this get escaped inside the filter parameter of the GET query.
However when the GET query gets over 2000 characters I get the following response:
414. That’s an error.
The requested URL /analytics/v3/data/ga... is too large to process. That’s all we know.
Note that just like in the example call the only part that is different per page is a GET parameter in the pagePath, but we have to OR a new filter specifying both the metric (pagePath) as well as the part of the path that is always identical.
Is there any way to specify a large number of different pages to query without hitting this limit in the GET query (I can't find any documentation for doing POST requests)? Or are there alternatives to creating batches of a max of X different pages per query and adding them up on my end?
Instead of using ga:pagePath as part of a filter you should use it as a dimension. You can get up to 10,000 rows per query this way and paginate to get all results. Then parse the results client side to get what you need. Additionally use a filter to scope the results down if possible based on your site structure or page names.
I am sharing a sample code where you can fetch more then 10,000 record data via help of Items PerPage
private void GetDataofPpcInfo(DateTime dtStartDate, DateTime dtEndDate, AnalyticsService gas, List<PpcReportData> lstPpcReportData, string strProfileID)
{
int intStartIndex = 1;
int intIndexCnt = 0;
int intMaxRecords = 10000;
var metrics = "ga:impressions,ga:adClicks,ga:adCost,ga:goalCompletionsAll,ga:CPC,ga:visits";
var r = gas.Data.Ga.Get("ga:" + strProfileID, dtStartDate.ToString("yyyy-MM-dd"), dtEndDate.ToString("yyyy-MM-dd"),
metrics);
r.Dimensions = "ga:campaign,ga:keyword,ga:adGroup,ga:source,ga:isMobile,ga:date";
r.MaxResults = 10000;
r.Filters = "ga:medium==cpc;ga:campaign!=(not set)";
while (true)
{
r.StartIndex = intStartIndex;
var dimensionOneData = r.Fetch();
dimensionOneData.ItemsPerPage = intMaxRecords;
if (dimensionOneData != null && dimensionOneData.Rows != null)
{
var enUS = new CultureInfo("en-US");
intIndexCnt++;
foreach (var lstFirst in dimensionOneData.Rows)
{
var objPPCReportData = new PpcReportData();
objPPCReportData.Campaign = lstFirst[dimensionOneData.ColumnHeaders.IndexOf(dimensionOneData.ColumnHeaders.FirstOrDefault(h => h.Name == "ga:campaign"))];
objPPCReportData.Keywords = lstFirst[dimensionOneData.ColumnHeaders.IndexOf(dimensionOneData.ColumnHeaders.FirstOrDefault(h => h.Name == "ga:keyword"))];
lstPpcReportData.Add(objPPCReportData);
}
intStartIndex = intIndexCnt * intMaxRecords + 1;
}
else break;
}
}
Only one thing is problamatic that your query length shouldn't exceed around 2000 odd characters

Google calendar query returns at most 25 entries

I'm trying to delete all calendar entries from today forward. I run a query then call getEntries() on the query result. getEntries() always returns 25 entries (or less if there are fewer than 25 entries on the calendar). Why aren't all the entries returned? I'm expecting about 80 entries.
As a test, I tried running the query, deleting the 25 entries returned, running the query again, deleting again, etc. This works, but there must be a better way.
Below is the Java code that only runs the query once.
CalendarQuery myQuery = new CalendarQuery(feedUrl);
DateFormat dfGoogle = new SimpleDateFormat("yyyy-MM-dd'T00:00:00'");
Date dt = Calendar.getInstance().getTime();
myQuery.setMinimumStartTime(DateTime.parseDateTime(dfGoogle.format(dt)));
// Make the end time far into the future so we delete everything
myQuery.setMaximumStartTime(DateTime.parseDateTime("2099-12-31T23:59:59"));
// Execute the query and get the response
CalendarEventFeed resultFeed = service.query(myQuery, CalendarEventFeed.class);
// !!! This returns 25 (or less if there are fewer than 25 entries on the calendar) !!!
int test = resultFeed.getEntries().size();
// Delete all the entries returned by the query
for (int j = 0; j < resultFeed.getEntries().size(); j++) {
CalendarEventEntry entry = resultFeed.getEntries().get(j);
entry.delete();
}
PS: I've looked at the Data API Developer's Guide and the Google Data API Javadoc. These sites are okay, but not great. Does anyone know of additional Google API documentation?
You can increase the number of results with myQuery.setMaxResults(). There will be a maximum maximum though, so you can make multiple queries ('paged' results) by varying myQuery.setStartIndex().
http://code.google.com/apis/gdata/javadoc/com/google/gdata/client/Query.html#setMaxResults(int)
http://code.google.com/apis/gdata/javadoc/com/google/gdata/client/Query.html#setStartIndex(int)
Based on the answers from Jim Blackler and Chris Kaminski, I enhanced my code to read the query results in pages. I also do the delete as a batch, which should be faster than doing individual deletions.
I'm providing the Java code here in case it is useful to anyone.
CalendarQuery myQuery = new CalendarQuery(feedUrl);
DateFormat dfGoogle = new SimpleDateFormat("yyyy-MM-dd'T00:00:00'");
Date dt = Calendar.getInstance().getTime();
myQuery.setMinimumStartTime(DateTime.parseDateTime(dfGoogle.format(dt)));
// Make the end time far into the future so we delete everything
myQuery.setMaximumStartTime(DateTime.parseDateTime("2099-12-31T23:59:59"));
// Set the maximum number of results to return for the query.
// Note: A GData server may choose to provide fewer results, but will never provide
// more than the requested maximum.
myQuery.setMaxResults(5000);
int startIndex = 1;
int entriesReturned;
List<CalendarEventEntry> allCalEntries = new ArrayList<CalendarEventEntry>();
CalendarEventFeed resultFeed;
// Run our query as many times as necessary to get all the
// Google calendar entries we want
while (true) {
myQuery.setStartIndex(startIndex);
// Execute the query and get the response
resultFeed = service.query(myQuery, CalendarEventFeed.class);
entriesReturned = resultFeed.getEntries().size();
if (entriesReturned == 0)
// We've hit the end of the list
break;
// Add the returned entries to our local list
allCalEntries.addAll(resultFeed.getEntries());
startIndex = startIndex + entriesReturned;
}
// Delete all the entries as a batch delete
CalendarEventFeed batchRequest = new CalendarEventFeed();
for (int i = 0; i < allCalEntries.size(); i++) {
CalendarEventEntry entry = allCalEntries.get(i);
BatchUtils.setBatchId(entry, Integer.toString(i));
BatchUtils.setBatchOperationType(entry, BatchOperationType.DELETE);
batchRequest.getEntries().add(entry);
}
// Get the batch link URL and send the batch request
Link batchLink = resultFeed.getLink(Link.Rel.FEED_BATCH, Link.Type.ATOM);
CalendarEventFeed batchResponse = service.batch(new URL(batchLink.getHref()), batchRequest);
// Ensure that all the operations were successful
boolean isSuccess = true;
StringBuffer batchFailureMsg = new StringBuffer("These entries in the batch delete failed:");
for (CalendarEventEntry entry : batchResponse.getEntries()) {
String batchId = BatchUtils.getBatchId(entry);
if (!BatchUtils.isSuccess(entry)) {
isSuccess = false;
BatchStatus status = BatchUtils.getBatchStatus(entry);
batchFailureMsg.append("\nID: " + batchId + " Reason: " + status.getReason());
}
}
if (!isSuccess) {
throw new Exception(batchFailureMsg.toString());
}
There is a small quote on the API page
http://code.google.com/apis/calendar/data/1.0/reference.html#Parameters
Note: The max-results query parameter for Calendar is set to 25 by default,
so that you won't receive an entire
calendar feed by accident. If you want
to receive the entire feed, you can
specify a very large number for
max-results.
So to get all events from a google calendar feed, we do this:
google.calendarurl.com/.../basic?max-results=999999
in the API you can also query with setMaxResults=999999
I got here while searching for a Python solution;
Should anyone be stuck in the same way, the important line is the fourth:
query = gdata.calendar.service.CalendarEventQuery(cal, visibility, projection)
query.start_min = start_date
query.start_max = end_date
query.max_results = 1000
Unfortunately, Google is going to limit the maximum number of queries you can retrieve. This is so as to keep the query governor in their guidelines (HTTP requests not allowed to take more than 30 seconds, for example). They've built their whole architecture around this, so you might as well build the logic as you have.

Resources