Elasticsearch query to return limited amount of result (10) which will contain 2 from each specified keyword - elasticsearch

I have articles stored in Elasticsearch and I've been wondering if there is a way I can query by date but the result to contain a specific amount of articles from each publisher. More specifically, I have 5 different publishers and I want to get the 10 latest articles, 2 from each publisher. I'm storing the publishers name as a keyword field in elastic.
The only idea I've come up with is to run a query for each publisher separately and limit the result to the first 2 (and then merge the results programmatically), but it will be more efficient I think if there is way I can do this in a single query.
Thanks

This sounds like a case for field collapsing.
You would collapse on the publisher field (as long as it is a keyword or a number) and then request inner_hits, the actual articles.

Related

Is there a limit on the number of filters that can be passed to DynamoDB query

I’d like to search for something like
Field “Id” has a value in [Long list of IDs]
This long list of Ids can hit over 1000 Ids.
should I expect a problem with that? Is there a limit on how long the query can be?
I am looking at cloudsearch and it seems to have a limit of 1024 clauses and wondering if it should just be done from DynamoDB if there are no limits on it.
At that point, I guess I should also ask if Elastic search/Open search has such limits,
You can review the various DynamoDB limits at https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ServiceQuotas.html. Here's two that will impact you:
The maximum length of any expression string is 4 KB.
The maximum number of operands for the IN comparator is 100.
Elasticsearch has a limit on max clause a query can have as explained in the search settings, but if you are using it only to filter the data, as you mentioned in your question, than you can simply use the terms query where you can send a long list of ids to filter on, this is also advised by Elasticsearch in the same document search settings.
Field “Id” has a value in [Long list of IDs]
This long list of Ids can hit over 1000 Ids.
Is Id your primary key?
If so, no because in dynamo you can do batchGet operation on passing the Ids and for 1000 you will have to do 10 concurrent/sequential calls to dynamo.
if it's not a primary key, i.e. a secondary index then you will have to do 1000 concurrent query operation to check the presence of the key.

Possible to use GroupBy in ElasticSearch querystring?

I have a few records in my elasticsearch collection and i want to use a GroupBy aggregation in elasticsearch querystring.
I want to know if it is possible, because i tried to google it always give result about this
i want to use this something like this in the query string , which can
give me records in the group.
For i.e.
http://localhost:9200/_all/tweets/_count?q=user:Pu*+user:Kim*
This will give me count of all the records which has name starts from Pu and Kim,
But i want to know that how many records are there has name starting with Pu
and Kim,
aggregations need to be specified in addition in the search request, you cannot specify them as part of a query string query.
You could also just execute two queries to find out this particular requirement...

Messages aggregation in elasticsearch

For example I have next documents.
{sourceIP:1.1.1.1, destIP:2.2.2.2}
{sourceIP:1.1.1.1, destIP:3.3.3.3}
{sourceIP:1.1.1.1, destIP:4.4.4.4}
Is there anyway to automatically aggregate them into one document which will contain next data?
{sourceIP:1.1.1.1, destIP:{2.2.2.2,3.3.3.3,4.4.4.4}}
So it looks like group by in SQL, but generate new documents in elasticsearch instead of old one.
I dont think there is anyway to do indexing time auto-merging of documents.
However , it should be possible to acheive whatever result you are planning to query should be possible by using one of querying options offered by Elasticsearch - while indexing one document for ,
Like ..
You can index seperate documents, query by sourceIP and use aggregations to give dest_ip
Take count of documents if its just to find dest_ips for a source_ip
Also if you want to avoid duplicate source_id + dest_id combinations , you can concat and use it as _id of document
Hope this helps.

In Elastic Search how can I get result from each types in an index for result limited to 10 query.?

I have four types in my index and I am searching for a keyword and the result is limited to 10.I need to get records from all types.Is it possible.?
If you mean getting the first 10 docs per type, I'd use the multisearch API.
See https://www.elastic.co/guide/en/elasticsearch/reference/2.3/search-multi-search.html

Representing summary data in a Kibana Data Table

Using Kibana, is it possible to display one row of data which is a summary of other rows?
This is our requirement:
Given an entry in an index with the following structure:
string requestId
boolean raisedException
boolean requiredExternalLookup
We want to create a tabular output with the following structure
requestId numberRaisedException numberNoException numberRequiredLookup
So, if there were three rows (or entries) in the index for the same request id, two where an exception was raised, the output may look like this:
requestId numberRaisedException numberNoException numberRequiredLookup
REQUEST_123 2 1 3
Presumably the correct Kibana visualization widget to represent this would be a Data Table. But how in Kibana would one create a row like the above which is a summary of several rows, somewhat akin to a sql GROUP BY clause. Is it at all possible?
You can probably do this with 'scripted_fields', but the status of the 'scripted_fields' feature in kibana isn't clear. I think it was recently blocked in kibana due to security issues - Leaving this open is dangerous since you can do anything.
If you have access to your elasticsearch cluster then you might be able to create the field on your elasticsearch index.
You can read about it here : http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html

Resources