How to retrieve and list the first element of a field use Elasticsearch query (two compare and find end deleted duplicated documents in same index)? - elasticsearch

in my elasticsearch index all logs have a field called RES and the structure look like this :
Number:"12131", amount:8, referenceNumber:"140102129728883", expire:"1365", securityControl:0
I want to compare number in all indexed documents and delete duplicated documents.
can anybody help me?

Related

ElasticSearch: how to search from multiple indexes

I have a situation where I need to search from multiple indexes (products and users). Below is a sample query I am using to do that search
http://localhost:9200/_all/_search?q=*wood*
http://localhost:9200/users,products/_search?q=*wood*
With the above API request, it only returns search results for the product index. But if I search using the below API it returns search results for users index
http://localhost:9200/users/_search?q=*wood*
As you can see I am passing same value for "q" parameter. I need to search for both product and users index and check if there is the word "wood" in any attribute in both indexes. How can I achieve this
You can pass multiple index names instead of _all as it will search in other indices that you don't intent to by using the comma seprated index name like
http://localhost:9200/users,products/_search?q=*wood*
Although, _all should also fetch the result from users index which you get when you specify its name, you need to debug why its happening, maybe increase the size param to 1000 as by default Elasticsearch returns only 10 results and it seems in case of _all all the top results coming from products index only.

screen out document results that share the same property value accept the first one

I have a db of documents. Every document has a property(keyword) called index (noting to do with the elastic index) and a property(keyword) named superIndex. There can be multiple documents with the same index and multiple documents with the same superIndex in the DB, these fields are not unique.
I run a compound query searching free text on the text content of these documents, with sorting, and get the results I want. However, I get many documents having the same index and/or superIndex. Currently I programmatically filter the result list and take only the first result from each index and superIndex. My requirement is that at the end I'm left with the top results from the sort, the first from each index and superIndex.
Can this be done using elastic query. If so how?
Field collapsing allows you to collapse all search results having the same value in a field (e.g. index). (See Elasticsearch Reference: Field Collapsing)

Intersectional search from two indices

I have two indices in Elastic: Track and Messages.
Each index has ServiceId field.
Search in the Tracks index is always performed by fields (exact value) in the index.
Search in the Messages is always performed by free text search in the index.
I need to implement an intersectional search from the both indices. To find ServiceIds from Tracks index by exact fields values and then find ServiceIds from Messages index by free text search and then to intersect the results = array of ServiceIds.
The actual result to the user is the list of Tracks by these ServiceIds received from the intersect action.
How can I do it? Each search (from Tracks and Messages) should return all the documents and only then intersect should be performed but as I understand Elastic can not return all the documents (suppose I have hundred of millions documents in each index) and intersect action is not correct...
Intersectional search from two indices
It could be achieved by using aliasing. Refer https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html for more information.
You should have some common fields in both your indices, in this case, it is serviceIds.

Elasticsearch get document count for all fields of index type

I would like to get an aggregation with document count on all fields of an index. is this possible or do I have to define every field within a missing to get something similar

Messages aggregation in elasticsearch

For example I have next documents.
{sourceIP:1.1.1.1, destIP:2.2.2.2}
{sourceIP:1.1.1.1, destIP:3.3.3.3}
{sourceIP:1.1.1.1, destIP:4.4.4.4}
Is there anyway to automatically aggregate them into one document which will contain next data?
{sourceIP:1.1.1.1, destIP:{2.2.2.2,3.3.3.3,4.4.4.4}}
So it looks like group by in SQL, but generate new documents in elasticsearch instead of old one.
I dont think there is anyway to do indexing time auto-merging of documents.
However , it should be possible to acheive whatever result you are planning to query should be possible by using one of querying options offered by Elasticsearch - while indexing one document for ,
Like ..
You can index seperate documents, query by sourceIP and use aggregations to give dest_ip
Take count of documents if its just to find dest_ips for a source_ip
Also if you want to avoid duplicate source_id + dest_id combinations , you can concat and use it as _id of document
Hope this helps.

Resources