Python OpenSearch retrieve records based on element in a list - elasticsearch

So I need to retrieve records based on a field called "cash_transfer_ids" which is a python list.
I want to retrieve all records whose cash_transfer_ids contain a specific id value (a string).
What should the query be look like? Should I use match or term query?
Example: I want to retrieve any record whose cash_transfer_ids field contains 'abc'
Then I may get record such as
record 1: cash_transfer_ids:['abc']
record 2: cash_transfer_ids:['dfdfd', 'abc']
etc...
Thanks very much for any help!

if cash_transfer_ids is type keyword I try filter with Term.
term = "abc"
query = {
"query": {
"term": {
"cash_transfer_ids": {
"value": term
}
}
}
}
response = get_client_es().search(index="idx_test", body=query)

Related

Exact match over decimal values

I want to perform an exact match over decimal values.
I have submitted two applications , for first application with annual salary as 99999868.10 and the other as 99999868.99.
When I do a query for 99999868 or I search 99999868.10 it returns me both the data , whereas I expect it to return only the exact match for it
The query I am executing is :
GET index/_search
{"query": {
"term": {
"Annual Salary": {
"value": "99999868"
}
}
}
}
Change mapping of salary field to numeric type and re index data
Numeric type reference : - https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html
use match_phrase and let me know. Actually, it will solve your problem.

Group by field in found document

The best way to explain what I want to accomplish is by example.
Let us say that I have an object with fields name and color and transaction_id. I want to search for documents where name and color match the specified value and that I can accomplish easily with boolean queries.
But, I do not want only documents which were found with search query. I also want transaction to which those documents belong, and that is specified with transaction_id. For example, if a document has been found with transaction_idequal to 123, I want my query to return all documents with transaction_idequal to 123.
Of course, I can do that with two queries, first one to fetch all documents that match criteria, and the second one that will return all documents that have one of transaction_idvalues found in first query.
But is there any way to do it in a single query?
You can use parent-child relation ship between transaction and your object. Or nest the denormalize your data to include the objects in the transactions. Otherwise you'll have to do an application side join, meaning 2 queries.
Try an index mapping similar to the following, and include a parent_id in the objects.
{
"mappings": {
"transaction": {},
"object": {
"_parent": {
"type": "transaction"
}
}
}
}
Further reading:
https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-mapping.html

Searching for multiple values in a String array in Elastic

I have a field that I am indexing into Elasticsearch that is an array of strings. So, for example, here is what the string array will look like in two records:
Record 1: {"str1", str2", str3", "str4", "str5"}
Record 2: {"str1", str2", str6", "str7", "str8"}
Question 1: I want to be able to query for multiple strings in this array. For e.g. my query has "str1", "str2". "str3" as the search parameter. I want to search for records where the string array has any of these three strings
Question 2: For the scenario above will Record 1 return with a higher score than record 2 (since all three strings are in the array for record 1 but only two are there in record 2).
Is this possible at all? Can you please help with what the query should look like and if the scoring works the way I stated.
You can index them as an array, such as:
{
"myArrayField": [ "str1", str2", str3", "str4", "str5" ],
...
}
You would then be able to query a number of ways, the simplest for your case being a match query (which is analyzed):
{
"match" : {
"myArrayField" : "str1 str2 str3"
}
}
Or a terms query (which is not analyzed):
{
"terms" : {
"myArrayField" : [ "str1", "str2", "str3" ]
}
}
And Yes, matches against more query terms will receive a higher score, so Record 1 would be scored higher than Record 2.

Project the sum of all fields in a document that match a regular expression, in elasticsearch

In Elasticsearch, I know I can specify the fields I want to return from documents that match my query using {"fields":["fieldA", "fieldB", ..]}.
But how do I return the sum of all fields that match a particular regular expression (as a new field)?
For example, if my documents look like this:
{"documentid":1,
"documentStats":{
"foo_1_1":1,
"foo_2_1":5,
"boo_1_1:3
}
}
and I want the sum of all stats that match _1_ per document?
You can define an artificial field called script_field that contains a small Groovy script, which will do the job for you.
So after your query, you can add a script_fields section like this:
{
"query" : {
...
},
"script_fields" : {
"sum" : {
"script" : "_source.documentStats.findAll{ it.key =~ '_1_'}.collect{it.value}.sum()"
}
}
}
What the script does is simply to retrieve all the fields in documentStats whose name matches _1_ and sums all their values, in this case, you'll get 4.
Make sure to enable dynamic scripting in elasticsearch.yml and restart your ES node before trying this out.

To Select documents having same startDate and endDate

I have some documents where in each document , there is a startDate and endDate date fields. I need all documents with both these value as same. I couldn't find any query which will help me to do it.
Elasticsearch supports script filters, which you can use in this case . More Info
Something like this is what you will need -
POST /<yourIndex>/<yourType>/_search?
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "doc['startDate'].value == doc['endDate'].value"
}
}
}
}
}
This can be achieved in 2 manner
Index solution - While indexing add an additional field called isDateSame and set it to true or false based on the value of startDate and endDate. Then you can easily do a query based on that field. This is the best optimized solution
Script solution - Elasticsdearch maintains all the indexed data in field data which is more like a reverse reverse index. Using script you can access any indexed fields and do comparison. This is pretty fast but not as good as first one.You can use the following query for the same

Resources