Search for documents with exactly different fields values - elasticsearch

I'm adding documents with the following strutucte
{
"proposta": {
"matriculaIndicacao": 654321,
"filial": 100,
"cpf": "12345678901",
"idStatus": "3",
"status": "Reprovada",
"dadosPessoais": {
"nome": "John Five",
"dataNascimento": "1980-12-01",
"email": "fulanodasilva#fulano.com.br",
"emailValidado": true,
"telefoneCelular": "11 99876-9999",
"telefoneCelularValidado": true,
"telefoneResidencial": "11 2211-1122",
"idGenero": "1",
"genero": "M"
}
}
}
I'm trying to perform a search with multiple field values.
I can successfull search for a document with a specific cpf atribute with the following search
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
}
}
}
But now I need to add an AND clause, like
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
,"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
}
but it's returning an error message.
P.S: If possible I would like to perform a search where if the field doesn't exist, it returns the document that matches only the proposta.cpf field.
I really appreciate any help.

The idea is to combine your constraints within a bool/should query
{
"query": {
"bool": {
"should": [
{
"term": {
"proposta.cpf": "23798770823"
}
},
{
"term": {
"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
]
}
}
}

Related

Elasticsearch: How to filter results with a specific word in a value using elasticsearch

I need to add a parameter to my search that filters results containing a specific word in a value. The query is searching for user history records and contains a url key. I need to filter out /history and any other url containing that string.
Here's my current query:
GET /user_log/_search
{
"size" : 50,
"query": {
"match": {
"user_id": 56678
}
}
}
Here's an example of a record, boiled down to just the value we're looking at:
"_source": {
"url": "/history?page=2&direction=desc",
},
How can the parameters of the search be changed to filter out this result.
You can use the filter param of boolean query in Elasticsearch.
if your url field is of type keyword, you can use the below query
{
"query": {
"bool": {
"must": {
"match": {
"user_id": 56678
}
},
"filter": { --> note filter
"term": {
"url": "/history"
}
}
}
}
}
I found a way to solve my specific issue. Instead of filtering on the url I'm filtering on a different value. Here's what I'm using now:
{
"size" : 50,
"query": {
"bool" : {
"must" : {
"match" : { "user_id" : 56678 }
},
"must_not": {
"match" : { "controller": "History" }
}
}
}
}
I'm still going to leave this question open for a while to see if anyone has other ways of solving the original problem.

Searching upon multiple date fields in ElasticSearch

I have a document like this:
{
"listings": {
"mappings": {
"listing": {
"properties": {
"auctionOn": {
"type": "date"
},
"inspections": {
"type": "nested",
"properties": {
"endsOn": {
"type": "date"
},
"startsOn": {
"type": "date"
}
}
},
// more fields.. snipped for brevity
}
}
}
}
}
and i would like to perform the following search: (needs to be a bool filter.. no scoring req'd)
return documents any of the inspections.startsOn matches any of the dates provided (if they are provided)
OR
return documents where auctionOn matches the date provided (if it's provided)
they can also specify to search for a) inspections only, b) auctions only. if not provided, either of the dates need to match.
So in other words, possible searches:
Search where there are any inspections/auctions
Search where there are any inspections
Search where there are any auctions
Search where there are any inspections/auctions on the dates provided
Search where there are any inspections on the dates provided
Search where there are any auctions on the dates provided
Now, i'm already in a bool query filter:
{
"query":
{
"bool":
{
"filter":[{"terms":{"location.suburb":["Camden"]}}
}
}
}
and i need this new filter to be seperate. so.. this is like a nested or filter, within a main bool filter?
So if provided "Suburb = Camden, Dates = ['2018-11-01','2018-11-02']'
then it should return documents where the suburb = Camden and either the inspections or auction date includes one of the dates provided.
I'm kinda stumped on how to do it, so any help would be much appreciated!
There will lot of bool query combinations for the cases you mentioned in the question. Taking the example you mention i.e.
So if provided "Suburb = Camden, Dates = ['2018-11-01','2018-11-02']'
then it should return documents where the suburb = Camden and either
the inspections or auction date includes one of the dates provided.
Assuming your location filter is working as expected, for dates part in the above e.g. additions to the query will be:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"location.suburb": [
"Camden"
]
}
},
{
"bool": {
"should": [
{
"terms": {
"auctionOn": [
"2018-11-01",
"2018-11-02"
]
}
},
{
"nested": {
"path": "inspections",
"query": {
"bool": {
"should": [
{
"terms": {
"inspections.startsOn": [
"2018-11-01",
"2018-11-02"
]
}
},
{
"terms": {
"inspections.endsOn": [
"2018-11-01",
"2018-11-02"
]
}
}
]
}
}
}
}
]
}
}
]
}
}
}

Elasticsearch: Search in an array of JSONs

I'm using Elasticsearch with the python library and I have a problem using the search query when the object become a little bit complex. I have objects build like that in my index:
{
"id" : 120,
"name": bob,
"shared_status": {
"post_id": 123456789,
"text": "This is a sample",
"urls" : [
{
"url": "http://test.1.com",
"displayed_url": "test.1.com"
},
{
"url": "http://blabla.com",
"displayed_url": "blabla.com"
}
]
}
}
Now I want to do a query that will return me this document only if in one of the displayed URL's a substring "test" and there is a field "text" in the main document. So I did this query:
{
"query": {
"bool": {
"must": [
{"exists": {"field": "text"}}
]
}
}
}
}
But I don't know what query to add for the part: one of the displayed URL's a substring "test"
Is that posssible? How does the iteration on the list works?
If you didn't define an explicit mapping for your schema, elasticsearch creates a default mapping based on the data input.
urls will be of type object
displayed_url will be of type string and using standard analyzer
As you don't need any association between url and displayed_url, the current schema will work fine.
You can use a match query for full text match
GET _search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "text"
}
},
{
"match": {
"urls.displayed_url": "test"
}
}
]
}
}
}

Boolean AND with exact matches oin Elasticsearch

In our Elasticsearch collection of products, we have an an array of hashes, called "nutrients". A partial example of the data would be:
"_source": {
"quantity": "150.0",
"id": 1001,
"barcode": "7610809001066",
"nutrients": [
{
"per_hundred": "1010.0",
"name_fr": "Énergie",
"per_portion": "758.0",
"name_de": "Energie",
"per_day": "9.0",
"name_it": "Energia",
"name_en": "Energy"
},
{
"per_hundred": "242.0",
"name_fr": "Énergie (kCal)",
"per_portion": "181.0",
"name_de": "Energie (kCal)",
"per_day": "9.0",
"name_it": "Energia (kCal)",
"name_en": "Energy (kCal)"
},
{
"per_hundred": "18.0",
"name_fr": "Matières grasses",
"per_portion": "13.5",
"name_de": "Fett",
"per_day": "19.0",
"name_it": "Grassi",
"name_en": "Fat"
},
In the search, we are trying to bring back the products based on an exact match of two of the fields contained in the nutrients array. What I am finding is the conditions seemed to be OR and not AND.
The two attempts have been:
"query": {
"bool": {
"must": [
{ "match": { "nutrients.name_fr": "Énergie" } },
{ "match": { "nutrients.per_hundred": "242.0" } }
]
}
}
}
and
"query": {
"filtered": {
"filter": {
"and": [
{ "term": { "nutrients.name_fr": "Énergie" } },
{ "term": { "nutrients.per_hundred": "242.0" } }
]
}
}
}
Both of these are in fact bringing back entries with Énergie and 242.0, but are also match on different name_fr, eg:
{
"per_hundred": "242.0",
"name_fr": "Acide folique",
"per_portion": "96.0",
"name_de": "Folsäure",
"per_day": "48.0",
"name_it": "Acido folico",
"name_en": "Folic acid"
},
They are also matching on a non exact match, i.e: matching also on "Énergie (kCal)" when we want to match only on "Énergie"
On your first problem:
You have to make the nutrients field nested, so you can query each object inside it for itself Elasticsearch Nested Objects.

Querystring search on array elements in Elastic Search

I'm trying to learn elasticsearch with a simple example application, that lists quotations associated with people. The example mapping might look like:
{
"people" : {
"properties" : {
"name" : { "type" : "string"},
"quotations" : { "type" : "string" }
}
}
}
Some example data might look like:
{ "name" : "Mr A",
"quotations" : [ "quotation one, this and that and these"
, "quotation two, those and that"]
}
{ "name" : "Mr B",
"quotations" : [ "quotation three, this and that"
, "quotation four, those and these"]
}
I would like to be able to use the querystring api on individual quotations, and return the people who match. For instance, I might want to find people who have a quotation that contains (this AND these) - which should return "Mr A" but not "Mr B", and so on. How can I achieve this?
EDIT1:
Andrei's answer below seems to work, with data values now looking like:
{"name":"Mr A","quotations":[{"value" : "quotation one, this and that and these"}, {"value" : "quotation two, those and that"}]}
However, I can't seem to get a query_string query to work. The following produces no results:
{
"query": {
"nested": {
"path": "quotations",
"query": {
"query_string": {
"default_field": "quotations",
"query": "quotations.value:this AND these"
}
}
}
}
}
Is there a way to get a query_string query working with a nested object?
Edit2: Yes it is, see Andrei's answer.
For that requirement to be achieved, you need to look at nested objects, not to query a flattened list of values but individual values from that nested object. For example:
{
"mappings": {
"people": {
"properties": {
"name": {
"type": "string"
},
"quotations": {
"type": "nested",
"properties": {
"value": {
"type": "string"
}
}
}
}
}
}
}
Values:
{"name":"Mr A","quotations":[{"value": "quotation one, this and that and these"}, {"value": "quotation two, those and that"}]}
{"name":"Mr B","quotations":[{"value": "quotation three, this and that"}, {"value": "quotation four, those and these"}]}
Query:
{
"query": {
"nested": {
"path": "quotations",
"query": {
"bool": {
"must": [
{ "match": {"quotations.value": "this"}},
{ "match": {"quotations.value": "these"}}
]
}
}
}
}
}
Unfortunately there is no good way to do that.
https://web.archive.org/web/20141021073225/http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/complex-core-fields.html
When you get a document back from Elasticsearch, any arrays will be in
the same order as when you indexed the document. The _source field
that you get back contains exactly the same JSON document that you
indexed.
However, arrays are indexed — made searchable — as multi-value fields,
which are unordered. At search time you can’t refer to “the first
element” or “the last element”. Rather think of an array as a bag of
values.
In other words, it is always considering all values in the array.
This will return only Mr A
{
"query": {
"match": {
"quotations": {
"query": "quotation one",
"operator": "AND"
}
}
}
}
But this will return both Mr A & Mr B:
{
"query": {
"match": {
"quotations": {
"query": "this these",
"operator": "AND"
}
}
}
}
If scripting is enabled, this should work:
"script": {
"inline": "for(element in _source.quotations) { if(element == 'this' && element == 'these') {return true;} }; return false;"
}

Resources