Elasticsearch: get all documents where array contains one of many values - elasticsearch

I have the following document data structure in Elasticsearch:
{
"topics": [ "a", "b", "c", "d" ]
}
I have a selection list where the user can filter which topics to show. When the user is OK with their filter, they will be presented with all documents that have any of the topics they selected in the array "topics"
I've tried the query
{
"query": {
"terms": {
"topics": ["a", "b"]
}
}
}
but this returns no results.
To expand on the query. For example, the list ["a", "b"] would match the first, second and third objects in the array below.
Is there a good way to do this in Elasticsearch? Obviously I could do multiple "match" queries but that's verbose as I have hundreds of topics
Edit: my mapping
{
"fb-cambodia-post": {
"mappings": {
"scrapedpost": {
"properties": {
"topics": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
}

As #Filip cordas mentioned you can use topic.keyword like.
{
"query": {
"terms": {
"topics.not_analyzed": [
"A" , "B"
]
}
}
}
This will do case sensitive search .It Will look for exact match. In case you want case-insensitive search you can use query_string like:
{
"query": {
"query_string": {
"default_field": "topics",
"query": "A OR B"
}
}
}

I will give some more info on the problem. The query with the data you added ("a", "b", "c") will work but if the topics have casing or multiple words it won't. This is due to the analyzer applied to the topic field. When you add a string value to ElasitcSearch it will by default use the standard analyzer. The terms query only compares raw terms as they are put. So if you have something like "Topic1" in the document and you search "terms":["Topic1"] it won't return any value because the term in standard analyzer is lowercased and the query that will return the value will be "terms":["topic1"]. As of 5.0 elastic added the default "keyword" subfield that stores the data with the keyword analyzer. And it stores it as is no transformation is applied. Terms on that field "terms.keyword":["Topic1"] will get you the values, but "terms.keyword":["topic1"] won't. What the match query dose is apply the filter on the input string as well and so you get the right result.

Related

ElasticSearch - Search for value on nested object under any key

I have documents indexed in ES with the following structure:
doc 1:
{
"map": {
"field1": ["foo"],
"field2": ["bar"]
}
}
doc2
{
"map": {
"fieldN": ["foo"],
}
}
I need to search all the documents that match a specific value under any key in the "map" object. Since the fields in "map" are dynamic, the value can be found under any key.
I tried different queries but none of them seems to work since it looks like for all the cases, I need to specify the field explicitly (ex.: map.field1 = "foo")
I would hope to be able to do a search like this:
{
"fields": ["map.*"],
"query": "foo"
}
Any recommendations on how to approach this type of search?
You can use multi-match query, to search for a query term on map.* fields
{
"query": {
"multi_match" : {
"query": "foo",
"fields": [ "map.*" ]
}
}
}

Exact match search on text field

I'm using ElasticSearch to search data. My data contains text field and when I tried to match query on input, it outputs the input with another string.
_mapping
"direction": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Elastic Data
[
{
direction: "North"
},
{
direction: "North East"
}
]
Query
{
match: {
"direction" : {
query: "North",
operator : "and"
}
}
}
Result
[
{
direction: "North"
},
{
direction: "North East"
}
]
Expected Result
[
{
direction: "North"
}
]
Noted: It should output exact match direction
You may want to look at Term Queries which are used on keyword datatype to perform exact match searches.
POST <your_index_name>/_search
{
"query": {
"term": {
"direction.keyword": {
"value": "North"
}
}
}
}
The reason you observe what you observe, is because you are querying on Text field using Match Query. The values of the text field are broken down into tokens which are then stored in inverted indexes. This process is called Analysis. Text fields are not meant to be used for exact match.
Also note that whatever words/tokens you'd mention in Match Query, they would also go through the analysis phase before getting executed.
Hope it helps!
Based on you mapping, you should not search on field direction but on direction.keyword if you want exact match. The field direction is type text and gets analyzed - in your case to the words north and east.
Try this
{ "query" : { "bool" : { "must": { "term": { "direction": "North" } } } } }

handling both exact and partial search on the same search string

I want to define the schema which can tackle the partial as well as the exact search for the same search value.
The exact search should always return the "exact match", ES should not break the search string into tokens in this case.
For partial match data type of the property should be text and for exact it should be keyword. For having the feasibility to have both partial and exact search without having to index the data to different properties you can leverage using fields. What it does is that it helps to index same data into different ways.
So, lets say you want to index name of persons, and have the ability for partial and exact search. In such case the mapping would be:
PUT test
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
Lets index a few docs:
PUT test/_doc/1
{
"name": "Nishant Saini"
}
PUT test/_doc/2
{
"name": "Nishant Kumar"
}
For partial search we have to query name field and it is of type text.
GET test/_doc/_search
{
"query": {
"query_string": {
"query": "Nishant Saini",
"field": [
"name"
]
}
}
}
The above query will return both docs (1 and 2) because one token i.e. Nishant appears in both the document for field name.
For exact search we need to query on name.keyword. To perform exact match we can use term query as below:
{
"query": {
"term": {
"name.keyword": "Nishant Saini"
}
}
}
This would match doc 1 only.

Elasticsearch with AND query in DSL

this drives me crazy. I have no clue why this elastic search do not return me value.
I put values with this:
PUT /customer/person-test/1?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q1"
}
PUT /customer/person-test/2?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q2"
}
and when I query like this, it do not returns me value:
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "John Doe"
}
},
{
"term": {
"quarter": "2016_Q1"
}
}
]
}
}
}
this query i copied from A simple AND query with Elasticsearch
I just want to get the person with "John Doe" AND "2016_Q1", why this did not work?
You should use match instead of term :
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"match": {
"name": "John Doe"
}
},
{
"match": {
"quarter": "2016_Q1"
}
}
]
}
}
}
Explanation
Why doesn’t the term query match my document ?
String fields can be of type text (treated as full text, like the body
of an email), or keyword (treated as exact values, like an email
address or a zip code). Exact values (like numbers, dates, and
keywords) have the exact value specified in the field added to the
inverted index in order to make them searchable.
However, text fields are analyzed. This means that their values are
first passed through an analyzer to produce a list of terms, which are
then added to the inverted index.
There are many ways to analyze text: the default standard analyzer
drops most punctuation, breaks up text into individual words, and
lower cases them. For instance, the standard analyzer would turn the
string “Quick Brown Fox!” into the terms [quick, brown, fox].
This analysis process makes it possible to search for individual words
within a big block of full text.
The term query looks for the exact term in the field’s inverted
index — it doesn’t know anything about the field’s analyzer. This
makes it useful for looking up values in keyword fields, or in numeric
or date fields. When querying full text fields, use the match query
instead, which understands how the field has been analyzed.
...
its not working because of u r using default standard analyzer link for 'name' and 'quarter' .
You have two more options :-
1)change mapping :-
"name": {
"type": "string",
"index": "not_analyzed"
},
"quarter": {
"type": "string",
"index": "not_analyzed"
}
2)try this , lowercase your value since by default standard analyzer use Lower Case Token Filter :-
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "john_doe"
}
},
{
"term": {
"quarter": "2016_q1"
}
}
]
}
}
}

Elastic search highlight not working

I am new to elastic search and i am trying to highlight the matched keywords
GET /{index}/_search
{
"query": {
"match": {
"_all": "first"
}
},
"highlight": {
"fields": {
"*": {}
},
"require_field_match": false
}
}
My output is a nested object.I also tried without "require_field_match" parameter
You can use one of the 2 methods mentioned in below link to search and highlight on all fields
A field can only be used for highlighting if the original string value
is available, either from the _source field or as a stored field.
The _all field is not present in the _source field and it is not
stored or enabled by default, and so cannot be highlighted. There are
two options. Either store the _all field or highlight the original
fields.
Highlight all fields
you can't produce a highlight with a search from the _all field.
You have to search in an actual field for it to work:
GET /{index}/_search
{
"query": {
"match": {
"title": "first"
}
},
"highlight": {
"fields": {
"title": {}
}
}
}

Resources