ElasticSearch - Search for value on nested object under any key - elasticsearch

I have documents indexed in ES with the following structure:
doc 1:
{
"map": {
"field1": ["foo"],
"field2": ["bar"]
}
}
doc2
{
"map": {
"fieldN": ["foo"],
}
}
I need to search all the documents that match a specific value under any key in the "map" object. Since the fields in "map" are dynamic, the value can be found under any key.
I tried different queries but none of them seems to work since it looks like for all the cases, I need to specify the field explicitly (ex.: map.field1 = "foo")
I would hope to be able to do a search like this:
{
"fields": ["map.*"],
"query": "foo"
}
Any recommendations on how to approach this type of search?

You can use multi-match query, to search for a query term on map.* fields
{
"query": {
"multi_match" : {
"query": "foo",
"fields": [ "map.*" ]
}
}
}

Related

How to use Wildcards in Elastic search query to skip some prefix values

"I am searching in a elasticsearch cluster GET request on the basis of sourceID tag with value :- "/A/B/C/UniqueValue.xml" and search query looks like this:-"
{
"query": {
"bool": {
"must": [
{
"term": {
"source_id": {
"value": "/A/B/C/UniqueValue.xml"
}
}
}
]
}
}
}
"How can i replace "/A/B/C" from any wildcard or any other way as i just have "UniqueValue.xml" as an input for this query. Can some please provide the modified search Query for this requirement? Thanks."
The following search returns documents where the source_id field contains a term that ends with UniqueValue.xml.
{
"query": {
"wildcard": {
"source_id": {
"value": "*UniqueValue.xml"
}
}
}
}
Note that wildcard queries are expensive. If you need fast suffix search, you could add a multi-field to your mapping which includes a reverse token filter. Then you can use prefix queries on that reversed field.

is there match phrase any query in elasticsearch?

In elasticsearch match_phrase query will match full phrase.
match_phrase_prefix query will match phrase as prefix.
for example:
"my_field": "confidence ab"
will match: "confidence above" and "confidence about".
is there query for "match phrase any" like below example:
"my_field": "dence ab"
should fetch match: "confidence above" and "confidence about"
Thanks
There are 2 ways that you can do this
Store the field values as-is in ES by applying keyword analyzer type in mapping => Do a wildcard search
(OR)
Store the field using ngram tokenizer => Do search your data based on your requirement with or without using standard or keyword search analyzers
usually wildcard search are performance inefficient .
Please do let me know on your progress based on my above suggestions so that I can help you further if needed
You need to define the mapping of your field to keyword like below:
PUT test
{
"mappings": {
"properties": {
"name":{
"type": "keyword"
}
}
}
}
Then search over this field using wildcard like below:
GET test/_search
{
"query": {
"wildcard": {
"name": {
"value": "*dence ab*"
}
}
}
}
Please let me know if your have any problem with this.
In your case, the simplest solution is using Query string query or Simple query string query. The latter one is less strict with the query syntax error.
First, make sure that your field is mapped with type text. The example below create a mapping for field named my_field under the test-index.
{
"test-index" : {
"mappings" : {
"properties" : {
"my_field" : {
"type" : "text"
}
}
}
}
}
Then, for searching, use query string query with wild-cards.
{
"query": {
"query_string": {
"fields": ["my_field"],
"query": "*dence ab*"
}
}
}

handling both exact and partial search on the same search string

I want to define the schema which can tackle the partial as well as the exact search for the same search value.
The exact search should always return the "exact match", ES should not break the search string into tokens in this case.
For partial match data type of the property should be text and for exact it should be keyword. For having the feasibility to have both partial and exact search without having to index the data to different properties you can leverage using fields. What it does is that it helps to index same data into different ways.
So, lets say you want to index name of persons, and have the ability for partial and exact search. In such case the mapping would be:
PUT test
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
Lets index a few docs:
PUT test/_doc/1
{
"name": "Nishant Saini"
}
PUT test/_doc/2
{
"name": "Nishant Kumar"
}
For partial search we have to query name field and it is of type text.
GET test/_doc/_search
{
"query": {
"query_string": {
"query": "Nishant Saini",
"field": [
"name"
]
}
}
}
The above query will return both docs (1 and 2) because one token i.e. Nishant appears in both the document for field name.
For exact search we need to query on name.keyword. To perform exact match we can use term query as below:
{
"query": {
"term": {
"name.keyword": "Nishant Saini"
}
}
}
This would match doc 1 only.

Elasticsearch: get all documents where array contains one of many values

I have the following document data structure in Elasticsearch:
{
"topics": [ "a", "b", "c", "d" ]
}
I have a selection list where the user can filter which topics to show. When the user is OK with their filter, they will be presented with all documents that have any of the topics they selected in the array "topics"
I've tried the query
{
"query": {
"terms": {
"topics": ["a", "b"]
}
}
}
but this returns no results.
To expand on the query. For example, the list ["a", "b"] would match the first, second and third objects in the array below.
Is there a good way to do this in Elasticsearch? Obviously I could do multiple "match" queries but that's verbose as I have hundreds of topics
Edit: my mapping
{
"fb-cambodia-post": {
"mappings": {
"scrapedpost": {
"properties": {
"topics": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
}
As #Filip cordas mentioned you can use topic.keyword like.
{
"query": {
"terms": {
"topics.not_analyzed": [
"A" , "B"
]
}
}
}
This will do case sensitive search .It Will look for exact match. In case you want case-insensitive search you can use query_string like:
{
"query": {
"query_string": {
"default_field": "topics",
"query": "A OR B"
}
}
}
I will give some more info on the problem. The query with the data you added ("a", "b", "c") will work but if the topics have casing or multiple words it won't. This is due to the analyzer applied to the topic field. When you add a string value to ElasitcSearch it will by default use the standard analyzer. The terms query only compares raw terms as they are put. So if you have something like "Topic1" in the document and you search "terms":["Topic1"] it won't return any value because the term in standard analyzer is lowercased and the query that will return the value will be "terms":["topic1"]. As of 5.0 elastic added the default "keyword" subfield that stores the data with the keyword analyzer. And it stores it as is no transformation is applied. Terms on that field "terms.keyword":["Topic1"] will get you the values, but "terms.keyword":["topic1"] won't. What the match query dose is apply the filter on the input string as well and so you get the right result.

How to make use of `gt` and `fields` in the same query in Elasticsearch

In my previous question, I was introduced to the fields in a query_string query and how it can help me to search nested fields of a document.
{
"query": {
"query_string": {
"fields": ["*.id","id"],
"query": "2"
}
}
}
But it only works for matching, what if I want to do some comparison? After some reading and testing, it seems queries like range do not support fields. Is there any way I can perform a range query, e.g. on a date, over a field that can be scattered anywhere in the document hierarchy?
i.e. considering the following document:
{
"id" : 1,
"Comment" : "Comment 1",
"date" : "2016-08-16T15:22:36.967489",
"Reply" : [ {
"id" : 2,
"Comment" : "Inner comment",
"date" : "2016-08-16T16:22:36.967489"
} ]
}
Is there a query searching over the date field (like date > '2016-08-16T16:00:00.000000') which matches the given document, because of the nested field, without explicitly giving the address to Reply.date? Something like this (I know the following query is incorrect):
{
"query": {
"range" : {
"date" : {
"gte" : "2016-08-16T16:00:00.000000",
},
"fields": ["date", "*.date"]
}
}
}
The range query itself doesn't support it, however, you can leverage the query_string query (again) and the fact that you can wildcard fields and that it supports range queries in order to achieve what you need:
{
"query": {
"query_string": {
"query": "\*date:[2016-08-16T16:00:00.000Z TO *]"
}
}
}
The above query will return your document because Reply.date matches *date

Resources