URL search using ElasticSearch not giving right result order - elasticsearch

GET firm_id_number/_search
{
"size": 20,
"from": 0,
"highlight": {
"order": "score",
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields": {
"question_text": {
"number_of_fragments": 0
},
"response_text_original": {
"number_of_fragments": 0
},
"question_text.shingles": {
"number_of_fragments": 0
},
"image_url": {
"number_of_fragments": 0
},
"response_text_original.shingles": {
"number_of_fragments": 0
},
"tags_text": {}
}
},
"docvalue_fields": [
"question_id",
"expiry_date",
"response_id",
"associated_entity.keyword",
"entity_type",
"is_verified",
"strategy_id",
"response_created_at",
"tags_text.keyword",
"associated_entity_type",
"response_type",
"associated_template.keyword",
"associated_investor_id",
"associated_entity_id",
"tags",
"associated_investor.keyword",
"duediligence_id",
"associated_template_id",
"fund_id",
"fund_firm_id",
"response_created_by",
"diligence_type",
"response_created_by_name.keyword",
"is_active",
"child_section_id",
"parent_section_id",
"verified_at",
"response_is_na",
"grid_id",
"verified_by_name.keyword",
"verified_by"
],
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{
"match": {
"comments.comment_text_original": "https://dvtestenvstorage.blob.core.windows.net/images/firm49012/financial-stock-market-graph-chart-investment-trading-stock-exchange-trading-market-screen-at-night-time-W38FWJ_1656075621.jpg"
}
}
]
}
},
"inner_hits": {
"_source": false,
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields": {
"comments.comment_text_original": {}
}
}
}
}
}
]
}
},
{
"bool": {
"must": [
{
"multi_match": {
"query": "https://dvtestenvstorage.blob.core.windows.net/images/firm49012/financial-stock-market-graph-chart-investment-trading-stock-exchange-trading-market-screen-at-night-time-W38FWJ_1656075621.jpg",
"type": "cross_fields",
"fields": [
"response_text_original^2",
"response_text_original.shingles",
"image_url"
]
}
}
]
}
}
]
}
},
{
"exists": {
"field": "question_text_vector"
}
}
],
"should": [],
"must_not": [],
"filter": [
{
"terms": {
"diligence_type": [
1243,
1242,
1241,
2,
-1,
0
]
}
}
]
}
},
"sort": [],
"_source": {
"includes": [
"response_text_original",
"question_text",
"comments",
"tags.tag_id",
"tags.tag_name",
"tags_text",
"question_sme",
"section_sme",
"comments",
"question_help_text",
"notes",
"grid_response",
"checkbox_response",
"attachment_response"
]
}
}
The above query gets the results for matching URLs present in the ES instance but the order of results isn't right. This query is being used for image search using image urls present in the ES_Instance. The exact match for the url is not showing as the first result, the best match is not showing for the exact same url. What might be wrong with this ES query?

Related

Limit the size per index when searching multiple index in Elastic

I have been following the guidelines from this post. I can get the desired output but in the same DSL how can I limit the size of results for each index ?
Full text Search with Multiple index in Elastic Search using NEST C#
POST http://localhost:9200/componenttypeindex%2Cprojecttypeindex/Componenttype%2CProjecttype/_search?pretty=true&typed_keys=true
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "componenttypeindex"
}
}
}
],
"must": [
{
"multi_match": {
"fields": [
"Componentname",
"Summary^1.1"
],
"operator": "or",
"query": "test"
}
}
]
}
},
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "projecttypeindex"
}
}
}
],
"must": [
{
"multi_match": {
"fields": [
"Projectname",
"Summary^0.3"
],
"operator": "or",
"query": "test"
}
}
]
}
}
]
}
}
}
With your given query, you could use aggregations to group and limit number of hits per index (in this case, limiting to 5):
{
"size": 0,
"query": {
... Same query as above ...
},
"aggs": {
"index_agg": {
"terms": {
"field": "_index",
"size": 20
},
"aggs": {
"hits_per_index": {
"top_hits": {
"size": 5
}
}
}
}
}
}

Elasticsearch multiple fields wildcard bool query

Currently using bool query which searches for a combination of both input words or either one of input word on field "Name". How to search on multiple fields using wild cards?
POST inventory_dev/_search
{"from":0,"query":{"bool":{"must":[{"bool":{"should":[{"term":{"Name":{"value":"dove"}}},{"term":{"Name":{"value":"3.75oz"}}},{"bool":{"must":[{"wildcard":{"Name":{"value":"*dove*"}}},{"wildcard":{"Name":{"value":"*3.75oz*"}}}]}}]}}]}},"size":10,"sort":[{"_score":{"order":"desc"}}]}
You can use query_string in place of wildcard query, to search on multiple fields
{
"from": 0,
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"Name": {
"value": "dove"
}
}
},
{
"term": {
"Name": {
"value": "3.75oz"
}
}
},
{
"bool": {
"must": [
{
"query_string": {
"query": "*dove*",
"fields": [
"field1",
"Name"
]
}
},
{
"query_string": {
"query": "*3.75oz*",
"fields": [
"field1",
"Name"
]
}
}
]
}
}
]
}
}
]
}
},
"size": 10,
"sort": [
{
"_score": {
"order": "desc"
}
}
]
}

Combining missing and term query in nested document in Elasticsearch

I have these 3 documents, where fields is of type nested:
{
"fields": [
{"field_id": 23, "value": "John Doe"},
{"field_id": 92, "value": null}
]
}
{
"fields": [
{"field_id": 23, "value": "Ada Lovelace"},
]
}
{
"fields": [
{"field_id": 23, "value": "Jack Daniels"},
{"field_id": 92, "value": "jack#example.com"}
]
}
I need to search for documents where:
(`field_id` = `92` AND `value` is `null`) OR (`field_id` `92` is missing.)
Combining a terms and missing query leads to only the document with the null value being returned:
...
"nested": {
"path": "fields",
"filter": {
"bool": {
"bool": {
"must": [
{
"missing": {
"field": "fields.value"
}
},
{
"terms": {
"fields.field_id": [92]
}
}
]
}
}
}
}
...
How can I do this?
You already have query for one condition. Lets call this A. For second condition check for fields.field_id: 92 in nested documents. Lets say this is B. But your condition is fields.field_id: 92 should not exist. So to achieve this wrap B in must_not. i.e. B'
What is required is A OR B'
So the final query will be:
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "fields",
"query": {
"bool": {
"must": [
{
"term": {
"fields.field_id": 92
}
}
],
"must_not": [
{
"exists": {
"field": "fields.value"
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"nested": {
"path": "fields",
"query": {
"term": {
"fields.field_id": 92
}
}
}
}
]
}
}
]
}
}
}

Elasticsearch query using more_like_this field renders a failed to parse search source. expected field name but got [START_OBJECT] error

We're using Elasticsearch 2.4.5. Have an application that can generate fairly complicated queries. I'm trying to add a more_like_this field to the query like so:
{
"query": {
"more_like_this": {
"fields": [
"title"
],
"ids": [
1234
],
"min_term_freq": 1,
"max_query_terms": 25
},
"function_score": {
"query": {
"bool": {
"must": [
{
"query_string": {
"default_operator": "AND",
"fields": [
"title",
"author"
],
"query": "((title:(\"Tale of Two Cities\"^2)))",
"lenient": true
}
}
],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"geo_distance": {
"distance": "50mi",
"location": {
"lat": 49.32,
"lon": -45.67
},
"distance_type": "plane",
"_cache": true
}
}
]
}
},
{
"term": {
"merged": 0
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "title_type"
}
}
}
}
]
}
}
}
},
"functions": [
{
"field_value_factor": {
"field": "quality_score",
"factor": 1,
"missing": 0
}
}
]
}
},
"filter": {
"bool": {
"must": []
}
},
"sort": "_score",
"size": 20,
"from": 0
}
I'm getting a failed to parse search source. expected field name but got [START_OBJECT] error when I try to run the above code. When I remove that piece of code the query executes correctly. I've looked at documentation and other examples of more_like_this usage and I can't determine what's wrong with my query. I'm assuming it has something to do with the way the rest of the query is formed.

How to query multiple parameters in a nested field in elasticsearch

I'm trying to search for keyword and then add nested queries for amenities which is a nested field of an array of objects.
With the query below I am able to search when I'm only matching one amenity id but when I have more than one it doesn't return anything.
Anyone have an idea what is wrong with my query ?
{
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"_geo_distance": {
"geolocation": [
100,
10
],
"order": "asc",
"unit": "m",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"name^2",
"city",
"state",
"zip"
],
"fuzziness": 5,
"query": "complete"
}
},
{
"nested": {
"path": "amenities",
"query": {
"bool": {
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}
]
}
}
}
}
]
}
}
}
When you do:
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]
What you're actually saying is find me any document where "amenities.id"="1" and "amenities.id"="2" which unless "amenities.id" is a list of values it won't work.
What you probably want to say is find me any document where "amenities.id"="1" or "amenities.id"="2"
To do that you should use should instead of must:
"should": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]

Resources