Need Some Help in Query String in ElasticSearch 6.4.3 - elasticsearch

Suppose I want to count the number of matching results
POST /_count
the following are the bodyJSON
{
"size": "1",
"from": "0",
"track_scores": true,
"sort": [
{
"employee_id": "asc"
}
],
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"content",
"title"
],
"query": "Winter is coming"
}
},
"filter": {
"range": {
"employee_id": {
"gte": "34222232"
}
}
}
}
}
}
Do you know what the code means in the following code?
"query_string": {
"fields": [
"content",
"title"
],
"query": "Winter is coming"
}
and this one
"filter": {
"range": {
"employee_id": {
"gte": "34222232"
}
}
}
Any comment would be appreciated. Thanks

The query_string query helps you find some text in multiple fields. In this case, you're searching for the tokens Winter is coming in the content and title fields.
"query_string": {
"fields": [
"content",
"title"
],
"query": "Winter is coming"
}
The range query is a term query that allows you to filter on the value of some field. In this case, you're considering only documents whose employee_id field is greater or equal (i.e. gte) than 34222232
"filter": {
"range": {
"employee_id": {
"gte": "34222232"
}
}
}
Both together mean that you're looking to find documents with employee_id > 34222232 and whose title or content fields contain the tokens Winter is coming

Related

Elastic Search : Search keyword results of a specific category

I'm trying to build a query where I'm trying to search for names of people of a specific country. If I provide input as John and USA, I should only find results of people by the name John (by the property : name) from USA (by the property : country) and results from other countries shouldn't appear in the results.
What I have tried :
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
With the above query the problem I'm seeing is that the results also show people **who don't have their name as John but are from USA
**.
Expectation : To filter results of given keyword specific to given country.
Instead of using should you need to use must clause in your name query.
Below query should give you expected results. refer boolean query official doc to understand the difference with examples.
"query": {
"bool": {
"must": [ --> note `must` here
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
You are using should clause thats why it is not working. You can use must insted of should and it will resolved your issue.
You can use "type":"phrase_prefix" to match Jo with John.
You can change your query as shown below and it will work:
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"type":"phrase_prefix",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}

Query String in Elasticsearch 6.4.3

I want to know the difference between these queries
{
"size": "1",
"from": "0",
"track_scores": true,
"sort": [
{
"employee_id": "asc"
}
],
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"content",
"title"
],
"query": "\"Macro Medium\""
}
}
}
When compared to this code
{
"size": "1",
"from": "0",
"track_scores": true,
"sort": [
{
"employee_id": "asc"
}
],
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"content",
"title"
],
"query": "Macro Medium"
}
}
I want to know the difference between "query": "\"Macro Medium\"" and "query": "Macro Medium" in Elasticsearch 6.4.3. Any feedback would be appreciated.
Thanks
query string according to your analyzer (default is standard analyzer), analyze your string and break it in two word (Macro, Medium). then as default, use this two word on should term query (OR). also you can change it to AND ("default_operator" :"AND").
with \"phrase\" you force the elastic to not break the string.

Elasticsearch: Conditionally filter query on fields if they exist in multi-index query

I have a query for a general search which spans multiple indices. Some of the indices have a field called is_published and some have a field called date_review, some have both.
I'm struggling to write a query which will search across fields and filter on the fields mentioned above but only if they exist. I have managed to achieve what I want on the individual fields using missing and/or exists, but it excludes the other variants.
In english, I want to keep documents in the result where:
is_published is true OR the field does not exist
date_review is in the future OR the field does not exist
So, if a document has is_published and it's false, remove it. If a document has date_review in the past, remove it. If it has is_published == false and date_review is in the future, remove it.
I hope this makes sense?
For the purpose of answering, assume the documents might look like this:
// Has `is_published` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
}
// Has `date_review` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"date_review": "2017-01-01"
}
// Has both `is_published` and `date_review` flags
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
"date_review": "2017-01-01"
}
At the moment, my [unfiltered] query looks like this:
{
"index": "index-1,index-2,index-3",
"type": "item",
"body": {
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body",
]
}
},
"filter": []
}
}
}
}
Very grateful for any pointers.
Thanks.
You can try a query like this one:
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body"
]
}
},
"filter": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "is_published"
}
},
{
"term": {
"is_published": true
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "date_review"
}
},
{
"range": {
"date_review": {
"gt": "now"
}
}
}
]
}
}
]
}
}
}
}
}

Elasticsearch Multipal queries with limit

I am trying to write an Elasticsearch query where I match multiple words in my title and description. The below code works fine but it gives all the articles matching those words. My aim is I need 4 articles per query word for e.g. 4 results of Tim Cook and four articles of Steve Jobs
{
"query": {
"multi_match": {
"query": ["Tim Cook","Steve Jobs"],
"fields": ["Title", "Description" ],
"operator":"AND"
}
}
}
Top hits aggregations are what you are looking for -
Basically give 2 filter aggregation and then nest top hits aggregation side them.
So something like below should work fine
{
"size": 0,
"query": {
"multi_match": {
"query": [
"Tim Cook",
"Steve Jobs"
],
"fields": [
"Title",
"Description"
],
"operator": "AND"
}
},
"aggs": {
"tim": {
"aggs": {
"top_hits": {}
},
"filter": {
"query": {
"multi_match": {
"query": [
"Tim Cook"
],
"fields": [
"Title",
"Description"
],
"operator": "AND"
}
}
}
},
"steve": {
"aggs": {
"top_hits": {}
},
"filter": {
"query": {
"multi_match": {
"query": [
"Steve Jobs"
],
"fields": [
"Title",
"Description"
],
"operator": "AND"
}
}
}
}
}
}

elasticsearch boosting slowing query

this is a very novice question but I'm trying to understand how
boosting certain elements in a document works.
I started with this query,
{
"from": 0,
"size": 6,
"fields": [
"_id"
],
"sort": {
"_score": "desc",
"vendor.name.stored": "asc",
"item_name.stored": "asc"
},
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"_all"
],
"query": "Calprotectin",
"default_operator": "AND"
}
},
"filter": {
"and": [
{
"query": {
"query_string": {
"fields": [
"targeted_countries"
],
"query": "All US"
}
}
}
]
}
}
}
}
then i needed to boost certain elements in the document more than the others
so I did this
{
"from": 0,
"size": 60,
"fields": [
"_id"
],
"sort": {
"_score": "desc",
"vendor.name.stored": "asc",
"item_name.stored": "asc"
},
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"item_name^4",
"vendor^4",
"id_plus_name",
"category_name^3",
"targeted_countries",
"vendor_search_name^4",
"AdditionalProductInformation^0.5",
"AskAScientist^0.5",
"BuyNowURL^0.5",
"Concentration^0.5",
"ProductLine^0.5",
"Quantity^0.5",
"URL^0.5",
"Activity^1",
"Form^1",
"Immunogen^1",
"Isotype^1",
"Keywords^1",
"Matrix^1",
"MolecularWeight^1",
"PoreSize^1",
"Purity^1",
"References^1",
"RegulatoryStatus^1",
"Specifications/Features^1",
"Speed^1",
"Target/MoleculeDescriptor^1",
"Time^1",
"Description^2",
"Domain/Region/Terminus^2",
"Method^2",
"NCBIGeneAliases^2",
"Primary/Secondary^2",
"Source/ExpressionSystem^2",
"Target/MoleculeSynonym^2",
"Applications^3",
"Category^3",
"Conjugate/Tag/Label^3",
"Detection^3",
"GeneName^3",
"Host^3",
"ModificationType^3",
"Modifications^3",
"MoleculeName^3",
"Reactivity^3",
"Species^3",
"Target^3",
"Type^3",
"AccessionNumber^4",
"Brand/Trademark^4",
"CatalogNumber^4",
"Clone^4",
"entrezGeneID^4",
"GeneSymbol^4",
"OriginalItemName^4",
"Sequence^4",
"SwissProtID^4",
"option.AntibodyProducts^4",
"option.AntibodyRanges&Modifications^1",
"option.Applications^4",
"option.Conjugate^3",
"option.GeneID^4",
"option.HostSpecies^3",
"option.Isotype^3",
"option.Primary/Secondary^2",
"option.Reactivity^4",
"option.Search^1",
"option.TargetName^1",
"option.Type^4"
],
"query": "Calprotectin",
"default_operator": "AND"
}
},
"filter": {
"and": [
{
"query": {
"query_string": {
"fields": [
"targeted_countries"
],
"query": "All US"
}
}
}
]
}
}
}
}
the query slowed down considerably, am I doing this correctly? Is there a
way to speed it up? I'm currently in the process of doing the boosting when I index the document, but using it in the query that way is best for the way my application runs. Any help is much appreciated
Query time boosting is used for assigning larger weight to a term. If you want to permanently boost a field, use index time boosting. If you don't want to use this boosting all the time, then it makes sense to create a separate mapping just for it with store: "no" set.

Resources