How do I construct this elasticsearch query object? - elasticsearch

My documents are indexed like this:
{
title: "stuff here",
description: "stuff here",
keywords: "stuff here",
score1: "number here",
score2: "number here"
}
I want to perform a query that:
Uses the title, description, and keywords fields for matching the text terms.
It doesn't have to be complete match. Eg. If someone searches "I have a big nose", and "nose" is in one of the document titles but "big" is not, then this document should still be returned.
Edit: I tried this query and it works. Can someone confirm if this is the right way to do it? Thanks.
{
query:{
'multi_match':{
'query': q,
'fields': ['title^2','description', 'keywords'],
}
}
}

Your way is definitely the way to go!
The multi_match query is usually the one that you want to expose to the end users, while the query_string is similar, but also more powerful and dangerous since it exposes the lucene query syntax. Rule of thumb: don't use query string if you don't need it.
Also, searching on multiple fields is easy just providing the list of fields you want to search on, as you did, without the need for a bool query.

Below is the code that will create the query you can use. I wrote it in c# but it will work in other languages in the same way.
What you need to do is to create a BooleanQuery and set that at least 1 of its condition has to match. Then add a condition for every document field you want to be checked with Occur.SHOULD enum value:
BooleanQuery searchQuery = new BooleanQuery();
searchQuery.SetMinimumNumberShouldMatch(1);
QueryParser parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "title", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
Query titleQuery = parser.Parse("I have a big nose");
searchQuery.Add(titleQuery, BooleanClause.Occur.SHOULD);
parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "description", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
Query descriptionQuery = parser.Parse("I have a big nose");
searchQuery.Add(titleQuery, BooleanClause.Occur.SHOULD);
parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "keywords", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
Query keywordsQuery = parser.Parse("I have a big nose");
searchQuery.Add(titleQuery, BooleanClause.Occur.SHOULD);
Query queryThatShouldBeExecuted.Add(searchQuery, BooleanClause.Occur.MUST);
Here is the link to an example in java http://www.javadocexamples.com/java_source/org/apache/lucene/search/TestBooleanMinShouldMatch.java.html

The according JSON object to perform a HTTP Post request would be this:
{
"bool": {
"should": [
{
"query_string": {
"query": "I have a big nose",
"default_field": "title"
}
},
{
"query_string": {
"query": "I have a big nose",
"default_field": "description"
}
},
{
"query_string": {
"query": "I have a big nose",
"default_field": "keywords"
}
}
],
"minimum_number_should_match": 1,
"boost": 1
}
}

Related

How can we make few tokens to be phrase in elastic search query

I want to search part of query to be considered as phrase .For e.g. I want to search "Can you show me documents for Hospitality and Airline Industry"
Here I want Airline Industry to be considered as phrase.I dont find any such settings in multi_match .
Even when we try to use multi_match query using "Can you show me documents for Hospitality and \"Airline Industry\"" .Default analyser breaks it into separate tokens.I dont want to change settings of my analyser.Also I have found that we can do this in simple_query_string but that has consequences that we can not apply filter option as we have in multi_match boolean query because I want to apply filter on certain feilds as well.
search_text="Can you show me documents for Hospitality and Airline Industry" Now I Want to pass Airline Industry as a phrase to search my indexed document against 2 fields.
okay so say I have existing code like this.
If filter:
qry={
“query":{
“bool”:{
“must”:{
"multi_match":{
"query":search_text,
"type":"best_fields",
"fields":["TITLE1","TEXT"],
"tie_breaker":0.3,
}
},
“filter”:{“terms”:{“GRP_CD”:[“1234”,”5678”] }
}
}
else:
qry={
"query":{
"multi_match":{
"query":search_text',
"type":"best_fields",
"fields":["TITLE1",TEXT"],
"tie_breaker":0.3
}
}
}
'But then I have realised this code is not handling Airline Industry as a phrase even though I am passing search string like this
"Can you show me documents for Hospitality and \"Airline Industry\""
As per elastic search document I came to know there is this query which might handle this
qry={"query":{
"simple_query_string":{
"query":"Can you show me documents for Hospitality and \"Airline Industry\"",
"fields":["TITLE1","TEXT"] }
} }
But now my issue is what if user want to apply filter..with filter query as above I can not pass phrase and boolean query is not possible with simple_query_string'
You can always combine queries using boolean query. Lets understand this case by case. Before going to the cases I would like to clarify one thing which is about filter. The filter clause of boolean query behave just like a must clause but the difference is that any query (even another boolean query with a must/should clause(s)) inside filter clause have filter context. Filter context means, that part of query will not be considered for score calculation.
Now lets move on to cases:
Case 1: Only query and no filters.
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Can you show me documents for Hospitality and \"Airline Industry\"",
"fields": [
"TITLE1",
"TEXT"
]
}
}
]
}
}
}
Notice that the query is same as specified by you in the question. All I have done here is that I wrapped it in a bool query. This doesn't make any logical change to the query but doing so will make it easier to add queries to filter clause programmatically.
Case 2: Phrase query with filter.
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Can you show me documents for Hospitality and \"Airline Industry\"",
"fields": [
"TITLE1",
"TEXT"
]
}
}
],
"filter": [
{
"terms": {
"GRP_CD": [
"1234",
"5678"
]
}
}
]
}
}
}
This way you can combine query(query context) with the filters.

How can I achieve this type of queries in ElasticSearch?

I have added a document like this to my index
POST /analyzer3/books
{
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
And then I do queries like this
GET /analyzer3/_analyze
{
"analyzer": "english",
"text": "\"The * day I went with my * to the\""
}
And it successfully returns the previously added document.
My idea is to have quotes so that the query becomes exact, but also wildcards that can replace any word. Google has this exact functionality, where you can search queries like this, for instance "I'm * the university" and it will return page results that contain texts like I'm studying in the university right now, etc.
However I want to know if there's another way to do this.
My main concern is that this doesn't seem to work with other languages like Japanese and Chinese. I've tried with many analyzers and tokenizers to no avail.
Any answer is appreciated.
Exact matches on the tokenized fields are not that straightforward. Better save your field as keyword if you have such requirements.
Additionally, keyword data type support wildcard query which can help you in your wildcard searches.
So just create a keyword type subfield. Then use the wildcard query on it.
Your search query will look something like below:
GET /_search
{
"query": {
"wildcard" : {
"title.keyword" : "The * day I went with my * to the"
}
}
}
In the above query, it is assumed that title field has a sub-field named keyword of data type keyword.
More on wildcard query can be found here.
If you still want to do exact searches on text data type, then read this
Elasticsearch doesn't have Google like search out of the box, but you can build something similar.
Let's assume when someone quotes a search text what they want is a match phrase query. Basically remove the \" and search for the remaining string as a phrase.
PUT test/_doc/1
{
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
GET test/_search
{
"query": {
"match_phrase": {
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
}
}
For the * it's getting a little more interesting. You could just make multiple phrase searches out of this and combine them. Example:
GET test/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"title": "The"
}
},
{
"match_phrase": {
"title": "day I went with my"
}
},
{
"match_phrase": {
"title": "to the"
}
}
]
}
}
}
Or you could use slop in the phrase search. All the terms in your search query have to be there (unless they are being removed by the tokenizer or as stop words), but the matched phrase can have additional words in the phrase. Here we can replace each * with 1 other words, so a slop of 2 in total. If you would want more than 1 word in the place of each * you will need to pick a higher slop:
GET test/_search
{
"query": {
"match_phrase": {
"title": {
"query": "The * day I went with my * to the",
"slop": 2
}
}
}
}
Another alternative might be shingles, but this is a more advanced concept and I would start off with the basics for now.

Search in two fields on elasticsearch with kibana

Assuming I have an index with two fields: title and loc, I would like to search in this two fields and get the "best" match. So if I have three items:
{"title": "castle", "loc": "something"},
{"title": "something castle something", "loc": "something,pontivy,something"},
{"title": "something else", "loc": "something"}
... I would like to get the second one which has "castle" in its title and "pontivy" in its loc. I tried to simplify the example and the base, it's a bit more complicated. So I tried this query, but it seems not accurate (it's a feeling, not really easy to explain):
GET merimee/_search/?
{
"query": {
"multi_match" : {
"query": "castle pontivy",
"fields": [ "title", "loc" ]
}
}
}
Is it the right way to search in various field and get the one which match the in all the fields?
Not sure my question is clear enough, I can edit if required.
EDIT:
The story is: the user type "castle pontivy" and I want to get the "best" result for this query, which is the second because it contains "castle" in "title" and "pontivy" in "loc". In other words I want the result that has the best result in both fields.
As the other posted suggested, you could use a bool query but that might not work for your use case since you have a single search box that you want to query against multiple fields with.
I recommend looking at a Simple Query String query as that will likely give you the functionality you're looking for. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html
So you could do something similar to this:
{
"query": {
"simple_query_string" : {
"query": "castle pontivy",
"fields": ["title", "loc"],
"default_operator": "and"
}
}
}
So this will try to give you the best documents that match both terms in either of those fields. The default operator is set as AND here because otherwise it is OR which might not give you the expected results.
It is worthwhile to experiment with other options available for this query type as well. You might also explore using a Query String query as it gives more flexibility but the Simple Query String term works very well for most cases.
This can be done by using bool type of query and then matching the fields.
GET _search
{
"query":
{
"bool": {"must": [{"match": {"title": "castle"}},{"match": {"loc": "pontivy"}}]
}
}
}

How to query two different fields with different query terms in same request. ElasticSearch 5.x

new to ElasticSearch - start loving it. I am working on a Rails application (using elasticsearch-rails / elasticsearch-model).
I have two fields - both strings consisting of Tags.
about_me & about_you
Now I was to query the about_you of another user with the current users about_me.
At the same time, I wish to query the about_me of the other users with the about_you of the current user.
Does this make sense? Like two fields, two queries and each query is aimed at a particular field.
I just need a hint how this can be achieved in ES. For the sake of completeness, here is the part method I created in my rails model - it is incomplete:
def home_search(query_you, query_me)
search_definition =
{
query: {
multi_match: {
query: query_me,
fields: ['about_you']
}
..... SOMETHINGs MISSING HERE ..... ?
},
suggest: {
text: query,
about_me: {
term: {
size: 1,
field: :about_me
}
},
about_you: {
term: {
size: 1,
field: :about_you
}
}
}
}
self.class.__elasticsearch__.search(search_definition)
end
Any help, link or donations are welcome. Thank you!
I'm not sure I've understood your question but I can suggest two options:
First Use a bool query of type should and minimum_should_match=1. In this case you can write two queries for you'r searches. and If you want to distinguish between results you can pass a _name parameter in each query. something like this:
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"multi_match": {
"query": "query_me",
"fields": [
"about_you"
],
"_name": "about_you"
}
},
{
"multi_match": {
"query": "query_you",
"fields": [
"about_me"
],
"_name": "about_you"
}
}
]
}
}
}
By providing _name you can see which queries are hitted in your search result.
The second approach could be a _msearch query which in which you can pass multiple queries to the endpoint and get the results back.
Here are some useful links:
Bool Query
Named Queries

Scope Elasticsearch Results to Specific Ids

I have a question about the Elasticsearch DSL.
I would like to do a full text search, but scope the searchable records to a specific array of database ids.
In SQL world, it would be the functional equivalent of WHERE id IN(1, 2, 3, 4).
I've been researching, but I find the Elasticsearch query DSL documentation a little cryptic and devoid of useful examples. Can anyone point me in the right direction?
Here is an example query which might work for you. This assumes that the _all field is enabled on your index (which is the default). It will do a full text search across all the fields in your index. Additionally, with the added ids filter, the query will exclude any document whose id is not in the given array.
{
"bool": {
"must": {
"match": {
"_all": "your search text"
}
},
"filter": {
"ids": {
"values": ["1","2","3","4"]
}
}
}
}
Hope this helps!
As discussed by Ali Beyad, ids field in the query can do that for you. Just to complement his answer, I am giving an working example. In case anyone in the future needs it.
GET index_name/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"field": "your query"
}
},
{
"ids" : {
"values" : ["0aRM6ngBFlDmSSLpu_J4", "0qRM6ngBFlDmSSLpu_J4"]
}
}
]
}
}
}
You can create a bool query that contains an Ids query in a MUST clause:
https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-ids-query.html
By using a MUST clause in a bool query, your search will be further limited by the Ids you specify. I'm assuming here by Ids you mean the _id value for your documents.
According to es doc, you can
Returns documents based on their IDs.
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "4", "100"]
}
}
}
With elasticaBundle symfony 5.2
$query = new Query();
$IdsQuery = new Query\Ids();
$IdsQuery->setIds($id);
$query->setQuery($IdsQuery);
$this->finder->find($query, $limit);
You have two options.
The ids query:
GET index/_search
{
"query": {
"ids": {
"values": ["1, 2, 3"]
}
}
}
or
The terms query:
GET index/_search
{
"query": {
"terms": {
"yourNonPrimaryIdField": ["1", "2","3"]
}
}
}
The ids query targets the document's internal _id field (= the primary ID). But it often happens that documents contain secondary (and more) IDs which you'd target thru the terms query.
Note that if your secondary IDs contain uppercase chars and you don't set their field's mapping to keyword, they'll be normalized (and lowercased) and the terms query will appear broken because it only works with exact matches. More on this here: Only getting results when elasticsearch is case sensitive

Resources