difference between match and query_string - elasticsearch

What is the difference between match query and query string query? Say, I have the following queries, do they have the same functionality?
GET /_search
{
"query": {
"match" : {
"_all" : "this is a test"
}
}
}
and:
GET /_search
{
"query": {
"query_string" : {
"query" : "this is a test",
}
}
}
Considering the fact that using query_string and not indicating any specific field, automatically _all field will be used

From elasticsearch documentation
Comparison match query to query_string / field
The match family of queries does
not go through a "query parsing" process. It does not support field
name prefixes, wildcard characters, or other "advanced" features. For
this reason, chances of it failing are very small / non existent, and
it provides an excellent behavior when it comes to just analyze and
run that text as a query behavior (which is usually what a text search
box does). Also, the phrase_prefix type can provide a great "as you
type" behavior to automatically load search results.

Related

Elasticseach query filter/term not working when special characters are involved

The following query is not working when "metadata.name" has "-" in the text like "demo-application-child3" . But if I remove "-" and make the query to "demoapplicationchild3". It works. The same with other field metadata.version. I've the data for both demoapplicationchild3 and demo-application-child3. suggestions please.
{
"query": {
"bool": {
"filter": [
{"term": { "metadata.name": "demo-application-child3" }},
{"term": { "metadata.version": "00.00.100" }}]
}
}
}
term queries are not analyzed see the official doc which clearly mention this
Returns documents that contain an exact term in a provided field.
Which clearly means that index time you are using some custom analyzer which is removing - and joining the tokens ie for demo-application-child3 your custom analyzer would be generating demoapplicationchild3 token, which you can easily confirm using the Analyze api.
If you want to get result either change term query to match query or use the .keyword suffix with your field if mappping is generated dynamically or create another field which is of type keyword which uses no-op analyzer.

How to boost Elasticsearch results based on another field?

Kinda simple use case but cannot come up with good solution.
Basically I have two indexed fields: content and keywords (keyword tokenizer), where content is a long text field and keywords contain important terms within that content. When I query with some long text, I have to boost those results based on the keywords present in the matching document.
I tried querying the complete text on both content and keywords field, but it is too slow or it throws too_many_clauses error for text with more than 40 words.
{"query": {
"match": {
"keywords": {
"query": "some long text",
"analyzer": "custom_analyzer"
}
}
}}
Is there any better way? Would percolator work here?
I can relate this to my application, which is similar to Stackoverflow, which consists of question and answers, for a question, there is subject, body, tags etc.
Subject here relates to your keyword indexed field and body relate to your content indexed field. Normally subject contains the important keywords about the post, which is also the case with you.
Now coming to solution part,
How we solve it by querying both on subject and body indexed fields but boost subject by a factor of 15, which is configurable.
ES query which we use:
{
"query": {
"multi_match" : {
"query" : "this is a test",
"fields" : [ "subject^15", "message" ]
}
}
}
This ES doc also has a similar example where they are boosting a subject field in multi_match query by a factor of 3.
Let me know if you have any questions.

Elastic exact matching and substring matching together

I know that Elastic have "keyword" type in order to find something with exact matching. Ex:
"address": { "type": "keyword"}
That's cool. exact matching works!
but I would like to have both "exact matching" and "sub-string" matching. So I decided to create the following mapping:
"address": { "type": "text" , "index": true }
Problem
If I have "text" type, how can I search exact matching string? (not sub-string). I've tried several ways but does not works:
GET testing_index/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"address" : "washington"
}
}
}
}
}
or
GET testing_index/_search
{
"query": {
"match": {
"address" : "washington"
}
}
}
I need just something universal mapping:
to find exact string
to find sub-strings
I hope elastic can do this.
By default, text fields use the default analyzer, which drops most punctuation, breaks up text into individual words, and lower cases them. For instance, the standard analyzer would turn the string “Quick Brown Fox!” into the terms [quick, brown, fox]. As you can imagine, this makes it difficult to write an exact match query against the text field. For your use case, I suggest one of 2 options:
store as keyword, and accomplish sub-string-like matching using wildcard or fuzzy queries. Wildcard queries, in particular queries with a leading wildcard, are notoriously slow, so proceed with caution.
store the field twice: one as keyword and one as text. Obvious downside here is bloating the size of the index.
For more background, see the "Term Query" Elasticsearch documentation, and in particular the section on "Why doesn’t the term query match my document?": https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

ElasticSearch: Using match_phrase for all fields

As a user of ElasticSearch 5, I have been using something like this to search for a given phrase in all fields:
GET /my_index/_search
{
"query": {
"match_phrase": {
"_all": "this is a phrase"
}
}
}
Now, the _all field is going away, and match_phrase does not seem to work like query_string, where you can simply use something like this to run a search for all fields:
"query": {
"query_string": {
"query": "word"
}
}
What is the alternative for a exact phrase search for all fields without using the _all field from version 6.0?
I have many fields per document so specifying all of them in the query is not really a solution for me.
You can find answer in Elasticsearch documentation https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html
It says:
Use a custom field and the mapping copy_to parameter
So, you have to create custom fields in source, and copy all other fields to it.

How to specify certain fields only in the query property

I am using a service which wraps requests to Elastic Search. This service only allows me to send the query property to Elastic Search. I want to tell Elastic Search to look only for matches in a certain field in a document.
For example, if this is my document:
{
name: 'foo',
value: 'true'
}
Then I want to tell Elastic Search to look only for documents where name equals foo.
The Elastic Search documentation says to do this by using the fields property like so:
{
"multi_match" : {
"query" : "this is a test",
"fields" : [ "subject^3", "message" ]
}
}
But I can ONLY access the query property, so I can't specify fields. Lower down on the page, under best fields it says that this is equivalent to doing something like +first_name:will +first_name:smith. But when I put this, it's looking for text that actually matches +first_name:will +first_name:smith in the value, rather than looking for a first_name field that has a value will.
Is it possible to specify what field to search in with Elastic Search using only the query property?
This sounds like a perfect match for query_string(https://www.elastic.co/guide/en/elasticsearch/reference/1.x/query-dsl-query-string-query.html). You can do something like this with it:
"query_string" : {
"query" : "subject:whatever OR message:whatever"
}
So, if you can change multi_match to query_string this would be what you are looking for.
Lucene supports fielded data. When performing a search you can either specify a field, or use the default field. The field names and default field is implementation specific.
You can search any field by typing the field name followed by a colon ":" and then the term you are looking for.
{
"query": {
"query_string": {
"query": "Name:\"foo bar cook\"",
"default_operator" : "or"
}
}
}
use default_operator and to perform AND operation, or to perform OR kind of operation among the values

Resources