Search keyword using double quotes to get exact match in elasticsearch - elasticsearch

If user searches by giving quotes around keyword like "flowers and mulch" then exact matches should be displayed.
I tried using query_string which is almost working but not satisfied with those results.
Can anyone help me out please.
{
"query": {
"query_string": {
"fields": ["body"],
"query": "\"flowers and mulch\""
}
}
}

You should be using phrase_match for exact matches of phrases:
{
"query": {
"match_phrase": {
"body": "flowers and mulch"
}
}
}
Phrase matching
In the same way that the match query is the “go-to” query for standard
full text search, the match_phrase query is the one you should reach
for when you want to find words that are near to each other.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/phrase-matching.html

As I put in the comment of the question, I think knowing what the OP found not satisfying about query_string would be great. I would recommend using query_string for these cases. Note that there are multiple options that could be set, such as: auto_generate_phrase_queries, split_on_whitespace, or quote_field_suffix (example: here), which makes it quite versatile.
The case one "two three"could be addressed using default parameters of query_string

Related

Elastic Match Nginx Status Code with match phrase

I'm looking to query elastic with messages only matching the 409 response code headers
Yet, it does match too many results : any url containing /409/ or -409- in it by example, like the match phrase operator doesn't care about leading and trailing spaces ..
{
"bool": {
"should": [{ "match_phrase": { "message": " 409 " } }],
"minimum_should_match": 1
}
}
Thanks in advance for any comment, clue, notice, enlightenment :)
Regards
Tldr;
As per the documentation match_phrase query does not perform exact match.
It is creating a phrase from the input, and try to find match in whole text.
SO in your case returning /409/ is expected.
You are looking for exact match, either use a keyword field and search by term.

Elastic search filename search not working with dots in filename

I have elasticsearch mapping as follows:
{
"info": {
"properties": {
"timestamp": {"type":"date","format":"epoch_second"},
"user": {"type":"keyword" },
"filename": {"type":"text"}
}
}
}
When I try to do match query on filename, it works properly when I don't give dot in search input, but when dot in included, it returns many false results.
I learnt that standard analyzer is the issue. It breaks search input on dots and then search. What analyzer I can use in this case? The filenames can be millions and I don't want something with takes lot of memory and time. Please suggest.
As you are talking about filenames here, i would suggest using the keyword analyzer. This will not split the string and just index it as it is.
You could also just change ur mapping from text to keyword instead.

Elasticsearch - text type regexp

Does elasticsearch support regex search on text type string?
I created a document like below.
{
"T": "a$b$c$d"
}
and I tried to search this document with below query.
{
"query": {
"query_string": {
"query": "T:/a.*/"
}
}
}
It seems work for me, BUT when I tried to query with '$' symbol. It's unable to find the document.
{
"query": {
"query_string": {
"query": "T:/a$.*/"
}
}
}
How should I do to find the document? This key data should be text type(not keyword) since it can be longer than keyword max length.
You should be aware of some things, here:
If your field is analyzed (and tokenized in the process) you will only find matches in fields containing a token (not the whole "text") that matches your RegExp. If you want the whole content of the field to match, you must use a keyword field or at least a Keyword Analyzer that doesn't tokenize your text.
The $ symbol has a special meaning in Regular Expressions (it marks the end of a string), so you'll have to escape it: a\$.*
Your RegExp must match a whole token to get a hit. That's why there's no point to use $ as a (non-escaped) RegExp symbol: Your RegExp must match a whole token from beginning to end, anyway. So (to stick to your example) to match fields where a is followed by c, you'd need .*?a[^c]*c.*, or if you need the $s in there, escape them: .*?a\$[^c]*c\$.*

How do you search for exact terms (which may include special characters) with trailing/leading wildcard matching in Elasticsearch?

I am trying to figure out how to create Elasticsearch queries that allow for exact matches containing reserved characters while supporting trailing or leading wildcard expansion. I am using logstash dynamic templates which automatically also creates a raw field for each of my terms.
To sum up as concisely as possible, I want to create queries that can support two generic types of matching across all values:
Searching terms such as 'abc' to return results like 'abc.xyz.com'. In this case, the token created by the standard token analyzer completely tokenizes 'abc.xyz.com' into one token, and wildcard matching can succeed using the following command:
{
"query": {
"wildcard": {
"_all": "*abc*"
}
}
}
Searching terms such as fullpaths like '/Intel/1938138191(1).zip' to return results like 'C:/Program Files (x86)/Intel/1938138191(1).zip'. In this case, even if I backslash all of the reserved characters, doing a wildcard match like
{
"query": {
"wildcard": {
"_all": "*/Intel/1938138191(1).zip*"
}
}
}
will not work. And this is because _all defaults to using the standard analyzer, so the path will be split and an exact match cannot be made. However, if I SPECIFICALLY query the raw field like below (both when I escape / do not escape the special characters), I get the correct result:
{
"query": {
"wildcard": {
"field.raw": "*/Intel/1938138191(1).zip*"
}
}
}
So my question is, is there any way to support calling wildcard queries across both tokens analyzed by the standard analyzers and the raw fields which are not analyzed at all, in one query? That is some way of generically encapsulating searched terms so that in both of my above examples, I would get the correct result? For reference I am using Elasticsearch version 1.7. I have also tried looking into query string matching and term matching, all to no avail.

Elastic search query string regex

I am having an issue querying an field (title) using query string regex.
This works: "title:/test/"
This does not : "title:/^test$/"
However they mention it is supported https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax
My goal it to do exact match, but this match should not be partial, it should match the whole field value.
Does anybody have an idea what might be wrong here?
From the documentation
The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators.
You are using anchors ^ and $, which are not supported because there is no need for that, again from the docs
Lucene’s patterns are always anchored. The pattern provided must match the entire string
If you are looking for the phrase query kind of match you could use double quotes like this
{
"query": {
"query_string": {
"default_field": "title",
"query": "\"test phrase\""
}
}
}
but this would also match documents with title like test phrase someword
If you want exact match, you should look for term queries, make your title field mapping "index" : "not_analyzed" or you could use keyword analyzer with lowercase filter for case insensitive match. Your query would look like this
{
"query": {
"term": {
"title": {
"value": "my title"
}
}
}
}
This will give you exact match
Usually in Regex the ^ and $ symbols are used to indicate that the text is should be located at the start/end of the string. This is called anchoring. Lucene regex patterns are anchored by default.
So the pattern "test" with Elasticsearch is the equivalent of "^test$" in say Java.
You have to work to "unanchor" your pattern, for example by using "te.*" to match "test", "testing" and "teeth". Because the pattern "test" would only match "test".
Note that this requires that the field is not analyzed and also note that it has terrible performance. For exact match use a term filter as described in the answer by ChintanShah25.

Resources