Elasticsearch escape hyphenated field in groovy script

Elasticsearch escape hyphenated field in groovy script - elasticsearch

I am attempting to add a field to a document doing something similar to https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html#_scripted_updates. However, I appear to be running into issues due to the field being hyphen separated(appears to be treated as a minus sign) as opposed to underscore separated.
Example body below:
{"script":"ctx._source.path.to.hyphen-separated-field = \"new data\""}
I attempted to escape the hyphens with a backslash, but to no luck.

You can access the field using square brackets, i.e. simply do it like this:
{"script": "ctx._source.path.to['hyphen-separated-field'] = \"new data\""}

This one worked for me on 2.x (or maybe other version as well):
"script": {
"inline": "ctx._source.path.to[field] = val",
"params": {
"val": "This is the new value",
"field": "hyphen-separated-field"
}
}

Or this will also work
{"script": "ctx._source.path.to.'hyphen-separated-field' = 'new data'"}

Related

Elastic search filename search not working with dots in filename

I have elasticsearch mapping as follows:
{
"info": {
"properties": {
"timestamp": {"type":"date","format":"epoch_second"},
"user": {"type":"keyword" },
"filename": {"type":"text"}
}
}
}
When I try to do match query on filename, it works properly when I don't give dot in search input, but when dot in included, it returns many false results.
I learnt that standard analyzer is the issue. It breaks search input on dots and then search. What analyzer I can use in this case? The filenames can be millions and I don't want something with takes lot of memory and time. Please suggest.

As you are talking about filenames here, i would suggest using the keyword analyzer. This will not split the string and just index it as it is.
You could also just change ur mapping from text to keyword instead.

Elasticsearch - text type regexp

Does elasticsearch support regex search on text type string?
I created a document like below.
{
"T": "a$b$c$d"
}
and I tried to search this document with below query.
{
"query": {
"query_string": {
"query": "T:/a.*/"
}
}
}
It seems work for me, BUT when I tried to query with '$' symbol. It's unable to find the document.
{
"query": {
"query_string": {
"query": "T:/a$.*/"
}
}
}
How should I do to find the document? This key data should be text type(not keyword) since it can be longer than keyword max length.

You should be aware of some things, here:
If your field is analyzed (and tokenized in the process) you will only find matches in fields containing a token (not the whole "text") that matches your RegExp. If you want the whole content of the field to match, you must use a keyword field or at least a Keyword Analyzer that doesn't tokenize your text.
The $ symbol has a special meaning in Regular Expressions (it marks the end of a string), so you'll have to escape it: a\$.*
Your RegExp must match a whole token to get a hit. That's why there's no point to use $ as a (non-escaped) RegExp symbol: Your RegExp must match a whole token from beginning to end, anyway. So (to stick to your example) to match fields where a is followed by c, you'd need .*?a[^c]*c.*, or if you need the $s in there, escape them: .*?a\$[^c]*c\$.*

Elastic search query string regex

I am having an issue querying an field (title) using query string regex.
This works: "title:/test/"
This does not : "title:/^test$/"
However they mention it is supported https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax
My goal it to do exact match, but this match should not be partial, it should match the whole field value.
Does anybody have an idea what might be wrong here?

From the documentation
The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators.
You are using anchors ^ and $, which are not supported because there is no need for that, again from the docs
Lucene’s patterns are always anchored. The pattern provided must match the entire string
If you are looking for the phrase query kind of match you could use double quotes like this
{
"query": {
"query_string": {
"default_field": "title",
"query": "\"test phrase\""
}
}
}
but this would also match documents with title like test phrase someword
If you want exact match, you should look for term queries, make your title field mapping "index" : "not_analyzed" or you could use keyword analyzer with lowercase filter for case insensitive match. Your query would look like this
{
"query": {
"term": {
"title": {
"value": "my title"
}
}
}
}
This will give you exact match

Usually in Regex the ^ and $ symbols are used to indicate that the text is should be located at the start/end of the string. This is called anchoring. Lucene regex patterns are anchored by default.
So the pattern "test" with Elasticsearch is the equivalent of "^test$" in say Java.
You have to work to "unanchor" your pattern, for example by using "te.*" to match "test", "testing" and "teeth". Because the pattern "test" would only match "test".
Note that this requires that the field is not analyzed and also note that it has terrible performance. For exact match use a term filter as described in the answer by ChintanShah25.

Search keyword using double quotes to get exact match in elasticsearch

If user searches by giving quotes around keyword like "flowers and mulch" then exact matches should be displayed.
I tried using query_string which is almost working but not satisfied with those results.
Can anyone help me out please.
{
"query": {
"query_string": {
"fields": ["body"],
"query": "\"flowers and mulch\""
}
}
}

You should be using phrase_match for exact matches of phrases:
{
"query": {
"match_phrase": {
"body": "flowers and mulch"
}
}
}
Phrase matching
In the same way that the match query is the “go-to” query for standard
full text search, the match_phrase query is the one you should reach
for when you want to find words that are near to each other.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/phrase-matching.html

As I put in the comment of the question, I think knowing what the OP found not satisfying about query_string would be great. I would recommend using query_string for these cases. Note that there are multiple options that could be set, such as: auto_generate_phrase_queries, split_on_whitespace, or quote_field_suffix (example: here), which makes it quite versatile.
The case one "two three"could be addressed using default parameters of query_string

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Elasticsearch escape hyphenated field in groovy script - elasticsearch

You can access the field using square brackets, i.e. simply do it like this: {"script": "ctx._source.path.to['hyphen-separated-field'] = \"new data\""}

This one worked for me on 2.x (or maybe other version as well): "script": { "inline": "ctx._source.path.to[field] = val", "params": { "val": "This is the new value", "field": "hyphen-separated-field" } }

Or this will also work {"script": "ctx._source.path.to.'hyphen-separated-field' = 'new data'"}

Related

Related to previous question, how can we format string escaped value?

Elastic search filename search not working with dots in filename

Elasticsearch - text type regexp

Elastic search query string regex

Search keyword using double quotes to get exact match in elasticsearch

Categories

Resources