Elasticsearch escape slash - elasticsearch

I am using elasticsearch 7 and I am trying to build up a search request in this way:
{
"query": {
"prefix": {
"document": {
"value": "/home/myfolder"
}
}
}
}
in order to find all folders starting with /home/myfolder ("document" element is stored like a path "/home/myfolder/file.txt". I am trying many ways but I didn't found any way to escape properly "/" character. In other links, people suggested to use "\/home/myfolder" or "/home/myfolder" but it does not work.
many thanks for any help

Since you are trying to match with the / use .keyword as below.
{
"query": {
"prefix": {
"document.keyword": {
"value": "/home/myfolder"
}
}
}
}
This is because when you dont use keyword, you are trying to match against an analyzed field and by default it removes the /.
Try running this and see how it breaks at each slash (/) to create the inverted index.
POST /_analyze
{
"text" :"/home/myfolder/document.txt"
}

Related

Kibana search pattern issue

I am trying to create a elastic search query for one of my Library projects. I am trying to use regex but I do not get any result. I am trying to enter the following regex query.
GET /manifestation_v1/_search
{
"query": {
"regexp": {
"bibliographicInformation.title": {
"value": "python access*"
}
}
}
}
access is a wildcard so i want to create a query which takes as python access* not python access
Can anyone help me out who already has some experience in kibana?
you can try wildcard query
{
"query": {
"wildcard": {
"bibliographicInformation.title": {
"value": "saba safavi*"
}
}
}
}
You need to run regex query on keyword field and use .* instead of *
ex.
GET /manifestation_v1/_search
{
"query": {
"regexp": {
"bibliographicInformation.title": {
"value": "python access.*"
}
}
}
}
Regex is slower , you can also try prefix query
{
"query": {
"prefix": {
"bibliographicInformation.title": {
"value": "python access"
}
}
}
}
If field is of nested type then you need to use nested query
Update
For "text" type , field is stored as tokens. i.e
"python access" is stored as ["python","access"]. You query is trying to match "phython access*" with each of these tokens individually. You need to query against keyword field , which is stored as single value "phython access".

Elastic query bool must match issue

Below is the query part in Elastic GET API via command line inside openshift pod , i get all the match query as well as unmatch element in the fetch of 2000 documents. how can i limit to only the match element.
i want to specifically get {\"kubernetes.container_name\":\"xyz\"}} only.
any suggestions will be appreciated
-d ' {\"query\": { \"bool\" :{\"must\" :{\"match\" :{\"kubernetes.container_name\":\"xyz\"}},\"filter\" : {\"range\": {\"#timestamp\": {\"gte\": \"now-2m\",\"lt\": \"now-1m\"}}}}},\"_source\":[\"#timestamp\",\"message\",\"kubernetes.container_name\"],\"size\":2000}'"
For exact matches there are two things you would need to do:
Make use of Term Queries
Ensure that the field is of type keyword datatype.
Text datatype goes through Analysis phase.
For e.g. if you data is This is a beautiful day, during ingestion, text datatype would break down the words into tokens, lowercase them [this, is, a, beautiful, day] and then add them to the inverted index. This process happens via Standard Analyzer which is the default analyzer applied on text field.
So now when you query, it would again apply the analyzer at querying time and would search if the words are present in the respective documents. As a result you see documents even without exact match appearing.
In order to do an exact match, you would need to make use of keyword fields as it does not goes through the analysis phase.
What I'd suggest is to create a keyword sibling field for text field that you have in below manner and then re-ingest all the data:
Mapping:
PUT my_sample_index
{
"mappings": {
"properties": {
"kubernetes":{
"type": "object",
"properties": {
"container_name": {
"type": "text",
"fields":{ <--- Note this
"keyword":{ <--- This is container_name.keyword field
"type": "keyword"
}
}
}
}
}
}
}
}
Note that I'm assuming you are making use of object type.
Request Query:
POST my_sample_index
{
"query":{
"bool": {
"must": [
{
"term": {
"kubernetes.container_name.keyword": {
"value": "xyz"
}
}
}
]
}
}
}
Hope this helps!

How do I not match a bare hyphen in Elasticsearch?

I am querying apache logs stored in Elasticsearch. I want to return log entries from a given hostname that has a hyphen and with a populated auth field.
These strings should be an exact match: "hostname": "example-dev" and not "auth": "-".
My questions are:
How do I correctly remap a type in Elasticsearch to allow a hyphen to be part of the matched string.
How do I correctly query a type in Elasticsearch with a bare hyphen.
The hyphen is a reserved character in Elasticsearch, so I understand it takes special effort. However, I'm having what seems like a lot of trouble figuring out how to include it in my query.
I have tried to remap the type to be not_analysed. It looks like the format has recently changed. The old way of defining the index ("analysed", "not_analysed", and "no") makes sense to me. The new way (true or false) does not. In either case, I cannot seem to get remapping to work.
Here is my attempt at remapping:
DELETE /search
PUT search
{
"mappings" : {
"beat" : {
"properties" : {
"hostname" : {
"type" : "text",
"norms" : false,
"index" : false
}
}
}
}
}
I have not included the remapping of the auth field because it only returns a mapper_parsing_exception.
I am using json to query Elasticsearch. Here is my query:
GET _search
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"beat.hostname": "example-dev"
}
}
],
"must_not": [
{
"match": {
"auth.keyword": "-"
}
}
]
}
}
}
}
}
I have tried escaping the hyphen with \\- but that returns results that match "auth": "-". The hostname still does not match exactly. The hostname query also matches something like "example-prod".
I have tried using "term" rather than "match"; that returns no results.
I can match a specific string for "auth", for example "must": { "match": { "auth": "foo" } } returns all entries for auth = "foo". That is opposite of what I need, but it does work. The hostname is still not exactly matched if it includes a hyphen.
The log entries are parsed into Elasticsearch using ELK stack, however this will be a report that is generated outside of Kibana for legacy reasons.
I have read the documentation and examples, but there is a lot to dig through. Many of the examples I have found are for older versions of Elasticsearch, which is understandable, but confusing.
I am new to Elasticsearch. It feels like I am just overlooking something, but it the problem might stem from a basic misunderstanding of how Elasticsearch is doing things.
After spending some more time with ElascticSearch queries, I think I have it figured out.
Splitting the hostname string into two separate string and matching for both filters the hostname as expected. Using an empty string for the negative match also seems to work as expected.
Here is the updated query:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"beat.hostname": "example"
}
},
{
"match": {
"beat.hostname": "dev"
}
}
],
"must_not": [
{
"match_phrase": {
"auth.keyword": ""
}
}
]
}
}
}
}
I will do bit more testing is need to make sure this is actually returning what I need.
I was trying too hard to make ElasticSearch fit what I expected. Instead of working with ElasticSearch, I was trying to fight against it.

elasticsearch - confused on how to searching items that a field contains string

This query is returning fine only one item "steve_jobs".
{
"query": {
"constant_score": {
"filter": {
"term": {
"name":"steve_jobs"
}
}
}
}
}
So, now I want to get all people with name prefix steve_. So I try this:
{
"query": {
"constant_score": {
"filter": {
"term": {
"name": "steve_"
}
}
}
}
}
This is returning nothing. Why?
I'm confused about when to use term query / term filter / terms filter / querystring query.
What you need is Prefix Query.
If you are indexing your document like so:
POST /testing_nested_query/class/
{
"name": "my name is steve_jobs"
}
And you are using the default analyzer, then the problem is that the term steve_jobs will be indexed as one term. So your Term Query will never be able to find any docs matching the term steve as there is no term like in the index. Prefix Query helps you solve your problem by searching for a prefix in all the indexed terms.
You can solve the same problem by making your custom analyzers (read this and this) so that steve_jobs is stored as steve and jobs.

Case insensitivity does not work

I cant figure out why my searches are case sensitive. Everything I've read says that ES is insensitive by default. I have mappings that specify the standard analyzer for indexing and search but it seems like some things are still case sensitive - ie, wildcard:
"query": {
"bool": {
"must": [
{
"wildcard": {
"name": {
"value": "Rae*"
}
}
}
]
}
This fails but "rae*" works as wanted. I need to use wildcard for 'starts-with' type searches (I presume).
I'm using NEST from a .Net app and am specifying the analyzers when I create the index thus:
var settings = new IndexSettings();
settings.NumberOfReplicas = _configuration.Replicas;
settings.NumberOfShards = _configuration.Shards;
settings.Add("index.refresh_interval", "10s");
settings.Analysis.Analyzers.Add(new KeyValuePair<string, AnalyzerBase>("keyword", new KeywordAnalyzer()));
settings.Analysis.Analyzers.Add(new KeyValuePair<string, AnalyzerBase>("simple", new SimpleAnalyzer()));
In this case it's using the simple analyzer but the standard one has the same result.
The mapping looks like this:
name: {
type: string
analyzer: simple
store: yes
}
Anyone got any ideas whats wrong here?
Thanks
From the documentation,
"[The wildcard query] matches documents that have fields matching a wildcard expression (not analyzed)".
Because the search term is not analyzed, you'll essentially need to run the analysis yourself before generating the search query. In this case, this just means that your search term needs to be lowercase. Alternatively, you could use query_string:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "name:Rae*"
}
}
]
}
}
}

Resources