I have a use case where I have to store documents with field names after some processing. But for search purposes I want that document to return with an alias of mine.
This is specific to removing Dots "." from the input field names, but keeping the search results oblivious of the change.
Example:
Fieldname recieved: My.Field.Name
Processed name in ES: My<Separator>Field<Separator>Name
Expected Search Result: My.Field.Name
I am assuming that Field Aliases are not supported by ElasticSearch right now. But is there any work around for this.
Related
I’m trying to tag my data according to a lookup table.
The lookup table has these fields:
• Key- represent the field name in the data I want to tag.
In the real data the field is a subfield of “Headers” field..
An example for the “Key” field:
“Server. (* is a wildcard)
• Value- represent the wanted value of the mentioned field above.
The value in the lookup table is only a part of a string in the real data value.
An example for the “Value” field:
“Avtech”.
• Vendor- the value I want to add to the real data if a combination of field- value is found in an document.
An example for combination in the real data:
“Headers.Server : Linux/2.x UPnP/1.0 Avtech/1.0”
A match with that document in the look up table will be:
Key= Server (with wildcard on both sides).
Value= Avtech(with wildcard on both sides)
Vendor= Avtech
So baisically I’ll need to add a field to that document with the value- “ Avtech”.
the subfields in “Headers” are dynamic fields that changes from document to document.
of a match is not found I’ll need to add to the tag field with value- “Unknown”.
I’ve tried to use the enrich processor , use the lookup table as the source data , the match field will be ”Value” and the enrich field will be “Vendor”.
In the enrich processor I didn’t know how to call to the field since it’s dynamic and I wanted to search if the value is anywhere in the “Headers” subfields.
Also, I don’t think that there will be a match between the “Value” in the lookup table and the value of the Headers subfield, since “Value” field in the lookup table is a substring with wildcards on both sides.
I can use some help to accomplish what I’m trying to do.. and how to search with wildcards inside an enrich processor.
or if you have other idea besides the enrich processor- such as parent- child and lookup terms mechanism.
Thanks!
Adi.
There are two ways to accomplish this:
Using the combination of Logstash & Elasticsearch
Using the only the Elastichsearch Ingest node
Constriant: You need to know the position of the Vendor term occuring in the Header field.
Approach 1
If so then you can use the GROK filter to extract the term. And based on the term found, do a lookup to get the corresponding value.
Reference
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-jdbc_static.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-jdbc_streaming.html
Approach 2
Create an index consisting of KV pairs. In the ingest node, create a pipeline which consists of Grok processor and then Enrich it. The Grok would work the same way mentioned in the Approach 1. And you seem to have already got the Enrich part working.
Reference
https://www.elastic.co/guide/en/elasticsearch/reference/current/grok-processor.html
If you are able to isolate the sub field within the Header where the Term of interest is present then it would make things easier for you.
I have graphs defined from Elasticsearch source with long field names such as private_data.systemMetrics.systemData.cpu.usage_user. I would like to set alias for the cpu usage fields that will display only the field name suffixes, in above example, usage_user.
Using Grafana v4.5.2
I found a way to display a short alias but it is costly. I split the query with multiple fields into a list of queries, each on a single field with explicit alias.
Is there a better way to do it?
I have an problem with the unique count feature.
I get data from elasticsearch for example an computer name (PC-01) in a field.
When i want to use a visualisation unique count then kibana makes from "DESKTOP-2D562R2" -> "DESKTOP" and "2D562R2" as a entery.
See this splitted field:
The data kibana gets from elastic search looks like this entery data:
The problem with this is that 2d562r2 and desktop two different "enterys" are in a kibana table or with unique count.
Your field is being analyzed (split into tokens). Change the mapping (or template, depending on how you're creating the indexes) to make this field not_analyzed.
Note that, as a hack, logstash's default template creates a ".raw" version of string fields that is not analyzed. You could refer to enterys.raw.
How can we do multiple field search in Elastic search.
for example I want to search subcategory and region, for one field it is working for multiple field search how we have to do.
Below link is working fine, since I am using one field only for search
http://34c512ba34534fffdfd12abfd69f2458.us-east-1.aws.found.io:9200/episodes/episode/_search?q=sub_cat_seo_url:english-news&sort=pubdate_timestamp:desc
but when I try to search multiple field for example sub_cat_seo_url and region it is not working
see this link (not working)
http://34c512ba34534fffdfd12abfd69f2458.us-east-1.aws.found.io:9200/episodes/episode/_search?q=sub_cat_seo_url:english-news,region:1&sort=pubdate_timestamp:desc
http://34c512ba34534fffdfd12abfd69f2458.us-east-1.aws.found.io:9200/episodes/episode/_search?q=sub_cat_seo_url:english-news®ion:1&sort=pubdate_timestamp:desc
According to documentation, it should work
See http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html
That being said, you can also use the following:
http://34c512ba34534fffdfd12abfd69f2458.us-east-1.aws.found.io:9200/episodes/episode/_search?q=%2Bsub_cat_seo_url%3Aenglish-news+%2Bregion%3A1&sort=pubdate_timestamp:desc
NOTE :
The existing mapping makes your field "sub_cat_seo_url" analyzed which is analyzed using standard analyzer. Hence, when you are searching for "english-news" it gets tokenized into "english", "news" which results in any document matching either english or news to be valid matches. For eg. "telugu-news" is a valid match for your query. Not sure if it is intentional.
In your mapping you need to mark it as "not_analyzed" for exact match.
Note : %2b is decoded as '+' whereas '+' is decoded as ' '
I have a use case which is a bit similar to the ES example of dynamic_template where I want certain strings to be analyzed and certain not.
My document fields don't have such a convention and the decision is made based on an external schema. So currently my flow is:
I grab the inputs document from the DB
I grab the approrpiate schema (same database, currently using logstash for import)
I adjust the name in the document accordingly (using logstash's ruby mutator):
if not analyzed I don't change the name
if analyzed I change it to ORIGINALNAME_analyzed
This will handle the analyzed/not_analyzed problem thanks to dynamic_template I set but now the user doesn't know which fields are analyzed so there's no easy way for him to write queries because he doesn't know what's the name of the field.
I wanted to use field name aliases but apparently ES doesn't support them. Are there any other mechanisms I'm missing I could use here like field rename after indexation or something else?
For example this ancient thread mentions that field.sub.name can be queried as just name but I'm guessing this has changed when they disallowed . in the name some time ago since I cannot get it to work?
Let the user only create queries with the original name. I believe you have some code that converts this user query to Elasticsearch query. When converting to Elasticsearch query, instead of using the field name provided by the user alone use both the field names ORIGINALNAME as well as ORIGINALNAME_analyzed. If you are using a match query, convert it to multi_match. If you are using a term query, convert it to a bool should query. I guess you get where I am going with this.
Elasticsearch won't mind if a field does not exists. This can be a problem if there is already a field with _analyzed appended in its original name. But with some tricks that can be fixed too.