ElasticSearch Logstash JDBC: How to aggregate into different column names - elasticsearch

I am new to Elasticsearch and I am trying to use Logstash to load data to an index. Following is a partial of my losgstash config:
filter {
aggregate {
task_id => "%{code}"
code => "
map['campaignId'] = event.get('CAM_ID')
map['country'] = event.get('COUNTRY')
map['countryName'] = event.get('COUNTRYNAME')
# etc
"
push_previous_map_as_event => true
timeout => 5
}
}
output {
elasticsearch {
document_id => "%{code}"
document_type => "company"
index => "company_v1"
codec => "json"
hosts => ["127.0.0.1:9200"]
}
}
I was expecting that the aggregation would map for instance the column 'CAM_ID' into a property in the ElasticSearch Index as 'campaignId'. Instead, is creating a property with the name 'cam_id' which is the column name as lowercase. The same with the rest of the properties.
Following is the Index Document after logstash being executed:
{
"company_v1": {
"aliases": {
},
"mappings": {
"company": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"cam_id": {
"type": "long"
},
"campaignId": {
"type": "long"
},
"cam_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"campaignType": {
"type": "text"
}
}
}
},
"settings": {
"index": {
"creation_date": "1545905435871",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "Dz0x16ohQWWpuhtCB3Y4Vw",
"version": {
"created": "6050399"
},
"provided_name": "company_v1"
}
}
}
}
'campaignId' and 'campaignType' were created by me when i created the index, but logstash created the other 2.
Can someone explain me how to configure logstash to customize the indexes documents properties names when data is being loaded?
Thank you very much.
Best Regards

Related

Elasticsearch: How to define Contexts property of nested completion field?

I've got the following mapping for an ES index (I'm not including config for analyzer and other things):
{
"mappings": {
"properties": {
"topCustomer": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"topCustomer_suggest": {
"type": "completion",
"contexts": [
{
"name": "index_name",
"type": "category"
}
]
},
"customer": {
"type": "nested",
"include_in_root": "true",
"properties": {
"customer_name": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"customer_name_suggest": {
"type": "completion",
"contexts": [
{
"name": "index_name",
"type": "category"
}
]
}
}
},
"customer_level": {
"type": "integer"
}
}
}
}
}
}
Also, I have the following logstash configuration file:
input {
jbdc {
//Input config
}
}
filter {
mutate {
remove_field => ["#version"]
}
ruby {
code => "
input = event.get('topCustomer').strip.gsub(/[\(\)]+/, '').split(/[\s\/\-,]+/);
event.set('[topCustomer_suggest][input]', input);
contexts = { 'index_name' => [event.get('type')] };
event.set('[topCustomer_suggest][contexts]', contexts);
input = event.get('[customer][cutomer_name]').strip.gsub(/[\(\)]+/, '').split(/[\s\/\-,]+/);
event.set('[customer][customer_name][fields][customer_name_suggest][input]', input);
contexts = { 'index_name' => [event.get('type')] };
event.set('[customer][customer_name][fields][customer_name_suggest][contexts]', contexts);
"
}
}
output {
elasticsearch {
index => "%{type}"
manage_template => false
hosts => ["localhost:9200"]
}
}
Now, when I try to refresh my index, to apply the changes that I made to one of these files, I get the following error:
Could not index event to Elasticsearch ...
:response=>{"index"=>{"index"=>"customers", "_type"=>"_doc",
"_id"=>"...", "status"=>400,
"error"=>{"type"=>"illegal_argument_exception", "reason"=>"Contexts
are mandatory in context enabled completion field
[customer.customer_name.customer_name_suggest]"}}}}
I tried to modify my config file so that the set events (in the ruby filter section) match the format that the error displays to access the field; I also tried many more combinations to see if this was causing the error.
As you can see, I defined another completion field in the mapping. This field works as expected. The difference is that this is not a nested field.
Notice that the customer_name_suggest is a sub-field and not an 'independent' field like the topCustomer_suggest field. Is this the correct way of doing it or should I not make customer_name_suggest a sub field? I really don't understand why I'm getting the error as I'm defining the contexts property in the mapping.

Sorting in elastic search using new java api

I am using the latest java API for communication with the elastic search server.
I require to search data in some sorted order.
SortOptions sort = new SortOptions.Builder().field(f -> f.field("customer.keyword").order(SortOrder.Asc)).build();
List<SortOptions> list = new ArrayList<SortOptions>();
list.add(sort);
SearchResponse<Order> response = elasticsearchClient.search(b -> b.index("order").size(100).sort(list)
.query(q -> q.bool(bq -> bq
.filter(fb -> fb.range(r -> r.field("orderTime").
gte(JsonData.of(timeStamp("01-01-2022-01-01-01")))
.lte(JsonData.of(timeStamp("01-01-2022-01-01-10")))
)
)
// .must(query)
)), Order.class);
I have written the
above code for getting search results in sorted order by customer.
I am getting the below error when I run the program.
Exception in thread "main" co.elastic.clients.elasticsearch._types.ElasticsearchException: [es/search] failed: [search_phase_execution_exception] all shards failed
at co.elastic.clients.transport.rest_client.RestClientTransport.getHighLevelResponse(RestClientTransport.java:281)
at co.elastic.clients.transport.rest_client.RestClientTransport.performRequest(RestClientTransport.java:147)
at co.elastic.clients.elasticsearch.ElasticsearchClient.search(ElasticsearchClient.java:1487)
at co.elastic.clients.elasticsearch.ElasticsearchClient.search(ElasticsearchClient.java:1504)
at model.OrderDAO.fetchRecordsQuery(OrderDAO.java:128)
Code runs fine if I remove .sort() method.
My index is configured in the following format.
{
"order": {
"aliases": {},
"mappings": {
"properties": {
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"orderId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"orderTime": {
"type": "long"
},
"orderType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"provided_name": "order",
"creation_date": "1652783550822",
"number_of_replicas": "1",
"uuid": "mrAj8ZT-SKqC43-UZAB-Jw",
"version": {
"created": "8010299"
}
}
}
}
}
Please let me know what is wrong here also if possible please send me the correct syntax for using sort() in the new java API.
Thanks a lot.
As you have confirmed in comment, customer is a text type field and this is the reason you are getting above error as sort can not apply on texttype of field.
Your index should be configured like below for customer field to apply sort:
{
"mappings": {
"properties": {
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Once you have index mapping like above, you can use customer.keyword as field name for sorting and customer as field name for free text search.

Logstash. Nested object mapping with ruby script

I have a problem with nested object mapping by ruby filter plugin.
My object should have field cmds which is array of objects like this:
"cmds": [
{
"number": 91,
"errors": [],
"errors_count": 0
},
{
"number": 92,
"errors": ["ERROR_1"],
"errors_count": 1
}]
By elasticsearch I need to find objects where number = 91 and error_count > 0. So object above shoudn`t be correct result. But my query (below) matches it.
GET /logs/default/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"cmds.number": 91
}
},
{
"range": {
"cmds.errors_count": {
"gt": 0
}
}
}]}}
I know it because JSON document is flattened into a simple key-value format and I should mapp the cmds field as type nested instead of type object.
The problem is I have no idea how to do it in my logstash ruby script with event.set
I have folowing code:
for t in commandTexts do
commandv = Command.new(t)
if i==0
event.set("[cmds]", ["[number]" => commandv.hexnumber,
"[command_text]" => commandv.command_text,
"[errors]" => commandv.errors,
"[has_error]" => commandv.has_error,
"[errors_count]" => commandv.errors_count])
else
event.set("[cmds]", event.get("cmds") + ["[number]" => commandv.hexnumber,
"[command_text]" => commandv.command_text,
"[errors]" => commandv.errors,
"[has_error]" => commandv.has_error,
"[errors_count]" => commandv.errors_count])
end
i+=1
end
end
I`m new in ruby and my code is not perfect, but "cmds" field look fine in elastic search. The only problem is that is not nested. Please help.
Ok, i did it. I`m still new in ELK, and sometimes I'm confused where (logstash/kibana/scripts in ruby) i should do what needed.
My code is okey. Using kibana, I deleted my index, and make a new one with correct mapping
code:
PUT /logs?pretty
{
"mappings": {"default": {
"properties": {
"cmds" : {
"type" : "nested",
"properties": {
"command_text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"errors": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"errors_count": {
"type": "long"
},
"has_error": {
"type": "boolean"
},
"number": {
"type": "long"
}
}
}
}
}}
}
Earlier I was trying to create new index just by setting "type" as "nested"
PUT /logs?pretty
{
"mappings": {"default": {
"properties": {
"cmds" : {
"type" : "nested"
}
}
}}}
But it wasn`t working correctly ("cmds" field was not added to elasticsearch) so I done it by full mapping (all properties).

Unable to see data in Kibana 4 on Ubuntu

I am trying to visualize my data file using Kibana
The format of my file is as follows
timeStamp;elapsed;label;responseCode;responseMessage;threadName;success;failureMessage;bytes;grpThreads;allThreads;Latency;SampleCount;ErrorCount;Hostname
2016-01-16 02:27:17,565;912;HTTP Request;200;OK;Thread Group 1-5;true;;78854;10;10;384;1;0;sundeep-Latitude-E6440 timeStamp;elapsed;label;responseCode;responseMessage;threadName;success;failureMessage;bytes;grpThreads;allThreads;Latency;SampleCount;ErrorCount;Hostname
2016-01-16 02:27:17,565;912;HTTP Request;200;OK;Thread Group 1-5;true;;78854;10;10;384;1;0;sundeep-Latitude-E6440
To map the above data, my logstash config is as follows:
input {
file {
path => [ "/home/sundeep/data/test.csv"]
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if ([message] =~ "responseCode") {
drop { }
} else {
csv {
separator => ";"
columns => ["timeStamp", "elapsed", "label", "responseCode","responseMessage","threadName",
"success","failureMessage", "bytes", "grpThreads", "allThreads", "Latency",
"SampleCount", "ErrorCount", "Hostname"]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"]
index => "aa-%{+yyyy-MM-dd}"
}
}
The template file is as follows:
{
"template": "aa-*",
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index.refresh_interval": "5s"
},
"mappings": {
"logs": {
"properties": {
"timeStamp": {
"index": "analyzed",
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss,SSS"
},
"elapsed": {
"type": "long"
},
"dummyfield": {
"type": "long"
},
"label": {
"type": "string"
},
"responseCode": {
"type": "integer"
},
"threadName": {
"type": "string"
},
"success": {
"type": "boolean"
},
"failureMessage":{
"type": "string"
},
"bytes": {
"type": "long"
},
"grpThreads": {
"type": "long"
},
"allThreads": {
"type": "long"
},
"Latency": {
"type": "long"
},
"SampleCount": {
"type": "long"
},
"ErrorCount": {
"type": "long"
},
"Hostname": {
"type": "string"
}
}
}
}
}
Now as you can see, a new index is created in elasticsearch as soon as I start logstash with the config file.
The newly created index starts from aa-* which is expected.
Now, I search for the Index in Kibana and I can see it as below:
[
However, I cannot see any data when I try to plot a line chart.
Things which I have tried:
Deleting the index from Sense and then creating again via sense (did not work)
Changing Timestamp of Log file, did not work as import was successful
Tried the Solution here Similar Question
Also, I was able to visualize another dataset, from this blog post:enter link description here
Trace log:
[2016-01-16 02:45:41,105][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] deleting index
[2016-01-16 02:46:01,370][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] creating index, cause [auto(bulk api)], templates [aa], shards 1/[0], mappings [logs]
[2016-01-16 02:46:01,451][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] update_mapping [logs]
ELK Stack
ElasticSearch - 2.1
Logstash - 2.1
Kibana - 4.3.1.1

Default elasticsearch configuration for docker container

What is the best way to configure ES index template with mappings in docker container? I expected to use template file but it seems that from version 2 it is not possible. Executing http request also won't work because on container creation process doesn't start. It could be done on each container launch with script which will start ES and execute HTTP request to it but it looks really ugly.
you can configure template with mappings by execute HTTP PUT request in Linux terminal, as following:
curl -XPUT http://ip:port/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8
},
"mappings": {
"_default_": {
"_all": {
"store": false
},
"_source": {
"enabled": true,
"compress": true
},
"properties": {
"_id": {
"index": "not_analyzed",
"type": "string"
},
"_type": {
"index": "not_analyzed",
"type": "string"
},
"field1": {
"index": "not_analyzed",
"type": "string"
},
"field2": {
"type": "double"
},
"field3": {
"type": "integer"
},
"xy": {
"properties": {
"x": {
"type": "double"
},
"y": {
"type": "double"
}
}
}
}
}
}
}
'
The "logstash-*" is your index name, you can have a try.
if using logstash, you can make template part of your logstash pipeline config
pipeline/logstash.conf
input {
...
}
filter {
...
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
template => "/usr/share/logstash/templates/logstash.template.json"
template_name => "logstash"
template_overwrite => true
index => "logstash-%{+YYYY.MM.dd}"
}
}
Reference: https://www.elastic.co/guide/en/logstash/6.1/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template

Resources