"index": "not_analyzed" in elasticsearch

"index": "not_analyzed" in elasticsearch - elasticsearch

i have delete mapping with the cmd
curl -XDELETE 'http://localhost:9200/logstash_log*/'
in my conf ,i have defined the index as follow,
output {
elasticsearch {
hosts => localhost
index => "logstash_log-%{+YYYY.MM.dd}"
}
and try to create a new mapping , but i got the error
#curl -XPUT http://localhost:9200/logstash_log*/_mapping/log -d '
{
"properties":{
"#timestamp":"type":"date","format":"strict_date_optional_time||epoch_millis"},
"message":{"type":"string"},
"host":{"type":"ip"},
"name":{"type":"string","index": "not_analyzed"},
"type":{"type":"string"}
}
}'
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"logstash_log*","index":"logstash_log*"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"logstash_log*","index":"logstash_log*"},"status":404}
How can i fix it?
any help will be appreciated!!

You need to re-create your index like this:
# curl -XPUT http://localhost:9200/logstash_log -d '{
"mappings": {
"log": {
"properties": {
"#timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"message": {
"type": "string"
},
"host": {
"type": "ip"
},
"name": {
"type": "string",
"index": "not_analyzed"
},
"type": {
"type": "string"
}
}
}
}
}'
Although since it looks like you're creating daily indices from logstash, you're probably better off creating a template instead. Store the following content inside index_template.json
{
"template": "logstash-*",
"mappings": {
"log": {
"properties": {
"#timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"message": {
"type": "string"
},
"host": {
"type": "ip"
},
"name": {
"type": "string",
"index": "not_analyzed"
},
"type": {
"type": "string"
}
}
}
}
}
And then modify your logstash configuration like this:
output {
elasticsearch {
hosts => localhost
index => "logstash_log-%{+YYYY.MM.dd}"
manage_template => true
template_name => "logstash"
template => "/path/to/index_template.json"
template_overwrite => true
}

* is an invalid character for index name.
Index name must not contain the following characters [\, /, *, ?, \",
<, >, |, , ,]

Related

ElasticSearch Logstash JDBC: How to aggregate into different column names

I am new to Elasticsearch and I am trying to use Logstash to load data to an index. Following is a partial of my losgstash config:
filter {
aggregate {
task_id => "%{code}"
code => "
map['campaignId'] = event.get('CAM_ID')
map['country'] = event.get('COUNTRY')
map['countryName'] = event.get('COUNTRYNAME')
# etc
"
push_previous_map_as_event => true
timeout => 5
}
}
output {
elasticsearch {
document_id => "%{code}"
document_type => "company"
index => "company_v1"
codec => "json"
hosts => ["127.0.0.1:9200"]
}
}
I was expecting that the aggregation would map for instance the column 'CAM_ID' into a property in the ElasticSearch Index as 'campaignId'. Instead, is creating a property with the name 'cam_id' which is the column name as lowercase. The same with the rest of the properties.
Following is the Index Document after logstash being executed:
{
"company_v1": {
"aliases": {
},
"mappings": {
"company": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"cam_id": {
"type": "long"
},
"campaignId": {
"type": "long"
},
"cam_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"campaignType": {
"type": "text"
}
}
}
},
"settings": {
"index": {
"creation_date": "1545905435871",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "Dz0x16ohQWWpuhtCB3Y4Vw",
"version": {
"created": "6050399"
},
"provided_name": "company_v1"
}
}
}
}
'campaignId' and 'campaignType' were created by me when i created the index, but logstash created the other 2.
Can someone explain me how to configure logstash to customize the indexes documents properties names when data is being loaded?
Thank you very much.
Best Regards

How do I alter the schema without destroying data in elasticsearch?

This is my current schema
{
"mappings": {
"historical_data": {
"properties": {
"continent": {
"type": "string",
"index": "not_analyzed"
},
"country": {
"type": "string",
"index": "not_analyzed"
},
"description": {
"type": "string"
},
"funding": {
"type": "long"
},
"year": {
"type": "integer"
},
"agency": {
"type": "string"
},
"misc": {
"type": "string"
},
"university": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
I have 700k records uploaded. Without destroying the data, how can I make the university index not "not_analysed" such that the change reflects in my existing data?

The mapping for an existing field cannot be modified.
However you can achieve the desired outcome in two ways .
Create another field. Adding fields is free using put _mapping API
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"new_university": {
"type": "string"
}
}
}'
Use multi-fields, add a sub-field to your not_analyzed field.
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"university": {
"type": "string",
"index": "not_analyzed",
"fields": {
"university_analyzed": {
"type": "string" // <-- ANALYZED sub field
}
}
}
}
}'
In both the case, you need to reindex in order to populate the new field. Use _reindex API
curl -XPUT localhost:9200/_reindex -d '{
"source": {
"index": "YOUR_INDEX"
},
"dest": {
"index": "YOUR_INDEX"
},
"script": {
"inline": "ctx._source.university = ctx._source.university"
}
}'

You are not exactly forced to "destroy" your data, what you can do is reindex your data as described in this article (I'm not gonna rip off the examples as they are particularly clear in the section Reindexing your data with zero downtime).
For reindexing, you can also take a look at the reindexing API, the simplest way being:
POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}
Of course it will take some resources to perform this operation, so I would suggest that you take a complete look at the changes you want to introduce in your mapping, and perform the operation when you have the least amount of activity on your servers (e.g. during the weekend, or at night...)

ElasticSearch create an index with dynamic properties

Is it possible to create an index, restricting indexing a parent property?
For example,
$ curl -XPOST 'http://localhost:9200/actions/action/' -d '{
"user": "kimchy",
"message": "trying out Elasticsearch",
"actionHistory": [
{ "timestamp": 123456789, "action": "foo" },
{ "timestamp": 123456790, "action": "bar" },
{ "timestamp": 123456791, "action": "buz" },
...
]
}'
I don't want actionHistory to be indexed at all. How can this be done?
For the above document, I believe the index would be created as
$ curl -XPOST localhost:9200/actions -d '{
"settings": {
"number_of_shards": 1
},
"mappings": {
"action": {
"properties" : {
"user": { "type": "string", "index" : "analyzed" },
"message": { "type": "string": "index": "analyzed" },
"actionHistory": {
"properties": {
"timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"action": { "type": "string", "index": "analyzed" }
}
}
}
}
}
}'
Would removing properties from actionHistory and replace it with "index": "no" be the proper solution?
This is an example, however my actual situation are documents with dynamic properties (i.e. actionHistory contains various custom, non-repeating properties across all documents) and my mapping definition for this particular type has over 2000 different properties, making searches extremely slow (i.e. worst than full text search from the database).

You can probably get away by using dynamic templates, match on all actionHistory sub-fields and set "index": "no" for all of them.
PUT actions
{
"mappings": {
"action": {
"dynamic_templates": [
{
"actionHistoryRule": {
"path_match": "actionHistory.*",
"mapping": {
"type": "{dynamic_type}",
"index": "no"
}
}
}
]
}
}
}

Unable to see data in Kibana 4 on Ubuntu

I am trying to visualize my data file using Kibana
The format of my file is as follows
timeStamp;elapsed;label;responseCode;responseMessage;threadName;success;failureMessage;bytes;grpThreads;allThreads;Latency;SampleCount;ErrorCount;Hostname
2016-01-16 02:27:17,565;912;HTTP Request;200;OK;Thread Group 1-5;true;;78854;10;10;384;1;0;sundeep-Latitude-E6440 timeStamp;elapsed;label;responseCode;responseMessage;threadName;success;failureMessage;bytes;grpThreads;allThreads;Latency;SampleCount;ErrorCount;Hostname
2016-01-16 02:27:17,565;912;HTTP Request;200;OK;Thread Group 1-5;true;;78854;10;10;384;1;0;sundeep-Latitude-E6440
To map the above data, my logstash config is as follows:
input {
file {
path => [ "/home/sundeep/data/test.csv"]
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if ([message] =~ "responseCode") {
drop { }
} else {
csv {
separator => ";"
columns => ["timeStamp", "elapsed", "label", "responseCode","responseMessage","threadName",
"success","failureMessage", "bytes", "grpThreads", "allThreads", "Latency",
"SampleCount", "ErrorCount", "Hostname"]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"]
index => "aa-%{+yyyy-MM-dd}"
}
}
The template file is as follows:
{
"template": "aa-*",
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index.refresh_interval": "5s"
},
"mappings": {
"logs": {
"properties": {
"timeStamp": {
"index": "analyzed",
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss,SSS"
},
"elapsed": {
"type": "long"
},
"dummyfield": {
"type": "long"
},
"label": {
"type": "string"
},
"responseCode": {
"type": "integer"
},
"threadName": {
"type": "string"
},
"success": {
"type": "boolean"
},
"failureMessage":{
"type": "string"
},
"bytes": {
"type": "long"
},
"grpThreads": {
"type": "long"
},
"allThreads": {
"type": "long"
},
"Latency": {
"type": "long"
},
"SampleCount": {
"type": "long"
},
"ErrorCount": {
"type": "long"
},
"Hostname": {
"type": "string"
}
}
}
}
}
Now as you can see, a new index is created in elasticsearch as soon as I start logstash with the config file.
The newly created index starts from aa-* which is expected.
Now, I search for the Index in Kibana and I can see it as below:
[
However, I cannot see any data when I try to plot a line chart.
Things which I have tried:
Deleting the index from Sense and then creating again via sense (did not work)
Changing Timestamp of Log file, did not work as import was successful
Tried the Solution here Similar Question
Also, I was able to visualize another dataset, from this blog post:enter link description here
Trace log:
[2016-01-16 02:45:41,105][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] deleting index
[2016-01-16 02:46:01,370][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] creating index, cause [auto(bulk api)], templates [aa], shards 1/[0], mappings [logs]
[2016-01-16 02:46:01,451][INFO ][cluster.metadata ] [Hulk 2099] [aa-2016-01-15] update_mapping [logs]
ELK Stack
ElasticSearch - 2.1
Logstash - 2.1
Kibana - 4.3.1.1

Default elasticsearch configuration for docker container

What is the best way to configure ES index template with mappings in docker container? I expected to use template file but it seems that from version 2 it is not possible. Executing http request also won't work because on container creation process doesn't start. It could be done on each container launch with script which will start ES and execute HTTP request to it but it looks really ugly.

you can configure template with mappings by execute HTTP PUT request in Linux terminal, as following:
curl -XPUT http://ip:port/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8
},
"mappings": {
"_default_": {
"_all": {
"store": false
},
"_source": {
"enabled": true,
"compress": true
},
"properties": {
"_id": {
"index": "not_analyzed",
"type": "string"
},
"_type": {
"index": "not_analyzed",
"type": "string"
},
"field1": {
"index": "not_analyzed",
"type": "string"
},
"field2": {
"type": "double"
},
"field3": {
"type": "integer"
},
"xy": {
"properties": {
"x": {
"type": "double"
},
"y": {
"type": "double"
}
}
}
}
}
}
}
'
The "logstash-*" is your index name, you can have a try.

if using logstash, you can make template part of your logstash pipeline config
pipeline/logstash.conf
input {
...
}
filter {
...
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
template => "/usr/share/logstash/templates/logstash.template.json"
template_name => "logstash"
template_overwrite => true
index => "logstash-%{+YYYY.MM.dd}"
}
}
Reference: https://www.elastic.co/guide/en/logstash/6.1/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

"index": "not_analyzed" in elasticsearch - elasticsearch

* is an invalid character for index name. Index name must not contain the following characters [\, /, *, ?, \", <, >, |, , ,]

Related

ElasticSearch Logstash JDBC: How to aggregate into different column names

How do I alter the schema without destroying data in elasticsearch?

ElasticSearch create an index with dynamic properties

Unable to see data in Kibana 4 on Ubuntu

Default elasticsearch configuration for docker container

Categories

Resources