Elasticsearch - how to bulk insert index + add index to alias? - elasticsearch

I have a script that creates indices which are generated periodically using the _bulk API. For example, it would create the following:
/foo_2017-06-01/bar/abcd
/foo_2017-05-31/bar/efgh
....
But I want an alias to, basically, index all of these indices - and newer ones. For example I want:
/foo --> [/foo_2017-06-01, /foo_2017-05-31, ...]
Is this possible? Or what can I do to achieve the same thing?

You can use Templates to do this:
https://www.elastic.co/guide/en/elasticsearch/reference/5.4/indices-templates.html
Templates allow you to apply settings to newly created indexes automatically. For example, the following template will apply to all indexes with the 'foo_' prefix and add it to the foo alias.
PUT _template/template_1
{
"template" : "foo_*",
"settings" : {
"number_of_shards" : 1
},
"aliases" : {,
"foo" : {}
}
}

Related

comparing data between different mappings

I am relatively new to Elasticsearch so I apologies if the terms are not accurate. I have a few indexes and a few almost identical indexes but with less fields in the mapping.
(the original indexes has data and the new ones with less fields are empty)
how can I compare the data and insert the relevant documents into the new indexes with less fields?
for example original index mapping:
{
“first_name” : ”Dana”,
“last_name” : ”Leon”,
“birth_date” : “1990-01-09“,
“social_media” : {
“facebook_id” : ”K8426dN”,
“google_id” : ”8764873”,
“linkedin_id” : ”Gdna”
}
}
new mapping with less fields
{
“first_name” : ”Dana”,
“last_name” : ”Leon”,
“social_media” : {
“facebook_id” : ”K8426dN”,
“google_id” : ”8764873”,
“linkedin_id” : ”Gdna”
}
}
Thanks
You can use reindex by script:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-change-name
In the "script" you'll need to specify the fields, that you want to remove like:
ctx._source.remove("birth_date")"
The second option is to use ingest pipeline with "remove" proccessor:
https://www.elastic.co/guide/en/elasticsearch/reference/current/remove-processor.html, and to do reindex with default pipeline definition into settings, but this will be harder to implement

Rolling indices in Elasticsearch

I see a lot of topics on how to create rolling indices in Elasticsearch using logstash.
But is there a way to achieve the same i.e create indices on daily basis in elasticsearch without logstash?
I came a cross a post which says to run cron job to create the indices as date rolls, but that is a manual job I have to do, I was looking for out of the box options if available in elasticsearch
Yes, use index templates (which is what Logstash uses internally to achieve the creation of rolling indices)
Simply create a template with a name pattern like this and then everytime you index a document in an index whose name matches that pattern, ES will create the index for you:
curl -XPUT localhost:9200/_template/my_template -d '{
"template" : "logstash-*",
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"my_type" : {
"properties": {
...
}
}
}
}'

Query two indexes simultaneously in Kibana 4?

Whenever I create a visualization, Kibana 4 asks me to select the index for doing the search. My project requires searching data that is present in multiple indexes and hence I am stuck. I wish to search two indexes for my data and then visualize them. Any help would be valuable.
Kibana can create Visualization from multiple indexes. But! indexes should have similar names, or alias names with similar names, for example, you can simply grab data from indexes: logstash-2015-01-01 and logstash-2015-01-02 using mask logstash-*.
But yes it would be handy if we could write something like index1,onother_index.
A solution that works in any case: create an alias in Elasticsearch for the indexes you want to query simultaneously and then use the alias as an index-pattern in Kibana.
In the plugin Marvel, through the Sense interface, you can create an alias for multiple indexes by doing this request :
POST _aliases
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}
Or using CURL:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
Then, you just need to add an index-pattern in Kibana for "alias1" and create your visualizations.
For more informations on aliases, see https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html
Thanks for all the help, But I figured out a way in which this could be done.
In Index Pattern of Kibana 4 create an index Pattern as _all. This index pattern contains all the indexes present in your elasticsearch. Hence when you create a new visualization simply select the _all index pattern there and all the data fields from all the indexes in your elasticsearch are accessible and you can easily use it to create visualizations.
If I understand what you are asking correctly, then it may depend on how you've named your indexes.
I can query multiple logstash indexes, by selecting my pattern 'logstash-*'. When you setup your indexes it gives you the option to specify a pattern.
(Settings => Indices => Index Pattern => Add New)
I hope that helps.
Two wildcards (i.e. *-*) works for me in Kibana 4.
I'm not sure i understand correctly, but I think your best option is to create that visualization on both indexes you want separately, and build a dashboard including both the visualizations.
Kibana can't display a single visualization with searches from two separate indexes.

Why are Elasticsearch aliases not unique

The Elasticsearch documentation describes aliases as feature to reindex data with zero downtime:
Create a new index and index the whole data
Let your alias point to the new index
Delete the old index
This would be a great feature if aliases would be unique but it's possible that one alias points to multiple indexes. Considering that maybe the deletion of the old index fails my application might speak to two indexes which might not be in sync. Even worse: the application doesn't know about that.
Why is it possible to reuse an alias?
It allows you to easily have several indexes that are both used individually and together with other indexes. This is useful for example when having a logging index where sometimes you want to query the most recent (logs-recent alias) and sometimes want to query everything (logs alias). There are probably lots of other use cases but this one pops up as the first for me.
As per the documentation you can send both the remove and add in one request:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
After that succeeds you can remove your old index and if that fails you will just have an extra index taking up some space until its cleaned out.

Updating the default index number_of_replicas setting for new indices

I've tried updating the number of replicas as follows, according to the documentation
curl -XPUT 'localhost:9200/_settings' -d '
{ "index" : { "number_of_replicas" : 4 } }'
This correctly changes the replica count for existing nodes. However, when logstash creates a new index the following day, number_of_replicas is set to the old value.
Is there a way to permanently change the default value for this setting without updating all the elasticsearch.yml files in the cluster and restarting the services?
FWIW I've also tried
curl -XPUT 'localhost:9200/logstash-*/_settings' -d '
{ "index" : { "number_of_replicas" : 4 } }'
to no avail.
Yes, you can use index templates. Index templates are a great way to set default settings (including mappings) for new indices created in a cluster.
Index Templates
Index templates allow to define templates that will automatically be
applied to new indices created. The templates include both settings
and mappings, and a simple pattern template that controls if the
template will be applied to the index created.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
For your example:
curl -XPUT 'localhost:9200/_template/logstash_template' -d '
{
"template" : "logstash-*",
"settings" : {"number_of_replicas" : 4 }
} '
This will set the default number of replicas to 4 for all new indexes that match the name "logstash-*". Note that this will not change existing indexes, only newly created ones.

Resources