comparing data between different mappings - elasticsearch

I am relatively new to Elasticsearch so I apologies if the terms are not accurate. I have a few indexes and a few almost identical indexes but with less fields in the mapping.
(the original indexes has data and the new ones with less fields are empty)
how can I compare the data and insert the relevant documents into the new indexes with less fields?
for example original index mapping:
{
“first_name” : ”Dana”,
“last_name” : ”Leon”,
“birth_date” : “1990-01-09“,
“social_media” : {
“facebook_id” : ”K8426dN”,
“google_id” : ”8764873”,
“linkedin_id” : ”Gdna”
}
}
new mapping with less fields
{
“first_name” : ”Dana”,
“last_name” : ”Leon”,
“social_media” : {
“facebook_id” : ”K8426dN”,
“google_id” : ”8764873”,
“linkedin_id” : ”Gdna”
}
}
Thanks

You can use reindex by script:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-change-name
In the "script" you'll need to specify the fields, that you want to remove like:
ctx._source.remove("birth_date")"
The second option is to use ingest pipeline with "remove" proccessor:
https://www.elastic.co/guide/en/elasticsearch/reference/current/remove-processor.html, and to do reindex with default pipeline definition into settings, but this will be harder to implement

Related

Kibana display number after comma on metric

I'm actually trying to dislay all number after comma in my kibana's datatable but even with json input format, it does display as expected ...
Do you have an idea how to do this ?
Here for example I have 2.521 but in can be 0.632, or 0.194 ...
I only see 0 in Min, Max, Avg columns
In my C# code is a double and indexed as a number in Kibana index:
How to do this plz ?
Thank a lot and best regards
This usually means that your field has been mapped as integer or long. If that's the case, 0.632 is stored as 0 and 2.521 as 2.
You need to make sure that those fields are mapped as float or double in your mapping.
PS: you cannot change the mapping type once the index has been created, you need to create a new index and reindex your data.
You need to pre-create your index with the right mapping types before sending the first document:
PUT webapi-myworkspace-test
{
"mappings": {
"properties": {
"GraphApiResponseTime" : {
"type" : "double"
}
}
}
}

ElasticSearch - Unique Tags for multiple documents (indexing)

We would like a unique Tag and multiple values in elastic search : to be clearer. We need to do a timeserie graph. So we get values between 2 dates. But of course we have different kinds of data. That where our tags comes. We want to search our tags with an autoCompletion, then choose our values with the dates.
{tag :["sdfsf", "fddsfsd", "fsdfsf"]
{
values : 145.45
date : "2004-10-23"
},
{
values : 556.09
date : "2010-02-13"
}
}
After, a bit of research we found the parent/child technique but because we want to do a completion on tag (in the parent), we need an aggregation which is impossible in ES with "has_parent".
Our solutions is to do :
{
{
tag :["sdfsf", "fddsfsd", "fsdfsf"],
values : 145.45,
date : "2004-10-23"
},
{
tag :null,
values : 556.09,
date : "2010-02-13"
}, {etc...}
}
So we only have one tag easy to check with completion. But it's kind of "ugly".
Does anybody have a correct way to do what we want to do ?
thx in advance

How to get Elasticsearch actual result size

I am not asking the count for the search response. what I am asking is, size of the result(_source) that took Elasticsearch's hard-disk memory. is it possible to find such?. Why I am asking is, I need to find which type of source takes maximum size for an index. thanks in advance.
You can enable the _size field in your mapping. So this is data is created at index time.
{
"tweet" : {
"_size" : {"enabled" : true, "store" : true }
}
}
Check out the size field documentation.
Then you can return this field by adding it to the fields list in the query.
See the Fields documentation for how to do that.

Query two indexes simultaneously in Kibana 4?

Whenever I create a visualization, Kibana 4 asks me to select the index for doing the search. My project requires searching data that is present in multiple indexes and hence I am stuck. I wish to search two indexes for my data and then visualize them. Any help would be valuable.
Kibana can create Visualization from multiple indexes. But! indexes should have similar names, or alias names with similar names, for example, you can simply grab data from indexes: logstash-2015-01-01 and logstash-2015-01-02 using mask logstash-*.
But yes it would be handy if we could write something like index1,onother_index.
A solution that works in any case: create an alias in Elasticsearch for the indexes you want to query simultaneously and then use the alias as an index-pattern in Kibana.
In the plugin Marvel, through the Sense interface, you can create an alias for multiple indexes by doing this request :
POST _aliases
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}
Or using CURL:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
Then, you just need to add an index-pattern in Kibana for "alias1" and create your visualizations.
For more informations on aliases, see https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html
Thanks for all the help, But I figured out a way in which this could be done.
In Index Pattern of Kibana 4 create an index Pattern as _all. This index pattern contains all the indexes present in your elasticsearch. Hence when you create a new visualization simply select the _all index pattern there and all the data fields from all the indexes in your elasticsearch are accessible and you can easily use it to create visualizations.
If I understand what you are asking correctly, then it may depend on how you've named your indexes.
I can query multiple logstash indexes, by selecting my pattern 'logstash-*'. When you setup your indexes it gives you the option to specify a pattern.
(Settings => Indices => Index Pattern => Add New)
I hope that helps.
Two wildcards (i.e. *-*) works for me in Kibana 4.
I'm not sure i understand correctly, but I think your best option is to create that visualization on both indexes you want separately, and build a dashboard including both the visualizations.
Kibana can't display a single visualization with searches from two separate indexes.

Why are Elasticsearch aliases not unique

The Elasticsearch documentation describes aliases as feature to reindex data with zero downtime:
Create a new index and index the whole data
Let your alias point to the new index
Delete the old index
This would be a great feature if aliases would be unique but it's possible that one alias points to multiple indexes. Considering that maybe the deletion of the old index fails my application might speak to two indexes which might not be in sync. Even worse: the application doesn't know about that.
Why is it possible to reuse an alias?
It allows you to easily have several indexes that are both used individually and together with other indexes. This is useful for example when having a logging index where sometimes you want to query the most recent (logs-recent alias) and sometimes want to query everything (logs alias). There are probably lots of other use cases but this one pops up as the first for me.
As per the documentation you can send both the remove and add in one request:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
After that succeeds you can remove your old index and if that fails you will just have an extra index taking up some space until its cleaned out.

Resources