ElasticSearch - Reindex to add doc_value - elasticsearch

What am I trying to do?
Add doc_type to an existing index.
What have I tried?
Created index and document
POST /my_index-1/my_type/1
{
"my_prop": "my_value"
}
Added a template
PUT /_template/my_template
{
"id": "my_template",
"template": "my_index-*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"my_prop_template": {
"mapping": {
"index": "not_analyzed",
"doc_values": true,
"fielddata": {
"format": "doc_values"
},
"type": "string"
},
"match": "my_prop",
"match_mapping_type": "string"
}
}
]
}
}
}
Reindexed
./stream2es es --source http://localhost:9200/my_index-1 --target http://localhost:9200/my_index-2
What went wrong?
In the new index my_index-2 the property did not receive "doc_values": true:
...
"properties": {
"my_prop": {
"type": "string"
}
}
...
Just for the sanity, I have also tried adding the same document to my_index-3, and it got "doc_values": true.
My question
How can I reindex my old index with "doc_values": true?

Thanks #Val! Logstash indeed solved the problem.
Both stream2es and elasticsearch-reindex created new mapping without "doc_values": true.

Related

How do I alter the schema without destroying data in elasticsearch?

This is my current schema
{
"mappings": {
"historical_data": {
"properties": {
"continent": {
"type": "string",
"index": "not_analyzed"
},
"country": {
"type": "string",
"index": "not_analyzed"
},
"description": {
"type": "string"
},
"funding": {
"type": "long"
},
"year": {
"type": "integer"
},
"agency": {
"type": "string"
},
"misc": {
"type": "string"
},
"university": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
I have 700k records uploaded. Without destroying the data, how can I make the university index not "not_analysed" such that the change reflects in my existing data?
The mapping for an existing field cannot be modified.
However you can achieve the desired outcome in two ways .
Create another field. Adding fields is free using put _mapping API
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"new_university": {
"type": "string"
}
}
}'
Use multi-fields, add a sub-field to your not_analyzed field.
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"university": {
"type": "string",
"index": "not_analyzed",
"fields": {
"university_analyzed": {
"type": "string" // <-- ANALYZED sub field
}
}
}
}
}'
In both the case, you need to reindex in order to populate the new field. Use _reindex API
curl -XPUT localhost:9200/_reindex -d '{
"source": {
"index": "YOUR_INDEX"
},
"dest": {
"index": "YOUR_INDEX"
},
"script": {
"inline": "ctx._source.university = ctx._source.university"
}
}'
You are not exactly forced to "destroy" your data, what you can do is reindex your data as described in this article (I'm not gonna rip off the examples as they are particularly clear in the section Reindexing your data with zero downtime).
For reindexing, you can also take a look at the reindexing API, the simplest way being:
POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}
Of course it will take some resources to perform this operation, so I would suggest that you take a complete look at the changes you want to introduce in your mapping, and perform the operation when you have the least amount of activity on your servers (e.g. during the weekend, or at night...)

Elasticsearch 1.5 Won't Add _timestamp

I am using this request when creating my index:
PUT some_name
{
"mappings": {
"_default_": {
"_timestamp" : {
"enabled": true,
"store": true
},
"properties": {
"properties": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}
However, _timestamp field is not being returned, basically when I add a document (without any time field) and request it back. I am running Elasticsearch 1.5, and I have tried "store": "yes", "store": "true".
What am I doing wrong? Thanks.
You need to specifically ask for that field to be returned: "fields": ["_timestamp"] because it's a field that's not commonly returned and is not included in the _source (the default being returned):
GET /some_name/_search
{
"query": {
"match_all": {}
},
"fields": ["_timestamp"]
}

not analyzed string in elasticsearch

I want to write a template in elasticsearch that changes all strngs to not analyzed. The official documentation shows that I can do that using
"properties": {
"host_name": {
"type": "string",
"index": "not_analyzed"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z YYYY"
}
}
But the problem here is that I need to do this for every field like it is done here for host_name. I tried using _all and __all but it did not seem to work. How can I change all the strings to not analyzed using a custom template?
For an already existent index, you cannot change the mapping of the already existent fields and, even if you could, you need to reindex all documents so that they can obey the new mapping rules.
Otherwise, if you just create the index:
PUT /_template/not_analyzed_strings
{
"template": "xxx-*",
"order": 0,
"mappings": {
"_default_": {
"dynamic_templates": [
{
"string_fields": {
"mapping": {
"index": "not_analyzed",
"type": "string"
},
"match_mapping_type": "string",
"match": "*"
}
}
]
}
}
}

Kibana doesn't show results on tile map

I have approximately 3300 documents with geo_point typed field filled.
When I try to visualize my documents on the tile map, kibana says "no results found".
I've already tried putting coordinates as:
- geohash in string
- [lon, lat] array
- object with "lat" and "lon" properties
- string "lat,lon"
All these ways of setting geo_point are allowed according to ES docs.
Kibana detects this field as geo_point (there is a globe icon near field name), but nothing shows up on tile map.
What's wrong with me?
I'm using Kibana 4.2, elasticsearch 2.0.0
I've managed it.
It was happening because I had my geo_point typed field inside of the field with "type": "nested" parameter.
I've changed this outer field to "dynamic": "true" and now I can visualize my locations!
I was able to have a nested geo_point by removing the "type": "nested" from the mapping. No "dynamic":"true" needed. My mapping looks like this:
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"_ttl": {
"enabled": true,
"default": "12m"
},
"dynamic_templates": [{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
},
"user_data": {
"properties": {
"user_geolocation": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}
}

Changing elasticsearch index's shard-count on the next index-rotation

I have an ELK (Elasticsearch-Kibana) stack wherein the elasticsearch node has the default shard value of 5. Logs are pushed to it in logstash format (logstash-YYYY.MM.DD), which - correct me if I am wrong - are indexed date-wise.
Since I cannot change the shard count of an existing index without reindexing, I want to increase the number of shards to 8 when the next index is created. I figured that the ES-API allows on-the-fly persistent changes.
How do I go about doing this?
You can use the "Template Management" features in Elasticsearch: http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/indices-templates.html
Create a new logstash template by using:
curl -XPUT localhost:9200/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8,
"index.refresh_interval": "5s"
},
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
},
"geoip": {
"type": "object",
"dynamic": true,
"path": "full",
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}'
The next time the index that matches your pattern is created, it will be created with your new settings.
The setting is on your elasticsearch. You need to change to config file config/elasticsearch.yml
Change the index.number_of_shards: 8. and restart elasticsearch. The new configuration will set and the new index will use the new configuration, which create 8 shard as you want.
Best would be to use templates and to add one I would recommend Kopf pluin found here: https://github.com/lmenezes/elasticsearch-kopf
You can ofcourse use the API:
curl -XPUT $ELASTICSEARCH-MASTER$:9200/_template/$TEMPLATE-NAME$ -d '$TEMPLATE-CONTENT$'
In the plugin: on the top left corner click on more -> Index templates and then create a new template and make sure you have the following settings as part of your template:
{
"order": 0,
"template": "logstash*",
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1"
}
},
"mappings": {### your mapping ####},
"aliases": {}
}
The above setting will make sure that if a new new index with name logstash* is created it would have 5 number of shards and 1 replica.

Resources