Changing elasticsearch mapping - elasticsearch

Take the simplest case of indexing the following document in elasticsearch
{
"name": "Mark",
"age": 28
}
With automatic mapping the mapping for this index would now look like
"properties" : {
"doc" : {
"properties" : {
"age" : { "type" : "long"},
"name" : { "type" : "string"
}
}
},
But say I then wanted to allow the case where this document should be indexed
{
"name": "Bill",
"age": "seven"
}
If I try this the mapping does not update and elasticsearch throws an error since there is a conflict with the type of the age property.
Is there any way to do this so both docs could be automatically indexed and consequently queryable?

Mappings are defined per type so what you could do is having two types in your index:
numeric
alphabetical
And split the documents according to the value in the age field. If you run a query you can query both types.

you can add new fields and update a mapping. But you cannot update a mapping.To do that you need to drop the index and create a new mapping and index the data..!
For more info refer this link reference

You can't change existing mapping.You can only add new field in it.
Or you have to delete old mapping & create a new mapping for that particular index.

Related

How to update data type of a field in elasticsearch

I am publishing a data to elasticsearch using fluentd. It has a field Data.CPU which is currently set to string. Index name is health_gateway
I have made some changes in python code which is generating the data so now this field Data.CPU has now become integer. But still elasticsearch is showing it as string. How can I update it data type.
I tried running below commands in kibana dev tools:
PUT health_gateway/doc/_mapping
{
"doc" : {
"properties" : {
"Data.CPU" : {"type" : "integer"}
}
}
}
But it gave me below error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
}
],
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
},
"status" : 400
}
There is also this document which says using mutate we can convert the data type but I am not able to understand it properly.
I do not want to delete the index and recreate as I have created a visualization based on this index and after deleting it will also be deleted. Can anyone please help in this.
The short answer is that you can't change the mapping of a field that already exists in a given index, as explained in the official docs.
The specific error you got is because you included /doc/ in your request path (you probably wanted /<index>/_mapping), but fixing this alone won't be sufficient.
Finally, I'm not sure you really have a dot in the field name there. Last I heard it wasn't possible to use dots in field names.
Nevertheless, there are several ways forward in your situation... here are a couple of them:
Use a scripted field
You can add a scripted field to the Kibana index-pattern. It's quick to implement, but has major performance implications. You can read more about them on the Elastic blog here (especially under the heading "Match a number and return that match").
Add a new multi-field
You could add a new multifield. The example below assumes that CPU is a nested field under Data, rather than really being called Data.CPU with a literal .:
PUT health_gateway/_mapping
{
"doc": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "keyword",
"fields": {
"int": {
"type": "short"
}
}
}
}
}
}
}
}
Reindex your data within ES
Use the Reindex API. Be sure to set the correct mapping on the target index.
Delete and reindex everything from source
If you are able to regenerate the data from source in a timely manner, without disrupting users, you can simply delete the index and reingest all your data with an updated mapping.
You can update the mapping, by indexing the same field in multiple ways i.e by using multi fields.
Using the below mapping, Data.CPU.raw will be of integer type
{
"mappings": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "string",
"fields": {
"raw": {
"type": "integer"
}
}
}
}
}
}
}
}
OR you can create a new index with correct index mapping, and reindex the data in it using the reindex API

Changing type of property in index type's mapping

I have index mapping for type 'T1' as below:
"T1" : {
"properties" : {
"prop1" : {
"type" : "text"
}
}
}
And now I want to change the type of prop1 from text to keyword. I don't want to delete index. I have also read people suggesting to create another property with new type and replace it. But then I have to update old documents which I am not interested into. I tried to use PUT api as below but I never works.
PUT /indexName/T1/_mapping -d
{
"T1" : {
"properties" : {
"prop1" : {
"type" : "keyword"
}
}
}
}
Is there any way to achieve this?
Mapping cannot be modified, hence the PUT api you have used will not work. The new index will have to be created with the updated mapping to be used and reindexing all the data to new index.
To prevent downtime you can always use alias:
https://www.elastic.co/blog/changing-mapping-with-zero-downtime
A mapping cannot be updated once it is persisted. The only option is to create a new index with the correct mappings and reindex your data using the reindex API provided by ES.
You can read about the reindex API here:
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/docs-reindex.html

Is there a way to apply the synonym token filter in ElasticSearch to field names rather than the value?

Consider the following JSON file:
{
"titleSony": "Matrix",
"cast": [
{
"firstName": "Keanu",
"lastName": "Reeves"
}
]
}
Now, I know in ElasticSearch, you can apply a synonym token filter to field values as given in the following link: Elasticsearch Analysis: Synonym token filter.
Hence, I can create a "synonym.txt" file with Matrix => Matx, then if I search for titleSony:Matx, it will return the documents with Matrix as well.
Now, what I would like is to create a synonym for the field name titleSony. For example - titleSony => titleAll, such that when I search for titleAll, I should get all documents with titleSony as well.
Is there any way to accomplish this in ElasticSearch?
Now, what I would like is to create a synonym for the field name "titleSony". For example - titleSony => titleAll , hence when I search for "titleAll", I should get all documents with "titleSony" as well.
Yes, somewhat. Elasticsearch has some default behavior very similar to this, which I'll touch on in a bit.
The feature you're looking for is called "Copy to field." It allows you to specify that the terms in one field should be copied into another. This is useful for consolidating terms you expect to match into a single field, to help simplify your query when you would like to match against any one of a number of fields.
In this example, you would specify in your mapping that the terms in the titleSony field ought to be copied into the titleAll field. Presumably you'd have other fields (say, titleDisney) which also copy into that field as well. So a search against titleAll will effectively match the other fields whose terms are copied into it.
An excerpt of your mapping might look something like this:
{
"movies" : {
"properties" : {
"titleSony" : { "type" : "string", "copy_to" : "titleAll" },
"titleDisney" : { "type" : "string", "copy_to" : "titleAll" },
"titleAll" : { "type" : "string" },
"cast" : { ... },
...
}
}
I mentioned earlier that Elasticsearch does something like this. By default it creates a special field called _all into which all the document's terms are copied. This field lets you construct very simple queries to match against terms that occur in any field on the document. So as you see, this is a fairly common convention in Elasticsearch. (Elasticsearch mapping: _all field.)

How to add multiple object types to elasticsearch using jdbc river?

I'm using the jdbc river to successfully add one object type, "contacts", to elasticsearch. How can I add another contact type with different fields? I'd like to add "companies" as well.
What I have is below. Do I need to do a separate PUT statement? If I do, no new data appears to be added to elasticsearch.
PUT /_river/projects_river/_meta
{
"type" : "jdbc",
"index" : {
"index" : "ALL",
"type" : "project",
"bulk_size" : 500,
"max_bulk_requests" : 1,
"autocommit": true
},
"jdbc" : {
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"poll" : "30s",
"strategy" : "poll",
"url" : "jdbc:sqlserver://connectionstring",
"user":"username","password":"password",
"sql" : "select ContactID as _id, * from Contact"
}
}
Also, when search returns results, how can I tell if they are of type contact or company? Right now they all have a type of "jdbc", and changing that in the code above throws an error.
You can achieve what you want with inserting several columns to your sql query.
Like ContactID AS _id you can also define indexName AS _index and indexType AS _type in your sql query.
Also, if you need another river, add rivers with different _river types.
In your case such as,
PUT /_river/projects_river2/_meta + Query ....
PUT /_river/projects_river3/_meta + Query ....
Anyone else who stumbles across this, please see official documentation for syntax first: https://github.com/jprante/elasticsearch-river-jdbc/wiki/How-bulk-indexing-isused-by-the-JDBC-river
Here's the final put statement I used:
PUT /_river/contact/_meta
{
"type":"jdbc",
"jdbc": {
"driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url":"connectionstring",
"user":"username",
"password":"password",
"sql":"select ContactID as _id,* from Contact",
"poll": "5m",
"strategy": "simple",
"index": "contact",
"type": "contact"
}
}

Exact (not substring) matching in Elasticsearch

{"query":{
"match" : {
"content" : "2"
}
}} matches all the documents whole content contains the number 2, however I would like the content to be exactly 2, no more no less - think of my requirement in a spirit of Java's String.equals.
Similarly for the second query I would like to match when the document's content is exactly '3 3' and nothing more or less. {"query":{
"match" : {
"content" : "3 3"
}
}}
How could I do exact (String.equals) matching in Elasticsearch?
Without seeing your index type mapping and sample data, it's hard to answer this directly - but I'll try.
Offhand, I'd say this is similar to this answer here (https://stackoverflow.com/a/12867852/382774), where you simply set the content field's index option to not_analyzed in your mapping:
"url" : {
"type" : "string",
"index" : "not_analyzed"
}
Edit: I wasn't clear enough with my original answer, shown above. I did not mean to imply that you should add the example code to your query, I meant that you need to specify in your index type mapping that the url field is of type string and it is indexed but not analyzed (not_analyzed).
This tells Elasticsearch to not bother analyzing (tokenizing or token filtering) the field when you're indexing your documents - just store it in the index as it exists in the document. For more information on mappings, see http://www.elasticsearch.org/guide/reference/mapping/ for an intro and http://www.elasticsearch.org/guide/reference/mapping/core-types/ for specifics on not_analyzed (tip: search for it on that page).
Update:
Official doc tells us that in a new version of Elastic search you can't define variable as "not_analyzed", instead of this you should use "keyword".
For the old version elastic:
{
"foo": {
"type" "string",
"index": "not_analyzed"
}
}
For new version:
{
"foo": {
"type" "keyword",
"index": true
}
}
Note that this functionality (keyword type) are from elastic 5.0 and backward compatibility layer is removed from Elasticsearch 6.0 release.
Official Doc
You should use filter instead of match.
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"content" : 2
}
}
}
}
And you got docs whose content is exact 2, instead of 20 or 2.1

Resources