Cannot open data from 5.x in 6.x : Mappings - elasticsearch

I've created a routine that updates ES clients from 5.x to 6.x and finally 7.x
Somehow some clients cannot be updated.
Loading existing data in 6.8 fails.
Appearently some mappings are causing this.
But there are not templates applied and I cannot see any difference to the other clients, were everything works just fine.
I know that ES has dropped string type and is using text now but where does this type string come from? Why doesn't it occur on the other clients then? And finally - how would I solve this? I cannot change type from string to text in 5.x and I cannot apply templates in 6.x because it's not starting up.
Caused by: org.elasticsearch.index.mapper.MapperParsingException: Failed to parse mapping [datapoint]: No handler for type [string] declared on field [batchId]
UPDATE:
this is my current mapping for batchId
http://localhost:9200/_mapping
"batchId":{"type":"keyword"}

it seems that you forgot to change the datatype from string to text in your mapping which caused MapperParsingException and its really good that exception is telling you that probalmatic field is batchId, just change it to text datatype and it should work.
Please refer this elastic blog that talks about this string to text change and provides some tips on how to handle it while upgrading.

The problem was something else:
Some clients had unexpected indices which caused the problem.
After deleting them, ES 6.x started fine.

Related

How to configure elastic search correct on Shopware 6?

I'm trying to set up elastic search on our server. Actually it seems to work, but when i run the product importer (own plugin), then it has exceptions.
So the config is done like described in the documentation: https://docs.shopware.com/en/shopware-6-en/enterprise-extensions/enterprise-search?category=shopware-6-en/enterprise-extensions
But even if all product fields are set to "Not searched" in the settings, the importer runs into this exception:
ERROR [app] object mapping for [customFields.product_processing_fields] tried to parse field [product_processing_fields] as object, but found a concrete value
In ElasticsearchIndexer.php line 210:
[Shopware\Elasticsearch\Exception\ElasticsearchIndexingException]
Following errors occurred while indexing:
object mapping for [customFields.product_processing_fields] tried to parse field [product_proc
essing_fields] as object, but found a concrete value
If i turn off elastic search in the .env file, then the importer runs trough without any issue.
Does anyone has an idea how to solve?
The problem is that the value of the custom field does not correlate to what type it was configured as. Check the type property of the config column in the custom_field table. If it is bool then the value mustn't be a string for example. You can see where the type is mapped from what is set in your custom field's config to what Elasticsearch will expect by looking at the method CustomFieldUpdater::getTypeFromCustomFieldType.

Upgrate elasticsearch2.4 to elasticsearch7.x

Can somebody please guide me that following questions while upgrading es2 to es7
Is it possible to directly upgrade to es2 to es7 ?
do we need to do mapping ?
how to handle _parent fields that are already exist in es2.
1. Is it possible to directly upgrade to es2 to es7 ?
No, it's not possible, ES provides the backward compatibility only till last major version, as mentioned in ES upgrade docs and there are a lot of breaking changes, from es2 to es7.
2. do we need to do mapping ?
It's not clear what exactly you mean, but yes, as mentioned there are lot of breaking changes in mapping as well, like removal of types, string data-type is changed to text to name a few, so it's better you define your new mapping according to latest syntax.
3.how to handle _parent fields that are already exist in es2.?
Not much familiar with this but you can read the docs and migration guide on it.

Elasticsearch issue types removal

I am trying to run the below code in Python using Elasticsearch Ver 7.1, however the following errors come up:
ElasticsearchDeprecationWarning: [types removal] Using include_type_name in put mapping requests is deprecated. The parameter will be removed in the next major version.
client.indices.put_mapping(index=indexName,doc_type='diseases', body=diseaseMapping, include_type_name=True)
followed by:
ElasticsearchDeprecationWarning: [types removal] Specifying types in document index requests is deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id}).
client.index(index=indexName,doc_type=docType, body={"name": disease,"title":currentPage.title,"fulltext":currentPage.content})
How I am supposed to amend my code to make it (see here) work in line with Elasticsearch 7X version? Any kind of help would be much appreciated.
This is just a warning right now, but it will become an error in Elasticsearch 8.
From last few version, Elasticsearch has been planning the removal of index types inside an index
ES5 - Setting index.mapping.single_type: true on an index will enable the single-type-per-index behavior which will be enforced in 6.0.
In ES6 - you can't have more than 1 index type inside 1 index
In ES7 - the concept of types inside an index has been deprecated
In ES8 - it will be removed, and you can't use types for query or while inserting documents
My suggestion would be to design an application and mapping in such a way that it doesn't include type parameter in index
To know the reason why elastic search has done this here is a link: https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#_why_are_mapping_types_being_removed
A common issue (and difficult to spot) for this error message could be misspelling the endpoint, e.g.:
Misspelled:
/search
Correct:
/_search
Double check if your endpoint is correct as ElasticSearch may think you are trying to manipulate (add, update, remove) a document and you are giving a type, which is not the case (you are trying to call an endpoint).

Indexing Errors when using quarkus logging-gelf extension and ELK stack

I have setup logging like described in https://quarkus.io/guides/centralized-log-management with an ELK Stack using version 7.7.
My logstash pipeline looks like the proposed example:
input {
gelf {
port => 12201
}
}
output {
stdout {}
elasticsearch {
hosts => ["http://elasticsearch:9200"]
}
}
Most Messages are showing up in my Kibana using logstash.* as an Index pattern. But some Messages are dropped.
2020-05-28 15:30:36,565 INFO [io.quarkus] (Quarkus Main Thread) Quarkus 1.4.2.Final started in 38.335s. Listening on: http://0.0.0.0:8085
The Problem seems to be, that the fields MessageParam0, MessageParam1, MessageParam2 etc. are mapped to the type that first appeared in the logs but actually contain multiple datatypes. The Elasticsearch log shows Errors like ["org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [MessageParam1].
Is there any way in the Quarkus logging-gelf extension to correctly map the values?
ELK can auto-create your Elasticsearch index mapping by looking at the first indexed document. This is a very convenient functionality, but it comes with some drawback.
For example, if you have a field that can contains numbers or strings, if the first document contains a number for this field, the mapping will be created with a number field so you will not be able to index a document containing a String inside this field ...
The only workaround for this is to create the mapping upfront (you can only defines the fields that causing the issue, the other fields will be created automatically).
This is an ELK issue, there is nothing we can do at Quarkus side.

Resulting all the indices while trying to get only those indices which are associated with specific alias

I have recently upgraded application from Elastic search 5.3.1 to 6.0.
My requirement was to get all the indices which were associated with specific alias.
I used below mentioned snippet to fetch all indices associated with specific alias . This snippet is working fine in 5.3.1 and giving only those indices which are associated with that specific alias.
GetAliasesResponse r = client.admin().indices().getAliases(new
GetAliasesRequest("givenalias")).actionGet();
But after ES 6.0 , the same snippet giving all the indices which are created in system .
Ideally it should only return those indices which are associated with given alias .Not others. This was working in Elastic search 5.3.1.
TL;DR: It is an intended breaking change in the Java API of Elasticsearch (though it is not explicitly mentioned in the "Breaking changes in 6.0 ยป Java API changes" page).
The following is the story of discovering this fact. (Note: the original answer was heavily edited, hence comments might be out of date.)
Breaking changes in the REST API in 6.0
First I noticed that this part of the REST API changed in Elasticsearch 6.0. There are two reported breaking changes concerning aliases:
GET /_aliases,_mappings syntax was removed in favor of GET /_aliases/GET /_mappings
Indices aliases api resolves indices expressions only against indices
Though there isn't mentioned anything concerning OP's case.
From what I have seen doing queries, this query works in Elasticsearch 5:
GET /alias1/_aliases
And does not work in Elasticsearch 6, giving the following error:
{
"error": "Incorrect HTTP method for uri [/alias1/_aliases] and method [GET], allowed: [PUT]",
"status": 405
}
Interestingly, GET /alias1/_alias works in both versions and returns same result.
Moreover, I didn't manage to find an example of GET /alias1/_aliases in the documentation of nor 5.6, neither 6.0.
Reproducing the bug
After having realised that OP is actually using the Java API, I managed to reproduce the exact same behavior.
The following code:
GetAliasesResponse alias1 = client.admin().indices()
.getAliases(new GetAliasesRequest("alias1")).actionGet();
In ES 5 produces this in the IntelliJ debugger:
And for ES 6 I've got the following:
As you can see, there are extra keys in the second output, which have empty values.
Diving into the source code
Quick search over the elasticsearch codebase gave me the final explanation. In ES 5 there was a test testIndicesGetAliases which was checking that list of indices returned for a test alias has exactly one element (IndexAliasesIT.java#L554):
logger.info("--> getting alias1");
GetAliasesResponse getResponse = admin().indices().prepareGetAliases("alias1").get();
assertThat(getResponse, notNullValue());
assertThat(getResponse.getAliases().size(), equalTo(1));
And in 6.0 it checks that the size is 5! (IndexAliasesIT.java#L573)
logger.info("--> getting alias1");
GetAliasesResponse getResponse = admin().indices().prepareGetAliases("alias1").get();
assertThat(getResponse, notNullValue());
assertThat(getResponse.getAliases().size(), equalTo(5));
This change was introduced in this commit, which is related to these issues:
Remove comma-separated feature parsing for GetIndicesAction #24723
_alias API no longer accepts index wildcards #25090
This is actually interesting, because one of the reported REST API breaking changes that we have see above also broke compatibility of some Java API calls.
What you can do
In short term, you need just to filter out those keys with empty values.
In longer term I think it makes sense to migrate to Java High Level REST Client since Elastic plan to deprecate the TransportClient in version 7.0:
We plan on deprecating the TransportClient in Elasticsearch 7.0 and
removing it completely in 8.0. Instead, you should be using the Java
High Level REST Client, which executes HTTP requests rather than
serialized Java requests.
In general, Elasticsearch breaks compatibility quite often, so it's better to stay away from the dark corners of it like Java API.
Thanks for reading.
Hope that helps!

Resources