Unable to Bulk upload wikipedia json.gz to elasticsearch

Unable to Bulk upload wikipedia json.gz to elasticsearch - elasticsearch

I'm following the example online to import a json.gz wikipedia dump into elasticsearch: https://www.elastic.co/blog/loading-wikipedia.
After executing the following
curl -s 'https://'$site'/w/api.php?action=cirrus-mapping-dump&format=json&formatversion=2' |
jq .content |
sed 's/"index_analyzer"/"analyzer"/' |
sed 's/"position_offset_gap"/"position_increment_gap"/' |
curl -XPUT $es/$index/_mapping/page?pretty -d #-
I get an error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Unknown Similarity type [arrays] for field [category]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Unknown Similarity type [arrays] for field [category]"
},
"status" : 400
}
Anybody got any ideas? I'm not able to ingest the wikipedia content using the method described. Wish the company would at least update their tutorial page.

Related

A mapper_parsing_exception occurred when using the bulk API of Elasticsearch

Elasticsearch version: 8.3.3
Indexing was performed using the following Elasticsearch API.
curl -X POST "localhost:9200/bulk_meta/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index": { "_id": "1"}}
{"mydoc": "index action, id 1 "}
{"index": {}}
{"mydoc": "index action, id 2"}
'
In this case, the following error occurred.
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "failed to parse"
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Malformed content, found extra data after parsing: START_OBJECT"
}
},
"status" : 400
}
I've seen posts asking to add \n, but that didn't help.

You need to remove _doc from the requst.
curl -X POST "localhost:9200/bulk_meta/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"mydoc":"index action, id 1 "}
{"index":{}}
{"mydoc":"index action, id 2"}
'

Elasticsearch: strict_dynamic_mapping_exception

Hi,
I am trying to modify the date format in an elasticsearch index (operate-operation-0.26.0_). But I get the following error.
{
"took" : 148,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "operate-operation-0.26.0_",
"_type" : "_doc",
"_id" : "WBGhSXcB_hD8-yfn-Rh5",
"status" : 400,
"error" : {
"type" : "strict_dynamic_mapping_exception",
"reason" : "mapping set to strict, dynamic introduction of [dynamic] within [_doc] is not allowed"
}
}
}
]
}
The json file I am using is bulk6.json:
{"index":{}}
{"dynamic":"strict","properties":{"date":{"type":"date","format":"yyyy-MM-dd'T'HH:mm:ss.SSSZZ"}}}
The command I am running is
curl -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/operate-operation-0.26.0_/_bulk?pretty&refresh' --data-binary #"bulk6.json"

The _bulk API endpoint is not meant for changing mappings. You need to use the _mapping API endpoint like this:
The JSON file mapping.json should contain:
{
"dynamic": "strict",
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
}
}
}
And then the call can be made like this:
curl -H "Content-Type: application/json" -XPUT 'localhost:9200/operate-operation-0.26.0_/_mapping?pretty&refresh' --data-binary #"mapping.json"
However, this is still not going to work as you're not allowed to change the date format after the index has been created. You're going to get the following error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
}
],
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
},
"status" : 400
}
You need to create a new index with the desired correct mapping and reindex your data.

'Unknown BaseAggregationBuilder [composite] error' when running elasticsearch composite aggregation

I'm trying to create a composite aggregation per the documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-aggregations-bucket-composite-aggregation.html
I'm basically following this example:
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"my_buckets": {
"composite" : {
"sources" : [
{ "product": { "terms" : { "field": "product" } } }
]
}
}
}
}
'
but every time I try to run the code I get the below error regardless of which field I try to aggregate on:
{
"error" : {
"root_cause" : [
{
"type" : "unknown_named_object_exception",
"reason" : "Unknown BaseAggregationBuilder [composite]",
"line" : 5,
"col" : 27
}
],
"type" : "unknown_named_object_exception",
"reason" : "Unknown BaseAggregationBuilder [composite]",
"line" : 5,
"col" : 27
},
"status" : 400
}
I did some digging around and haven't seen the error 'Unknown BaseAggregationBuilder [composite]' come up anywhere else so I thought I'd post this question here to see if anyone has run into a similar issue. Cardinality and regular terms aggregation work fine. Also to clarify, I'm running on v6.8

Composite aggs were released in 6.1.0. The error sounds like you cannot possibly be using >=6.1 but some older ver.
What's your version.number when you run curl -X GET "localhost:9200"?

Error while sending data into Elasticsearch

While using Elasticsearch to load datasets with curl command->
curl -H "Content-Type: application/x-ndjson" -XPOST "localhost:9200/shakespeare/doc/_bulk?pretty" --data-binary #$shakespeare_6.0
Following warning is encountered->
Warning: Couldn't read data from file "$shakespeare_6.0", this makes an empty
Warning: POST.
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "request body is required"
}
],
"type" : "parse_exception",
"reason" : "request body is required"
},
"status" : 400
}
My data is:
{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV", "speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}
What is the root cause of this warning? I am using 64 bit Windows 10.
Also, Please let me know what are the different ways to send the data into the elasticsearch? I am a noob.

You provided a wrong file name. The name of that file is shakespeare_6.0.json, not $shakespeare_6.0. This is the correct command:
curl -H "Content-Type: application/x-ndjson" -XPOST "localhost:9200/shakespeare/doc/_bulk?pretty" --data-binary #shakespeare_6.0.json
This assumes that the file is in the current directory.

Getting unknown setting [index._id] error while adding data to Elasticsearch

I have created a mapping eventlog in Elasticsearch 5.1.1. I added it successfully however while adding data under it, I am getting Illegal_argument_exception with reason unknown setting [index._id]. My result from getting the indices is yellow open eventlog sX9BYIcOQLSKoJQcbn1uxg 5 1 0 0 795b 795b
My mapping is:
{
"mappings" : {
"_default_" : {
"properties" : {
"datetime" : {"type": "date"},
"ip" : {"type": "ip"},
"country" : { "type" : "keyword" },
"state" : { "type" : "keyword" },
"city" : { "type" : "keyword" }
}
}
}
}
and I am adding the data using
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog' -d '{"index":{"_id":1}}
{"datetime":"2016-03-31T12:10:11Z","ip":"100.40.135.29","country":"US","state":"NY","city":"Highland"}';
If I don't include the {"index":{"_id":1}} line, I get Illegal_argument_exception with reason unknown setting [index.apiKey].

The problem was arising with sending the data from the command line as a string. Keeping the data in a JSON file and sending it as binary solved it. The correct command is:
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog/_bulk?pretty' --data-binary #eventlogs.json

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Unable to Bulk upload wikipedia json.gz to elasticsearch - elasticsearch

Related

A mapper_parsing_exception occurred when using the bulk API of Elasticsearch

Elasticsearch: strict_dynamic_mapping_exception

'Unknown BaseAggregationBuilder [composite] error' when running elasticsearch composite aggregation

Error while sending data into Elasticsearch

Getting unknown setting [index._id] error while adding data to Elasticsearch

Categories

Resources