I'm not using BulkIndex request in my code.
Below is the Jest code used for indexing the document.
Builder builder = new Index.Builder(source).index(indexName).type(typeName)
.id(optionalDocId == null ? UUID.randomUUID().toString() : optionalDocId);
datatype of source is Map
Caused by: java.lang.Exception: ES Error Message while writing to raw: {"root_cause":[{"type":"remote_transport_exception","reason":"[datanodes-364022667-13-482069107.azure-ebf.opsds.cp.prod-az-westus-1.prod.us.walmart.net][10.12.10.171:9300][indices:data/write/bulk[s]]"}],"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [557850852][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[sceventstorev1][15]] containing [index {[SCEventStore][scevents][93adca0b-7404-4405-8f72-9fa5e32a167c], source[n/a, actual length: [2.1kb], max length: 2kb]}], target allocation id: iTbBHe7vT_ihTHdJwqVRhA, primary term: 7 on EsThreadPoolExecutor[name = datanodes-364022667-13-482069107.azure-ebf.opsds.cp.prod-az-westus-1.prod.us.walmart.net/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor#4d2bc57e[Running, pool size = 8, active threads = 8, queued tasks = 200, completed tasks = 148755948]]"}
Given is the index mapping.
{
"sceventstore_v1": {
"mappings": {
"scevents": {
"properties": {
"eventID": {
"type": "keyword"
},
"eventId": {
"type": "keyword"
},
"eventName": {
"type": "keyword"
},
"message": {
"type": "text"
},
"producerName": {
"type": "keyword"
},
"receivedTimestamp": {
"type": "date",
"format": "epoch_millis"
},
"timestamp": {
"type": "date",
"format": "epoch_millis"
}
}
}
}
}
}```
The section of error message clearly explains the cause of it.
source[n/a, actual length: [2.1kb], max length: 2kb]}], it means max length allowed is 2kb somewhere configured in your application and you are sending more than which is 2.1 kb.
Try to inspect your document and see which field value is crossing this limit.
More resource can be found https://www.elastic.co/blog/why-am-i-seeing-bulk-rejections-in-my-elasticsearch-cluster and https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html
Related
I'm using the go-elasticsearch API in my application to create indices in an Elastic.co cloud cluster. The application dynamically creates an index with a template and then starts indexing documents. The template includes an alias name and look like this:
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"title": {
"type": "text"
},
"created_at": {
"type": "date"
},
"updated_at": {
"type": "date"
},
"status": {
"type": "keyword"
}
}
},
"aliases": {
"rollout-nodes-f0776f0": {}
}
}
The name of the alias can change, so we pass it to the template when we create a new index. This is done with the Create indices API in Go:
indexTemplate := getIndexTemplate()
res, err := n.client.Indices.Create(
indexName,
n.client.Indices.Create.WithBody(indexTemplate),
n.client.Indices.Create.WithContext(ctx),
n.client.Indices.Create.WithTimeout(time.Second),
)
Doing some testing, this code works on localhost (without security enabled) but is not working with the cluster in Elastic.co, the index is created but not the alias.
I think it should be a problem related with either the API Key permissions or some configuration in the server, but I was unable to find yet which permission I'm missing.
For more context, this is the API Key I'm using:
{
"id": "fakeID",
"name": "index-service-key",
"creation": 1675350573126,
"invalidated": false,
"username": "fakeUser",
"realm": "cloud-saml-kibana",
"metadata": {},
"role_descriptors": {
"logstash_writer": {
"cluster": [
"monitor",
"transport_client",
"read_ccr",
"read_ilm",
"manage_index_templates"
],
"indices": [
{
"names": [
"*"
],
"privileges": [
"all"
],
"allow_restricted_indices": false
}
],
"applications": [],
"run_as": [],
"metadata": {},
"transient_metadata": {
"enabled": true
}
}
}
}
Any ideas? I know I can use the POST _aliases API, but the index creation option should be working too.
I'm creating a winston-elasticsearch logger usin Kibana, where I need to save all what is happen in the app (request, apis, ...).I used this( https://github.com/vanthome/winston-elasticsearch) I defined a json mapping template. Also a transformer.js file where I define a Transformer function to transform log data as provided by winston into a message structure which is more appropriate for indexing in ES.
then I use the winston-elasticsearch to create new Es instance:
exports.mappingTemplate = require('../../../index-template-mapping.json');
const TransportersElastic = new Elasticsearch({
index: 'soapserver',
level: 'info',
indexPrefix: 'logs',
transformer: transformer,
ensureMappingTemplate: true,
mappingTemplate: mappingTemplate,
flushInterval: 2000,
waitForActiveShards: 1,
handleExceptions: false,...
But I keep getting Elasticsearch index error of type: 'mapper_parsing_exception' reasons like : failed to parse field [fields.result.status] of type [text]'
illegal_argument_exception' reason : 'failed to parse field [fields.rows.tmcode] of type [long]'
Here transformer.js:
exports.mappingTemplate = require('../../../index-template-mapping.json');
exports.transformer = (logData) => {
const transformed = {};
transformed['#timestamp'] = new Date().toISOString();
transformed.source_host = os.hostname();
transformed.message = logData.message;
if (typeof transformed.message === 'object') {
transformed.message = JSON.stringify(transformed.message);
I need help, suggestion how to resolve those error and succeed the Mapping.
Here the index-mapping-template.json
{
"mapping": {
"_doc": {
"properties": {
"#timestamp": {
"type": "date"
},
"fields": {
"ignore_malformed": true,
"dynamic": "true",
"properties": {}
},
"message": {
"type": "text"
},
"severity": {
"type": "keyword"
}
}
}
}
}
My partial mapping of an index listing elasticsearch 2.5 (I know I have to upgrade to newer version and start using painless, let's keep that aside for this question)
"name": { "type": "string" },
"los": {
"type": "nested",
"dynamic": "strict",
"properties": {
"start": { "type": "date", "format": "yyyy-MM" },
"max": { "type": "integer" },
"min": { "type": "integer" }
}
}
I have only one document in my storage and that is as follows:
{
"name": 'foobar',
"los": [{
"max": 12,
"start": "2018-02",
"min": 1
},
{
"max": 8,
"start": "2018-03",
"min": 3
},
{
"max": 10,
"start": "2018-04",
"min": 2
},
{
"max": 12,
"start": "2018-05",
"min": 1
}
]
}
I have a a groovy script in my elastic search query as follows:
los_map = [doc['los.start'], doc['los.max'], doc['los.min']].transpose()
return los_map.size()
This groovy query ALWAYS returns 0, which is not possible, as I have one document, as mentioned above (even if I add multiple documents, it still returns 0) and los field is guaranteed to be present in every doc with multiple objects in it. So it seems the transpose which I am doing is not working correctly?
I also tried changing this line los_map = [doc['los.start'], doc['los.max'], doc['los.min']].transpose() to los_map = [doc['los'].start, doc['los'].max, doc['los'].min].transpose() then I get this error "No field found for [los] in mapping with types [listing]"
Does anyone have any idea how to get the transpose work?
By the way, if you are curious, my complete script is as follows:
losMinMap = [:]
losMaxMap = [:]
los_map = [doc['los.start'], doc['los.max'], doc['los.min']].transpose()
los_map.each {st, mx, mn ->
losMinMap[st] = mn
losMaxMap[st] = mx
}
return los_map['2018-05']
Thank you in advance.
I have a managed cluster hosted by elastio.co. Here is the configuration
|Platform => Amazon Web Services| |Memory => 4 GB|
|Storage => 96 GB| |SSD => Yes| |High availability => Yes 2 data centers|
Each index in this cluster contain log data of exactly one day. Average index size is 15 mb and average doc count is 15000. The cluster is not in any way under any kind of pressure (JVM, Indexing & Searching time, Disk Space all are in very comfort zone)
When I opened a previously closed index the cluster is turned RED. Here are some matrices I found querying the elasticsearch.
GET /_cluster/allocation/explain
{
"index": "some_index_name", # 1 Primary shard , 1 replica shard
"shard": 0,
"primary": true
}
Response :
"unassigned_info": {
"reason": "ALLOCATION_FAILED"
"failed_allocation_attempts": 3,
"details": "failed recovery, failure RecoveryFailedException[[some_index_name][0]: Recovery failed on {instance-*****}{Hash}{HASH}{IP}{IP}{logical_availability_zone=zone-1, availability_zone=***, region=***}]; nested: IndexShardRecoveryException[failed to fetch index version after copying it over]; nested: IndexShardRecoveryException[shard allocated for local recovery (post api), should exist, but doesn't, current files: []]; nested: IndexNotFoundException[no segments* file found in store(mmapfs(/app/data/nodes/0/indices/MFIFAQO2R_ywstzqrfbY4w/0/index)): files: []]; ",
"last_allocation_status": "no_valid_shard_copy"
},
"can_allocate": "no_valid_shard_copy",
"allocate_explanation": "cannot allocate because all found copies of the shard are either stale or corrupt",
"node_allocation_decisions": [
{
"node_name": "instance-***",
"node_decision": "no",
"store": {
"in_sync": false,
"allocation_id": "RANDOM_HASH",
"store_exception": {
"type": "index_not_found_exception",
"reason": "no segments* file found in SimpleFSDirectory#/app/data/nodes/0/indices/RANDOM_HASH/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory#346e1b99: files: []"
}
}
},
{
"node_name": "instance-***",
"node_attributes": {
"logical_availability_zone": "zone-0",
},
"node_decision": "no",
"store": {
"found": false
}
}
I've tried rerouting the shards to a node. Even setting data loss flag to true.
POST _cluster/reroute
{
"commands" : [
{"allocate_stale_primary" : {
"index" : "some_index_name", "shard" : 0,
"node" : "instance-***",
"accept_data_loss" : true
}
}
]
}
Response:
"acknowledged": true,
"state": {
"version": 338190,
"state_uuid": "RANDOM_HASH",
"master_node": "RANDOM_HASH",
"blocks": {
"indices": {
"restored_**: {
"4": {
"description": "index closed",
"retryable": false,
"levels": [
"read",
"write"
]
}
},
"restored_**": {
"4": {
"description": "index closed",
"retryable": false,
"levels": [
"read",
"write"
]
}
}
}
},
"routing_table": {
"indices": {
"SOME_INDEX_NAME": {
"shards": {
"0": [
{
"state": "INITIALIZING",
"primary": true,
"relocating_node": null,
"shard": 0,
"index": "SOME_INDEX_NAME",
"recovery_source": {
"type": "EXISTING_STORE"
},
"allocation_id": {
"id": "HASH"
},
"unassigned_info": {
"reason": "ALLOCATION_FAILED",
"failed_attempts": 4,
"delayed": false,
"details": "same as explanation above ^ ",
"allocation_status": "no_valid_shard_copy"
}
},
{
"state": "UNASSIGNED",
"primary": false,
"node": null,
"relocating_node": null,
"shard": 0,
"index": "some_index_name",
"recovery_source": {
"type": "PEER"
},
"unassigned_info": {
"reason": "INDEX_REOPENED",
"delayed": false,
"allocation_status": "no_attempt"
}
}
]
}
},
Any kind of suggestion is welcomed. Thanks and regards.
This occurs when the master-node is brought down abruptly.
Here are the steps I took to resolve the same issue, that I had encountered ,
Step 1: Check the allocation
curl -XGET http://localhost:9200/_cat/allocation?v
Step 2: Check the shard stores
curl -XGET http://localhost:9200/_shard_stores?pretty
Look out for "index", "shard" and "node" that has the error that you displayed.
The ERROR should be --> "no segments* file found in SimpleFSDirectory#/...."
Step 3: Now reroute that index as shown below
curl -XPOST 'http://localhost:9200/_cluster/reroute?master_timeout=5m' \
-d '{ "commands": [ { "allocate_empty_primary": { "index": "IndexFromStep2", "shard": ShardFromStep2 , "node": "NodeFromStep2", "accept_data_loss" : true } } ] }'
Step 4: Repeat Step2 and Step3 until you see this output.
curl -XGET 'http://localhost:9200/_shard_stores?pretty'
{
"indices" : { }
}
Your cluster should go green soon.
I have been stuck with this from a couple of days, any help will be highly appreciable.
How actually we retrieve the documents from the result object of Jest client
I am using the following below code
List<Hit<Part,Void>> hits = searchresult.getHits(Part.class);
for (SearchResult.Hit<Part,Void> hit : hits) {
Here in Part PPOJO class I have a variable attributes
private Object attributes;
Here is the mapping of the attributes field, it is of object type.
"attributes": {
"properties": {
"AttrCatgId": {
"type": "long"
},
"AttrCatgText": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AttrText": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AttributeId": {
"type": "long"
}
}
}
Now when I'm retrieving the documents from the result object, it's giving me the values in double format ( .0 is appending by default)
{AttrText=3/4 TON, AttrCatgText=LOAD RATING, AttrCatgId=3.0, AttributeId=11.0}
How to get the AttrText=3/4 TON, AttrCatgText=LOAD RATING, AttrCatgId=3, AttributeId=11 Values in this format.
Thanks for your help in advance.
Here is the sample Document.
"catalogLineCode": "G12",
"supplierId": [
5904
],
"partId": 5278493,
"partGrpName": "Thermostat, Gasket & Housing",
"terminologyName": "Rear Right Wheel Cylinder",
"catalogName": "EPE",
"catId": [
10
],
"perCarQty": 1,
"catalogId": 1,
"partGrpId": 10,
"regionId": 1,
"partNumber": "12T1B",
"attributes": [
{
"AttrText": "3/4 TON",
"AttrCatgText": "LOAD RATING",
"AttrCatgId": 3,
"AttributeId": 11
},
{
"AttrText": "M ENG CODE",
"AttrCatgText": "ENG SOURCE CODE",
"AttrCatgId": 16,
"AttributeId": 111
},
{
"AttrText": "4 WHEEL/ALL WHEEL DRIVE",
"AttrCatgText": "DRIVE TYPE",
"AttrCatgId": 27,
"AttributeId": 168
}
],
"vehicleId": [
5274
],
"extendedDescription": "ALTERNATE TEMP - ALL MAKES-STD LINE~STD CAB - BASE",
"terminologyId": 20