Elasticsearch error resource_already_exists_exception when index doesn't exist for sure - elasticsearch

I use random index name for new indices:
async import_index(alias_name, mappings, loadFn) {
const index = `${alias_name}_${+new Date()}`
console.log('creating new index: ', index)
await this.esService.indices.create({
index: index,
body: {
"settings": this.index_settings(),
"mappings": mappings
}
}).then(res => {
console.log('index created: ', index)
}).catch(async (err) => {
console.error(alias_name, ": creating new index", JSON.stringify(err.meta, null, 2))
throw err
});
I believe an index with this name cannot exist, but ES returns me this error
"error": {
"root_cause": [
{
"type": "resource_already_exists_exception",
"reason": "index [brands_1637707367610/bvY5O_NjTm6mU3nQVx7QiA] already exists",
"index_uuid": "bvY5O_NjTm6mU3nQVx7QiA",
"index": "brands_1637707367610"
}
],
"type": "resource_already_exists_exception",
"reason": "index [brands_1637707367610/bvY5O_NjTm6mU3nQVx7QiA] already exists",
"index_uuid": "bvY5O_NjTm6mU3nQVx7QiA",
"index": "brands_1637707367610"
},
"status": 400
}
ES installed in k8s using bitnami helm chart, 3 master nodes running. Client is connected to master service url. My thoughts: client sends a request to all nodes at the same time, but i cannot prove it.
plz help

We have experienced same error with the python client. But from what i see the javascript client is written in similar way.
There is a retry option in the client.
Most likely this is set to true in your case (can be reconfigured). Then you pass a large mapping to this.esService.indices.create. The operation takes too long, times out, then a retry happens, but the index was created on the cluster.
You need to send a larger timeout to elasticsearch index create api (default 30s). And also set the same timeout for the http connection.
Those are 2 separate settings:
Client http connection requestTimeout . Default: 30s
https://github.com/elastic/elasticsearch-js/blob/v7.17.0/lib/Transport.js#L435
https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/7.17/basic-config.html
Serverside timeout via Create Index API. Default: 30s
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/indices-create-index.html
this.esService = new Client({
// ....
maxRetries: 5,
requestTimeout: 60000 // time in ms
})
// ....
tnis.esService.indices.create({
index: index,
body: {
"settings": this.index_settings(),
"mappings": mappings
},
timeout: "60s" // elasticsearch timestring
})

Related

AWS Appsync subscription with postman - No Protocol Error

I am using Postman to connect AWS Appsync subscription as per : https://docs.aws.amazon.com/appsync/latest/devguide/real-time-websocket-client.html
with the below config:
{ "payload": { "errors": [ { "message": "NoProtocolError", "errorCode": 400 } ] }, "type": "connection_error" }
The problem occurs due to a missing header:-
Sec-WebSocket-Protocol = graphql-ws
if you get this error in unity use this way;
using System.Net.WebSockets;
var webSocket = new ClientWebSocket();
webSocket.Options.UseDefaultCredentials = false;
webSocket.Options.AddSubProtocol("graphql-ws");

How to create a duplicate index in ElasticSearch from existing index?

I have an existing index with mappings and data in ElasticSearch which I need to duplicate for testing new development. Is there anyway to create a temporary/duplicate index from the already existing one?
Coming from an SQL background, I am looking at something equivalent to
SELECT *
INTO TestIndex
FROM OriginalIndex
WHERE 1 = 0
I have tried the Clone API but can't get it to work.
I'm trying to clone using:
POST /originalindex/_clone/testindex
{
}
But this results in the following exception:
{
"error": {
"root_cause": [
{
"type": "invalid_type_name_exception",
"reason": "Document mapping type name can't start with '_', found: [_clone]"
}
],
"type": "invalid_type_name_exception",
"reason": "Document mapping type name can't start with '_', found: [_clone]"
},
"status": 400
}
I know someone would guide me quickly. Thanks in advance all you wonderful folks.
First you have to set the source index to be read-only
PUT /originalindex/_settings
{
"settings": {
"index.blocks.write": true
}
}
Then you can clone
POST /originalindex/_clone/testindex
If you need to copy documents to a new index, you can use the reindex api
curl -X POST "localhost:9200/_reindex?pretty" -H 'Content-Type:
application/json' -d'
{
"source": {
"index": "someindex"
},
"dest": {
"index": "someindex_copy"
}
}
'
(See: https://wrossmann.medium.com/clone-an-elasticsearch-index-b3e9b295d3e9)
Shortly after posting the question, I figured out a way.
First, get the properties of original index:
GET originalindex
Copy the properties and put to a new index:
PUT /testindex
{
"aliases": {...from the above GET request},
"mappings": {...from the above GET request},
"settings": {...from the above GET request}
}
Now I have a new index for testing.

elasticsearch search request size limit error

me elasticsearch version 7.9.3 (running on ubuntu) holds an index of each day (logs)
so when a query needs to include for example data from 2020-01-01 until 2020-11-20
Search query will look like this: (which returns error 400)
http://localhost:9200/log_2020-02-14,log_2020-02-26,log_2020-02-27,log_2020-04-24,log_2020-04-25,log_2020-07-17,log_2020-08-01,log_2020-09-09,log_2020-09-21,log_2020-10-06,log_2020-10-07,log_2020-10-08,log_2020-10-16,log_2020-10-17,log_2020-10-18,log_2020-10-21,log_2020-10-22,log_2020-11-12/_search?pretty
I know I can split the request into two but I don't see why (4096 bytes over HTTP it's not so big)
any chance to config this issue ?
response:
{
"error": {
"root_cause": [
{
"type": "too_long_frame_exception",
"reason": "An HTTP line is larger than 4096 bytes."
}
],
"type": "too_long_frame_exception",
"reason": "An HTTP line is larger than 4096 bytes."
},
"status": 400
}
URLs cannot exceed a certain size depending on the medium. Elasticsearch limits that length to 4096 bytes.
Since you seem to be willing to query all indexes of 2020 since January 1st until today (Nov 20), you can use a wildcard like this:
http://localhost:9200/log_2020*/_search?pretty
Another way is by leveraging aliases and put all your 2020 indexes behind the log_2020 alias:
POST /_aliases
{
"actions" : [
{ "add" : { "index" : "log_2020*", "alias" : "log_2020" } }
]
}
After running that you can query the alias directly
http://localhost:9200/log_2020/_search?pretty
If you want to make sure that all your daily indexes get the alias upon creation you can add an index template
PUT _index_template/my-logs
{
"index_patterns" : ["log_2020*"],
"template": {
"aliases" : {
"log_2020" : {}
}
}
}
UPDATE
If you need to query between 2020-03-04 and 2020-09-21, you can query the log_2020 alias with a range query on your date field
POST log_2020/_search
{
"query": {
"range": {
"#timestamp": {
"gte": "2020-03-04",
"lt": "2020-09-22"
}
}
}
}

Why am I getting constant time-out errors with Teams Shifts API? (Microsoft Graph)

When querying a schedule in Teams Shifts which has sizeable amounts of shifts (say, 100+ shifts over a month - which is like a 30 person team working every day) we are constantly getting 504 errors (gateway timeouts)
We've tried using TOP and limiting the amount of days we're returning to reduce the request size, but the GRAPH API for Shifts is VERY limited in terms of search and filtering capacity
REQUEST EXAMPLE (FROM MS FLOW, via custom connector).
{
"inputs": {
"host": {
"connection": {
"name": "#parameters('$connections')['shared_medicus365-5fconnector-5f455cfd1c6d1a89ed-5fce5269b428f1d481']['connectionId']"
}
},
"method": "get",
"path": "/beta/teams/#{encodeURIComponent(items('fe_team')?['TeamID'])}/schedule/shifts",
"queries": {
"$filter": "sharedShift/startDateTime ge #{outputs('composeStartOfDay')} and sharedShift/endDateTime le #{body('6monthsAhead')}",
"$top": "1000"
},
"authentication": "#parameters('$authentication')"
},
"metadata": {
"flowSystemMetadata": {
"swaggerOperationId": "ListTeamsShifts"
}
}
}
We're using the EXACT defined methods as described in the microsoft documentation.
We're getting a 504 response from GRAPH - gateway timeouts (again) any time the message size is anything beyond more than a few weeks worth of Shifts.
{
"error": {
"code": "UnknownError",
"message": "",
"innerError": {
"request-id": "437ad3be-be70-4bbe-b972-f9e24b588b5c",
"date": "2019-09-03T19:01:57"
}
}
}

S3 throws 500 when trying to restore snapshot to AWS ES 6.2 [cross-region]

Trying to restore an v5.3 ES snapshot from S3 to ES 6.2. Snapshot bucket is in us-east-1 and i'm trying to restore it to a Amazon ES cluster in us-west-2. It's a cross-account, cross-region restore operation.
Registered snapshot repository in the us-west-2 ES cluster as below
{
"type": "s3",
"settings": {
"bucket": "valid-bucket-name",
"server_side_encryption": "true",
"endpoint": "s3.amazonaws.com",
"region" : "us-east-1",
"role_arn": "valid-role"
}
}
Got the response as
{
"acknowledged": true
}
But, then when i try to restore a specific snapshot, S3 throws 301
{
"error": {
"root_cause": [
{
"type": "amazon_s3_exception",
"reason": "amazon_s3_exception: The bucket is in this region: us-east-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: PermanentRedirect; Request ID: D72EFE8A89F76F57; S3 Extended Request ID: in03KW452re297MDp3GQQRFjJhMRXeP4md+FU99CHZ7D4TQKz8PBuSZKoO3+IFd+wAxNApztG5Y=)"
}
],
"type": "amazon_s3_exception",
"reason": "amazon_s3_exception: The bucket is in this region: us-east-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: PermanentRedirect; Request ID: D72EFE8A89F76F57; S3 Extended Request ID: in03KW452re297MDp3GQQRFjJhMRXeP4md+FU99CHZ7D4TQKz8PBuSZKoO3+IFd+wAxNApztG5Y=)"
},
"status": 500
}
Repository is already configured with region us-east-1. Error message is not useful.
If i just specify the endpoint as #Michael-sqlbot suggested, then it will throw the following error
{
"error": {
"root_cause": [
{
"type": "amazon_s3_exception",
"reason": "amazon_s3_exception: The bucket is in this region: us-east-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: 301 Moved Permanently; Request ID: CC1853D0EF68B5F7; S3 Extended Request ID: nrbWmI3OiPLdrMcRT6FiOHJineYv6clmSf+GcXtBBwKSzfIEV2gmMZjWEDtyCIRQUg+dM/Vmawg=)"
}
],
"type": "blob_store_exception",
"reason": "Failed to check if blob [master.dat-temp] exists",
"caused_by": {
"type": "amazon_s3_exception",
"reason": "amazon_s3_exception: The bucket is in this region: us-east-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: 301 Moved Permanently; Request ID: CC1853D0EF68B5F7; S3 Extended Request ID: nrbWmI3OiPLdrMcRT6FiOHJineYv6clmSf+GcXtBBwKSzfIEV2gmMZjWEDtyCIRQUg+dM/Vmawg=)"
}
},
"status": 500
}
Update: I can confirm that it's a region/endpoint related issue with s3-snapshot-plugin. Created another cluster in us-east-1 (same region as that of bucket) and it worked with out any issue.
This seems potentially relevant:
Important
If the S3 bucket is in the us-east-1 region, you need to use "endpoint": "s3.amazonaws.com" instead of "region": "us-east-1". (emphasis added)
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html
Cannot comment on Michael's answer due to lack of reputation :-)
I tried their suggested workaround to set up a snapshot repo on a new 7.9 cluster in us-west-2 with a bucket located in eu-west-2 and it worked! Doesn't seem to be documented anymore though on Amazon's developer guide. I'll submit feedback to them.
BTW, if you're trying to migrate data from another cluster, don't forget to add "readonly": true to the settings object. This is just to prevent accidentally writing to that repo.

Resources