Unassigned Shards in Cluster - elasticsearch

I have an ES cluster which is playing up. At one point I had all primary and replica shards correctly assigned to 4 of my 5 nodes, but in trying to get some onto the 5th node I have once again lost my replica shards. Now my primary shards exist only on 3 nodes.
I am trying to get to the bottom of the issue:
On trying a forced allocation such as:
{
"commands": [
{
"allocate": {
"index": "group7to11poc",
"shard": 7,
"node": "SPOCNODE1"
}
}
]
}
I get the following response. I am having trouble finding out the exact problem!
explanations: [1]
0: {
command: "allocate"
parameters: {
index: "group7to11poc"
shard: 7
node: "SPOCNODE5"
allow_primary: true
}-
decisions: [11]
0: {
decider: "same_shard"
decision: "YES"
explanation: "shard is not allocated to same node or host"
}-
1: {
decider: "filter"
decision: "NO"
explanation: "node does not match index include filters [_id:"4rZYPBOGRMK4y9YG6p7E2w"]"
}-
2: {
decider: "replica_after_primary_active"
decision: "YES"
explanation: "primary is already active"
}-
3: {
decider: "throttling"
decision: "YES"
explanation: "below shard recovery limit of [2]"
}-
4: {
decider: "enable"
decision: "YES"
explanation: "allocation disabling is ignored"
}-
5: {
decider: "disable"
decision: "YES"
explanation: "allocation disabling is ignored"
}-
6: {
decider: "awareness"
decision: "YES"
explanation: "no allocation awareness enabled"
}-
7: {
decider: "shards_limit"
decision: "YES"
explanation: "total shard limit disabled: [-1] <= 0"
}-
8: {
decider: "node_version"
decision: "YES"
explanation: "target node version [1.3.2] is same or newer than source node version [1.3.2]"
}-
9: {
decider: "disk_threshold"
decision: "YES"
explanation: "disk usages unavailable"
}-
10: {
decider: "snapshot_in_progress"
decision: "YES"
explanation: "shard not primary or relocation disabled"
}-

Finally sorted this. Somehow the Index has gotten a filter applied to it which prevented shard allocation and move.
I removed the filter and the cluster began behaving.
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.include._id" : "" }'
This sets the _id filter to empty. This was previously populated and prevented the filter ever being matched!

Related

Sort a list of dictionaries by a specific key within a nested list of dictionaries

I have a list of dictionaries with each entry having the following structure
{
"id": 0,
"type": "notification",
"name": "jane doe",
"loc": {
"lat": 38.8239,
"long": 104.7001
},
"data": [
{
"type": "test",
"time": "Fri Aug 13 09:17:16 2021",
"df": 80000000,
"db": 1000000,
"tp": 92
},
{
"type": "real",
"time": "Sat Aug 14 09:21:30 2021",
"df": 70000000,
"db": 2000000,
"tp:": 97
}
]
}
I need to be able to sort this list by any of these keys: name, type, time, tp and return it in memory.
I understand how to sort by the top level keys sorted(json_list, key=lambda k:k['name']) or even nested keys. For instance by lat sorted(json_list, key=lambda k:k['loc']['lat'])
so currently I have a function that works for the case when sorting by name.
def sort_by(self, param, rev=False):
if param == NAME:
self.json_list = sorted(self.json_list, key=lambda k: k[param], reverse=rev)
else:
# need help here
I'm having trouble sorting by type, time, and tp. Notice the data key is also a list of dictionaries. I would like to leverage existing methods built into the standard lib if possible. I can provide more clarification if necessary
Update:
def sort_by(self, param, rev=False):
if param == NAME:
self.json_list = sorted(self.json_list, key=lambda k: k[param], reverse=rev)
else:
self.json_list = sorted(self.json_list, key=lambda k: k['data'][0][param], reverse=rev)
return self.json_list
This works fine if there is only one item in the data list
If json_list[i]['data'] (for each i) only contains one dict, then the following should work; otherwise modifications are required.
sorted(json_list, key = lambda k: (
k['name'], k['data']['type'], k['data']['time'], k['data']['tp']
))

TextMate Grammar - Problem with `end` expression

I'm having huge problems with the end portion of a regex in TextMate:
It looks like end becomes the part of the pattern that's returned between begin and end
Trying to apply multiple endings with one negative lookbehind proves unsuccessful
Here is an example code:
property_name: {
test1: [1, 50, 5000]
test2: something ;;
test3: [
1,
50,
5000
]
test4: "string"
test5: [
"text",
"text2"
]
test6: something2
test7: something3
}
I'm using the following code:
"begin": "\\b([a-z_]+):",
"beginCaptures": {
"1": {
"name" : "parameter.name"
}
}
"end": "(?<!,)\\n(?!\\])",
"patterns": [
{
"name": "parameter.value",
"match": "(.+)"
}
]
My logic for the end regular expression is to consider it ended if there's a new line but only if it's not preceded by a comma (list of values in an array) or followed by a closing square bracket (last value in an array).
Unfortunately it's not working as expected.
What I would like to achieve is that all property_name# and test are matched as parameter.name and the values are matched as parameter.value apart from ;;

Painless Scripting Kibana 6.4.2 not matching using matcher, but matches using expression conditional

Hello I'm trying to take a substring of a log message using regex in kibana scripted fields. I've run into an interesting scenario that doesn't add up. I converted the message field to a keyword so I could do scripted field operations on it.
When I match with a conditional such as:
if (doc['message'].value =~ /(\b(?:\d{1,3}\.){3}\d{1,3}\b)/) {
return "match"
} else {
return "no match"
}
This will match the ip and return correctly that there is an ip in the message. However, whenever I try to do the matcher function which splits the matched text into substrings it doesn't find any matches.
Following the guide on Elastic's documentation for doing this located here:
https://www.elastic.co/blog/using-painless-kibana-scripted-fields
This is the example script they give to match the first octet of an ip in a log message. However, this returns no matches when indeed there is ip addresses in the log message. I can't even match just text characters no matter what I do it returns 0 matches.
I have enabled rexex in the elasticsearch.yml in my cluster as well.
def m = /^([0-9]+)\..*$/.matcher(doc['message'].value);
if ( m.matches() ) {
return Integer.parseInt(m.group(1))
} else {
return m.matches() + " - " + doc['message'].value;
}
This returns 0 matches. Even if I use the same expression used for the conditional:
/(\b(?:\d{1,3}.){3}\d{1,3}\b)/
the matcher will still return false.
Any idea what I'm doing wrong here according to the documentation this should work.
I tried using subs-strings when the value exists in the if conditional but there is to many variations between the log messages. I also don't see a way to split and look through the list of outputs to pick the one with ip if I just use conditional for the scripted field.
Any idea on how to solve this:
Here is a example of that is returned form
def m = /^([0-9]+)\..*$/.matcher(doc['message'].value);
if ( m.matches() ) {
return Integer.parseInt(m.group(1))
} else {
return m.matches() + " - " + doc['message'].value;
}
The funny part is they all return false and this is essentially just looking for numbers with . and I've tried all kinds of regex combinations with no luck.
[
{
"_id": "VRYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Version: 1.0"
]
},
{
"_id": "VhYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - 2019-02-17 00:34:11 127.0.0.1 GET /status/web - 8611 - 127.0.0.1 ELB-HealthChecker/2.0 - 200 0 0 31"
]
},
{
"_id": "VxYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Software: Microsoft Internet Information Services 10.0"
]
},
{
"_id": "WBYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Date: 2019-03-26 00:00:08"
]
},
{
"_id": "WRYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
127.0.0.1 ELB-HealthChecker/2.0 - 200 0 0 15"
]
},
{
ended up being the following:
if (doc["message"].value != null) {
def m = /(\b(?:\d{1,3}\.){3}\d{1,3}\b)/.matcher(doc["message"].value);
if (m.find()) { return m.group(1) }
else { return "no match" }
}
else { return "NULL"}

Elasticsearch update script - 'noop' flushes entire script

I want to update 2 fields in a document in a single update request, using an inline painless script:
{
"script" : {
"inline": "ctx._source.counter1 ++ ; ctx._source.counter2 == 0 ? ctx.op = 'noop' : ctx._source.counter2 ++"}
}
Problem is - if the condition is met and ctx.op = 'noop' then the first part of the script (ctx._source.counter1 ++ ;) is also not being executed.
How would u recommend I should do this?
I can split the operation into 2 update requests which will double my db calls (but maybe a 'noop' call is extremely fast).
I also tried to swap the 2 parts of script (the conditional first , the increment second) - but then I'm getting a compilation error:
"script_stack": [
" ctx._source. ...",
" ^---- HERE"
],
"script": " ctx._source.counter2 > 0 ? ctx.op = 'noop' : ctx._source.counter2++ ; ctx._source.counter1++ ",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Not a statement."
}
Any ideas?
Thanks

ElasticSearch all shards remain unassigned (with routing.allocation set to all)

why are none of my shards being assigned? (ES 2.3)
Create index:
PUT 'host:9200/entities?pretty' -d ' {
"mappings": {
x
}
},
"settings" : {
"index" : {
"number_of_shards" : 6,
"number_of_replicas" : 1
}
}
}'
Cluster Settings:
GET 'host:9200/_cluster/settings?pretty'
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "all"
}
}
}
},
"transient" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "all"
}
}
}
}
}
Cluster:
host master
node3 m
node2 m
node1 *
Shards
GET 'host:9200/_cat/shards?v'
index shard prirep state docs store ip node
entities 5 p UNASSIGNED
entities 5 r UNASSIGNED
entities 1 p UNASSIGNED
entities 1 r UNASSIGNED
entities 4 p UNASSIGNED
entities 4 r UNASSIGNED
entities 2 p UNASSIGNED
entities 2 r UNASSIGNED
entities 3 p UNASSIGNED
entities 3 r UNASSIGNED
entities 0 p UNASSIGNED
entities 0 r UNASSIGNED
I'm able to assign nodes directly through the routing API, but that doesn't seem to be the wait to go.
If I setup the cluster differently, with 1 master node and 2 data nodes, the problem doesn't occur. But
Turns out I misinterpreted node.master and node.data settings. I thought it had to be either or.
Set all three nodes to node.master: true and node.data: true , now it's working like a charm.

Resources