Elastic: mismatched input '-' - elasticsearch

I'm running this code and I'm getting this error and I just can't figure out what's wrong with it:
POST /_sql/translate
{
"query": "SELECT duracao, elementoRede, hostname, ingest_time FROM reta-* WHERE duracao > 0 ORDER BY duracao DESC LIMIT 10",
"fetch_size": 100
}

You need to escape the special character, try this
POST /_sql/translate
{
"query": "SELECT duracao, elementoRede, hostname, ingest_time FROM \"reta-*\" WHERE duracao > 0 ORDER BY duracao DESC LIMIT 10",
"fetch_size": 100
}

Related

This query works well in Mysql but in Postgres it is giving error

This query works fine when I was using Mysql, now that we've migrated to Postgres, it's giving an error. Where is the problem?
public function scopeClosestTo(\Illuminate\Database\Eloquent\Builder $query, array $coord = null)
{
if ($coord && isset($coord['longitude'], $coord['latitude'])) {
return $query->select([
'*',
'distance' => DB::table( 'offers as offers_temp' )->selectRaw(
'ST_Distance_Sphere(point(`longitude`, `latitude`), point(?, ?)) AS distance',
[$coord['longitude'], $coord['latitude']]
)->whereColumn('offers_temp.id', 'offers.id')
])
->withCount(['favoriteOffers'])
->where('published', '=', true)
->where('active', '=', true)
->whereNotNull('longitude')
->whereNotNull('latitude')
->whereDate('expires_at', '>', \Carbon\Carbon::now())
->orWhereNull('expires_at')
->orderBy('distance');
}
return $query;
}
Error:
"SQLSTATE[42601]: Syntax error: 7 ERROR: syntax error at or near ","\nLINE 1: ...ct , (select ST_Distance_Sphere(point(longitude, latitud...\n ^ (SQL: select *, (select ST_Distance_Sphere(point(longitude, latitude`), point(-43.3722344, -22.7867144)) AS distance from "offers" as "offers_temp" where "offers_temp"."id" = "offers"."id") as "distance", (select count() from "favorite_offers" where "offers"."id" = "favorite_offers"."offer_id" and "favorite_offers"."deleted_at" is null) as "favorite_offers_count" from "offers" where (("published" = 1 and "active" = 1 and "longitude" is not null and "latitude" is not null and "expires_at"::date > 2022-03-28 or "expires_at" is null) and "longitude" is not null and "latitude" is not null and exists (select * from "offer_categories" inner join "offers_offer_categories" on "offer_categories"."id" = "offers_offer_categories"."offer_category_id" where "offers"."id" = "offers_offer_categories"."offer_id" and "offers_offer_categories"."offer_category_id" in (1) and "offer_categories"."deleted_at" is null) and "to_companies" = 0 and "published" = 1 and "active" = 1 and "expires_at"::date > 2022-03-28 or "expires_at" is null) and "offers"."deleted_at" is null order by "distance" asc limit 15 offset 0)"
Your query uses backticks to escape column names, which works in MySQL. However, PostgreSQL uses double quotes to escape column names.
Change
point(`longitude`, `latitude`)
To
point("longitude", "latitude")
However, the words longitude and latitude are not reserved words in postgres, so there should be no reason you need to quote them.
See this article on the PostgreSQL wiki for more about moving from MySQL to PostgreSQL.

Painless Scripting Kibana 6.4.2 not matching using matcher, but matches using expression conditional

Hello I'm trying to take a substring of a log message using regex in kibana scripted fields. I've run into an interesting scenario that doesn't add up. I converted the message field to a keyword so I could do scripted field operations on it.
When I match with a conditional such as:
if (doc['message'].value =~ /(\b(?:\d{1,3}\.){3}\d{1,3}\b)/) {
return "match"
} else {
return "no match"
}
This will match the ip and return correctly that there is an ip in the message. However, whenever I try to do the matcher function which splits the matched text into substrings it doesn't find any matches.
Following the guide on Elastic's documentation for doing this located here:
https://www.elastic.co/blog/using-painless-kibana-scripted-fields
This is the example script they give to match the first octet of an ip in a log message. However, this returns no matches when indeed there is ip addresses in the log message. I can't even match just text characters no matter what I do it returns 0 matches.
I have enabled rexex in the elasticsearch.yml in my cluster as well.
def m = /^([0-9]+)\..*$/.matcher(doc['message'].value);
if ( m.matches() ) {
return Integer.parseInt(m.group(1))
} else {
return m.matches() + " - " + doc['message'].value;
}
This returns 0 matches. Even if I use the same expression used for the conditional:
/(\b(?:\d{1,3}.){3}\d{1,3}\b)/
the matcher will still return false.
Any idea what I'm doing wrong here according to the documentation this should work.
I tried using subs-strings when the value exists in the if conditional but there is to many variations between the log messages. I also don't see a way to split and look through the list of outputs to pick the one with ip if I just use conditional for the scripted field.
Any idea on how to solve this:
Here is a example of that is returned form
def m = /^([0-9]+)\..*$/.matcher(doc['message'].value);
if ( m.matches() ) {
return Integer.parseInt(m.group(1))
} else {
return m.matches() + " - " + doc['message'].value;
}
The funny part is they all return false and this is essentially just looking for numbers with . and I've tried all kinds of regex combinations with no luck.
[
{
"_id": "VRYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Version: 1.0"
]
},
{
"_id": "VhYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - 2019-02-17 00:34:11 127.0.0.1 GET /status/web - 8611 - 127.0.0.1 ELB-HealthChecker/2.0 - 200 0 0 31"
]
},
{
"_id": "VxYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Software: Microsoft Internet Information Services 10.0"
]
},
{
"_id": "WBYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
"false - #Date: 2019-03-26 00:00:08"
]
},
{
"_id": "WRYK_2kB0_nHZ_3qyRwt",
"Source-IP": [
127.0.0.1 ELB-HealthChecker/2.0 - 200 0 0 15"
]
},
{
ended up being the following:
if (doc["message"].value != null) {
def m = /(\b(?:\d{1,3}\.){3}\d{1,3}\b)/.matcher(doc["message"].value);
if (m.find()) { return m.group(1) }
else { return "no match" }
}
else { return "NULL"}

Elasticsearch: scroll between specified time frame

I have some data in elasticsearch. as shown in the image
I used below link example to do the scrolling
https://gist.github.com/drorata/146ce50807d16fd4a6aa
page = es.search(
index = INDEX_NAME,
scroll = '1m',
size = 1000,
body={"query": {"match_all": {}}})
sid = page['_scroll_id']
scroll_size = page['hits']['total']
# Start scrolling
print( "Scrolling...")
while (scroll_size > 0):
print("Page: ",count)
page = es.scroll(scroll_id = sid, scroll = '10m')
# Update the scroll ID
sid = page['_scroll_id']
for hit in page['hits']['hits']:
#some code processing here
Currently my requirement is that i want to scroll but want to specify the start timestamp and end timestamp
Need help as to how to do this using scroll.
Simply replace
body={"query": {"match_all": {}}})
by
body={"query": {"range": {"timestamp":{"gte":"2018-08-05T05:30:00Z", "glte":"2018-08-06T05:30:00Z"}}}})
example code. time range should be in es query. Also You should process the first query result.
es_query_dict = {"query": {"range": {"timestamp":{
"gte":"2018-08-00T00:00:00Z", "lte":"2018-08-17T00:00:00Z"}}}}
def get_es_logs():
es_client = Elasticsearch([source_es_ip], port=9200, timeout=300)
total_docs = 0
page = es_client.search(scroll=scroll_time,
size=scroll_size,
body=json.dumps(es_query_dict))
while True:
sid = page['_scroll_id']
details = page["hits"]["hits"]
doc_count = len(details)
if len(details) > 0:
total_docs += doc_count
print("scroll size: " + str(doc_count))
print("start bulk index docs")
# index_bulk(details)
print("end success")
else:
break
page = es_client.scroll(scroll_id=sid, scroll=scroll_time)
print("total docs: " + str(total_docs))
Also have a look at elasticsearch.helpers.scan where you already have the loop logic implemented for you, just pass it query={"query": {"range": {"timestamp": {"gt": ..., "lt": ...}}}}

Elasticsearch update script - 'noop' flushes entire script

I want to update 2 fields in a document in a single update request, using an inline painless script:
{
"script" : {
"inline": "ctx._source.counter1 ++ ; ctx._source.counter2 == 0 ? ctx.op = 'noop' : ctx._source.counter2 ++"}
}
Problem is - if the condition is met and ctx.op = 'noop' then the first part of the script (ctx._source.counter1 ++ ;) is also not being executed.
How would u recommend I should do this?
I can split the operation into 2 update requests which will double my db calls (but maybe a 'noop' call is extremely fast).
I also tried to swap the 2 parts of script (the conditional first , the increment second) - but then I'm getting a compilation error:
"script_stack": [
" ctx._source. ...",
" ^---- HERE"
],
"script": " ctx._source.counter2 > 0 ? ctx.op = 'noop' : ctx._source.counter2++ ; ctx._source.counter1++ ",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Not a statement."
}
Any ideas?
Thanks

Querying a parameter that’s not an index on DynamoDb

TableName : people
id | name | age | location
id_1 | A | 23 | New Zealand
id_2 | B | 12 | India
id_3 | C | 26 | Singapore
id_4 | D | 30 | Turkey
keys: id -> hash and age->range
Question 1
I’m trying to execute a query: “Select * from people where age > 25”
I can get it to work queries like “Select age from people where id = id_1 and age > 25” which is not what I need, just need to select all values.
And if I don’t need age to be a range index, how should i modify my query params to just return the list of records matching the criterion: age > 25?
Question 2
AWS throws an error when either Lines 23 or 24-41 are commented.
: Query Error: ValidationException: Either the KeyConditions or KeyConditionExpression parameter must be specified in the request.
status code: 400, request id: []
Is the KeyConditions/KeyConditionsExpressions parameter required? Does it mean that I cannot query the table on a parameter that's not a part of the index?
func queryDynamo() {
log.Println("Enter queryDynamo")
svc := dynamodb.New(nil)
params := &dynamodb.QueryInput{
TableName: aws.String("people"), // Required
Limit: aws.Long(3),
// IndexName: aws.String("localSecondaryIndex"),
ExpressionAttributeValues: map[string]*dynamodb.AttributeValue{
":v_age": { // Required
N: aws.String("25"),
},
":v_ID": {
S: aws.String("NULL"),
},
},
FilterExpression: aws.String("age >= :v_age"),
// KeyConditionExpression: aws.String("id = :v_ID and age >= :v_age"),
KeyConditions: map[string]*dynamodb.Condition{
"age": { // Required
ComparisonOperator: aws.String("GT"), // Required
AttributeValueList: []*dynamodb.AttributeValue{
{ // Required
N: aws.String("25"),
},
// More values...
},
},
"id": { // Required
ComparisonOperator: aws.String("EQ"), // Required
// AttributeValueList: []*dynamodb.AttributeValue{
// S: aws.String("NOT_NULL"),
// },
},
// More values...
},
Select: aws.String("ALL_ATTRIBUTES"),
ScanIndexForward: aws.Boolean(true),
}
//Get the response and print it out.
resp, err := svc.Query(params)
if err != nil {
log.Println("Query Error: ", err.Error())
}
// Pretty-print the response data.
log.Println(awsutil.StringValue(resp))
}
DynamoDB is a NoSQL based system so you will not be able to retrieve all of the records based on a condition on a non-indexed field without doing a table scan.
A table scan will cause DynamoDB to go through every single record in the table, which for a big table will be very expensive in either time (it is slow) or money (provisioned read IOPS).
Using a filter is the correct approach and will allow the operation to complete if you switch from a query to a scan. A query must always specify the hash key.
A word of warning though: if you plan on using a scan operation on a table of more than just a few (less than 100) items that is exposed in a front end you will be disappointed with the results. If this is some type of cron job or backend reporting task where response time doesn't matter this is an acceptable approach, but be careful not to exhaust all of your IOPS and impact front end applications.

Resources