Elasticsearch truncate string field in query - elasticsearch

To display recent exceptions on a Grafana dashboard I am doing a query on exceptions in logfiles. Grafana doesn't seem to have an option to limit a string value in table view. Of course the stacktraces are huge.
So I came up with the idea to limit this field in the used Lucene query, but I am unaware on how to do this. I tried doing this using a painless script:
{
"query": {
"match_all": {}
},
"script_fields": {
"message_short": {
"script": {
"lang": "painless",
"inline": "return doc['message'].value.substring(50);"
}
}
}
}
I don't get any error but also no additional field "message_short" which I would have expected. Do I have to enable scripting support somehow? I'm running on v6.1.2

I got a workaround implemented where I have a drilldown URL ("Render value as link" in Grafana Table) where I render a link to my Kibana instance and use the Grafana variable $__cell that references the document_id I get from the underlying Elasticsearch query:
https://mykibana.host/app/kibana#/doc/myindex-*/myindex-prod-*/logs?id=$__cell&_g=h#8b5b71a
Not perfect, but keeps my Dashboard readable and allows more info if needed. Even better would be to add a shorted field into the ES index, but that is not possible for me currently.

Related

Insert data when no match by update_by_query in elastic search

I have this command that don't match any data in elastic search and I want to insert it after that.
//localhost:9200/my_index/my_topic/_update_by_query
{
"script": {
"source": "ctx._source.NAME = params.NAME",
"lang": "painless",
"params": {
"NAME": "kevin"
}
},
"query": {
"terms": {
"_id": [
999
]
}
}
}
I try using upsert but it return errors Unknown key for a START_OBJECT in [upsert].
I don't want using update + doc_as_upsert cause I have a case that I will don't send id in my update query.
How can I insert this with update_by_query. Thank you.
If elastic search don't support. I think I will check condition if have id or not, and use indexAPI to create and update to update.
_update_by_query runs on existing documents contained in an existing index. What _update_by_query does is scroll over all documents in your index (that optionally match a query) and perform some logic on each of them via a script or an ingest pipeline.
Hence, logically, you cannot create/upsert data that doesn't already exist in the index. The Index API will always overwrite your document. Upsert only works with in conjunction with the _update endpoint, which is what you should probably do.

Filtering documents by an unknown value of a field

I'm trying to create a query to filter my documents by one (can be anyone) value from a field (in my case "host.name"). The point is that I don't know previously the unique values of this field. I need found these and choose one to be used in the query.
I had tried the below query using a painless script, but I have not been able to achieve the goal.
{
"sort" : [{"#timestamp": "desc"}, {"host.name": "asc"}],
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": """
String k = doc['host.name'][0];
return doc['host.name'].value == k;
""",
"lang": "painless"
}
}
}
}
}
I'll appreciate if any can help me improving this idea of suggesting me a new one.
TL;DR you can't.
The script query context operates on one document at a time and so you won't have access to the other docs' field values. You can either use a scripted_metric aggregation which does allow iterating through all docs but it's just that -- an aggregation -- and not a query.
I'd suggest to first run a simple terms agg to figure out what values you're working with and then build your queries accordingly.

Return which field got matched in Elastic Search

I am trying to find out what actually got matched for a search in a specific for which the doc is returned.
Ex. I have a table index where there are fields called table_name and column_name...
My search query is finding both those fields, now If I fire a search query and any one of them gets matched ,but I want to know what got matched .. whether its column_name or the table_name.
I am aware of the Explain API but that will require me to call another API...
You don't need to call the explain API. The search API supports the explain flag
GET stackoverflow/_search?explain=true
This will return the _explanation section along with the _source section.
Update
Another solution would be to use highlight. I've used this before, for manually evaluating queries. It's an easy way to get some feedback on what matched
GET stackoverflow/_search
{
"query": {
"match": {
"FIELD": "TEXT"
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Of course, you can have the explain flag set as well

ElasticSearch - Delete documents by specific field

This seemingly simple task is not well-documented in the ElasticSearch documentation:
We have an ElasticSearch instance with an index that has a field in it called sourceId. What API call would I make to first, GET all documents with 100 in the sourceId field (to verify the results before deletion) and then to DELETE same documents?
You probably need to make two API calls here. First to view the count of documents, second one to perform the deletion.
Query would be the same, however the end points are different. Also I'm assuming the sourceId would be of type keyword
Query to Verify
POST <your_index_name>/_search
{
"size": 0,
"query": {
"term": {
"sourceId": "100"
}
}
}
Execute the above Term Query and take a note at the hits.total of the response.
Remove the "size":0 in the above query if you want to view the entire documents as response.
Once you have the details, you can go ahead and perform the deletion using the same query as shown in the below query, notice the endpoint though.
Query to Delete
POST <your_index_name>/_delete_by_query
{
"query": {
"term": {
"sourceId": "100"
}
}
}
Once you execute the Deletion By Query, notice the deleted field in the response. It must show you the same number.
I've used term queries however you can also make use of any Match or any complex Bool Query. Just make sure that the query is correct.
Hope it helps!
POST /my_index/_delete_by_query?conflicts=proceed&pretty
{
"query": {
"match_all": {}
}
}
Delete all the documents of an index without deleting the mapping and settings:
See: https://opster.com/guides/elasticsearch/search-apis/elasticsearch-delete-by-query/

how to log or print python elasticsearch-dsl query that gets invoked

I am using elasticsearch-dsl for my python application to query elastic search.
To debug what query is actually getting generated by elasticsearch-dsl library, I am unable to log or print the final query that goes to elasticsearch.
For example, like to see the request body sent to elasticsearch like this :
{
"query": {
"query_string": {
"query": "Dav*",
"fields": ["name", "short_code"],
"analyze_wildcard": true
}
}
}
Tried to bring the elasticsearch log level to TRACE. Even then, unable to see the queries that got executed.
Take a look at my blog post here, "Slowlog settings at index level" section. Basically, you can use slowlog to print in a separate log file Elasticsearch generates, the queries. I suggest using a very low threshold to be able to see all the queries.
For example, something like this, for a specific index:
PUT /test_index/_settings
{
"index": {
"search.slowlog.level": "trace",
"search.slowlog.threshold.query.trace": "1ms"
}
}
Or
PUT /_settings
{
"index": {
"search.slowlog.level": "trace",
"search.slowlog.threshold.query.trace": "1ms"
}
}
as a cluster-wide setting, for all the indices.
And the queries will be logged in your /logs location, a file called [CLUSTER_NAME]_index_search_slowlog.log.

Resources