How to check elasticsearch query performance? - elasticsearch

I need to check elasticsearch query performance. But due to caching I am unable to figure out actual query performance. Is there any way to stop caching.
I had tried _cache/clear as per suggested below document.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-clearcache.html
$ curl -XPOST 'http://localhost:9200/_cache/clear'
Also tried , set index.cache.filter.type to none in elasticsearch.yml
index.cache.filter.type : none
I using Sense to run elasticseaech query.
Any other way to doing this?

Maybe restart your elastic search cluster, then run some queries that hit more or less the same data but not the actual query you want to test, and then the query you want to test.
I also notice the first query you run against a restarted cluster is slow, but after that everything tends to be fast.
It's very possible that ElasticSearch isn't even caching the query you're trying to get performance data on, it's just really really fast ;)

Related

elasticsearch query statistics and analysis in near real time

I am pretty new to elasticsearch and I want to create statistics and kibana dashboards on queries sent to elasticsearch index , what is the best approach to do so ? Any advice or recommendations will be highly appreciated?
The idea is to analyze all queries sent to the index and do some performance optimisation in the future when the userbase increase ...
I am planning for the moment to store the logs in different index , but parsing seems to be kind of complex activity ...
Ideally I need to have:
-Counting of user queries
-Counting of queries that returned no results
-Logging of all search terms
-Sorting of queries, and queries that returned no results, by most frequently contained search term
-A view of top queries, including the search term not found results for and the exact query
-A view of top queries returning no results, including the search term not found results for and the exact query
Thanks
There is no OOTB functionality available in Elasticsearch for search analysis. But there are some workaround you can do for same and get information what you are asking.
First option, you can enable slow log in Elasticsearch by executing below command and it will log each and every request to coming to Elasticsearch.
PUT /my-index-000001/_settings
{
"index.search.slowlog.threshold.query.info": "0s",
"index.search.slowlog.threshold.fetch.info": "0s"
}
Second option, You can log all the query the application layer or intermediate level using which application and elasticsearch talking to each other.
Once you have logs, You can configured Logstash / Filebeat / Fleet to read log and transform and index to Elasticsearch. Logstash provide differnt kind of filter which you can use and easily transofrm your plain text logs to strcture logs (grok filter).

Implements popular keyword in ElasticSearch

I'm using ElasticSearch on AWS EC2.
And i want to implement today's popular keyword function in ES.
there is 3 indexes(place, genre, name), and i want see today's popular keyword in name index only.
I tried to use ES slowlog and logstash. but slowlog save logs every shard's log.
(ex)number of shards : 5 then 5 query log saved.
Is there any good and easy way to implement popular keyword in ES?
As far as I know, this is not supported by Elasticsearch and you need to build your own custom solution.
Design you mentioned using the slowlog is not good as you mentioned its on per shard basis, even if you do some more computing and able to merge and relate them to a single search at index level, it would not be good, as
you have to change the slow log configuration and for every index there needs to be a different threshold, you can change it to 0ms, to make sure you get all the search queries in slow logs, but that would take a huge disk space and would not be good for Elasticsearch performance.
You have to do some parsing of slow log in your application and if you do it runtime it would be very costly.
I think you can maintain a distributed cache in your application where you store the top searched keyword like the leaderboard of a multi-player gaming app, which is changing very frequently but in your case, you don't even have to update this cache very frequently. I would not go into much implementation details, but simple Hashmap of search term as key and count as value would solve the issue.
Hope this helps. let me know if you have questions.

How can I find the most used query from Elasticsearch?

I have a Elasticsearch cluster running on AWS Elasticsearch instance. It is up running for a few months. I'd like to know the most used query requests over the last few months. Does Elasticsearch save all queries somewhere I can search? Or do I have to programmatically save the requests for analysis?
As far as I'm aware, Elasticsearch doesn't by default save a record or frequency histogram of all queries. However, there's a way you could have it log all queries, and then ship the logs somewhere to be aggregated/searched for the top results (incidentally this is something you could use Elasticsearch for :D). Sadly, you'll only be able to track queries after you configure this, I doubt that you'll be able to find any record of your historical queries the last few months.
To do this, you'd take advantage of Elasticsearch's slow query log. The default thresholds are designed to only log slow queries, but if you set those defaults to 0s then Elasticsearch would log any query as a slow query, giving you a record of all queries. See that link above for detailed instructions how, you could set this for a whole cluster in your yaml configuration file like
index.search.slowlog.threshold.fetch.debug: 0s
or set it dynamically per-index with
PUT /<my-index-name>/_settings
{
"index.search.slowlog.threshold.query.debug": "0s"
}
To be clear the log level you choose doesn't strictly matter, but utilizing debug for this would allow you to keep logging actually slow queries at the more dangerous levels like info and warn, which you might find useful.
I'm not familiar with how to configure an AWS elasticsearch cluster, but as the above are core Elasticsearch settings in all the versions I'm aware of there should be a way to do it.
Happy searching!

Swapping out one index for another in Elasticsearch

I am running Elasticsearch on a personal machine that only has so much memory. I'd like to use all of the memory at any given time for whatever problem I'm working on, but make it easy to switch between projects.
For example, I have a project involving a large text corpus, and a different project with geospatial data. I'd like to switch Elasticsearch from indexing one to the other without reindexing all the documents.
Is there an easier way to do this than to do a backup/reload of the index?
ES has open/close index API:
curl -XPOST 'localhost:9200/my_index/_close'
curl -XPOST 'localhost:9200/my_index/_open'

Why are queries not being logged?

I've got an enviroment set on Dev that should keep a log with every query ran, but it's not writing anything. I'm using the slow-log feature for it...
These are my thresholds on the elasticsearch.yml:
http://pastebin.com/raw.php?i=qfwnruhD
And this is my whole logging.yml:
http://pastebin.com/raw.php?i=aXg8xHNE
I'm using ElasticSearch 1.3.1 in this enviroment.
You should set the threshold to 0ms if you want to log all queries. On a smaller index I was testing on, lots of queries were taking less than 1ms.
If that doesn't work, perhaps elasticsearch isn't using the config file you are updating.

Resources