Elasticsearch check if an index is receiving read traffic - elasticsearch

We are using elasticsearch 6.0.
We have an index in our cluster and would like to know if there are any clients who are querying it.
Is there a way to know if an index is receiving read( get, search, aggregation etc) on an index?

If you don't have monitoring enabled on your cluster please have a look on the index stats api. You'll find a lot of metrics worth to monitor.

You can see which thread has completed how many write tasks completed with this command:
GET /_cat/thread_pool/write?v&h=id,node_name,active,rejected,completed
or for getting get task:
GET /_cat/thread_pool/get?v&h=id,node_name,active,rejected,completed

Related

How to get current using shards in Elasticsearch or Opensearch

My opensearch sometimes reaches this error when i adding new index:
Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;
So i have to increase cluster.max_shards_per_node larger.
I wonder if is there any way to check current shards we are using to avoid this error happening?
The best way to see indexing and search activity is by using a monitoring system. And the best monitoring system for Elasticsearch is Opster. You can try it for free at the following link.
https://opster.com/
For the manual check and sort, you can try the following APIs.
You can sort your indices according to the creation date string (cds). It will help you to understand which one is the old one. So you can have an idea about your indices (shards).
GET _cat/indices?v&h=index,cds&s=cds
Also, you check the indices stats to see if is there any activity in searching or indexing.
To check all indices you can use GET _all/_stats
To check only one index you can use GET index_name/_stats

Does ElasticSearch Keep Count The Number Of Times A Record Is Returned In A Given Period Of Time?

I have an ElasticSearch instance and it does one type of search - it takes a few parameters and returns the companies in its index that match the parameters given.
I'd like to be able to pull some stats that essentially says "This company has been returned from search queries X number of times in the past week".
Does ElasticSearch store metadata that will allow to pull this kind of info from it? If this kind of data isn't stored in ES out of the box, is there a way to enable it?
Elasticsearch (not ElasticSearch ;) ) does not do this natively, no. you can build something using the slow log, where you set the timing to 0 to get it to log everything, but that then logs everything which may not be useful/too noisy
things like https://www.elastic.co/enterprise-search, built on top of Elasticsearch, do provide this sort of insight

How can I find the most used query from Elasticsearch?

I have a Elasticsearch cluster running on AWS Elasticsearch instance. It is up running for a few months. I'd like to know the most used query requests over the last few months. Does Elasticsearch save all queries somewhere I can search? Or do I have to programmatically save the requests for analysis?
As far as I'm aware, Elasticsearch doesn't by default save a record or frequency histogram of all queries. However, there's a way you could have it log all queries, and then ship the logs somewhere to be aggregated/searched for the top results (incidentally this is something you could use Elasticsearch for :D). Sadly, you'll only be able to track queries after you configure this, I doubt that you'll be able to find any record of your historical queries the last few months.
To do this, you'd take advantage of Elasticsearch's slow query log. The default thresholds are designed to only log slow queries, but if you set those defaults to 0s then Elasticsearch would log any query as a slow query, giving you a record of all queries. See that link above for detailed instructions how, you could set this for a whole cluster in your yaml configuration file like
index.search.slowlog.threshold.fetch.debug: 0s
or set it dynamically per-index with
PUT /<my-index-name>/_settings
{
"index.search.slowlog.threshold.query.debug": "0s"
}
To be clear the log level you choose doesn't strictly matter, but utilizing debug for this would allow you to keep logging actually slow queries at the more dangerous levels like info and warn, which you might find useful.
I'm not familiar with how to configure an AWS elasticsearch cluster, but as the above are core Elasticsearch settings in all the versions I'm aware of there should be a way to do it.
Happy searching!

Using job to delete data from Elastic seach asynchronously

I would like to delete data from Elastic search using API (curl).
I would like to start the deletion process and later query about the progress of deletion process.
Is it possible to use job to do it?
I tried looking at relevant documentation but the amount of examples is very low.
Would appreciate any relevant information or links.
You have two solutions:
Using the delete-by-query API using a range query that you can then monitor using the Task API
Use daily indices (e.g. my-logs-2018-09-10, my-logs-2018-09-11, etc) so deleting data in the past is simply a matter of deleting the indices for the days you want to ditch. No need to monitor as this happens instantaneously

Track what documents are coming from which shards in Elasticsearch

I have enabled routing and all my sets of documents are going to same shard. Now i need to directly hit that machines and see if there is performance gain . But then i haven't found a mechanism to find what document went to which shard. Kindly let me know if there is any way to achieve this.
You can use Search Shards API.
Sample Syntax:
GET /index/type/_search_shards?routing={routing_id}

Resources