How to retrieve all existing indeces in painless - elasticsearch

I'm wanting to retrieve the number of indices in my ES cluster from within a scripted field of an aggregation.
I know you can access some context values with ctx._source but does anyone know how to get the total number of indeces from my cluster?
Thanks!

That's not possible. The ctx context has no idea about the state of your cluster. It has only access to the currently iterated doc.

Related

Get last document from index in Elasticsearch

I'm playing around the package github.com/olivere/elastic; all works fine, but I've a question: is it possible to get the last N inserted documents?
The From statement has 0 as default starting point for the Search action and I didn't understand if is possible to omit it in search.
Tldr;
Although I am not aware of a feature in elasticsearch api to retrieve the latest inserted documents.
There is a way to achieve something alike if you store the ingest time of the documents.
Then you can sort on the ingest time, and retrieve the top N documents.

How to search specific index names from Elasticsearch?

In my ES cluster, I'm having indices like, 'movies-2011', 'movies-2012', 'movies-2013' ...
I want to fetch the indices list which starts from "movies..".
Are there any ways to achieve it?
ES Documentation Link
Try below:
GET /_cat/indices/movies*

How can I send StormCrawler content to multiple Elasticsearch indices, based on host?

I currently have a successful StormCrawler instance crawling about 20 sites, and indexing the content to one Elasticsearch index. Is it possible, either in ES or via StormCrawler, to send each host's content to its own unique content index?
Out of curiosity: why do you need to do that? Having one index per host seems rather wasteful. You can filter the results based on a field like host if you want to provide results for a particular host.
To answer your question, there is no direct way of doing it currently as the IndexerBolt it connected to one index only. You could declare one IndexerBolt per index you need and add a custom bolt to fan based on the value of the host metadata but this is not dynamic and rather heavy-handed. There could be a way of doing it using pipelines in ES, not sure.

Elasticsearch and Kibana: aggregation to find the name of the most rewarded miner, daily

I created an index from a Storm topology to ElasticSearch (ES). The index map is basically:
index: btc-block
miner: text
reward: double
datetime: date
From those documents I would like to create a histogram of the richest miner, on a daily scale.
I am wondering if I should aggregate first in storm and just use ES and Kibana to store, query and then display the data or if ES and Kibana can handle such requests.
I have been looking at the Transforms, in the index management section, that allows to create new indices from queries and aggregations in continuous modes but I can't succeed to get to the expected result.
Any help will be appreciated.
Sometimes we need to ask a question to find the answer...
I kept looking at the documentation and eventually I could solve the issue by using a sibling pipeline aggregation, in the visualization. In my case, a max bucket aggregation of the sum of reward on Y-axis.
In that case get like 6 records/hour so I guess it's ok to let Kibana and ES work. What if I got lot more data? Would it not be wiser to aggregate in Storm?

ElasticSearch: optimise the storage in indexes and the time response for requests

In a Kafka server I have N types of messages, one for each IOT application. I want to store these messages in Elastisearch in different indexes. Do you know which is the most optimizing method for that use case in order to have the lower time response for request regarding every message type ?
Furthermore, it is adivised to create an index per day like this: "messageType-%{+YYYY.MM.dd}"; Is this a way for my use case?
Finally, concerning the previous way, if I have a request with a time range for instance from 2016.06.01 to 2016.07.04, does elasticsearch search directly in the indexes "messageType-%{+2016.06.01}", "messageType-%{+2016.06.02}", ..., "messageType-%{+2016.07.04}" ?
Thanks in advance,
J
If you plan to purge docs after a certain time, creating indexes based on time is a good idea because you can drop indexes after certain time.
You can search against all indexes or more preferably you should specify the indexes you want to search against.
For example, you could do a search against /index1,index2/_search where you determine index1, index2 from the query or you can just hit /_search which will search all indexes (slower)

Resources