What does "source:name" in filter means? - elasticsearch-curator

I have been studying curator past few days and I came across this filter type "age".
On official documentation it is written as name based age filter look for a timestring within the index or snapshot name, and convert that into an epoch timestamp.
Which is not quite clear to me.
If I mention
source: name
what "name" does curator refer to?
Does it refer to name of any particular index and if yes how can I mention name of that index?
It will be really helpful if anyone suggest me some more documentation on curator.
Thanks in advance ^^

Yes, source: name reads the index name and looks for a time/date value matching timestring. For example, if you had an index named indexname-2019.06.01, you might build a filter like this:
- filtertype: age
source: name
timestring: ‘%Y.%m.%d’
unit: days
unit_count: 30
direction: older
This filter (if not following other filters in a list) will look through the names of all indices in Elasticsearch for a Year.month.day pattern, convert it to an epoch time stamp, and see if that date is more than 30 days older than the epoch time stamp at the time Curator is executed. If that is true, that index name will remain in the actionable list to do whatever action the filter is associated with.
Now, this by itself can be a dangerous filter. It will match indexname-2019.06.01 or 2019.06.01-anything or even prefix-2019.06.01-suffix. Filters in Curator were made to go together in a chain. To specify which indices you want Curator to consider, it might be wise to do a pattern filter before the age filter:
- filtertype: pattern
kind: prefix
value: indexname
- filtertype: age
source: name
timestring: ‘%Y.%m.%d’
unit: days
unit_count: 30
direction: older
Now this filter list will only look for indices which begin with indexname and have a Year.month.day time string after that. Filters in Curator are always ANDed together.
The official Curator documentation is the ultimate source of truth for all things Curator. If you have further requests for explanation, I’m happy to answer them (full disclosure: I am the author and maintainer of Curator).

Related

Advice on ElasticSearch query design

I've got ES documents that looks like this:
{
"auctionOn": "2018-01-01",
"inspections: [
{
"startsOn": "2018-01-02 09:00",
"endsOn": "2018-01-02 10:00"
}
]
}
I need the following answers from a search (or multiple searches)
number of documents with an auctionOn in the future (e.g > now)
number of documents with an inspection.startsOn in the future (e.g > now)
date histogram (day breakdown) of the next 7 days, with # of documents with a auctionOn on that day
date histogram (day breakdown) of the next 7 days, with # of documents with a inspection.startsOn on that day
So, i'm trying to figure out how to efficiently get these answers. I know i can/should test out all different approaches, but i'm relatively new to ES so easier said than done.
Can someone give me a advice (or ideally, a query) on how to get these 4 values?
Ideas i had:
Query for all documents with an inspection/auction in the future. Create date histogram aggregations filtered to the next 7 days for both auction and inspections. Use range aggregations to get number of docs with auction/inspection > today.
Pros: one search for all answers. Cons: lots of documents to aggregate over?
Create seperate searches (e.g msearch) for:
query all documents with an inspection in the next 7 days. aggregate by day.
query all documents with an auction in the next 7 days. aggregate by day.
query all documents with an inspection in the future. use hits to get total
query all documents with an auction in the future. use hits to get total.
Pros: queries are simpler.. more cache hits? Cons: 4 seperate searches.
Can someone please guide me down the right path, and give me hints on how to do the query/aggregations?
Thanks
Use range query on the field auctionOn setting from as current date and to date as null.
Use range query inside nested query on the field inspection.startsOn as above.
Use date histogram aggregation using interval as day
Same as 3.) but inside nested aggregation
You can adjust all these in one query.

How do time filter shortcuts work in KIbana?

Kibana provides time filter shortcuts like Today, Yesterday, last 10 days etc in the dashboard. I want to know how do they work. When I click Today in Kibana, Which field is used in the query? How can i configure these links to take custom timestamp fields?
When you create your index the first time you choose after that Time filter field name:
this filter will be used when you choose last 15 minutes or another time filter.
By default it is the timestamp but if you have a time field in your fields list you can use it.

elasticsearch curator delete "all" indices order than 7 days

background:
elasticsearch version 6.2
curator version 5.4.1.
Now I can use curator to delete one index that order 7 days, but I have more than one index and I don't want to create more than one action.yml, such as :
actions:
1:
action: delete_indices
description: >-
Delete indices older than 7 days (based on index name), for student-prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: student=
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 7
According to this action.yml, It deletes student=2017-XX-XX.
But I have many indices such as teacher, parent and so on.
I replace studnet= with *= but doesn't work.
So what can I do?
Thank you very much.
You try a few things. A few examples include:
You can omit the pattern filtertype, leaving only the age. This might delete other indices with %Y-%m-%d patterns, however. In that case, you might use a different pattern filter, but to exclude patterns you don't want to delete:
- filtertype: pattern
kind: prefix
value: omit_me
exclude: true
Replacing your pattern filter with this will delete all indices with %Y-%m-%d that are older than 7 days, except indices starting with omit_me.
You might set up a regex instead of a prefix. For example:
- filtertype: pattern
kind: regex
value: '^(student|parent|teacher).*$'
This will match indices starting with student, parent, or teacher.

ElasticSearch query specifying an indexname using todays date

I'm using logstash to populate ES with a number of metrics from our live services across a number of machines. Logstash creates a new index each day and i am finding that querying ES without specifying the index, is running slowly. ( i currently maintain 5 days of indicies). If i specify the specific index eg today
.es(index=logstash-2018.01.15, q= examplequery
it runs very quickly
Is there a way i can specify todays index using the date field?
eg
.es(index=logstash-'get date', q= examplequery
You can use the query for getting the indices of today's date:
.es(index='<logstash-{now/d}>')
An interesting read with all the options available in elastic search to include date math in index names:
https://www.elastic.co/guide/en/elasticsearch/reference/current/date-math-index-names.html
By looking at the syntax I guess you are using Timelion or something that uses query string. There is a good tutorial here that includes specifying index patterns:
https://www.elastic.co/blog/timelion-tutorial-from-zero-to-hero
In your case it will be
.es(index=logstash-*, q= examplequery
or
.es(index=logstash-2018.01.*, q= examplequery
if you need this year january and the index pattern is 'logstash-YYYY.MM.dd'

How to facet/histogram the following in elasticsearch?

Say I have the following fields:
timestamp: (elasticsearch date field)
ice-cream-flavor: (e.g. Chocolate, Vanilla, Strawberry)
container: (e.g. Cup, Cone)
I want to be able to use some kind of facet to be able to give me a count based on:
timestamp: (bucketed by day)
ice-cream-flavor count (how many Chocolate on that day? How many Vanilla that day?)
Could I take it a step further and do:
I want to be able to use some kind of facet to be able to give me a count based on:
timestamp: (bucketed by day)
ice-cream-flavor count (how many Chocolate on that day? How many Vanilla that day?)
container count (based on time bucketed by day, could I get the count of how many of each ice cream flavor were stored in a container?)
Is this possible? What kind of facet is this? Could you provide an example? I tried using the DateHistogram and Histogram facet but it appears that if I specify the field to be a date, I get some random key with some random count that makes no sense....
What I've tried...
Given a date histogram with a specified "field=timestamp", I get the following output that appears to make no sense. It expects a key field and a value field with a value field that has to be an integer? It doesnt make much sense... and does not take into account the specific conditions I want.
myhistogram: {
_type: "date_histogram",
entries: [
{},
{
count: 1,
time: 634579946870400000
},
{
count: 1,
time: 634580073100800000
}]}
The date histogram supports a "value_field" parameter for bucketing, but it only supports numeric values.
There is a plugin elasticfacets which seems to allow using the date histogram facet with any other facet to bucket on.

Resources