Elastic Cloud | How and why to handle Kibana resources when creating deploy - elasticsearch

I am wondering which would be the reasons/use-cases that would require to assign Kibana high RAM, CPU and zone resources.
I mean, it's clear what does this means for Elasticsearch component (I/O efficiency). But how these variables affect Kibana performance? What different use cases there are where these resources might/must be handled differently?
Thank you in advance.

Below are the some of the you can consider while creating Kibana instance:
Number of simultaneous connetions to Kibana
simultaneous editing request the dashboards
the number of accesses to the shared dashboard
high Reporting functionality (which requires CPU)
large reporting jobs / alert jobs
number of space you are creating
Basically, you need to consider what all functionality you are going to use from Kibana. If you have limited usage of kibana then you can go with either first or second option from list.

Related

Limiting Elasticsearch data retention below disk space

Scenario:
We use Elasticsearch & logstash to do application logging for a moderately high traffic system
This system generates ~200gb of logs every single day
We use 4 instances sharded; and want to retain roughly last 3 days worth of logs
So, we implemented a "cleanup" system, running daily, which removes all data older than 3 days
So far so good. However, a few days ago, some subsystem generated a persistent spike of data logs, resulting in filling up all available disk space within a few hours, which turned the cluster red. This also meant, that the cleanup system wasn't able to connect to ES, as the entire cluster was down -on account of disk being full. This is extremely problematic, as it limits our visibility into what's going on -and blocks our ability to see what caused this in the first place.
Doing root cause analysis here, a few questions pop out:
How can we look at the system in eg Kibana when the cluster status is red?
How can we tell ES to throw away (oldest-first) logs if there is no more space, rather than going status=red?
In what ways can we make sure this does not happen ever again?
Date based index patterns are tricky with spiky loads. There are two things to combine this for a smooth setup without needing manual intervention:
Switch to rollover indices. You can then define that you want to create a new index once your existing one has reached X GB. Then you don't care about the log volume per day any more, but you can simply keep as many indices around as you have disk space (and leave some buffer / fine tune the watermarks).
To automate the rollover, removal of indices, and optionally setting of an alias, we have Elastic Curator:
Example for rollover
Example for delete index, but you want to combine this with the count filtertype
PS: There will be another solution soon, called Index Lifecycle Management. It's built into Elasticsearch directly and can be configured through Kibana, but it's only around the corner at the moment.
How can we look at the system in eg Kibana when the cluster status is red?
Kibana can't connect to ES if it's already down. Best to poll Cluster health API to get cluster's current state.
How can we tell ES to throw away (oldest-first) logs if there is no more space, rather than going status=red?
This option is not inbuilt within Elasticsearch. Best way is to monitor disk space using Watcher or some other tool and have your monitoring send out an alert + trigger a job that cleansup old logs if the disk usage goes below a specified threshold.
In what ways can we make sure this does not happen ever again?
Monitor the disk space of your cluster nodes.

Elastic search API Vs Spring data Vs logstash

I am planing to use elastic search for our dashboard using spring boot based rest services. After research i see top 3 options
Option A:
Use Elastic Search Java API ( from comment looks like going to go away)
Use Elastic Search Java Rest Client
Use spring-data-elasticsearch ( planing to use es 5.6 but challenging for latest es 6 as I don't see it's supports right now)
Option B:
Or shall I use logstash approach to
Sync data between postgressql and elastic search using logstash ?
Which one among them will be long term approach to get near real time data from ES in high load scenario ??
Usecase: I need to save some data from postgresql table to elastic search for my dashboard (near real time )
Update is frequent for both tables and es
to maintain current state
Load is going to increase in couple of week
The options you listed, in essence, are: should you go with a ready to use solution (logstash) or should you implement your own.
Try logstash first to see if it works for you - it'll take less time than implementing your own solution, and you can get working solution in minutes (if it's not hundreds of tables)
If you want near-real time, then you need to figure out if it allows you to:
handle incremental updates, i.e. if its 'tracking_column' configuration will work for your data structure and it will only load updated records in each run, not the whole table.
run it at the desired frequency
and in general, satisfies your latency requirements
If you decide to go with your own solution, keep in mind that spring-data-elasticsearch is a higher level wrapper for underlying elasticsearch client. If there are latency goals, then working on the lower level (elasticsearch clients) may give you better control and more options to tune the pipeline.
Otherwise, the client choice will not matter that much as data feed features (volume/update frequency) and db/es cluster configuration.

Are there conventions for naming/organizing Elasticsearch indexes which store log data?

I'm in the process of setting up Elasticsearch and Kibana as a centralized logging platform in our office.
We have a number of custom utilities and plug-ins which I would like to track the usage of and if users are encountering any errors. Not to mention there are servers, and scheduled jobs I would like to keep track of as well.
So if I have a number of different sources for log data all going to the same elasticsearch cluster what are the conventions or best practices for how this is organized into indexes and document types?
The default index value used by Logstash is "logstash-%{+YYYY.MM.dd}". So it seems like it's best to suffix any index names with the current date, as this makes it easy to purge old data.
However, Kibana allows for adding multiple "index patterns" that can be selected from in the UI. Yet all the tutorials I've read only mention creating a single pattern like logstash-*.
How are multiple index patterns used in practice? Would I just give names for all the sources for my data? Such as:
BackupUtility-%{+YYYY.MM.dd}
UserTracker-%{+YYYY.MM.dd}
ApacheServer-%{+YYYY.MM.dd}
I'm using nLog in a number of my tools which has an elastic search target. The convention for nLog and other similar logging frameworks is to have a "logger" for each class in the source code. Should these logger translate to indexes in elastic search?
MyCompany.CustomTool.FooClass-%{+YYYY.MM.dd}
MyCompany.CustomTool.BarClass-%{+YYYY.MM.dd}
MyCompany.OtherTool.BazClass-%{+YYYY.MM.dd}
Or is this too granular for elasticsearch index names, and it would be better to stick to just to a single dated index for the application?
CustomTool-%{+YYYY.MM.dd}
In my environment we're working through a similar question. We have a mix of system logs, metric alerts from Prometheus, and application logs from both client and server applications. In addition, we have some shared variables between the client and server apps that let us correlate the two (e.g., we know what server logs match some operation on the client that made requests to said server). We're experimenting with the following scheme to help Kibana answer questions for us:
logs-system-{date}
logs-iis-{date}
logs-prometheus-{date}
logs-app-{applicationName}-{date}
Where:
{applicationName} is the unique name of some application we wrote (these could be client or server side)
{date} is whatever date-based scheme you use for indexes
This way we can set up Kibana searches against logs-app-* and quickly search for logs among any of our applications. This is still new for us, but we started without this type of scheme and are already regretting it. It makes searching for correlated logs across applications much harder than it should be.
In my company we have worked lot about this topic. We agree the following convention:
Customer
-- Product
--- Application
---- Date
In any case, it is neccesary to review both how the data is organized and how the data is consulted inside the organization
Kind Regards
Dario Rodriguez
I am not aware of such conventions, but for my environment, we used to create two different type of indexes logstash-* and logstash-shortlived-*depending on the severity level. In my case, I create index pattern logstash-* as it will satisfy both kind of indices.
As these indices will be stored at Elasticsearch and Kibana will read them, I guess it should give you the options of creating the indices of different patterns.
Give it a try on your local machine. Why don't you try logstash-XYZ if you want more granularity otherwise you can always create indices with your custom name.

Is it safe to expose the Elasticsearch Search API directly through your application's API?

I am developing an AngularJS app with a Java/Spring Boot API. It uses Spring Data Elasticsearch to provide access to Elasticsearch's Search API for searching. Here is an example:
Page<Address> page = addressSearchRepository.search(simpleQueryStringQuery(query), pageable);
The variable query is a user's search string. pageable is an object that specifies page number, page size, and sorting. I can use QueryBuilders to build other Elasticsearch queries and expose them as different API endpoints.
Another option is to use QueryBuilders.wrapperQuery and send Elasticsearch queries directly from JavaScript. Here is an example where jsonQuery is a string containing a full Elasticsearch query:
Page<Address> page = addressSearchRepository.search(wrapperQuery(jsonQuery), pageable);
This would be a secure endpoint that only authenticated users can access. This seems to be equivalent to exposing an Elasticsearch index's Search API directly. Assuming that any data in the index is safe to show the user, would this be a security risk?
In my research so far I've found that it may be possible to crash Elasticsearch using a query, but it isn't that big of a problem in newer versions: https://www.elastic.co/blog/found-crash-elasticsearch#arbitrary-large-size-parameter
Maybe limiting the page size or using the scan and scroll API when the page size is very large would mitigate this.
I know that script fields should be avoided at all costs, but they are disabled by default (as of v1.4.3).
You can still crash Elasticsearch if you know how to do it. For example, if you start building a 10 deep nested aggregations, you might very well go and take a break. It will either take a lot of time, or be very expensive, use a lot of memory, make the JVM do a lot of garbage collection (which basically freezes all other threads running in the JVM), reclaim back small amounts of memory. It can make the cluster unresponsive in this way.
I'm not saying that whatever aggregations you take and create a 10 deep nested aggregations you'll cripple the cluster, but under normal circumstances a cluster built for a certain SLA and deal with a certain amount of data, given some heavy aggregations (for example terms on analyzed string fields), will be very highly computational for the nodes.
Maybe the nodes will not run out of memory, but the nodes will barely be responsive.
Elastic's team is trying to implement other circuit breakers and to add default limits to certain types of queries and aggregations (a huge task). But if your aim is for your users not to crash ES, while they have full access to all queries, I think there are ways to crash it. I, personally, wouldn't expose ES and let my users do whatever they want with whatever queries they create.
Depending on how your wrapper is configured, I'd only allow my users certain types of queries/aggregations and for those I'd impose some limits (applicable for those queries/aggs that accept limits).

elastic search index strategies under high traffic

We use ElasticSearch for our tool's real time metrics and analytics part. ElasticSearch is very cool and fast when we are query our data. (statiticial facets and terms facet)
But we have problem when we try to index our hourly data. We collect every our metric data from other services. First we collect data from other services and save them RabbitMQ process. But when queue worker runs our all hourly data not index to ES. Usually %40 of data index in ES and other them lost.
So what is your idea about when index ES under high traffic ?
I've posted answers to other similar questions:
Ways to improve first time indexing in ElasticSearch
Performance issues using Elasticsearch as a time window storage (latter part of my answer applies)
Additionally, instead of a custom 'queue worker' have you considered using a 'river'? For more information see:
http://www.elasticsearch.org/blog/the-river/
http://www.elasticsearch.org/guide/reference/river/

Resources