One elasticsearch/logstash instance with multiple applications with a unique index - elasticsearch

We use swisscoms application cloud and are currently evaluating the new Elasticsearch service. We set it up including logstash and kibana.
We now added a user provided service to each of our apps that should use the common elasticsearch/logstash/kibana instance. When we first logged in into kibana we saw there was an index called logstash-, where all the logs of all applications go.
Now what we want is to have a index for each of the apps that writes to our elk instance. Lets say we have e apps (app1, app2, app3). We d like to have three indices (app1-..., app2-... and app3-...). Any ideas on how we can achieve that?
Is that a configuration that has to be done using ENV variables on Cloud foundry or is it something we have to configure within our Java and NodeJS apps
(app1-... , ...)?
Thanks in advance for your help.

You can use Elasticsearch output plugin for logstash which is the recommended method of storing logs in Elasticsearch. This plugin has a configuration option called index which is used to define the name of the index to write events to. The default index name is logstash-%{+YYYY.MM.dd}
Use it along with if conditional to assign a name of the index for each app based on type, like this,
output {
if [type] == "apache" {
elasticsearch {
index => "apache-website-index"
}
} elseif [type] == "nginx" {
elasticsearch {
index => "nginx-website-index"
}
}
}
Please have a look at this answer as well
Please comment if you have any question.

Related

Indexing Errors when using quarkus logging-gelf extension and ELK stack

I have setup logging like described in https://quarkus.io/guides/centralized-log-management with an ELK Stack using version 7.7.
My logstash pipeline looks like the proposed example:
input {
gelf {
port => 12201
}
}
output {
stdout {}
elasticsearch {
hosts => ["http://elasticsearch:9200"]
}
}
Most Messages are showing up in my Kibana using logstash.* as an Index pattern. But some Messages are dropped.
2020-05-28 15:30:36,565 INFO [io.quarkus] (Quarkus Main Thread) Quarkus 1.4.2.Final started in 38.335s. Listening on: http://0.0.0.0:8085
The Problem seems to be, that the fields MessageParam0, MessageParam1, MessageParam2 etc. are mapped to the type that first appeared in the logs but actually contain multiple datatypes. The Elasticsearch log shows Errors like ["org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [MessageParam1].
Is there any way in the Quarkus logging-gelf extension to correctly map the values?
ELK can auto-create your Elasticsearch index mapping by looking at the first indexed document. This is a very convenient functionality, but it comes with some drawback.
For example, if you have a field that can contains numbers or strings, if the first document contains a number for this field, the mapping will be created with a number field so you will not be able to index a document containing a String inside this field ...
The only workaround for this is to create the mapping upfront (you can only defines the fields that causing the issue, the other fields will be created automatically).
This is an ELK issue, there is nothing we can do at Quarkus side.

Elasticsearch Dynamic Field Mapping and JSON Dot Notation

I'm trying to write logs to an Elasticsearch index from a Kubernetes cluster. Fluent-bit is being used to read stdout and it enriches the logs with metadata including pod labels. A simplified example log object is
{
"log": "This is a log message.",
"kubernetes": {
"labels": {
"app": "application-1"
}
}
}
The problem is that a few other applications deployed to the cluster have labels of the following format:
{
"log": "This is another log message.",
"kubernetes": {
"labels": {
"app.kubernetes.io/name": "application-2"
}
}
}
These applications are installed via Helm charts and the newer ones are following the label and selector conventions as laid out here. The naming convention for labels and selectors was updated in Dec 2018, seen here, and not all charts have been updated to reflect this.
The end result of this is that depending on which type of label format makes it into an Elastic index first, trying to send the other type in will throw a mapping exception. If I create a new empty index and send in the namespaced label first, attempting to log the simple app label will throw this exception:
object mapping for [kubernetes.labels.app] tried to parse field [kubernetes.labels.app] as object, but found a concrete value
The opposite situation, posting the namespaced label second, results in this exception:
Could not dynamically add mapping for field [kubernetes.labels.app.kubernetes.io/name]. Existing mapping for [kubernetes.labels.app] must be of type object but found [text].
What I suspect is happening is that Elasticsearch sees the periods in the field name as JSON dot notation and is trying to flesh it out as an object. I was able to find this PR from 2015 which explicitly disallows periods in field names however it seems to have been reversed in 2016 with this PR. There is also this multi-year thread from 2015-2017 discussing this issue but I was unable to find anything recent involving the latest versions.
My current thoughts on moving forward is to standardize the Helm charts we are using to have all of the labels use the same convention. This seems like a band-aid on the underlying issue though which is that I feel like I'm missing something obvious in the configuration of Elasticsearch and dynamic field mappings.
Any help here would be appreciated.
I opted to use the Logstash mutate filter with the rename option as described here:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-rename
The end result looked something like this:
filter {
mutate {
'[kubernetes][labels][app]' => '[kubernetes][labels][app.kubernetes.io/name]'
'[kubernetes][labels][chart]' => '[kubernetes][labels][helm.sh/chart]'
}
}
Although personally I've never encountered the exact same issue, I had similar problems when I indexed some test data and afterwards changed the structure of the document that should have been indexed (especially when "unflattening" data structures).
Your interpretation of the error message is correct. When you first index the document
{
"log": "This is another log message.",
"kubernetes": {
"labels": {
"app.kubernetes.io/name": "application-2"
}
}
}
Elasticsearch will recognize the app as an object/structure due to dynamic mapping.
When you then try to index the document
{
"log": "This is a log message.",
"kubernetes": {
"labels": {
"app": "application-1"
}
}
}
the previously, dynamically created mapping defined the field app as an object with sub-fields but elasticsearch encounters a concrete value, namely "application-1".
I suggest that you setup an index template to define the correct mappings. For the 'outdated' logging-versions I suggest to pre-process the particular documents either through an elasticsearch ingest-pipeline or with e.g. Logstash to get the documents in the correct format.
Hope that helps.

how to write to a given index-pattern in Kibana?

I have a dotnet core project set up for Elasticsearch, Kibana and Logstash. Right now I just spit out random data to the log, but the thing is, that when I run the app then it automaticly runs on the same index-pattern under '_index' in the "Discover-section" in Kibana.
The question is - how in my code - do I define which index-pattern I want to connect to? I presume it is inside Program.cs inside main, but I am not sure how.
I want to be able to decide myself inside the app-code, which index-pattern I want to log to, if that makes sense.
Currently using serilog sinks. Is it in that direction, that I should fix it or am I looking in the wrong direction?
Update (trying to implement code from link given by mike b)
var connectionSettings =
new ConnectionSettings()
.DefaultIndex("defaultindex")
.DefaultMappingFor<Project>(m => m.IndexName("mycustomindex"));
var elasticClient = new ElasticClient(connectionSettings);
var searchResponse = elasticClient.Search<Project>();
Also I have created an index inside kibana, which will show, when I enter "GET _cat/indices", but when running the project and seeing that logs are received by kibana I still see that they are registered under the same old index ("httplog") as seen in the snippet below:
how do I change that - or what am I doing wrong?
PS: I can see the created index inside "Discover" / drop down menu, but it is empty for logs. Instead my httplog-index is filled with logs...
If you're using the .NET Elasticsearch client, it'll infer the name it thinks you want for an index. You can override that behavior or specify index names for specific index actions.
See: https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/index-name-inference.html

single permissions in search guard

I went through the latest documentation of search guard, https://docs.search-guard.com/latest/roles-permissions and could only find a short explaination for single permissions:
Single permissions either start with cluster: or indices:, followed by
a REST-style path that further defines the exact action the permission
grants access to.
So one permission could be on cluster level or on indices level.
indices:data/read/search
So the part before : could be indices or cluster, but I'm not clear how to understand the parts after semicolon, and what are these parts seperated by "/".
Can someone please explaine me more about this or point me to some documentation, which I maybe missed?
Thanks
Dingjun
This is actually not related to Search Guard but to Elasticsearch. When you interact with Elasticsearch, e.g. indexing data or searching data, what you are really doing is executing actions. Each action has a name, and this name is in the format you outlined.
For example, have a look at org.elasticsearch.action.search.SearchAction:
public class SearchAction extends Action<SearchRequest, SearchResponse, SearchRequestBuilder> {
public static final SearchAction INSTANCE = new SearchAction();
public static final String NAME = "indices:data/read/search";
private SearchAction() {
super(NAME);
}
...
}
The permission schema of Search Guard is based on these Elasticsearch action names. So the naming conventions is actually an Elasticsearch naming convention.
Elastic does not publish a list of these action names anymore. AFAIK in X-Pack Security you cannot actually go down to this level of detail regarding permissions, that's probably why they stopped to do so.
The Search Guard Kibana plugin comes with a list of permissions you might use as a guideline:
https://github.com/floragunncom/search-guard-kibana-plugin/blob/6.x/public/apps/configuration/permissions/indexpermissions.js
https://github.com/floragunncom/search-guard-kibana-plugin/blob/6.x/public/apps/configuration/permissions/clusterpermissions.js
The question is if you really need to implement security on such a fine-grained level. That's what the action groups are for. These are pre-defined sets of permissions and should cover most use cases. Also, they are updated with each release if the permissions in Elasticsearch change, so it's safer to use action groups than to rely on single permissions. But as always, it depends on the use case.

How to remove the json. prefix on my elasticsearch field

I use ELK to get some info on my rabbitmq stuff.
Here my conf logstash side
json {
source => "message"
}
But in kibana I have to prefix all my fields with json.xxx:
json.sender, json.sender.raw,json.programld, json.programId.raw ...
How can I not have this json.-prefix in my field names, so that I only have to have: sender, programId, etc.?
Best regards and thanks for your help !
Bonus question : what are all these .'raw' I must use in kibana ?
According to the doc:
By default it will place the parsed JSON in the root (top level) of
the Logstash event, but this filter can be configured to place the
JSON into any arbitrary event field, using the target configuration.
So it feels like your json is wrapped in a container named "json" or you're setting the "target" in logstash without showing us.
As for ".raw", the default elasticsearch mapping will analyze the data you put in a field, so changing "/var/log/messages" into three words: [var, log, messages]" which can make it hard to search. To keep you from having to worry about this at the beginning, logstash creates a ".raw" version of each string, which is not analyzed.
You'll eventually make your own mappings, and you can make the original field not_analyzed, so you won't need the .raw versions anymore.

Resources