I've made a working ELK stack on Debian Wheezy and have set up Nxlog to gather windows logs. I see the logs in Kibana - everything is working fine, but i get too much data and want to filter it by removing some fields that I don't need.
I've made a filter section but it's not working at all. What can be the reason?
The filter above
input {
tcp {
type => "eventlog"
port => 3515
format => "json"
}
}
filter {
type => "eventlog"
mutate {
remove => { "Hostname", "Keywords", "SeverityValue", "Severity", "SourceName", "ProviderGuid" }
remove => { "Version", "Task", "OpcodeValue", "RecordNumber", "ProcessID", "ThreadID", "Channel" }
remove => { "Category", "Opcode", "SubjectUserSid", "SubjectUserName", "SubjectDomainName" }
remove => { "SubjectLogonId", "ObjectType", "IpPort", "AccessMask", "AccessList", "AccessReason" }
remove => { "EventReceivedTime", "SourceModuleName", "SourceModuleType", "#version", "type" }
remove => { "_index", "_type", "_id", "_score", "_source", "KeyLength", "TargetUserSid" }
remove => { "TargetDomainName", "TargetLogonId", "LogonType", "LogonProcessName", "AuthenticationPackageName" }
remove => { "LogonGuid", "TransmittedServices", "LmPackageName", "ProcessName", "ImpersonationLevel" }
}
}
output {
elasticsearch {
cluster => "wisp"
node_name => "io"
}
}
I think you try to remove fields that do not exist in some logs.
Does all your logs contains all the fieds you're trying to remove ?
If not, you have to identify your logs before removing fields.
Your filter config will look like this :
filter {
type => "eventlog"
if [somefield] == "somevalue" {
mutate {
remove => { "specificfieldtoremove1", "specificfieldtoremove2" }
}
}
}
Related
I am working on a project, ingesting CVE data from NVD into Elasticsearch.
My input data looks something like this:
{
"resultsPerPage":20,
"startIndex":0,
"totalResults":189227,
"result":{
"CVE_data_type":"CVE",
"CVE_data_format":"MITRE",
"CVE_data_version":"4.0",
"CVE_data_timestamp":"2022-11-23T08:27Z",
"CVE_Items":[
{...},
{...},
{...},
{
"cve":{
"data_type":"CVE",
"data_format":"MITRE",
"data_version":"4.0",
"CVE_data_meta":{
"ID":"CVE-2022-45060",
"ASSIGNER":"cve#mitre.org"
},
"problemtype":{...},
"references":{...},
"description":{
"description_data":[
{
"lang":"en",
"value":"An HTTP Request Forgery issue was discovered in Varnish Cache 5.x and 6.x before 6.0.11, 7.x before 7.1.2, and 7.2.x before 7.2.1. An attacker may introduce characters through HTTP/2 pseudo-headers that are invalid in the context of an HTTP/1 request line, causing the Varnish server to produce invalid HTTP/1 requests to the backend. This could, in turn, be used to exploit vulnerabilities in a server behind the Varnish server. Note: the 6.0.x LTS series (before 6.0.11) is affected."
}
]
}
},
"configurations":{
"CVE_data_version":"4.0",
"nodes":[
{
"operator":"OR",
"children":[
],
"cpe_match":[
{
"vulnerable":true,
"cpe23Uri":"cpe:2.3:a:varnish-software:varnish_cache_plus:6.0.8:r2:*:*:*:*:*:*",
"cpe_name":[
]
},
{
"vulnerable":true,
"cpe23Uri":"cpe:2.3:a:varnish-software:varnish_cache_plus:6.0.8:r1:*:*:*:*:*:*",
"cpe_name":[
]
},
{
"vulnerable":true,
"cpe23Uri":"cpe:2.3:a:varnish_cache_project:varnish_cache:7.2.0:*:*:*:*:*:*:*",
"cpe_name":[
]
}
]
}
]
},
"impact":{...},
"publishedDate":"2022-11-09T06:15Z",
"lastModifiedDate":"2022-11-23T03:15Z"
}
]
}
}
Each query from the API gives back a result like this, containing 20 individual CVEs, each CVE contains multiple configuration. What I wanted to achieve was to have one document per configuration, where the other data is the same, only the cpe23Uri field changes. I've tried using the split filter in multiple different ways, with no success.
My pipeline looks like this:
filter {
json {
source => "message"
}
split {
field => "[result][CVE_Items]"
}
split {
field => "[result][CVE_Items][cve][description][description_data]"
}
}
kv {
source => "message"
field_split => ","
}
mutate {
remove_field => [...]
rename =>
rename => ["[result][CVE_Items][configurations][nodes]", "Affected_Products"]
.
.
.
rename => ["[result][CVE_data_timestamp]", "CVE_Timestamp"]
}
fingerprint {
source => ["CVE", "Affected Products"]
target => "fingerprint"
concatenate_sources => "true"
method => "SHA256"
}
}
For splitting I've tried simply using:
split {
field => "[result][CVE_Items][configurations][nodes][cpe_match]"
}
and also with adding [cpe23Uri] to the end,
as well as:
split {
field => "[result][CVE_Items][configurations][nodes][cpe_match]"
target => "cpeCopy"
remove_field => "[result][CVE_Items][configurations][nodes][cpe_match]"
}
if [result][CVE_Items][configurations][nodes][cpe_match] {
ruby {
code => "event['cpeCopy'] = event['[result][CVE_Items][configurations][nodes][cpe_match]'][0]"
remove_field => "[result][CVE_Items][configurations][nodes][cpe_match]"
}
}
if [cpeCopy] {
mutate {
rename => { "cpeCopy" => "CPE" }
}
As I've said I've tried multiple ways, I won't list them all but unfortunately nothing has worked so far.
Does the Elasticsearch output plugin support elasticsearch's _update_by_query?
https://www.elastic.co/guide/en/logstash/6.5/plugins-outputs-elasticsearch.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html
The elasticsearch output plugin can only make calls to the _bulk endpoint, i.e. using the Bulk API.
If you want to call the Update by Query API, you need to use the http output plugin and construct the query inside the event yourself. If you explain what you want to achieve, I can update my answer with some more details.
Note: There's an issue requesting this feature, but it's still open after two years.
UPDATE
So if your input event is {"cname":"wang", "cage":11} and you want to update by query all documents with "cname":"wang" to set "cage":11, your query needs to look like this:
POST your-index/_update_by_query
{
"script": {
"source": "ctx._source.cage = params.cage",
"lang": "painless",
"params": {
"cage": 11
}
},
"query": {
"term": {
"cname": "wang"
}
}
}
So your Logstash config should look like this (your input may vary but I used stdin for testing purposes):
input {
stdin {
codec => "json"
}
}
filter {
mutate {
add_field => {
"[script][lang]" => "painless"
"[script][source]" => "ctx._source.cage = params.cage"
"[script][params][cage]" => "%{cage}"
"[query][term][cname]" => "%{cname}"
}
remove_field => ["host", "#version", "#timestamp", "cname", "cage"]
}
}
output {
http {
url => "http://localhost:9200/index/doc/_update_by_query"
http_method => "post"
format => "json"
}
}
The same result can be obtained with standard elasticsearch plugins:
input {
elasticsearch {
hosts => "${ES_HOSTS}"
user => "${ES_USER}"
password => "${ES_PWD}"
index => "<your index pattern>"
size => 500
scroll => "5m"
docinfo => true
}
}
filter {
...
}
output {
elasticsearch {
hosts => "${ES_HOSTS}"
user => "${ES_USER}"
password => "${ES_PWD}"
action => "update"
document_id => "%{[#metadata][_id]}"
index => "%{[#metadata][_index]}"
}
}
For more than a week I am struggling with trying to log into an index in elasticsearch information regarding queries which I run so I could compare performance between different types of queries. I have configured this config file on logstash home directory
input {
beats {
port> 5044
}
}
filter {
if "search" in [request]{
grok {
match => { "request" => ".*\n\{(?<query_body>.*)"}
}
grok {
match => { "path" => "\/(?<index>.*)\/_search"}
}
if [index] {
} else {
mutate {
add_field => { "index" => "All" }
}
}
mutate {
update => { "query_body" => "{%{query_body}" }
}
}
}
output {
if "search" in [request] and "ignore_unmapped" not in [query_body]{
elasticsearch {
hosts => "http://localhost:9200"
}
}
}
and also installed and configured packetbeat.yml
logstash hosts to :http://localhost:9200
Also in the tutorial that I have followed is mentioned that after starting Packetbeat it will listen for packets on 9200 sending them to Logstash and from there to the monitoring Elasticsearch cluster, it will be indexed in indexes like: logstash-2016.05.24. But these indexes does not exists.
I have an index in elastic search that has a field named locationCoordinates. It's being sent to ElasticSearch from logstash.
The data in this field looks like this...
-38.122, 145.025
When this field appears in ElasticSearch it is not coming up as a geo point.
I know if I do this below it works.
{
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
But what I would like to know is how can i change my logstash.conf file so that it does this at startup.
At the moment my logstash.conf looks a bit like this...
input {
# Default GELF input
gelf {
port => 12201
type => gelf
}
# Default TCP input
tcp {
port => 5000
type => syslog
}
# Default UDP input
udp {
port => 5001
type => prod
codec => json
}
file {
path => [ "/tmp/app-logs/*.log" ]
codec => json {
charset => "UTF-8"
}
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json{
source => "message"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
And I end up with this in Kibana (without the little Geo sign).
You simply need to modify your elasticsearch output to configure an index template in which you can add your additional mapping.
output {
elasticsearch {
hosts => "elasticsearch:9200"
template_overwrite => true
template => "/path/to/template.json"
}
}
And then in the file at /path/to/template.json you can add your additional geo_point mapping
{
"template": "logstash-*",
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
If you want to keep the official logstash template, you can download it and add your specific geo_point mapping to it.
I followed the steps in this document and I was able to do get some reports on the Shakespeare data.
I want to do the same thing with elastic search remotely installed.I tried configuring the "host" in config file but the queries still run on host as opposed to remote .This is my config file
input {
stdin{
type => "stdin-type" }
file {
type => "accessLog"
path => [ "/Users/akushe/Downloads/requests.log" ]
}
}
filter {
grok {
match => ["message","%{COMMONAPACHELOG} (?:%{INT:responseTime}|-)"]
}
kv {
source => "request"
field_split => "&?"
}
if [lng] {
kv {
add_field => [ "location" , ["%{lng}","%{lat}"]]
}
}else if [lon] {
kv {
add_field => [ "location" , ["%{lon}","%{lat}"]]
}
}
}
output {
elasticsearch {
host => "slc-places-qa-es3001.slc.where.com"
port => 9200
}
}
You need to add protocol => http in to make it use HTTP transport rather than joining the cluster using multicast.