Logstash filter to identify address matches

Logstash filter to identify address matches - elasticsearch

I have a CSV file with customer addresses. I have also an Elasticsearch index with my own addresses. I use Logstash as tool to import the CSV file. I'd like to use a logstash filter to check in my index if the customer address already exists. All I found is the default elasticsearch filter ("Copies fields from previous log events in Elasticsearch to current events") which doesn't look the correct one to solve my problem. Does another filter exist for my problem?
Here my configuration file so far:
input {
file {
path => "C:/import/Logstash/customer.CSV"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
csv {
columns => [
"Customer",
"City",
"Address",
"State",
"Postal Code"
]
separator => ";"
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "customer-unmatched"
}
stdout{}
}

You don't normally have access to the data in Elasticsearch while processing your Logstash event. Consider using a pipeline on an Ingest node

Related

Logstash pipeline data is not pushed to Elasticsearch

input {
file {
path => "/home/blusapphire/padma/sampledata.csv"
start_position => "beginning"
}
}
filter {
csv {
columns => [ "First_Name", "Last_Name", "Age", "Salary", "Emailid", "Gender" ]
}
}
output {
elasticsearch {
hosts => ["${ES_INGEST_HOST_02}:9200"]
index => "network"
user => "adcd"
password => "adcbdems"
}
}
This my logstash config file, and when running logstash file I'm not seeing data (which is given csv) in Elasticsearch, and index is not creating, Is there any mistake in configuration?

Based on your screenshot, it is clear that logstash has noticed your file. To get logstash to feel it is making "first contact" with the file, try:
shutdown logstash
delete the since_db file
start up logstash
Alternatively, move the file outside the monitored folder, execute the above steps and then drop the file in.
Can you also confirm that your set up works with other files - just in case?

reading .gz files using logstash

I am trying to use logstash 5.5 for analyzing archived (.gz) files generating every minute. Each.gz file contains csv file in it. My .conf file looks like below:
input {
file {
type => "gzip"
path => [ “C:\data*.gz” ]
start_position => "beginning"
sincedb_path=> "gzip"
codec => gzip_lines
}
}
filter {
csv {
separator => ","
columns => [“COL1”,“COL2”,“COL3”,“COL4”,“COL5”,“COL6”,“COL7”]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "mydata"
document_type => “zipdata”
}
stdout {}
}
Initially I was getting error for missing gzip_lines plugin. So, I installed it. After installing this plugin, I can see that logstash says “Succesfully started Logstash API endpoint” but nothing get indexed. I do not see any indexing of data in elasticsearch in logstash logs. When I try to get the index in Kibana, it is not available there. It means that logstash is not putting data in elasticsearch.
May be I am using wrong configuration. Please suggest, what is the correct way of doing this?

Plot a Tile map with the ELK stack

I'm trying to create a tile map with Kibana. My conf file logstash works correctly and generates all what Kibana needs to plot a tile map. This is my conf logstash :
input {
file {
path => "/home/ec2-user/part.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["kilo_bytes_total","ip","session_number","request_number_total","duration_minutes_total","referer_list","filter_match_count_avg","request_number_avg","duration_minutes_avg","kilo_bytes_avg","segment_duration_avg","req_by_minute_avg","segment_mix_rank_avg","offset_avg_avg","offset_std_avg","extrem_interval_count_avg","pf0_avg","pf1_avg","pf2_avg","pf3_avg","pf4_avg","code_0_avg","code_1_avg","code_2_avg","code_3_avg","code_4_avg","code_5_avg","volume_classification_filter_avg","code_classification_filter_avg","profiles_classification_filter_avg","strange_classification_filter_avg"]
}
geoip {
source => "ip"
database => "/home/ec2-user/logstash-5.2.0/GeoLite2-City.mmdb"
target => "geoip"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
add_tag => "geoip"
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
}
output {
elasticsearch {
index => "geotrafficip"
}
}
And this is what that generates :
It looks cool. Trying to create my tile map, I have this message :
What to do ?
It seems that I must add somewhere the possiblity to use dynamic templates.. Should I create a template and add it to my file conf logstash ?
Can anybody give me some feedback ? Thanks !

If you look in the Kibana settings for your index, you'll need at least one field to show up with a type of geo_point to be able to get anything on a map.
If you don't already have a geo_point field, you'll need to re-index your data after setting up an appropriate mapping for the geoip.coordinates field. For example: https://stackoverflow.com/a/42004303/2785358
If you are using a relatively new version of Elasticsearch (2.3 or later), it's relatively easy to re-index your data. You need to create a new index with the correct mapping, use the re-index API to copy the data to the new index, delete the original index and then re-index back to the original name.

You are using the geoip filter wrong and are trying to convert the longitude and latitude to float. Get rid of your mutate filter and change the geoip filter to this.
geoip {
source => "ip"
fields => ["latitude","longitude"]
add_tag => "geoip"
}
This will create the appropriate fields. And the required GeoJSON object.

Logstash Update a document in elasticsearch

Trying to update a specific field in elasticsearch through logstash. Is it possible to update only a set of fields through logstash ?
Please find the code below,
input {
file {
path => "/**/**/logstash/bin/*.log"
start_position => "beginning"
sincedb_path => "/dev/null"
type => "multi"
}
}
filter {
csv {
separator => "|"
columns => ["GEOREFID","COUNTRYNAME", "G_COUNTRY", "G_UPDATE", "G_DELETE", "D_COUNTRY", "D_UPDATE", "D_DELETE"]
}
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
query => "GEOREFID:%{GEOREFID}"
fields => [["JSON_COUNTRY","G_COUNTRY"],
["XML_COUNTRY","D_COUNTRY"]]
}
if [G_COUNTRY] {
mutate {
update => { "D_COUNTRY" => "%{D_COUNTRY}"
}
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
document_id => "%{GEOREFID}"
}
}
We are using the above configuration when we use this the null value field is getting removed instead of skipping null value update.
Data comes from 2 different source. One is from XML file and the other is from JSON file.
XML log format : GEO-1|CD|23|John|892|Canada|31-01-2017|QC|-|-|-|-|-
JSON log format : GEO-1|AS|33|-|-|-|-|-|Mike|123|US|31-01-2017|QC
When adding one log new document will get created in the index. When reading the second log file the existing document should get updated. The update should happen only in the first 5 fields if log file is XML and last 5 fields if the log file is JSON. Please suggest us on how to do this in logstash.
Tried with the above code. Please check and can any one help on how to fix this ?

For the Elasticsearch output to do any action other than index you need to tell it to do something else.
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
action => "update"
document_id => "%{GEOREFID}"
}
This should probably be wrapped in a conditional to ensure you're only updating records that need updating. There is another option, though, doc_as_upsert
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
action => "update"
doc_as_upsert => true
document_id => "%{GEOREFID}"
}
This tells the plugin to insert if it is new, and update if it is not.
However, you're attempting to use two inputs to define a document. This makes things complicated. Also, you're not providing both inputs, so I'll improvise. To provide different output behavior, you will need to define two outputs.
input {
file {
path => "/var/log/xmlhome.log"
[other details]
}
file {
path => "/var/log/jsonhome.log"
[other details]
}
}
filter { [some stuff ] }
output {
if [path] == '/var/log/xmlhome.log' {
elasticsearch {
[XML file case]
}
} else if [path] == '/var/log/jsonhome.log' {
elasticsearch {
[JSON file case]
action => "update"
}
}
}
Setting it up like this will allow you to change the ElasticSearch behavior based on where the event originated.

Mutiple logs in single config file to elasticsearch

I want to send logs from different location to elasticsearch using logstash conf file.
input {
file
{
path => "C:/Users/611166850/Desktop/logstash-5.0.2/logstash-5.0.2/logs/logs/ce.log"
type => "CE"
start_position => "beginning"
}
file
{
path => "C:/Users/611166850/Desktop/logstash-5.0.2/logstash-5.0.2/logs/logs/spovp.log"
type => "SP"
start_position => "beginning"
}
file
{
path => "C:/Users/611166850/Desktop/logstash-5.0.2/logstash-5.0.2/logs/logs/ovpportal_log"
type => "OVP"
start_position => "beginning"
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost:9200"
index => "multiple"
codec => json
workers => 1
}
}
This is the config file I use, but Kibana is not recognising this index. Can someone help with this
Thanks in advance ,Rashmi

Check logstash's log file for errors.(maybe you'r config file is not correct)
Also search ES directly for preferred index, maybe problem is not Kibana, and you don't have any index with this name.

try starting logstash in debug mode to see if there are any logs in it.
you can also try to get the logstash out put to a file on local system rather than directly sending it to the elasticsearch. uncomment block as per your requirement
# block-1
# if "_grokparsefailure" not in [tags] {
# stdout {
# codec => rubydebug { metadata => true }
# }
# }
# block-2
# if "_grokparsefailure" not in [tags] {
# file {
# path => "/tmp/out-try1.logstash"
# }
# }
so by any of these methods you can get the output to console or to a file. comment _grokparsefailure part in case you don't see any output in file.
Note: in kibana default indices have #timestamp in their fields so check
1. if kibana is able to recognize the index if you unckeck the checkbox on page where you create new index
2. if your logs are properly parsed. if not you need to work out with grok filters with pattern matching your logs or create grok filters
all elasticsearch indices are visible on http://elasticsearch-ip:9200/_cat/indices?v (your elasticsearch ip) so try that too. share what you find

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Logstash filter to identify address matches - elasticsearch

You don't normally have access to the data in Elasticsearch while processing your Logstash event. Consider using a pipeline on an Ingest node

Related

Logstash pipeline data is not pushed to Elasticsearch

reading .gz files using logstash

Plot a Tile map with the ELK stack

Logstash Update a document in elasticsearch

Mutiple logs in single config file to elasticsearch

Categories

Resources