How to get ElasticSearch output? - elasticsearch

I want to add my log document to ElasticSearch and, then I want to check the document in the ElasticSearch.
Following is the conntent of the log file :
Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF#mailserver14.example.com>
Feb 2 06:25:43 mailserver15 postfix/cleanup[21403]: BEF25A72999: message-id=<20130101142543.5828399CCAF#mailserver15.example.com>
Mar 3 06:25:43 mailserver16 postfix/cleanup[21403]: BEF25A72998: message-id=<20130101142543.5828399CCAF#mailserver16.example.com>
I am able to run my logstash instance with following logstast configuration file :
input {
file {
path => "/Myserver/mnt/appln/somefolder/somefolder2/testData/fileValidator-access.LOG"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
patterns_dir => ["/Myserver/mnt/appln/somefolder/somefolder2/logstash/pattern"]
match => { "message" => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:syslog_message}" }
}
}
output{
elasticsearch{
hosts => "localhost:9200"
document_id => "test"
index => "testindex"
action => "update"
}
stdout { codec => rubydebug }
}
I have define my own grok pattern as :
POSTFIX_QUEUEID [0-9A-F]{10,11}
When I am running the logstash instance, I am successfully sending the data to elasticsearch, which gives following output :
Now, I have got the index stored in elastic search under testindex, but when I am using the curl -X GET "localhost:9200/testindex" I am getting following output :
{
"depositorypayin" : {
"aliases" : { },
"mappings" : { },
"settings" : {
"index" : {
"creation_date" : "1547795277865",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "5TKW2BfDS66cuoHPe8k5lg",
"version" : {
"created" : "6050499"
},
"provided_name" : "depositorypayin"
}
}
}
}
This is not what is stored inside the index.I want to query the document inside the index.Please help. (PS: please forgive me for the typos)

The API you used above only returns information about the index itself (docs here). You need to use the Query DSL to search the documents. The following Match All Query will return all the documents in the index testindex:
curl -X GET "localhost:9200/testindex/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match_all": {}
}
}
'

Actually I have edited my config file whic look like this now :
input {
. . .
}
filter {
. . .
}
output{
elasticsearch{
hosts => "localhost:9200"
index => "testindex"
}
}
And now I am able to get fetch the data from elasticSearch using
curl 'localhost:9200/testindex/_search'
I don't know how it works, but it is now.
can anyone explain why ?

Related

Elasticsearch, upsert a document with script when the index does not exist

I'm receiving some payloads in a logstash, that I push in Elastic in a monthly rolling index with a script that allows me to override the fields depending on the order of the status of those payloads.
Example :
{
"id" : "abc",
"status" : "OPEN",
"field1" : "foo",
"opening_ts" : 1234567
}
{
"id" : "abc",
"status" : "CLOSED",
"field1" : "bar",
"closing_ts": 7654321
}
I want that, even if i receive the payload OPEN after the CLOSE for the id "abc", my elastic document to be :
{
"_id" : "abc",
"status": "CLOSED",
"field1" : "bar",
"closing_ts": 7654321,
"opening_ts" : 1234567
}
I order to guarantee that, i have added a script in my elastic output plugin in logstash
script => "
if (ctx._source['status'] == 'CLOSED') {
for (key in params.event.keySet()) {
if (ctx._source[key] == null) {
ctx._source[key] = params.event[key]
}
}
} else {
for (key in params.event.keySet()) {
ctx._source[key] = params.event[key]
}
}
"
Buuuuut, adding this script also added an extra step between the implicit "PUT" on the index, and if the target index does not exist, the script will fail and the whole document will never be created. (Nor the index)
Do you know how could i handle an error in this scripts ?
You need to resort to scripted upsert:
output {
elasticsearch {
index => "your-index"
document_id => "%{id}"
action => "update"
scripted_upsert => true
script => "... your script..."
}
}

Elasticsearch index not being created with settings from logstash template

I have a bulk upload for a new index that I'm sending to my ES cluster from logstash. As such I want replication and refreshing turned off until the load is done, and I'll re-enable those values after the upload is complete.
I have a config file that looks like the following
input {
stdin { type => stdin }
}
filter {
csv {
separator => " "
columns => [ ...]
}
}
output {
amazon_es {
hosts =>
["my-domain.us-east-1.es.amazonaws.com"]
index => "my-index"
template => "conf/my-index-template.json"
template_name => "my-index-template-name"
region => "us-east-1"
}
}
And the template file looks like
{
"template" : "my-index-template-name",
"mappings" : {
...
},
"settings" : {
"index" : {
"number_of_shards" : "48",
"number_of_replicas" : "0",
"refresh_interval": "-1"
}
}
}
And when I run logstash and go to look at the settings for that index, the mappings are all respected from this template which is good, but everything in the settings section is ignored and it takes on default values (i.e. number_of_shards=5, and number_of_replicas=1)
Some investigation notes:
If I get the template after it's installed from ES itself I see the proper values in the template (for both mappings and settings). They just don't seem to be applying to the index
Also if I take the contents of the template file and create the index manually w/ a PUT it shows up as I would expect
My logstash version is 7.3.0 and my elasticsearch version is 6.7
Not sure what I'm doing wrong here
Your index name is my-index, but the template setting in your mapping uses my-index-template-name, it needs to be a regular expression or the same name as your index.
Since you are using elasticsearch 6.7 you should use index_patterns instead of template in your mapping.
{
"index_patterns" : ["my-index"],
"mappings" : {
...
},
"settings" : {
"index" : {
"number_of_shards" : "48",
"number_of_replicas" : "0",
"refresh_interval": "-1"
}
}
}

Logstash do not send anything to ES

I have IIS failedlog xml files and i am trying to read,parse and send to ES but my LS do not send anything.
I could not find any solution. Thx for your helps.
input {
file {
path => "C:\Users\name\Desktop\Log2\WebApplication2\*.xml"
}
}
filter {
xml {
source => "message"
store_xml => false
target => "target"
xpath => ["/failedRequest/_url/#text", "clasification"]
remove_field => "message"
}
}
filter {
mutate{add_field=>{"class"=>"%{target}"}}
}
output {
elasticsearch {
hosts => "localhost:9200"
index=>"logfromstash"
}
}
I edited the xml tree IIS is using freb.xsl , is that can cause an error ?
object{1}
failedRequest{16}
Event[161]
_xmlns:freb : http://schemas.microsoft.com/win/2006/06/iis/freb
_url : https://localhost:44324/api/employee/14
_siteId : 2
_appPoolId : Clr4IntegratedAppPool
_processId : 8240
_verb : GET
_remoteUserName :
_userName :
_tokenUserName : name
_authenticationType : anonymous
_activityId : {800000B5-0002-F900-B63F-84710C7967BB}
_failureReason : STATUS_CODE
_statusCode : 200
_triggerStatusCode : 200
_timeTaken : 47

How can I configure a custom field to be aggregatable in Kibana?

I am new to running the ELK stack. I have Logstash configured to feed my webapp log into Elasticsearch. I am trying to set up a visualization in Kibana that will show the count of unique users, given by the user_email field, which is parsed out of certain log lines.
I am fairly sure that I want to use the Unique Count aggregation, but I can't seem to get Kibana to include user_email in the list of fields which I can aggregate.
Here is my Logstash configuration:
filter {
if [type] == "wl-proxy-log" {
grok {
match => {
"message" => [
"(?<syslog_datetime>%{SYSLOGTIMESTAMP}\s+%{YEAR})\s+<%{INT:session_id}>\s+%{DATA:log_message}\s+license=%{WORD:license}\&user=(?<user_email>%{USERNAME}\#%{URIHOST})\&files=%{WORD:files}",
]
}
break_on_match => true
}
date {
match => [ "syslog_datetime", "MMM dd HH:mm:ss yyyy", "MMM d HH:mm:ss yyyy" ]
target => "#timestamp"
locale => "en_US"
timezone => "America/Los_Angeles"
}
kv {
source => "uri_params"
field_split => "&?"
}
}
}
output {
elasticsearch {
ssl => false
index => "wl-proxy"
manage_template => false
}
}
Here is the relevant mapping in Elasticsearch:
{
"wl-proxy" : {
"mappings" : {
"wl-proxy-log" : {
"user_email" : {
"full_name" : "user_email",
"mapping" : {
"user_email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
Can anyone tell me what I am missing?
BTW, I am running CentOS with the following versions:
Elasticsearch Version: 6.0.0, Build: 8f0685b/2017-11-10T18:41:22.859Z, JVM: 1.8.0_151
Logstash v.6.0.0
Kibana v.6.0.0
Thanks!
I figured it out. The configuration was correct, AFAICT. The issue was that I simply hadn't refreshed the list of fields in the index in the Kibana UI.
Management -> Index Patterns -> Refresh Field List (the refresh icon)
After doing that, the field began appearing in the list of aggregatable terms, and I was able to create the necessary visualizations.

Logstash - elasticsearch getting only new data

I want to run a logstash process that grabs real-time data with certain value in the field and output it into the screen. So far i've come up with this configuration:
input {
elasticsearch {
hosts => "localhost"
user => "logstash"
password => "logstash"
size => 100
query =>'{ "query" : { "bool" : { "must" : { "bool" : { "should" : [ {"match": {"field": "value2"}}, {"match": {"field": "value1"}} ] } } } } }'
}
}
output {
stdout {
codec => rubydebug
}
}
What I've learned from running this config is that:
Logstash output the data in batches which is determined by the size parameter.
There's a few seconds delay between each batch
Logstash grabbed the data from the existing data first.
My question, is there any configuration that can turn the process so that logstash will only listen for new data and output it as soon as the data come into Elastic? Any help would be appreciated.

Resources