I am sending an http Get request to elastic search server and i want the response to be in csv format.Like in solr we can specify wt=csv is there any way In elastic Search too ?
My query is :
enter code here
http://elasticServer/_search?q=RCE:"some date" OR
VENDOR_NAME:"Anuj"&from=0&size=5&sort=#timestamp
-----After that i want to force the server to return me response in csv format
By default, ES supports only two data formats: JSON and YAML. However, if you're open to using Logstash, you can achieve what you want very easily like this:
input {
elasticsearch {
hosts => ["localhost:9200"]
query => 'RCE:"some date" OR VENDOR_NAME:"Anuj"'
size => 5
}
}
filter {}
output {
csv {
fields => ["field1", "field2", "field3"]
path => "/path/to/data.csv"
}
}
Since the elasticsearch input uses scrolling, you cannot specify any sorting. So if sorting is really important to you, you can use the http_poller input instead of the elasticsearch one, like this:
input {
http_poller {
urls => {
es => {
method => get
url => 'http://elasticServer/_search?q=RCE:"some date" OR VENDOR_NAME:"Anuj"&from=0&size=5&sort=#timestamp'
headers => {
Accept => "application/json"
}
}
}
codec => "json"
}
}
filter {}
output {
csv {
fields => ["field1", "field2", "field3"]
path => "/path/to/data.csv"
}
}
There is a ElasticSearch plugin on Github called Elasticsearch Data Format Plugin that should satisfy your requirements.
Related
Using http poller as input plugin and elastic search as output plugin can we send data from different urls into different indicies in elasticsearch using one single logstash config file.
Yes you can do it using if conditions in the output section:
input {
http_poller {
...
add_tag => "source1"
}
http_poller {
...
add_tag => "source2"
}
}
filter {
...
}
output {
if "source1" in [tags] {
elasticsearch {
...
index => "index1"
}
}
else if "source2" in [tags] {
elasticsearch {
...
index => "index2"
}
}
}
everyone. I'm new in elk and I have a question about logstash.
I have some services and each one has 4 or 6 logs; it means a doc in elastic may has 4 or 6 logs.
I want to read these logs and if they have the same id, put them in one elastic doc.
I must specify that all of the logs have a unique "id" and each request and every log that refers to that request has the same id. each log has a specific type.
I want to put together every log that has the same id and type; like this:
{
"_id":"123",
"Type1":{},
"Type2":[{},{}],
"Type3":[{},{}],
"Type4":{}
}
Every log for the same requset:
Some of them must be in the same group. because their type are the same. look example above. Type2 is Json Array and has 2 jsons. I want to use logstash to read every log and have them classified.
Imagine that our doc is like bellow JSON at the moment:
{
"_id": "123",
"Type1":{},
"Type2":[{},{}],
"Type3":{}
}
now a new log arrives, with id 123 and it's type is Type4. The doc must update like this:
{
"_id": "123",
"Type1":{},
"Type2":[{},{}],
"Type3":{},
"Type4":{}
}
again, I have new log with id, 123 and type, Type3. the doc update like this:
{
"_id": "123",
"Type1":{},
"Type2":[{},{}],
"Type3":[{},{}],
"Type4":{}
}
I tried with script, but I didn't succeed. :
{
"id": 1,
"Type2": {}
}
The script is:
input {
stdin {
codec => json_lines
}
}
output {
elasticsearch {
hosts => ["XXX.XXX.XXX.XXX:9200"]
index => "ss"
document_id => "%{requestId}"
action => "update" # update if possible instead of overwriting
document_type => "_doc"
script_lang => "painless"
scripted_upsert => true
script_type => "inline"
script => 'if (ctx._source.Type3 == null) { ctx._source.Type3 = new ArrayList() } if(!ctx._source.Type3.contains("%{Type3}")) { ctx._source.Type3.add("%{Type3}")}'
}
}
now my problem is this script format just one type; if it works for multiple types, what would it look like?
there is one more problem. I have some logs that they don't have an id, or they have an id, but don't have a type. I want to have these logs in the elastic, what should I do?
You can have a look on aggregate filter plugin for logstash. Or as you mentioned if some of the logs don't have an id, then you can use fingerprint filter plugin to create an id, which you can use to update document in elasticsearch.
E.g:
input {
stdin {
codec => json_lines
}
}
filter {
fingerprint {
source => "message"
target => "[#metadata][id]"
method => "MURMUR3"
}
}
output {
elasticsearch {
hosts => ["XXX.XXX.XXX.XXX:9200"]
index => "ss"
document_id => "%{[#metadata][id]}"
action => "update" # update if possible instead of overwriting
}
}
I would like to extract in Kiabana fields from #message field which contains a json.
ex:
Audit{
uuid='xxx-xx-d3sd-fds3-f43',
action='/v1.0/execute/super/method',
resultCode='SUCCESS',
browser='null',
ipAddress='192.168.2.44',
application='application1',
timeTaken='167'
}
Having "action" and "application" fields I hope to be able to find top 5 requests that hits the application.
I started with something similar to this:
filter {
if ([message]~ = "Audit") {
grok {
match => {
"message" => "%{WORD:uuid}, %{WORD:action}, %{WORD:resultCode}, %{WORD:browser}, %{WORD:ipAddress}, %{WORD:application}, %{NUMBER:timeTaken}"
}
add_field => ["action", "%{action}"]
add_field => ["application", "%{application}"]
}
}
}
But it seems to be too far from reality.
If the content of "Audit" is really in json format, you can use the filter plugin "json"
json{
source => "Audit"
}
It will do the parsing for you and creates everything. You don't need grok / add_field.
I use logstash-logback-encoder to send java log files to logstash, and then to elasticsearch. To parse the message in java log, I use following filter to dissect message
input {
file {
path => "/Users/MacBook-201965/Work/java/logs/oauth-logstash.log"
start_position => "beginning"
codec => "json"
}
}
filter {
if "EXECUTION_TIME" in [tags] {
dissect {
mapping => {
"message" => "%{endpoint} timeMillis:[%{execution_time_millis}] data:%{additional_data}"
}
}
mutate {
convert => { "execution_time_millis" => "integer" }
}
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "elk-%{+YYYY}"
document_type => "log"
}
stdout {
codec => json
}
}
It dissect the message so I can get value of execution_time_millis. However the data type is string. I created the index using Kibana index pattern. How can I change the data type of execution_time_millis into long?
Here is the sample json message from logback
{
"message":"/tests/{id} timeMillis:[142] data:2282||0:0:0:0:0:0:0:1",
"logger_name":"com.timpamungkas.oauth.client.controller.ElkController",
"level_value":20000,
"endpoint":"/tests/{id}",
"execution_time_millis":"142",
"#version":1,
"host":"macbook201965s-MacBook-Air.local",
"thread_name":"http-nio-8080-exec-7",
"path":"/Users/MacBook-201965/Work/java/logs/oauth-logstash.log",
"#timestamp":"2018-01-04T11:20:20.100Z",
"level":"INFO",
"tags":[
"EXECUTION_TIME"
],
"additional_data":"2282||0:0:0:0:0:0:0:1"
}{
"message":"/tests/{id} timeMillis:[110] data:2280||0:0:0:0:0:0:0:1",
"logger_name":"com.timpamungkas.oauth.client.controller.ElkController",
"level_value":20000,
"endpoint":"/tests/{id}",
"execution_time_millis":"110",
"#version":1,
"host":"macbook201965s-MacBook-Air.local",
"thread_name":"http-nio-8080-exec-5",
"path":"/Users/MacBook-201965/Work/java/logs/oauth-logstash.log",
"#timestamp":"2018-01-04T11:20:19.780Z",
"level":"INFO",
"tags":[
"EXECUTION_TIME"
],
"additional_data":"2280||0:0:0:0:0:0:0:1"
}
Thank you
If you have already indexed the documents, you'll have to reindex the data after changing the datatype of any field.
However, you can use something like this to change the type of millis from string to integer. (long is not supported in this)
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-convert
Also, try defining elasticsearch template before creating index if are going to add multiple index with index names having some regex pattern.Else, you can define your index format beforehand too an then start indexing.
I copied
{"name":"myapp","hostname":"banana.local","pid":40161,"level":30,"msg":"hi","time":"2013-01-04T18:46:23.851Z","v":0}
from https://github.com/trentm/node-bunyan and save it as my logs.json. I am trying to import only two fields (name and msg) to ElasticSearch via LogStash. The problem is that I depend on a sort of filter that I am not able to accomplish. Well I have successfully imported such line as a single message but certainly it is not worth in my real case.
That said, how can I import only name and msg to ElasticSearch? I tested several alternatives using http://grokdebug.herokuapp.com/ to reach an useful filter with no success at all.
For instance, %{GREEDYDATA:message} will bring the entire line as an unique message but how to split it and ignore all other than name and msg fields?
At the end, I am planing to use here:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
grok {
match => { "message" => "data=%{GREEDYDATA:request}"}
}
#### some extra lines here probably
}
output
{
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
I have just gone through the list of available Logstash filters. The prune filter should match your need.
Assume you have installed the prune filter, your config file should look like:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
prune {
whitelist_names => [
"#timestamp",
"type",
"name",
"msg"
]
}
}
output {
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
Please be noted that you will want to keep type for Elasticsearch to index it into a correct type. #timestamp is required if you will view the data on Kibana.