I'm trying to use NEST for searching through elastic search indexex that were created with logstash (basically logstash-*).
I have setup NEST with following code:
Node = new Uri("http://localhost:9200");
Settings = new ConnectionSettings(Node);
Settings.DefaultIndex("logstash-*");
Client = new ElasticClient(Settings);
this is how I try to get results:
var result = Client.Search<Logstash>(s => s
.Query(p => p.Term("Message", "*")));
and I get 0 hits:
http://screencast.com/t/d2FB9I4imE
Here is an example of entry I would like to find:
{
"_index": "logstash-2016.06.20",
"_type": "logs",
"_id": "AVVtswJxpdkh1tFPP9S5",
"_score": null,
"_source": {
"timestamp": "2016-06-20 14:04:55.6650",
"logger": "xyz",
"level": "debug",
"message": "Processed command service method SearchService.SearchBy in 65 ms",
"exception": "",
"url": "",
"ip": "",
"username": "",
"user_id": "",
"role": "",
"authentication_provider": "",
"application_id": "",
"application_name": "",
"application": "ZBD",
"#version": "1",
"#timestamp": "2016-06-20T12:04:55.666Z",
"host": "0:0:0:0:0:0:0:1"
},
"fields": {
"#timestamp": [
1466424295666
]
},
"sort": [
1466424295666
]
}
I'm using 5.0.0-alpha3 version, and NEST client is alpha2 version atm.
This is because of
...
"_type": "logs",
...
When you are doing query like yours it will hit logstash not logs type, because NEST infers type name from generic parameter. You have two options to solve this problem.
Tell NEST to map Logstash type to logs type whenever making
request to elasticsearch, by setting this mapping in client's
settings:
var settings = new ConnectionSettings()
.MapDefaultTypeNames(m => m.Add(typeof(Logstash), "logs");
var client = new ElasticClient(settings);
Override default behaviour by setting type explicitly in request
parameters:
var result = Client.Search<Logstash>(s => s
.Type("logs")
.Query(p => p.Term("message", "*")));
Also notice you sould use message not Message in term descriptor
as you don't have such field in index. Second this is as far as I
know wildcards are not supported in term query. You may want to use
query string instead.
Hope it helps.
Related
I'm trying out the new machine learning module in x pack. I'm trying to identify rare response codes in HTTP Access logs in time. My logs are being stored in elasticsearch as below:
{
"_index": "logstash-2017.05.18",
"_type": "Accesslog",
"_id": "AVxvVfFGdMmRr-0X-J5P",
"_version": 1,
"_score": null,
"_source": {
"request": "/web/Q123/images/buttons/asdf.gif",
"server": "91",
"auth": "-",
"ident": "-",
"verb": "GET",
"type": "Accesslog",
"path": "/path/to/log",
"#timestamp": "2017-05-18T10:20:00.000Z",
"response": "304",
"clientip": "1.1.1.1",
"#version": "1",
"host": "ip-10-10-10-10",
"httpversion": "1.1",
"timestamp": "18/May/2017:10:20:00 +0530"
},
"fields": {
"#timestamp": [
1495102800000
]
}
I added a detector where I selected the function as 'rare' and the by_field_name' as 'response'. But when I save the job I get the following error:
Save failed: [illegal_argument_exception] Can't merge a non object mapping [response] with an object mapping [response]
Please help.
The error message means that you are trying to change an existing mapping. However, that is not possible in Elasticsearch. Once a mapping has been created, it cannot be changed.
As explained by Shay Banon himself:
You can't change existing mapping type, you need to create a new index
with the correct mapping and index the data again.
So you must create a new index to create this mapping. Depending on the situation, you either
create an additional index, or
delete the current index and re-create it from scratch.
Of course in the latter case you will lose all data in the index, so prepare accordingly.
I'd like to import a text file in Elasticsearch. The text file contains 3 values per line. After spending several hours of struggling, I didn't get it done. Help is greatly appreciated.
Elasticsearch 5.4.0 with Logstash installed.
Sample data:
username email hash
username email hash
username email hash
username email hash
username email hash
also built a python script but its too slow:
import requests
import json
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
i = 1
with open("my2") as fileobject:
for line in fileobject:
username, email, hash = line.strip('\n').split(' ')
body = {"username": username, "email": email, "password": hash}
es.index(index='dbs', doc_type='db1', id=i, body=body)
i += 1
edit:
Thanks its work but i guess my filter is bad because i want it to look like this:
{
"_index": "logstash-2017.06.01",
"_type": "db",
"_id": "AVxinqK5XRvft8kN7Q6M",
"_version": 1,
"_score": null,
"_source": {
"username": "Marlb0ro",
"email": "Marlb0ro#site.com",
"hash": "123456",
}
and it put the data like this:
{
"_index": "logstash-2017.06.01",
"_type": "logs",
"_id": "AVxinqK5XRvft8kN7Q6M",
"_version": 1,
"_score": null,
"_source": {
"path": "C:/Users/user/Desktop/user/log.txt",
"#timestamp": "2017-06-01T07:46:22.488Z",
"#version": "1",
"host": "DESKTOP-FNGSJ6C",
"message": "username email password",
"tags": [
"_grokparsefailure"
]
},
"fields": {
"#timestamp": [
1496303182488
]
},
"sort": [
1496303182488
]
}
Simply put this in a file called grok.conf:
input {
file {
path => "/path/to/your/file.log"
start_position => beginning
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => {"message" => "%{WORD:username} %{WORD:email} %{WORD:hash}" }
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
Then run Logstash with bin/logstash -f grok.conf and you should be ok.
I cannot get the message field to decode from my json log line when receiving via filebeat.
Here is the line in my logs:
{"levelname": "WARNING", "asctime": "2016-07-01 18:06:37", "message": "One or more gateways are offline", "name": "ep.management.commands.import", "funcName": "check_gateway_online", "lineno": 103, "process": 44551, "processName": "MainProcess", "thread": 140735198597120, "threadName": "MainThread", "server": "default"}
Here the logstash config. I tried with and without the codec. The only difference is that the message is being escaped when I use the codec.
input {
beats {
port => 5044
codec => "json"
}
}
filter {
json{
source => "message"
}
}
Here is the json as it arrives in elasticsearch:
{
"_index": "filebeat-2016.07.01",
"_type": "json",
"_id": "AVWnpK519vJkh3Ry-Q9B",
"_score": null,
"_source": {
"#timestamp": "2016-07-01T18:07:13.522Z",
"beat": {
"hostname": "59b378d40b2e",
"name": "59b378d40b2e"
},
"count": 1,
"fields": null,
"input_type": "log",
"message": "{\"levelname\": \"WARNING\", \"asctime\": \"2016-07-01 18:07:12\", \"message\": \"One or more gateways are offline on server default\", \"name\": \"ep.controllers.secure_client\", \"funcName\": \"check_gateways_online\", \"lineno\": 80, \"process\": 44675, \"processName\": \"MainProcess\", \"thread\": 140735198597120, \"threadName\": \"MainThread\"}",
"offset": 251189,
"source": "/mnt/ep_logs/ep_.json",
"type": "json"
},
"fields": {
"#timestamp": [
1467396433522
]
},
"sort": [
1467396433522
]
}
What I would like is that contents from the message object are decoded.
Many thanks
When that happens, it's usually because your Filebeat instance is configured to send documents directly to ES.
In your filebeat configuration file, make sure to comment out the elasticsearch output.
I am a beginner with Grafana and ElasticSearch. I added Elasticsearch source in grafana. Now, i would like do a query to show my datas.
An example of one data in Elasticsearch:
{
"_index": "shinken-2016.04.08",
"_type": "shinken-logs",
"_id": "AVP0GFeTmLuZ9eaw1Bjp",
"_score": 1.0,
"_source":
{
"comment": "",
"plugin_output": "",
"attempt": 0,
"message": "[1460089115] SERVICE NOTIFICATION: shinken;hostname_test.com;MySQL - TCP;CRITICAL;notify-service-by-email;connect to address 10.11.12.13 and port 1234: No route to host",
"logclass": 3,
"options": "",
"state_type": "CRITICAL",
"state": 2,
"host_name": "hostname_test.com",
"#timestamp": "2016-04-08T04:18:35Z",
"time": 1460089115,
"service_description": "MySQL - TCP",
"logobject": 2,
"type": "SERVICE NOTIFICATION",
"contact_name": "shinken",
"command_name": "notify-service-by-email"
}
},
My goal it's to show in granafa the state number of one service (here it's MySQL - TCP) for each day (here 2016-04-08).
My question is: How to do a query in Grafana with Elasticsearch as source datas ?
After processing data with: input | filter | output > ElasticSearch the format it's get stored in is somewhat like:
"_index": "logstash-2012.07.02",
"_type": "stdin",
"_id": "JdRaI5R6RT2do_WhCYM-qg",
"_score": 0.30685282,
"_source": {
"#source": "stdin://dist/",
"#type": "stdin",
"#tags": [
"tag1",
"tag2"
],
"#fields": {},
"#timestamp": "2012-07-02T06:17:48.533000Z",
"#source_host": "dist",
"#source_path": "/",
"#message": "test"
}
I filter/store most of the important information in specific fields, is it possible to leave out the default fields like: #source_path and #source_host? In the near future it's going to store 8 billion logs/month and I would like to run some performance tests with this default fields excluded (I just don't use these fields).
This removes fields from output:
filter {
mutate {
# remove duplicate fields
# this leaves timestamp from message and source_path for source
remove => ["#timestamp", "#source"]
}
}
Some of that will depend on what web interface you are using to view your logs. I'm using Kibana, and a customer logger (c#) that indexes the following:
{
"_index": "logstash-2013.03.13",
"_type": "logs",
"_id": "n3GzIC68R1mcdj6Wte6jWw",
"_version": 1,
"_score": 1,
"_source":
{
"#source": "File",
"#message": "Shalom",
"#fields":
{
"tempor": "hit"
},
"#tags":
[
"tag1"
],
"level": "Info"
"#timestamp": "2013-03-13T21:47:51.9838974Z"
}
}
This shows up in Kibana, and the source fields are not there.
To exclude certain fields you can use prune filter plugin.
filter {
prune {
blacklist_names => [ "#timestamp", "#source" ]
}
}
Prune filter is not a logstash default plugin and must be installed first:
bin/logstash-plugin install logstash-filter-prune