How to search on a URL exactly in ElasticSearch / Kibana - elasticsearch

I have imported an IIS log file and the data has moved through Logstash (1.4.2), into ElasticSearch (1.3.1) and then being displayed in Kibana.
My filter section is as follows:
filter {
grok {
match =>
["message" , "%{TIMESTAMP_ISO8601:iisTimestamp} %{IP:serverIP} %{WORD:method} %{URIPATH:uri} - %{NUMBER:port} - %{IP:clientIP} - %{NUMBER:status} %{NUMBER:subStatus} %{NUMBER:win32Status} %{NUMBER:timeTaken}"]
}
}
When using a Terms panel in Kibana, and using "uri" (one of my captured fields from Logstash), it is matching the tokens within the URI. Therefore it is matching items like:
'Scripts'
'/'
'EN
Q: How do I display the 'Top URLs' in their full form?
Q: How do I inform ElasticSearch that the field is 'not_analysed'. I don't mind having 2 fields, for example:
uri - The tokenized URI
uri.raw - the fully formed URL.
Can this be done Logstash side, or is this a mapping that needs to be set up in ElasticSearch?
Mapping is as follows :
//http://localhost:9200/iislog-2014.10.09/_mapping?pretty
{
"iislog-2014.10.09" : {
"mappings" : {
"iislogs" : {
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"#version" : {
"type" : "string"
},
"clientIP" : {
"type" : "string"
},
"device" : {
"type" : "string"
},
"host" : {
"type" : "string"
},
"id" : {
"type" : "string"
},
"iisTimestamp" : {
"type" : "string"
},
"logFilePath" : {
"type" : "string"
},
"message" : {
"type" : "string"
},
"method" : {
"type" : "string"
},
"name" : {
"type" : "string"
},
"os" : {
"type" : "string"
},
"os_name" : {
"type" : "string"
},
"port" : {
"type" : "string"
},
"serverIP" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"subStatus" : {
"type" : "string"
},
"tags" : {
"type" : "string"
},
"timeTaken" : {
"type" : "string"
},
"type" : {
"type" : "string"
},
"uri" : {
"type" : "string"
},
"win32Status" : {
"type" : "string"
}
}
}
}
}
}

In your Elasticsearch mapping:
url: {
type: "string",
index: "not_analyzed"
}

The problem is that the iislog- is not compliant with the logstash- format, and hence did not pick up the template:
My index format was iislog-YYYY.MM.dd, this did not use the out-of-the-box mappings by Logstash. When using the logstash- index format, Logstash will create 2 pairs of fields for strings. For example uri is:
uri (appears in Kibana)
uri.raw (does not appear in Kibana)
Note that the uri.raw will not appear in Kibana - but it is queryable.
So the solution to use an alternative index is to:
Don't bother! Use the default index format of logstash-%{+YYYY.MM.dd}
Add a "type" to the file input to help you filter the correct logs in Kibana (whilst using the logstash- index format)
input {
file {
type => "iislog"
....
}
}
Apply filtering in Kibana based in the type
OR
If you really really do want a different index format:
Copy the default configuration file to a new file, say iislog-template.json
Reference the configuration file in the output ==> elasticsearch like this:
output {
elasticsearch_http {
host => localhost
template_name => "iislog-template.json"
template => "<path to template>"
index => "iislog-%{+YYYY.MM.dd}"
}
}

Related

Nest Aggregation with dynamic fields - elasticsearch

Is it possible to use nest to create buckets with keywords/fields that are not strongly typed?
Due to the nature of this project. I do not have any root objects to pass in.
Below is an example.
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t.Field(l => "CUSTOM_FIELD_HERE"))
)
);
string implicitly convert to Field, so you can pass a string for any field name
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t
.Field("CUSTOM_FIELD_HERE")
)
)
);
Yes, something like that is possible. Looks here for my solution using nested fields. It allows to do all the operations on "dynamic" fields, but with somewhat greater efforts (nested fields are tougher to manipulate with). The gist has some proofs for search, but I implemented aggregations as well.
curl -XPOST localhost:9200/something -d '{
"mappings" : {
"live" : {
"_source" : { "enabled" : true },
"dynamic" : false,
"properties" : {
"customFields" : {
"type" : "nested",
"properties" : {
"fieldName" : { "type" : "string", "index" : "not_analyzed" },
"stringValue": {
"type" : "string",
"fields" : {
"raw" : { "type" : "string", "index" : "not_analyzed" }
}
},
"integerValue": { "type" : "long" },
"floatValue": { "type" : "double" },
"datetimeValue": { "type" : "date", "format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd" },
"booleanValue": { "type" : "boolean" }
}
}
}
}
}
}'
Search should be done in the same nested query using AND, and aggregations should be done in nested aggregation.
I made it for dynamic fields, but it probably can be adjusted for something else. I doubt there can be more flexibility with searchable/aggregatable fields due to principles how indices work.

CSV to nested JSON using NiFi

I want to create nested json from csv using nifi -
CSV file:
"Foo",12,"newyork","North avenue","123213"
"Foo1",12,"newyork","North avenue","123213"
"Foo2",12,"newyork","North avenue","123213"
Required Json:
{
"studentName":"Foo",
"Age":"12",
"address__city":"newyork",
"address":{
"address__address1":"North avenue",
"address__zipcode":"123213"
}
}
I am using nifi 1.4 convertRecord Processor by applying avro schema but not able to get the nested json.
Avro schema:
{
"type" : "record",
"name" : "MyClass",
"namespace" : "com.test.avro",
"fields" : [ {
"name" : "studentName",
"type" : "string"
}, {
"name" : "Age",
"type" : "string"
}, {
"name" : "address__city",
"type" : "string"
}, {
"name" : "address",
"type" : {
"type" : "record",
"name" : "address",
"fields" : [ {
"name" : "address__address1",
"type" : "string"
}, {
"name" : "address__zipcode",
"type" : "string"
} ]
}
} ]
}
You will need to:
Split your flowfile into individual records using SplitRecord
Convert from CSV to flat JSON files using ConvertRecord
Use the JOLT transform processor to transform your flat JSON into nested JSON objects in your desired format.

How to add default values while adding a new field in existing mapping in elasticsearch

This is my existing mapping in elastic search for one of the child document
sessions" : {
"_routing" : {
"required" : true
},
"properties" : {
"operatingSystem" : {
"index" : "not_analyzed",
"type" : "string"
},
"eventDate" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"durations" : {
"type" : "integer"
},
"manufacturer" : {
"index" : "not_analyzed",
"type" : "string"
},
"deviceModel" : {
"index" : "not_analyzed",
"type" : "string"
},
"applicationId" : {
"type" : "integer"
},
"deviceId" : {
"type" : "string"
}
},
"_parent" : {
"type" : "userinfo"
}
}
in above mapping "durations" field is an integer array. I need to update the existing mapping by adding a new field called "durationCount" whose default value should be the size of durations array.
PUT sessions/_mapping
{
"properties" : {
"sessionCount" : {
"type" : "integer"
}
}
}
using above mapping I am able to update the existing mapping but I am not able to figure out how to assign a value ( which would vary for each session document like it should be durations array size ) while updating the mapping. any ideas ?
Well 2 recommendations here -
Instead of adding default value , you can adjust it in the query using missing filter. Lets say , you want to search based on a match query - Instead of just match query , use a bool query with should clause having the match and missing filter. inside filtered query. This way , those documents which did not have the field is also accounted.
If you absolutely need the value in that field for existing documents , you need to reindex the whole set of documents. Or , use the out of box plugin , update by query -

Specify Routing on Index Alias's Term Lookup Filter

I am using Logstash, ElasticSearch and Kibana to allow multiple users to log in and view the log data they have forwarded. I have created index aliases for each user. These restrict their results to contain only their own data.
I'd like to assign users to groups, and allow users to view data for the computers in their group. I created a parent-child relationship between the groups and the users, and I created a term lookup filter on the alias.
My problem is, I receive a RoutingMissingException when I try to apply the alias.
Is there a way to specify the routing for the term lookup filter? How can I lookup terms on a parent document?
I posted the mapping and alias below, but a full gist recreation is available at this link.
curl -XPUT 'http://localhost:9200/accesscontrol/' -d '{
"mappings" : {
"group" : {
"properties" : {
"name" : { "type" : "string" },
"hosts" : { "type" : "string" }
}
},
"user" : {
"_parent" : { "type" : "group" },
"_routing" : { "required" : true, "path" : "group_id" },
"properties" : {
"name" : { "type" : "string" },
"group_id" : { "type" : "string" }
}
}
}
}'
# Create the logstash alias for cvializ
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "logstash-2014.04.25", "alias" : "cvializ-logstash-2014.04.25" } },
{
"add" : {
"index" : "logstash-2014.04.25",
"alias" : "cvializ-logstash-2014.04.25",
"routing" : "intern",
"filter": {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "user",
"id" : "cvializ",
"path" : "group.hosts"
},
"_cache_key" : "cvializ_hosts"
}
}
}
}
]
}'
In attempting to find a workaround for this error, I submitted a bug to the ElasticSearch team, and received an answer from them. It was a bug in ElasticSearch where the filter is applied before the dynamic mapping, causing some erroneous output. I've included their workaround below:
PUT /accesscontrol/group/admin
{
"name" : "admin",
"hosts" : ["computer1","computer2","computer3"]
}
PUT /_template/admin_group
{
"template" : "logstash-*",
"aliases" : {
"template-admin-{index}" : {
"filter" : {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "group",
"id" : "admin",
"path" : "hosts"
}
}
}
}
},
"mappings": {
"example" : {
"properties": {
"host" : {
"type" : "string"
}
}
}
}
}
POST /logstash-2014.05.09/example/1
{
"message":"my sample data",
"#version":"1",
"#timestamp":"2014-05-09T16:25:45.613Z",
"type":"example",
"host":"computer1"
}
GET /template-admin-logstash-2014.05.09/_search

Kibana 3 bettermap ploting lat/long in wrong location on map

I am trying to use bettermap in Kibana 3 to see lat/long data. My geo location is shown in the map in Africa where as the lat/long I have specified are for US.
Any help on the same would be appreciated?
my mapping file is
{"mappings" : {
"livestats" : {
"_source" : {
"enabled" : true
},
"_timestamp" : {
"enabled" : true
},
"_all" : {
"enabled" : false
},
"properties" : {
"state" : { "type" : "string", "index" : "not_analyzed", "store" : "yes" },
"LatLng" : { "type" : "geo_point" , "index" : "analyzed", "store" : "yes"}
}
}
}
}
and dummy data is
{"create":{"_index":"livestats","_type":"livestats"}}
{"state":"CA","LatLng":[-118.252,34.0433]}
thanks
Try with lonlat as the dummy data.
The tooltip on better map says: "geoJSON array! Long,Lat NOT Lat,Long"
The problem is that you have to specify the full fieldname in Bettermaps (Kibana 3.0.0). The autosuggested fieldnames, when you preload the fieldname's of your index, are missleading.
Here is my suggestion:
Replace the mapping of LatLon with:
"LatLng" : { "type" : "geo_point"}
Make sure you comply with GeoJSON Specification: [lon,lat] and not [lat,lon]
In Kibana, use the following field:
"livestats.LatLng"
That should do the trick!

Resources