Nested object in Elasticsearch Using NEST - elasticsearch

We have created a nested object in our index mapping as shown:
.Nested<Bedroom>(n=>n.Name(c=>c.Beds).IncludeInParent(true).Properties(pp=>pp
.Number(d => d.Name(c => c.BedId).Type(NumberType.Long))
.Number(d => d.Name(c => c.PropertyId).Type(NumberType.Long))
.Number(d => d.Name(c => c.SingleDoubleShared).Type(NumberType.Integer))
.Number(d => d.Name(c => c.Price).Type(NumberType.Integer))
.Number(d => d.Name(c => c.RentFrequency).Type(NumberType.Integer))
.Date(d => d.Name(c => c.AvailableFrom))
.Boolean(d => d.Name(c => c.Ensuite))
However we are experiencing 2 problems.
1- The AvailableFrom field does not get included in the index mapping (the following show the missing field from Kibana index pattern page)
beds.bedId
beds.ensuite
beds.price
beds.propertyId
beds.rentFrequency
beds.singleDoubleShared
Thanks JFM for a constructive comment. This is that part of the mapping in Elastic
"beds" : {
"type" : "nested",
"include_in_parent" : true,
"properties" : {
"availableFrom" : {
"type" : "date"
},
"bedId" : {
"type" : "long"
},
"ensuite" : {
"type" : "boolean"
},
"price" : {
"type" : "integer"
},
"propertyId" : {
"type" : "long"
},
"rentFrequency" : {
"type" : "integer"
},
"singleDoubleShared" : {
"type" : "integer"
}
}
I can see the availablfrom here but not in the index pattern?
Why
2- When we try to index a document with a nested object, the whole Application (MVC Core 3) crashes.
Would appreciate any assistance.

Related

Program map all fields instead of ones I choose, when I use JsonNetSerializer

I have some problems with mapping. Instead of default, I use JsonNetSerializer with following properties:
var connectionSettings =
new ConnectionSettings(pool, sourceSerializer: (builtin, settings) => new JsonNetSerializer(
builtin, settings,
() => new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore},
resolver => resolver.NamingStrategy = new CamelCaseNamingStrategy()
))
.BasicAuthentication(userName, password);
client = new ElasticClient(connectionSettings);
I map Lecturer like that:
private static CreateIndexDescriptor GetLecturerMap(string indexName)
{
CreateIndexDescriptor map = new CreateIndexDescriptor(indexName);
map.Mappings(M => M
.Map<Lecturer>(m => m
.Properties(prop => prop
.Text(s => s
.Name(n => n.FullName)
)
.Boolean(o => o
.Name(s => s.IsActive)
)
.Number(s => s
.Name(n => n.Id)
.Type(NumberType.Integer)
)
.Date(d => d
.Name(n => n.User.LastLogin)
)
.Object<User>(u=>u
.Name(n=>n.User)
.Properties(pr => pr
.Text(t=>t
.Name(n=>n.SkypeContact)
)
)
)
)
)
)
;
return map;
}
And call it like that:
public int InitializeLecturers()
{
string lecturersIndexName = LECUTRERS_INDEX_NAME;
client.Indices.Create(GetLecturerMap(lecturersIndexName));
List<Lecturer> lecturers = GetLecturers();
client.IndexMany(lecturers, lecturersIndexName);
return lecturers.Count;
}
When I get lecturers from Database using following method:
private List<Lecturer> GetLecturers() {
using (Context context = new Context(connectionString))
{
return context.Lecturers
.ToList<Lecturer>();
}
}
Program creates following mapping:
{
"lecturers" : {
"mappings" : {
"properties" : {
"firstName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"fullName" : {
"type" : "text"
},
"id" : {
"type" : "integer"
},
"isActive" : {
"type" : "boolean"
},
"isLecturerHasGraduateStudents" : {
"type" : "boolean"
},
"isNew" : {
"type" : "boolean"
},
"isSecretary" : {
"type" : "boolean"
},
"lastLogin" : {
"type" : "date"
},
"lastName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"middleName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"skill" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"user" : {
"properties" : {
"skypeContact" : {
"type" : "text"
}
}
}
}
}
}
}
So I can't understand, why it ignores my mapping and adds all fields instead of ones
I choose? Please tell me how to fix it. And may be I have to create mapping in a different way?
What's likely happening is that
The explicit mapping you define on index creation is applied
Elasticsearch adds new field mappings for properties it sees in JSON documents that do not have a mapping, and infers the field mapping type for them.
Point 2 is the default behaviour of Elasticsearch, but it can be changed by changing the dynamic property when creating the index and mapping.
Based on the contents of the question, it looks like you're using Elasticsearch 6.x, which would be
var client = new ElasticClient(settings);
client.CreateIndex("index_name", c => c
.Mappings(m => m
.Map<Lecturer>(m => m
.Dynamic(false)
.Properties(prop => prop
.Text(s => s
.Name(n => n.FullName)
)
.Boolean(o => o
.Name(s => s.IsActive)
)
.Number(s => s
.Name(n => n.Id)
.Type(NumberType.Integer)
)
.Date(d => d
.Name(n => n.User.LastLogin)
)
.Object<User>(u => u
.Name(n => n.User)
.Properties(pr => pr
.Text(t => t
.Name(n => n.SkypeContact)
)
)
)
)
)
)
);
Per the documentation link, dynamic value of false will ignore new fields and not create new field mappings or index the fields, but the fields will still be in the _source document. You may also want to set [JsonIgnore] attribute on properties of Lecturer that should be ignored, so that they are not serialized and sent to Elasticsearch.

Nest Aggregation with dynamic fields - elasticsearch

Is it possible to use nest to create buckets with keywords/fields that are not strongly typed?
Due to the nature of this project. I do not have any root objects to pass in.
Below is an example.
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t.Field(l => "CUSTOM_FIELD_HERE"))
)
);
string implicitly convert to Field, so you can pass a string for any field name
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t
.Field("CUSTOM_FIELD_HERE")
)
)
);
Yes, something like that is possible. Looks here for my solution using nested fields. It allows to do all the operations on "dynamic" fields, but with somewhat greater efforts (nested fields are tougher to manipulate with). The gist has some proofs for search, but I implemented aggregations as well.
curl -XPOST localhost:9200/something -d '{
"mappings" : {
"live" : {
"_source" : { "enabled" : true },
"dynamic" : false,
"properties" : {
"customFields" : {
"type" : "nested",
"properties" : {
"fieldName" : { "type" : "string", "index" : "not_analyzed" },
"stringValue": {
"type" : "string",
"fields" : {
"raw" : { "type" : "string", "index" : "not_analyzed" }
}
},
"integerValue": { "type" : "long" },
"floatValue": { "type" : "double" },
"datetimeValue": { "type" : "date", "format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd" },
"booleanValue": { "type" : "boolean" }
}
}
}
}
}
}'
Search should be done in the same nested query using AND, and aggregations should be done in nested aggregation.
I made it for dynamic fields, but it probably can be adjusted for something else. I doubt there can be more flexibility with searchable/aggregatable fields due to principles how indices work.

No handler for type [string] declared on field [name]

When type is declared as string, Elasticsearch 6.0 will show this error.
"name" => [
"type" => "string",
"analyzer" => "ik_max_word"
]
Elasticsearch has dropped the string type and is now using text. So your code should be something like this
"name" => [
"type" => "text",
"analyzer" => "ik_max_word"
]

Aggregating on generic nested array in Elasticsearch with NEST

I'm trying to analyse data with Elasticsearch. I've started working with Elasticsearch and Nest about four months ago, so I might have missed some obvious stuff. All examples are simplified or altered, but the core is the same.
The data contains an array of nested objects, each of which also contain an array of nested objects, and again, each contains an array of nested objects. The data is obtained from an information request which contains XML messages. The messages are parsed and each element containing (multiple) text elements is saved with their element name, location, and an array with all text element names and values under the message name. I'm thinking this set-up might make analyzing the data easier.
Mapping example:
{
"data" : {
"properties" : {
"id" : { "type" : "string" },
"action" : { "type" : "string" },
"result" : { "type" : "string" },
"details" : {
"type" : "nested",
"properties" : {
"description" : { "type" : "string" },
"message" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "string" },
"nodes" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "string" },
"value" : { "type" : "string" }
}
},
"source" : { "type" : "string" }
}
}
}
}
}
}
}
Data example:
{
"id" : "123456789",
"action" : "GetInformation",
"result" : "Success",
"details" : [{
"description" : "Request",
"message" : [{
"name" : "Body",
"source" : "Message|Body",
"nodes" : [{
"name" : "Action",
"value" : "GetInformation"
}, {
"name" : "Identity",
"value" : "1234"
}
]
}
]
}, {
"description" : "Response",
"message" : [{
"name" : "Object",
"source" : "Message|Body|Object",
"nodes" : [{
"name" : "ID",
"value" : "123"
}, {
"name" : "Name",
"value" : "Jim"
}
]
}, {
"name" : "Information",
"source" : "Message|Body|Information",
"nodes" : [{
"name" : "Type",
"value" : "Birth City"
}, {
"name" : "City",
"value" : "Los Angeles"
}
]
}, {
"name" : "Information",
"source" : "Message|Body|Information",
"nodes" : [{
"name" : "Type",
"value" : "City of Residence"
}, {
"name" : "City",
"value" : "New York"
}
]
}
]
}
]
}
XML Example:
<Message>
<Body>
<Object>
<ID>123</ID>
<Name>Jim</Name>
</Object>
<Information>
<Type>Birth City</Type>
<City>Los Angeles</City>
<Information>
<Information>
<Type>City of Residence</Type>
<City>New York</City>
<Information>
</Body>
</Message>
I want to analyse the Name and Value properties of Nodes so I can get an overview of each city within the index that functions as a birthplace and how many people were born in them. Something like:
Dictionary<string, int> birthCities = {
{"Los Angeles", 400}, {"New York", 800},
{"Detroit", 500}, {"Michigan", 700} };
The code I have so far:
var response = client.Search<Data>(search => search
.Query(query =>
query.Match(match=> match
.OnField(data=>data.Action)
.Query("GetInformation")
)
)
.Aggregations(a1 => a1
.Nested("Messages", messages => messages
.Path(data => data.details.FirstOrDefault().Message)
.Aggregations(a2 => a2
.Terms("Sources", termSource => termSource
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Source)
.Aggregations(a3 => a3
.Nested("Nodes", nodes => nodes
.Path(dat => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes)
.Aggregations(a4 => a4
.Terms("Names", termName => termName
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes.FirstOrDefault().Name)
.Aggregations(a5 => a5
.Terms("Values", termValue => termValue
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes.FirstOrDefault().Value)
)
)
)
)
)
)
)
)
)
)
);
var dict = new Dictionary<string, long>();
var sAggr = response.Aggs.Nested("Messages").Terms("Sources");
foreach (var item in sAggr.Items)
{
if (item.Key.Equals("information"))
{
var nAggr = item.Nested("Nodes").Terms("Names");
foreach (var nItem in nAggr.Items)
{
if (nItem.Key.Equals("city"))
{
var vAgg = nItem.Terms("Values");
foreach (var vItem in vAgg.Items)
{
if (!dict.ContainsKey(vItem.Key))
{
dict.Add(vItem.Key, 0);
}
dict[vItem.Key] += vItem.DocCount;
}
}
}
}
}
This code gives me every city and how many times they occur, but since they're saved with the same element name and at the same location (both of which I'm not able to change), I've found no way to distinguish between birth cities and cities of residence.
Specific types for each action are sadly not an option. So my question is: How can I count all occurrences of a city name with Birth City type, preferably without having to import and go through all documents.

How to search on a URL exactly in ElasticSearch / Kibana

I have imported an IIS log file and the data has moved through Logstash (1.4.2), into ElasticSearch (1.3.1) and then being displayed in Kibana.
My filter section is as follows:
filter {
grok {
match =>
["message" , "%{TIMESTAMP_ISO8601:iisTimestamp} %{IP:serverIP} %{WORD:method} %{URIPATH:uri} - %{NUMBER:port} - %{IP:clientIP} - %{NUMBER:status} %{NUMBER:subStatus} %{NUMBER:win32Status} %{NUMBER:timeTaken}"]
}
}
When using a Terms panel in Kibana, and using "uri" (one of my captured fields from Logstash), it is matching the tokens within the URI. Therefore it is matching items like:
'Scripts'
'/'
'EN
Q: How do I display the 'Top URLs' in their full form?
Q: How do I inform ElasticSearch that the field is 'not_analysed'. I don't mind having 2 fields, for example:
uri - The tokenized URI
uri.raw - the fully formed URL.
Can this be done Logstash side, or is this a mapping that needs to be set up in ElasticSearch?
Mapping is as follows :
//http://localhost:9200/iislog-2014.10.09/_mapping?pretty
{
"iislog-2014.10.09" : {
"mappings" : {
"iislogs" : {
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"#version" : {
"type" : "string"
},
"clientIP" : {
"type" : "string"
},
"device" : {
"type" : "string"
},
"host" : {
"type" : "string"
},
"id" : {
"type" : "string"
},
"iisTimestamp" : {
"type" : "string"
},
"logFilePath" : {
"type" : "string"
},
"message" : {
"type" : "string"
},
"method" : {
"type" : "string"
},
"name" : {
"type" : "string"
},
"os" : {
"type" : "string"
},
"os_name" : {
"type" : "string"
},
"port" : {
"type" : "string"
},
"serverIP" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"subStatus" : {
"type" : "string"
},
"tags" : {
"type" : "string"
},
"timeTaken" : {
"type" : "string"
},
"type" : {
"type" : "string"
},
"uri" : {
"type" : "string"
},
"win32Status" : {
"type" : "string"
}
}
}
}
}
}
In your Elasticsearch mapping:
url: {
type: "string",
index: "not_analyzed"
}
The problem is that the iislog- is not compliant with the logstash- format, and hence did not pick up the template:
My index format was iislog-YYYY.MM.dd, this did not use the out-of-the-box mappings by Logstash. When using the logstash- index format, Logstash will create 2 pairs of fields for strings. For example uri is:
uri (appears in Kibana)
uri.raw (does not appear in Kibana)
Note that the uri.raw will not appear in Kibana - but it is queryable.
So the solution to use an alternative index is to:
Don't bother! Use the default index format of logstash-%{+YYYY.MM.dd}
Add a "type" to the file input to help you filter the correct logs in Kibana (whilst using the logstash- index format)
input {
file {
type => "iislog"
....
}
}
Apply filtering in Kibana based in the type
OR
If you really really do want a different index format:
Copy the default configuration file to a new file, say iislog-template.json
Reference the configuration file in the output ==> elasticsearch like this:
output {
elasticsearch_http {
host => localhost
template_name => "iislog-template.json"
template => "<path to template>"
index => "iislog-%{+YYYY.MM.dd}"
}
}

Resources