Script fields in nested objects specificaly geo_shapes - elasticsearch

Part of my document mapping consisits of the mapping below
"locations": {
"type": "nested",
"properties": {
"point": {
"type": "geo_shape",
"tree": "quadtree",
"precision": "100m"
}
}
}
When I attempt to issue a script_field as part of a query Elasticsearch is returning an error
failed to run inline script [doc['locations.point'].distanceInMiles(53.4791,-2.2441)] using lang [groovy]
With a reason of:
failed to find field data builder for field locations.point, and type geo_shape
I'm assuming this is because the field is nested (it has a few (geo) points inside the field and the search matches on any one of them, however as it's nested the context of the path locations.point is obviously wrong, it needs to be something like locations.point[10] (for the 11th one perhaps - this is dependant on the context of the matched item in the query).
So, does anyone know a way to perform this properly? Is there a special operator I can tell the script so that it knows it needs to look at the matched point from the field?
Thanks in advance.

Turns out it's actually not possible to do this with geo_shape's

Related

Kafka Connect JDBC sink - write Avro field into PG JSONB

I'm trying to build a pipeline where Avro data is written into a Postgres DB. Everything works fine with simple schemas and the AvroConverter for the values. However, I would like to have a nested field written into a JSONB column. There are a couple of problems with this. First, it seems that the Connect plugin does not support STRUCT data. Second, the plugin cannot write directly into the JSONB column.
The second problem should be avoided by adding a cast in PG, as described in this issue. The first problem is proving more diffult. I have tried different transformations but have not been able to get the Connect plugin to interpret one complex field as a string. The schema in questions looks something like this (in practice there would be more fields on the first level besides the timestamp):
{
"namespace": "test.schema",
"name": "nested_message",
"type": "record",
"fields": [
{
"name": "timestamp",
"type": "long"
},
{
"name": "nested_field",
"type": {
"name": "nested_field_record",
"type": "record",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "prop",
"type": "float",
"doc": "Some property"
}
]
}
}
]
}
The message is written in Kafka as
{"timestamp":1599493668741396400,"nested_field":{"name":"myname","prop":377.93887}}
In order to write the contents of nested_field into a single DB column, I would like to interpret this entire field as a string. Is this possible? I have tried the cast transformation, but this only supports prmitive Avro types. Something along the lines of HoistField could work, but I don't see a way to limit this to a single field. Any ideas or advice would be greatly appreciated.
A completely different approach would be to use two connect plugins and UPSERT into the table. One plugin would use the AvroConverter for all fields save the nested one, while the second plugin uses the StringConverter for the nested field. This feels wrong in all kinds of ways though.

Elastic Beats - Changing the Field Type of Default Fields in Beats Documents?

I'm still fairly new to the Elastic Stack and I'm still not seeing the entire picture from what I'm reading on this topic.
Let's say I'm using the latest versions of Filebeat or Metricbeat for example, and pushing that data to Logstash output, (which is then configured to push to ES). I want an "out of the box" field from one of these beats to have its field type changed (example: change beat.hostname from it's current default "text" type to "keyword"), what is the best place/practice for configuring this? This kind of change is something I would want consistent across multiple hosts running the same Beat.
I wouldn't change any existing fields since Kibana is building a lot of visualizations, dashboards, SIEM,... on the exptected fields + data types.
Instead extend (add, don't change) the default mapping if needed. On top of the default index template, you can add your own and they will be merged. Adding more fields will require some more disk space (and probably memory when loading), but it should be manageable and avoids a lot of drawbacks of other approaches.
Agreed with #xeraa. It is not advised to change the default template since that field might be used in any default visualizations.
Create a new template, you can have multiple templates for the same index pattern. All the mappings will be merged.The order of the merging can be controlled using the order parameter, with lower order being applied first, and higher orders overriding them.
For your case, probably create a multi-field for any field that needs to be changed. Eg: As shown here create a new keyword multifield, then you can refer the new field as
fieldname.raw
.
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
The other answers are correct but I did the below in Dev console to update the message field from text to text & keyword
PUT /index_name/_mapping
{
"properties": {
"message": {
"type": "match_only_text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 10000
}
}
}
}
}

Specifying Field Types Indexing from Logstash to Elasticsearch

I have successfully ingested data using the XML filter plugin from Logstash to Elasticsearch, however all the field types are of the type "text."
Is there a way to manually or automatically specify the correct type?
I found the following technique good for my use:
Logstash would filter the data and change a field from the default - text to whatever form you want. The documentation would be found here. The example given in the documentation is:
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
This you add in the /etc/logstash/conf.d/02-... file in the body part. I believe the downside of this practice is that from my understanding it is less recommended to alter data entering the ES.
After you do this you will probably get the this problem. If you have this problem and your DB is a test DB that you can erase all old data just DELETE the index until now that there would not be a conflict (for example you have a field that was until now text and now it is received as date there would be a conflict between old and new data). If you can't just erase the old data then read into the answer in the link I linked.
What you want to do is specify a mapping template.
PUT _template/template_1
{
"index_patterns": ["te*", "bar*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"type1": {
"_source": {
"enabled": false
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z YYYY"
}
}
}
}
}
Change the settings to match your needs such as listing the properties to map what you want them to map to.
Setting index_patterns is especially important because it tells elastic how to apply this template. You can set an array of index patterns and can use * as appropriate for wildcards. i.e logstash's default is to rotate by date. They will look like logstash-2018.04.23 so your pattern could be logstash-* and any that match the pattern will receive the template.
If you want to match based on some pattern, then you can use dynamic templates.
Edit: Adding a little update here, if you want logstash to apply the template for you, here is a link to the settings you'll want to be aware of.

ElasticSearch 6, copy_to with dynamic index mappings

Maybe I'm missing something simple, but still could not figure out the following thing:
As of ES 6.x the _all field is deprecated, and instead it's suggested to use the copy_to instruction (https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html).
However, I got an impression that you need to explicitly specify the fields which you want to copy to the custom _all field. But if I use dynamic mappings, I don't know the fields in advance, and therefore cannot use copy_to?
Any way I can tell ES to copy all encountered fields to the custom _all field so that I can search across all fields?
Thanks in advance!
You could use Dynamic Templates. Basically create an index, add the custom catch_all field and then specify that particular property for all the fields that are strings. (Haven't done this before, but I believe this is the only way now. Since the field catch_all will be already present when you put the dynamic template, it will not match the catch_all - meaning that the catch_all will not copy to itself, but check it out yourself to make sure).
PUT my_index
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"copy_to": "catch_all"
}
}
}
]
}
}
}

Unanalyzed fields on Kibana

i need help to correct kibana field. when I try to visualizing the fields, shown me the following warning:
Careful! The field contains Analyzed selected strings. Analyzed
strings are highly unique and can use a lot of memory to visualize.
Values: such as bar will be foo-foo and bar broken into. See Core
Mapping Types for more information on setting esta field Analyzed as
not
Elasticsearch default dynamic mapping is to analyze any string field (break the field into tokens, for instance: aaa_bbb_ccc will be break down into aaa,bbb and ccc).
If you do not want such behavior you must change the mapping settings
before any document was pushed into the index.
You have two options to do that:
Change the mapping for a particular index using mapping API, in a static way or dynamic way (dynamic means that the mapping will be applies also to fields that still does not exist in the index)
You can change the behavior of any index according to a pattern, using the template API
This example shows a template that changes the mapping for any index that starts with "app", applying "not analyze" to any field in any type and make sure "timestamp" is a date (good for cases in with the timestamp is represented as a number of seconds from 1970):
{
"template": "myindciesprefix*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
},
{
"timestamp_field": {
"match": "timestamp",
"mapping": {
"type": "date"
}
}
}
]
}
}
}
Really you dont have any problem is only a message of info, but if you dont want analyzed fields when you build your index in elasticsearch you must indicate that one field is a not analyzed field.

Resources