Elasticsearch sql query object in array - elasticsearch

I am trying out the new elasticsearch feature elasticsearch sql with elasticsearch 6.3.0
I have a question that it seems not able to query an object in an array with this sql feature.
for example I have indexed a doc as below:
PUT /test/_doc/1
{
"orderId": "123456",
"items": [
{
"itemId": "1234",
"name": "ipad"
}
]
}
and I try an query:
POST /_xpack/sql?format=txt
{
"query": "select items.itemId from test"
}
it gives me error as below:
{
"error": {
"root_cause": [
{
"type": "sql_illegal_argument_exception",
"reason": "Cannot extract value [items.itemId] from source"
}
],
"type": "sql_illegal_argument_exception",
"reason": "Cannot extract value [items.itemId] from source"
},
"status": 500
}
may I know is there a way to query the data in objects of an array?

Related

How to calculate lag between the time log message was generated at application end and the time it was ingested to Elastic Search?

Elasticsearch Experts, need your help to achieve the below mention goal.
Goal:
Trying to find a way to calculate lag between the time, log message was generated at application end (#timestamp field) and the time, it was ingested to Elastic Search (ingest_time field)?
Current Setup:
I am using FluentD to capture the logs and send to Kafka. Then I use Kafka connect (Elasticsearch connector) to send the logs further to Elasticsearch. Since I have a layer of Kafka in between FluentD and Elasticsearch, I want to calculate the lag between the log message generation time and ingestion time.
Log message generation time is stored in timestamp field of the log and is done at when the application generates log. PFB how log message looks at Kafka topic end.
{
"message": "ServiceResponse - Throwing non 2xx response",
"log_level": "ERROR",
"thread_id": "http-nio-9033-exec-21",
"trace_id": "86d39fbc237ef7f8",
"user_id": "85355139",
"tag": "feedaggregator-secondary",
"#timestamp": "2022-06-18T23:30:06+0530"
}
I have created an ingest pipeline to add the ingest_time field to every doc inserted to the Elasticsearch index.
PUT _ingest/pipeline/ingest_time
{
"description": "Add an ingest timestamp",
"processors": [
{
"set": {
"field": "_source.ingest_time",
"value": "{{_ingest.timestamp}}"
}
}]
}
Once document gets inserted to the index from Kafka using Kafka connect (ES sink connector), this is how my message looks on Kibana in JSON format.
{
"_index": "feedaggregator-secondary-2022-06-18",
"_type": "_doc",
"_id": "feedaggregator-secondary-2022-06-18+2+7521337",
"_version": 1,
"_score": null,
"_source": {
"thread_id": "http-nio-9033-exec-21",
"trace_id": "86d39fbc237ef7f8",
"#timestamp": "2022-06-18T23:30:06+0530",
"ingest_time": "2022-06-18T18:00:09.038032Z",
"user_id": "85355139",
"log_level": "ERROR",
"tag": "feedaggregator-secondary",
"message": "ServiceResponse - Throwing non 2xx response"
},
"fields": {
"#timestamp": [
"2022-06-18T18:00:06.000Z"
]
},
"sort": [
1655574126000
]
}
Now, I wanted to calculate the difference between #timestamp field and ingest_time field. For this I added a script in the ingest pipeline, which adds a field lag_seconds and sets it value as the difference between ingest_time and #timestamp fields.
PUT _ingest/pipeline/calculate_lag
{
"description": "Add an ingest timestamp and calculate ingest lag",
"processors": [
{
"set": {
"field": "_source.ingest_time",
"value": "{{_ingest.timestamp}}"
}
},
{
"script": {
"lang": "painless",
"source": """
if(ctx.containsKey("ingest_time") && ctx.containsKey("#timestamp")) {
ctx['lag_in_seconds'] = ChronoUnit.MILLIS.between(ZonedDateTime.parse(ctx['#timestamp']), ZonedDateTime.parse(ctx['ingest_time']))/1000;
}
"""
}
}
]
}
Error:
But since my ingest_time and #timestamp fields are in different format it gave error DateTimeParseException.
{
"error": {
"root_cause": [
{
"type": "exception",
"reason": "java.lang.IllegalArgumentException: ScriptException[runtime error]; nested: DateTimeParseException[Text '2022-06-18T23:30:06+0530' could not be parsed, unparsed text found at index 22];",
"header": {
"processor_type": "script"
}
}
],
"type": "exception",
"reason": "java.lang.IllegalArgumentException: ScriptException[runtime error]; nested: DateTimeParseException[Text '2022-06-18T23:30:06+0530' could not be parsed, unparsed text found at index 22];",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "ScriptException[runtime error]; nested: DateTimeParseException[Text '2022-06-18T23:30:06+0530' could not be parsed, unparsed text found at index 22];",
"caused_by": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"java.base/java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:2049)",
"java.base/java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1948)",
"java.base/java.time.ZonedDateTime.parse(ZonedDateTime.java:598)",
"java.base/java.time.ZonedDateTime.parse(ZonedDateTime.java:583)",
"ctx['lag_in_seconds'] = ChronoUnit.MILLIS.between(ZonedDateTime.parse(ctx['#timestamp']), ZonedDateTime.parse(ctx['ingest_time']))/1000;\n }",
" ^---- HERE"
],
"script": " if(ctx.containsKey(\"ingest_time\") && ctx.containsKey(\"#timestamp\")) {\n ctx['lag_in_seconds'] = ChronoUnit.MILLIS.between(ZonedDateTime.parse(ctx['#timestamp']), ZonedDateTime.parse(ctx['ingest_time']))/1000;\n }",
"lang": "painless",
"caused_by": {
"type": "date_time_parse_exception",
"reason": "Text '2022-06-18T23:30:06+0530' could not be parsed, unparsed text found at index 22"
}
}
},
"header": {
"processor_type": "script"
}
},
"status": 500
}
So, need your help to find the lag_seconds, between the #timestamp and ingest_time fields.
Using managed Elasticsearch by AWS (Opensearch) Elasticsearch Version - 7.1
I can see a Java date parsing problem for the #timestamp field. ctx['#timestamp'] will return the value "2022-06-18T23:30:06+0530", which is a ISO_OFFSET_DATE_TIME. You would need to parse this is using OffsetDateTime.parse(ctx['#timestamp']). Alternatively, you could try to access the #timestamp from the fields block. You can read up on date parsing in Java at https://howtodoinjava.com/java/date-time/zoneddatetime-parse/.

Elasticsearch query nested object

I have this record in elastic:
{
"FirstName": "Winona",
"LastName": "Ryder",
"Notes": "<p>she is an actress</p>",
"Age": "40-50",
"Race": "Caucasian",
"Gender": "Female",
"HeightApproximation": "No",
"Armed": false,
"AgeCategory": "Adult",
"ContactInfo": [
{
"ContactPoint": "stranger#gmail.com",
"ContactType": "Email",
"Details": "Details of tv show",
}
]
}
I want to query inside the contact info object and I used the query below but I dont get any result back:
{
"query": {
"nested" : {
"path" : "ContactInfo",
"query" : {
"match" : {"ContactInfo.Details" : "Details of tv show"}
}
}
}
}
I also tried:
{
"query": {
"term" : { "ContactInfo.ContactType" : "email" }
}
}
here is the mapping for contact info:
"ContactInfo":{
"type": "object"
}
I think I know the issue which is the field is not set as nested in mapping, is there a way to still query nested without changing the mapping, I just want to avoid re-indexing data if its possible.
I'm pretty new to elastic search so need your help.
Thanks in advance.
Elasticsearch has no concept of inner objects.
Some important points from Elasticsearch official documentation on Nested field type
The nested type is a specialized version of the object data type that allows arrays of objects to be indexed in a way that they can be queried independently of each other.
If you need to index arrays of objects and to maintain the independence of each object in the array, use the nested datatype instead of the object data type.
Internally, nested objects index each object in the array as a separate hidden document, such that that each nested object can be queried independently of the others with the nested query.
Refer to this SO answer, to get more details on this
Adding a working example with index mapping, search query, and search result
You have to reindex your data, after applying nested data type
Index Mapping:
{
"mappings": {
"properties": {
"ContactInfo": {
"type": "nested"
}
}
}
}
Search Query:
{
"query": {
"nested" : {
"path" : "ContactInfo",
"query" : {
"match" : {"ContactInfo.Details" : "Details of tv show"}
}
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64269180",
"_type": "_doc",
"_id": "1",
"_score": 1.1507283,
"_source": {
"FirstName": "Winona",
"LastName": "Ryder",
"Notes": "<p>she is an actress</p>",
"Age": "40-50",
"Race": "Caucasian",
"Gender": "Female",
"HeightApproximation": "No",
"Armed": false,
"AgeCategory": "Adult",
"ContactInfo": [
{
"ContactPoint": "stranger#gmail.com",
"ContactType": "Email",
"Details": "Details of tv show"
}
]
}
}
]

How to create a mutlitype index in Elasticsearch?

In several pages in Elasticsearch documentation is mentioned how to query a multi-type index.
But I failed to create one at the first place.
Here is my minimal example (on a Elasticsearch 6.x server):
PUT /myindex
{
"settings" : {
"number_of_shards" : 1
}
}
PUT /myindex/people/123
{
"first name": "John",
"last name": "Doe"
}
PUT /myindex/dog/456
{
"name": "Rex"
}
Index creation and fist insert did well, but at the dog type insert attempt:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [myindex] as the final mapping would have more than 1 type: [people, dog]"
}
],
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [myindex] as the final mapping would have more than 1 type: [people, dog]"
},
"status": 400
}
But this is exactly what I'm trying to do, buddy! Having "more than 1 type" in my index.
Do you know what I have to change in my calls to achieve this?
Many thanks.
Multiple mapping types are not supported from Elastic 6.0.0 onwards. See breaking changes for details.
You can still effectively use multiple types by implementing your own custom type field.
For example:
{
"mappings": {
"doc": {
"properties": {
"type": {
"type": "keyword"
},
"first_name": {
"type": "text"
},
"last_name": {
"type": "text"
}
}
}
}
}
This is described in removal of types.

Can't update mapping in elasticsearch

When putting an anaylzer into mapping using PUT /job/_mapping/doc/ but get conflicts.
But there isn't a anaylzer in mappings.
PUT /job/_mapping/doc/
{
"properties":{
"title": {
"type": "text",
"analyzer":"ik_smart",
"search_analyzer":"ik_smart"
}
}
}
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Mapper for [title] conflicts with existing mapping in other types:\n[mapper [title] has different [analyzer]]"
}
],
"type": "illegal_argument_exception",
"reason": "Mapper for [title] conflicts with existing mapping in other types:\n[mapper [title] has different [analyzer]]"
},
"status": 400
}
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"fielddata": true
},
The output config is like this.
output {
elasticsearch {
hosts => ["<Elasticsearch Hosts>"]
user => "<user>"
password => "<password>"
index => "<table>"
document_id => "%{<MySQL_PRIMARY_KEY>}"
}
}
You cant update mapping in elasticsearch, you can add mapping but not update mapping. Elasticsearch use mapping at the indexation time, that s why you cant update mapping of an existing field. Analyzer is part of the mapping, in fact if you don't specify one es a default one, analyzer tell elastic how to index the documents.
create a new index with your new mappings (include analyzer)
reindex your documents from your existing index to the new one (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html)
Updating Mapping:
Once a document is indexed to an index i.e. the mapping is generated under a given type as like in our case Mapping of EmployeeCode, EmployeeName & isDevelopers' is generated under type "customtype", we cannot modify it afterwards. In case if we want to modify it, we need to delete the index first and then apply the modified mapping manually and then re-index the data. But If you want to add an a new property under a given type, then it is feasible. For example, our document attached our index "inkashyap-1002" under type "customtype" is as follows:
{
"inkashyap-1002": {
"mappings": {
"customtype": {
"properties": {
"EmployeeCode": {
"type": "long"
},
"isDeveloper": {
"type": "boolean"
},
"EmployeeName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
now let's add another property "Grade" :
curl -XPUT localhost:9200/inkashyap-1002(IndexName)/customtype(TypeName)/2 — d '{
"EmployeeName": "Vaibhav Kashyap",
"EmployeeCode": 13629,
"isDeveloper": true,
"Grade": 5
}'
Now hit the GET mapping API. In the results, you can see there is another field added called "Grade".
Common Error:
In the index "inkashyap-1002", so far we have indexed 2 documents. Both the documents had the same type for the field "EmployeeCode" and the type was "Long". Now let us try to index a document like below:
curl -XPUT localhost:9200/inkashyap-1002/customtype/3 -d '{
"EmployeeName": "Vaibhav Kashyap",
"EmployeeCode": "onethreesixtwonine",
"isDeveloper": true,
"Grade": 5
}'
Note that here the "EmployeeCode" is given in string type, which indicates that it is a string field. The response to the above request will be like below:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failedtoparse[
EmployeeCode
]"
}
],
"type": "mapper_parsing_exception",
"reason": "failedtoparse[
EmployeeCode
]",
"caused_by": {
"type": "number_format_exception",
"reason": "Forinputstring: \"onethreesixtwonine\""
}
},
"status": 400
}
In the above response, we can see the error "mapper_parsing_exception" on the field "EmployeeCode". This indicates that the expected field here was of another type and not string. In such cases re-index the document with the appropriate type

Elasticsearch Indexed Script Issue

I am using elasticsearch 2.3.3. I am trying to set up a template query using the following mustache script:
{
"from":"{{from}}",
"size":"{{size}}",
"query": {
"multi_match": {
"query": "{{query}}",
"type": "most_fields",
"fields": [ "meta.court^1.5",
"meta.judge^1.5",
"meta.suit_no^4",
"meta.party1^1.5",
"meta.party2^1.5",
"meta.subject^3",
"content"]
}
}
}
I have successfully indexed this script as follows:
POST /_search/template/myscript
{
"script":{
"from":"{{from}}"
,"size":"{{size}}"
,"query": {
"multi_match": {
"query": "{{query}}",
"type": "most_fields",
"fields": [ "meta.court^1.5", "meta.judge^1.5", "meta.suit_no^4", "meta.party1^1.5", "meta.party2^1.5", "meta.subject^3", "content"]
}
}
}
}
However when I try to render the template for example with:
GET _render/template
{
"id": "myscript",
"params":{
"from":"0"
,"size":"10"
,"query":"walter"
}
}
I get the following error:
{
"error": {
"root_cause": [
{
"type": "json_parse_exception",
"reason": "Unexpected character ('=' (code 61)): was expecting a colon to separate field name and value\n at [Source: [B#39f8927e; line: 1, column: 8]"
}
],
"type": "json_parse_exception",
"reason": "Unexpected character ('=' (code 61)): was expecting a colon to separate field name and value\n at [Source: [B#39f8927e; line: 1, column: 8]"
},
"status": 500
}
The funny thing is I can successfully execute the script if it is stored as a file in the config/scripts directory of an es node.
What am I missing here? Any help would be greatly appreciated.
Many thanks

Resources