how to skip particular fields from inserting in elasticsearch? Don't want to fields like "message", "event", "log" - elasticsearch

In the record, I am not inserting fields like "message", "event", or "log". These fields are autogenerated while inserting records from CSV file using logstash somehow which I would like not to get there.
The record in the index looks like follows:
"_index": "jmeter2",
"_id": "dsfdsfdsf",
"_score": 1,
"_source": {
"Samples": "1083",
"Received KB/sec": "178.9",
"99th pct": "1350",
"log": {
"file": {
"path": "/Users/abc/Downloads/opt/jenkins/workspace/agg_report2.csv"
}
},
"host": {
"name": "dfdsfdsffs"
},
"#timestamp": "2022-11-22T07:15:29.052181Z",
"95th pct": "659",
"Min": "112",
"Max": "3829",
"#version": "1",
"Throughput": "7.2",
"Label": "ACTIVITY_DETAIL",
"90th pct": "338",
"Build_number": "abcd1111",
"Error %": "0.00%",
"Median": "207",
"message": "ACTIVITY_DETAIL,1083,270,207,338,659,1350,112,3829,0.00%,7.2,178.9,251.61",
"event": {
"original": "ACTIVITY_DETAIL,1083,270,207,338,659,1350,112,3829,0.00%,7.2,178.9,251.61"
},
"Average Response Time": "270",
"Stddev": "251.61"
}
}

You can add a remove_field statement to your csv filter :
filter {
csv {
remove_field => [ "message", "event", "log" ]
}
}
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field

Related

How to convert Elasticsearch index field of string type to json?

I have an Opensearch index with a string field message defined as below:
{"name":"message.message","type":"string","esTypes":["text"],"count":0,"scripted":false,"searchable":true,"aggregatable":false,"readFromDocValues":false}
Sample data:
"_source" : {
"message" : {
"message" : "user: AB, from: home, to: /app1"
}
}
I would like to convert the message column into json so that I can access the values message.user, message.from and message.to individually.
How do I go about it?
You can use Json Processor.
POST /_ingest/pipeline/_simulate
{
"pipeline": {
"description": "convert json to object",
"processors": [
{
"json": {
"field": "foo",
"target_field": "json_target"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "{\"name\":\"message.message\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false}\r\n"
}
}
]
}
Response:
{
"docs": [
{
"doc": {
"_index": "index",
"_id": "id",
"_version": "-3",
"_source": {
"foo": """{"name":"message.message","type":"string","esTypes":["text"],"count":0,"scripted":false,"searchable":true,"aggregatable":false,"readFromDocValues":false}
""",
"json_target": {
"esTypes": [
"text"
],
"readFromDocValues": false,
"name": "message.message",
"count": 0,
"aggregatable": false,
"type": "string",
"scripted": false,
"searchable": true
}
},
"_ingest": {
"timestamp": "2022-11-09T19:38:01.16232Z"
}
}
}
]
}

Elastic Search Wildcard query with space failing 7.11

I am having my data indexed in elastic search in version 7.11. This is my mapping i got when i directly added documents to my index.
{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
I havent added the keyword part but no idea where it came from.
I am running a wild card query on the same. But unable to get data for keywords with spaces.
{
"query": {
"bool":{
"should":[
{"wildcard": {"name":"*hello world*"}}
]
}
}
}
Have seen many answers related to not_analyzed . And i have tried updating {"index":"true"} in mapping but with no help. How to make the wild card search work in this version of elastic search
Tried adding the wildcard field
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type" :"wildcard"
}
}
}
And got following response
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
},
"status": 400
}
Adding a sample document to match
{
"_index": "accelerators",
"_type": "_doc",
"_id": "602ec047a70f7f30bcf75dec",
"_score": 1.0,
"_source": {
"acc_id": "602ec047a70f7f30bcf75dec",
"name": "hello world example",
"type": "Accelerator",
"description": "khdkhfk ldsjl klsdkl",
"teamMembers": [
{
"userId": "karthik.r#gmail.com",
"name": "Karthik Ganesh R",
"shortName": "KR",
"isOwner": true
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS",
"isOwner": false
}
],
"sectorObj": [
{
"item_id": 14,
"item_text": "Cross-sector"
}
],
"geographyObj": [
{
"item_id": 4,
"item_text": "Global"
}
],
"technologyObj": [
{
"item_id": 1,
"item_text": "Artificial Intelligence"
}
],
"themeColor": 1,
"mainImage": "assets/images/Graphics/Asset 35.svg",
"features": [
{
"name": "Ideation",
"icon": "Asset 1007.svg"
},
{
"name": "Innovation",
"icon": "Asset 1044.svg"
},
{
"name": "Strategy",
"icon": "Asset 1129.svg"
},
{
"name": "Intuitive",
"icon": "Asset 964.svg"
},
],
"logo": {
"actualFileName": "",
"fileExtension": "",
"fileName": "",
"fileSize": 0,
"fileUrl": ""
},
"customLogo": {
"logoColor": "#B9241C",
"logoText": "EC",
"logoTextColor": "#F6F6FA"
},
"collaborators": [
{
"userId": "muhammed.arif#gmail.com",
"name": "muhammed Arif P T",
"shortName": "MA"
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS"
}
],
"created_date": "2021-02-18T19:30:15.238000Z",
"modified_date": "2021-03-11T11:45:49.583000Z"
}
}
You cannot modify a field mapping once created. However, you can create another sub-field of type wildcard, like this:
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"wildcard": {
"type" :"wildcard"
},
"keyword": {
"type" :"keyword",
"ignore_above":256
}
}
}
}
}
When the mapping is updated, you need to reindex your data so that the new field gets indexed, like this:
POST http://localhost:9001/indexname/_update_by_query
And then when this finishes, you'll be able to query on this new field like this:
{
"query": {
"bool": {
"should": [
{
"wildcard": {
"name.wildcard": "*hello world*"
}
}
]
}
}
}

logstash extract and move nested fields into new parent field

If in my log I print the latitude and longitude of a given point, how can I capture this information so that it is processed as a geospatial data in elastic search?
Below I show an example of a document in Elasticsearch corresponding to a log line:
{
"_index": "memo-logstash-2018.05",
"_type": "doc",
"_id": "DDCARGMBfvaBflicTW4-",
"_version": 1,
"_score": null,
"_source": {
"type": "elktest",
"message": "LON: 12.5, LAT: 42",
"#timestamp": "2018-05-09T10:44:09.046Z",
"host": "f6f9fd66cd6c",
"path": "/usr/share/logstash/logs/docker-elk-master.log",
"#version": "1"
},
"fields": {
"#timestamp": [
"2018-05-09T10:44:09.046Z"
]
},
"highlight": {
"type": [
"#kibana-highlighted-field#elktest#/kibana-highlighted-field#"
]
},
"sort": [
1525862649046
]
}
You can first separate LON and LAT into their own fields as follows,
grok {
match => {"message" => "LON: %{NUMBER:LON}, LAT: %{NUMBER:LAT}"}
}
once they are separated you can use mutate filter to create a parent field around them, like this,
filter {
mutate {
rename => { "LON" => "[location][LON]" }
rename => { "LAT" => "[location][LAT]" }
}
}
let me know if this helps.

How to configure elasticsearch regexp query

I try to configure elasticsearch request. I use DSL and try to find some data with word "swagger" into "message" field.
Here is one of correct answer I want to show :
{
"_index": "apiconnect508",
"_type": "audit",
"_id": "AWF1us1T4ztincEzswAr",
"_score": 1,
"_source": {
"consumerOrgId": null,
"headers": {
"http_accept": "application/json",
"content_type": "application/json",
"request_path": "/apim-5a7c34e0e4b02e66c60edbb2-2018.02/auditevent",
"http_version": "HTTP/1.1",
"http_connection": "keep-alive",
"request_method": "POST",
"http_host": "localhost:9700",
"request_uri": "/apim-5a7c34e0e4b02e66c60edbb2-2018.02/auditevent",
"content_length": "533",
"http_user_agent": "Wink Client v1.1.1"
},
"nlsMessage": {
"resource": "messages",
"replacements": [
"test",
"1.0.0",
"ext_mafashagov#rencredit.ru"
],
"key": "swagger.import.notification"
},
"notificationType": "EVENT",
"eventType": "AUDIT",
"source": null,
"envId": null,
"message": "API test version 1.0.0 was created from a Swagger document by ext_mafashagov#rencredit.ru.",
"userId": "ext_mafashagov#rencredit.ru",
"orgId": "5a7c34e0e4b02e66c60edbb2",
"assetType": "api",
"tags": [
"_geoip_lookup_failure"
],
"gateway_geoip": {},
"datetime": "2018-02-08T14:04:32.731Z",
"#timestamp": "2018-02-08T14:04:32.747Z",
"assetId": "5a7c58f0e4b02e66c60edc53",
"#version": "1",
"host": "127.0.0.1",
"id": "5a7c58f0e4b02e66c60edc55",
"client_geoip": {}
}
}
I try to find ths JSON by :
POST myAddress/_search
Next query works without "regexp" field. How should I configure regexp part of my query?
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp" : {"gte" : "now-100d"}
}
},
{
"term": {
"_type": "audit"
}
},
{
"regexp" : {
"message": "*wagger*"
}
}
]
}
}
}
},
"sort": {
"TraceDateTime": {
"order": "desc",
"ignore_unmapped": "true"
}
}
}
If message field is analyzed, this simple match query should work:
"match":{
"message":"*swagger*"
}
However if it is not analyzed, these two queries should also work for you:
These two queries are case sensitive so you should consider lower casing your field if you wish to keep it not analyzed.
"wildcard":{
"message":"*swagger*"
}
or
"regexp":{
"message":"swagger"
}
Be careful as wildcard and regexp queries degrade performance.

How to search exact text in nested document in elasticsearch

I have a index like this,
"_index": "test",
"_type": "products",
"_id": "URpYIFBAQRiPPu1BFOZiQg",
"_score": null,
"_source": {
"currency": null,
"colors": [],
"api": 1,
"sku": 9999227900050002,
"category_path": [
{
"id": "cat00000",
"name": "B1"
},
{
"id": "abcat0400000",
"name": "Cameras & Camcorders"
},
{
"id": "abcat0401000",
"name": "Digital Cameras"
},
{
"id": "abcat0401005",
"name": "Digital SLR Cameras"
},
{
"id": "pcmcat180400050006",
"name": "DSLR Package Deals"
}
],
"price": 1034.99,
"status": 1,
"description": null,
}
And i want to search only exact text ["Camcorders"] in category_path field.
I did some match query, but it search all the products which has "Camcorders" as a part of the text. Can some one help me to solve this.
Thanks
To search in nested field use like following query
{
"query": {
"term": {
"category_path.name": {
"value": "b1"
}
}
}
}
HOpe it helps..!
you could add one more nested field raw_name with not_analyzed analyzer and match against it.

Resources