In my case, NIFI will receive data from syslog firewall, then after transformation sends JSON to ELASTIC. This is my first contact with ELASTICSEARCH
{
"LogChain" : "Corp01 input",
"src_ip" : "162.142.125.228",
"src_port" : "61802",
"dst_ip" : "177.16.1.13",
"dst_port" : "6580",
"timestamp_utc" : 1646226066899
}
In Elasticsearch automatically created Index with such types
{
"mt-firewall" : {
"mappings" : {
"properties" : {
"LogChain" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"dst_ip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"dst_port" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"src_ip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"src_port" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"timestamp_utc" : {
"type" : "long"
}
}
}
}
}
How to change type fields in Elasticsearch?
"src_ip": type "ip"
"dst_ip": type "ip"
"timestamp_utc": type "data"
You can change or configure field type using Mapping in Elasticsearch and some of the way i have given below:
1. Explicit Index Mapping
Here, you will define index mapping by your self with all the required field and specific type of field before indexing any document to Elasticsearch.
PUT /my-index-000001
{
"mappings": {
"properties": {
"src_ip": { "type": "ip" },
"dst_ip": { "type": "ip" },
"timestamp_utc": { "type": "date" }
}
}
}
2. Dyanamic Template:
Here, you will provide dynamic template while creating index and based on condition ES will map field with specific data type like if field name end with _ip then map field as ip type.
PUT my-index-000001/
{
"mappings": {
"dynamic_templates": [
{
"strings_as_ip": {
"match_mapping_type": "string",
"match": "*ip",
"runtime": {
"type": "ip"
}
}
}
]
}
}
Update 1:
If you want to update mapping in existing index then it is not recommndate as it will create data inconsistent.
You can follow bellow steps:
Use Reindex API to copy data to temp index.
Delete your original index.
define index with one of the above one method with index mapping.
Use Reindex API to copy data from temp index to original index (newly created index with Mapping)
Related
Kibana Version : 7.4.2
My Index: cls-docker-logs*
Existing mapping in index cls-docker-logs-*
GET cls-docker-logs-*/_mapping
"stack_trace" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
Updated the mapping to:
"stack_trace" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 2560000
}
}
}
Created a scripted field from where i basically want the value and want to perform some string manipulation
return doc['stack_trace.keyword'].value;
I don't see any value after i run this
What I am doing wrong here?
I'm using Elastic cloud hosted in Azure and use NEST for the client. I have a part of auto generated mapping that I need to change from
"Bonus" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
to
"Bonus" : {
"properties" : {
"Amount" : {
"properties" : {
"Value" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"PayrollSyncDateTime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
When I tried to do it, I get illegal_argument_exception error with the message "can't merge a non object mapping [activityData.Bonus] with an object mapping". How can I correct the auto generated mapping?
I have data in hive in following format
user_ids name city owner_ids
[1, 324, 456] some_name some_city [4567, 12345678]
I want to be able to search by user_ids = 324 as filter criteria or owner_ids = 12345678 and be able to get back above document as response. (Exact match on ids)
Currently I am using dynamic template for mapping which maps user_ids field to long and I am unable to get any results, what type should I force field mapping of user_ids and owner_ids to get this response?
Mapping configuration
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"doc": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
}
]
}
}
}
Result mapping
{
"user_search" : {
"mappings" : {
"doc" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"ds" : {
"type" : "date"
},
"user_ids" : {
"type" : "long"
},
"owner_ids" : {
"type" : "long"
}
}
}
}
}
}
I'm wondering how elastic search tokenizes keywords.
Example:
So I'm using a search box for searching keywords in comments.
When I search for "Zelle" only comments in Spanish showed up.
enter image description here
But if I search for "Zell", all comments with "Zelle" showed up, with highlighting "Zell".
enter image description here
Can anyone please tell me why when I search for some keywords, only some comments in specific languages showed up?
Edit1:
The mapping is like this:
{
"comments" : {
"mappings" : {
"ios" : {
"properties" : {
"content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"date" : {
"type" : "date"
},
"language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"product_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"product_version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"rating" : {
"type" : "long"
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"user_language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
and it did not have any info with the tokenizer.
How should I know which tokenizer es uses for searching?
I recommend you read the Mapping chapter of the official book, it will help you a lot.
To answer your question, we need to know the Mapping of your documents, specifically, the mapping of the field you search in.
By the look of it, you do not use the default analyzer (called "standard"), because "Zell" would not match "Zelle" with it.
In Elasticsearch you have analyzer which tokenize your content the way you want. And by the look of it, some analyzer is setup in your mapping, because "Zelle" and "Zell" are matching.
Been trying to find a way to do this for a couple days now. I've looked through 'bool', 'constant_score', 'filtered' queries none of which seem to be able to come up with the result I want.
One that HAS come close is the 'ids' query (does exactly what I described in the title of this questions) the one problem is that the key that I'm trying to search is not the '_id' value of the Elastic search index. Instead it is 'posterId' in the index below:
"_index": "activity",
"_type": "activity",
"_id": "<unique string id>",
"_score": null,
"_source": {
...
misc keys
...
"posterId": "<QUERY BASED ON THIS VALUE>",
"time": 20171007173623
}
Query that returns based on the _id value:
ids : {
type : "activity",
values : ["<unique string id>", ...]
}
as seen here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
How I want my query to work:
posterId : {
type : "activity",
values : [<list of posterIds>]
}
Returning all indicies that have posterIds contained in "<list of posterIds>"
< Edit > I'm trying to do this in one query as apposed to looping through each member of my list of posterIds because I also need to sort based on the time key and be able to page the query.
So, does anyone know of a built in query that does this or a work around?
Side note: if you feel like you're about to downvote this please just comment why, I'm about to be banned and I've read through all the guidelines and I feel like I'm following them but my questions rarely perform well. :( It would be much appreciated
Edit:
{
"activity" : {
"aliases" : { },
"mappings" : {
"activity" : {
"properties" : {
"-Kvp7f3epvW_dXSONzKj" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"actionId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"actionType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"activityType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"attachedId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"attachedType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"cardType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"noteTitleDict" : {
"properties" : {
"noun" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"subject" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"verb" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"posterId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"segueType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"time" : {
"type" : "long"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1507678305995",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "<id>",
"version" : {
"created" : "5010199"
},
"provided_name" : "activity"
}
}
}
}
I think what you are looking for is a Terms Query
{
"query": {
"constant_score" : {
"filter" : {
"terms" : { "user" : ["kimchy", "elasticsearch"]}
}
}
}
}
This finds documents which contain the exact term Kimchy or elasticsearch in the index of the user field. You can read more about this here https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html
In your case you need to replace
the user with posterId.keyword
Kimchy and elasticsearch with all your posterIds
Keep in mind that a terms query is case sensitive and the keyword field does not use a lowercase analyzer (which means it'll save/index the value in the same case it was received)