Use existing field as id in elasticsearch - elasticsearch

Just started using elasticSearch today. I was wondering if it would be possible to set in some kind of global parameter to use a certain field within a document as the ID always?
My JSON documents will always have it's own unique ID
{
"Record ID": "a06b0000004SWbdAAG",
"System Modstamp": "01/31/2013T07:46:02.000Z",
"body": "Test Body"
}
Here I would like to use Record ID as the ID field.
Regards

You want to use the path setting, see the docs here:
http://www.elasticsearch.org/guide/reference/mapping/id-field/
specifically something like this should work in your mapping:
{
"your_mapping" : {
"_id" : {
"path" : "Record ID"
}
}
}
I've never tried having variable names split up though. You might want to camelcase or underscore them if you run into wierdness.

Related

Kibana Regex check if a field Value contains another field value

I'm trying to search for documents in which a description field contains the value of a name field (from another document). I tried to do a Regex query as following :
GET inventory-full-index/_search
{
"query": {
"regexp": {
"description.description_data.value.keyword": ".*doc['name.keyword'].*"
}
}
}
It returns me interesting documents, that fit my need. The problem is that i created a document that contains "python3" in the description, and I made sure there was a document named "python3" as well. This query doesn't return this document, so i obviously missed something.
Any idea how to fix this ?

Fluent Bit set index mapping

I am trying to set all mapped fields to string ie if a json message comes with following:
{
"logDate": "2012-04-23T18:25:43.511Z",
"logId": 123131,
"message": {
"username": "pera",
"password": "pera123"
}
}
I need to log every value as string ie. logId should be logged as "logId": "123131".
Is there a way to tell fluent bit what index mapping to use of maybe there is another setting that changes dynamic type to string?
Maybe can try adding an index template.
https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html

How can I let ES support mixed type of a field?

I am saving logs to Elasticsearch for analysis but I found there are mixed types of a particular field which causing error when indexing the document.
For example, I may save below log to the index where uuid is an object.
POST /index-000001/_doc
{
"uuid": {"S": "001"}
}
but from another event, the log would be:
POST /index-000001/_doc
{
"uuid": "001"
}
the second POST will fail because the type of uuid is not an object. so I get this error: object mapping for [uuid] tried to parse field [uuid] as object, but found a concrete value
I wonder what the best solution for that? I can't change the log because they are from different application. The first log is from the data of dynamodb while the second one is the data from application. How can I save both types of logs into ES?
If I disable dynamic mapping, I will have to specify all fields in the index mapping. For any new fields, I am not able to search them. so I do need dynamic mapping.
There will be many cases like that. so I am looking for a solution which can cover all conflict fields.
It's perfectly possible using ingest pipelines which are run before the indexing process.
The following would be a solution for your particular use case, albeit somewhat onerous:
create a pipeline
PUT _ingest/pipeline/uuid_normalize
{
"description" : "Makes sure uuid is a hash map",
"processors" : [
{
"script": {
"source": """
if (ctx.uuid != null && !(ctx.uuid instanceof java.util.HashMap)) {
ctx.uuid = ['S': ctx.uuid]; // hash map init
}
"""
}
}
]
}
run the pipeline when ingesting a new doc
POST /index-000001/_doc
{
"uuid": {"S": "001"}
}
POST /index-000001/_doc?pipeline=uuid_normalize <------
{
"uuid": "001"
}
You could now extend this to be as generic as you like but it is assumed that you know what you expect as input in each and every doc. In other words, unlike dynamic templates, you need to know what you want to safeguard against.
You can read more about painless script operators here.
You just cannot.
You should either normalize all your field in a way or another.
Or use 2 separate field.
I can suggest to use a field like this :
"uuid": {"key": "S", "value": "001"}
and skip the key when not necessary.
But you will have to preprocess your value before ingestion.

Changing data in every document

I am working on an application that has messages and I want to store all the messages. But my problem is the message has a from first name and last name which could change. So if for example my JSON was
{
"subject": "Hello!",
"message": "Hello there",
"from": {
"user_id": 1,
"firstname": "George",
"lastname": "Lastgeorge"
}
}
The user could potentially change their last name or even first name. Which would require basically looping over every record in elasticsearch and updating everyone with the user_id.
Is there a better way to go about doing this?
I feel you should use parent mapping.
Keep the user info as parent with userID as key.
/index/userinfo/userID
{
"name" : "George",
"last" : "Lastgeorge"
}
Next , you need to maintain each chat as a child document and map the parent to the userindo type.
This way , whenever you want to make some change to the user information , simply make the change in userInfo type.
With this feature intact , you can search your logs based on user information , or search users based on chat records.
Link - http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/parent-child.html

Using a combined field as id mapping in ElasticSearch

From this question I can see that it is possible Use existing field as id in elasticsearch
My question is, if can do similar thing but concatenating fields.
{
"RecordID": "a06b0000004SWbdAAG",
"SystemModstamp": "01/31/2013T07:46:02.000Z",
"body": "Test Body"
}
And then do something like
{
"your_mapping" : {
"_id" : {
"path" : "RecordID" + "body"
}
}
}
So the id is automatically formed from concatenating those fields.
No you can't, you can only make the _id point to a field that's within the document, using the dot notation as well if needed (e.g. level1,level2.id).
I'd suggest to have a field that contains the whole id in your documents, or even better to take the id out and provide it in the url, as configuring a path causes the document to be parsed when not needed.

Resources