No fixed type in indexing elasticsearch - elasticsearch

I am still learning elasticsearch. I wanted to know, if there is a way where the type of the value of a particular key is not fixed, can we still index it?
for example firstName can be "xyz" and it can also be an object in another document of the same type, and there is a huge combination of such fields which can all have string or object as the value, so it is not like I can isolate the string one and the object one in different indexes.
Thanks

Elasticsearch doesn't support this.
Elasticsearch does have features to "auto detect" what the type for a field should be. However, the first time it sees a field, it will make its guess, and then every subsequent record that has that field will have to match.
In your case, if a record where firstName was a string was indexed first, then all records where firstName is an object will throw an error when you try to index them in Elasticsearch. If an object was indexed first, all the records where firstName is a string would fail.
Elasticsearch is designed to help you get started quickly, but ultimately there's no shortcut and you will have to:
Design a schema that tells Elasticsearch good settings to use for each field
Do the work in your code that imports records into Elasticsearch to make decisions about how you want to import them

Related

elasticsearch unknown setting index.include_type_name

I'm in really weird situations, I need to create indexes in elasticsearch that contain typeless fields. I have a rails application that sends any data per second to my elasticsearch. about my architecture, I have to say I use elastic-stack on docker in ubuntu server and use socket to send data's to elk and all of them are the latest version.
In my rails application user could choose datatype for each field but the issues happen when the user want to change the datatype of one field right after it's created, logstash return this error
error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [field] of type [long] in document with id '5e760cac-cafc-4fd0-9e45-1c650967ccd4'. Preview of field's value: '2022-01-18T08:06:30'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"For input string: \"2022-01-18T08:06:30\
I found deadly queue letter plugins to save wrong input in my server after that I think if I could index documents without any type the problem is solved so I start to googling and found Removal of mapping types in elasticsearch documents and I follow instructions which describe in tutorials I get the following error:
unknown setting [index.include_type_name] please check that any required plugins are installed, or check the breaking changes documentation for removed settings
even I put "include_type_name" in the request to send to the elastic noting change I have the latest version of elastic.
I think maybe it's helpful to edit the default elasticsearch template but noting the change. could you please help me with what should I do?
As already mentioned in the comments, Elasticsearch does not support changing the data type of a field without a reindex or creating a new index.
For example, if a field is mapped as a numeric field like integer and the user wants to index a string value in this field, elasticsearch will return an mapping error.
You would need to change the mapping of the index and reindex it or create a entirely new index using the new mapping.
None of this is done automatically by elastic, you would need to deal with this in your application, you could catch the error and implement some logic to create a new index with the new mapping, but this also could lead to other problems as having too many indices in the cluster and query errors when the range of the query include index with the same field with different mappings.
One feature that Elasticsearch has that could help you in some way is the runtime fields, with runtime fields you can query a field that has a specific mapping using a different mapping.
For example, if you have a field that has date values, but was wrongly mapped as a keyword or text field, you could use a runtime field to query it as it was a date field.
But again, this will need that you implement a logic to build those runtime fields and also can lead to other problems, not all the data types are available to runtime fields and runtime fields can impact in the performance.
Another feature that could help you is to use of multi-fields, this, I think, is the closest you got of having a field with multiple data types.
Using multi-fields you could have a field named date with the date type and also a field named date.keyword with the keyword type, you also could have a field name code with the keyword type and a field name code.int with the integer type, you would also need to use the ignore_malformed setting in the mapping so elastic does not reject the entire document in case of mapping errors, just the field with the wrong mapping.
Just keep in mind that when use multi-fields, you will have a different field for each mapping, for example date is a field, date.keyword is another field, this will increase the storage usage.
But again, none of this is done automatically, it needs logic in your application, elasticsearch does not allows you to change the mapping of a field, if your application needs this, you will need to implement something in the application that can work with that limitations of elasticsearch.

How to make Country-State-City like search in elastic Search

I want to make dependent search like when user type country and select country then on next dropdown/text search result would be from that particular Countries state, after selecting state on next text search would only based on that selected state. can anyone help to achieve this kind thing via elastic search.
i am new to elastic Search and i had basic idea of it, but didn't get idea how to do this kind of stuff where i need to search from child and feel data like map
Fist, it is important to understand how Elasticsearch store its data. You can find this kind of info here: https://www.elastic.co/guide/en/elasticsearch/reference/master/documents-indices.html
So, basically what you need is build a query with two must terms.
One for the object type (Country, State, etc).
Other for the name ("Los Angeles", "Massachussets", etc). If you want a autocomplete feature you could add a wildcard query in your list. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html
Obs: When you store your State object do not forget to store the Country name together. Since Elasticsearch is non relational you have to have Country name indexed in the State document.
Hope that it helps

Lucene: Filter query by doc ID

I want to have in the search response only documents with specified doc id. In stackoverflow I found this question (Lucene filter with docIds) but as far as I understand there is created the additional field in the document and then doing search by this field. Is there another way to deal with it?
Lucene's docids are intended only to be internal keys. You should not be using them as search keys, or storing them for later use. Those ids are subject to change without warning. They will be changed when updating or reindexing documents, and can change at other times, such as segment merges, as well.
If you want your documents to have a unique identifier, you should generate that key separate from the docId, and index it as a field in your document.

Can I index nested documents on ElasticSearch without mapping?

I have a document whose structure changes often, how can I index nested documents inside it without changing the mapping on ElasticSearch?
You can index documents in Elasticsearch without providing a mapping yes.
However, Elasticsearch makes a decision about the type of a field when the first document contains a value for that field. If you add document 1 and it has a field called item_code, and in document 1 item_code is a string, Elasticsearch will set the type of field "item_code" to be string. If document 2 has an integer value in item_code Elasticsearch will have already set the type as string.
Basically, field type is index dependant, not document dependant.
This is mainly because of Apache Lucene and the way it handles this information.
If you're having a case where some data structure changes, while other doesn't, you could use an object type, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-object-type.html.
You can even go as far as to use "enabled": false on it, which makes elasticsearch just store the data. You couldn't search it anymore, but maybe you actually don't even want that?

What indexes are created when indexing a document in elasticsearch

If I create a first document of it's type, or put a mapping, is an index created for each field?
Obviously if i set "index" to "analyzed" or "not analyzed" the field is indexed.
Is there a way to store a field so it can be retrieved but never searched by? I imagine this will save a lot of space? If I set this to "no" will this save space?
Will I still be able to search by this, just take more time, or will this be totally unsearchable?
Is there a way to make a field indexed after some documents are inserted and I change my mind?
For example, I might have a mapping:
{
"book":{"properties":{
"title":{"type":"string", "index":"not_analyzed"},
"shelf":{"type":"long","index":"no"}
}}}
so I want to be able to search by title, but also retrieve the shelf the book is on
index:no will indeed not create an index for that field, so that saves some space. Once you've done that you can't search for that particular field anymore.
Perhaps also useful in this context is to know aboutthe _source field, which is returned by default and includes all fields you've stored. http://www.elasticsearch.org/guide/reference/mapping/source-field/
As to your second question:
you can't change your mind halfway. When you want to index a particular field later on you have to reindex the documents.
That's why you may want to reconsider setting index:no, etc. In fact a good strategy to begin is to don't define a schema for fields at all, unless you're 100% sure you need a non-default analyzer for a particular field for instance. Otherwise ES will use generally usable defaults.

Resources