Store only in Elastic AppSearch / Enterprise Search - elasticsearch

I am trying to apply enterprise search for our e-commerce webapp.
Actually we have some fields which doesn't need to be indexed / searched, but only stored.
The schema documentation does only allow 4 types for a field:
text
number
date
geolocation
Before I try to come around indexing by falsely setting fields to geolocation or date, I wanted to ask if there are any other options to only store, but not index data into App Search?

Documents indexed in AppSearch are visible in a dedicated indice in Elasticsearch.
As such, you can modify its mapping and reindex it or update all docs by query.
The only think I cannot tell you is the side effects incurred by such changes at appsearch level.
Best, Dan

Related

ElasticSearch as primary DB for document library

My task is a full-text search system for a really large amount of documents. Now I have documents as RTF file and their metadata, so all this will be indexed in elastic search. These documents are unchangeable (they can be only deleted) and I don't really expect many new documents per day. So is it a good idea to use elastic as primary DB in this case?
Maybe I'll store the RTF file separately, but I really don't see the point of storing all this data somewhere else.
This question was solved here. So it's a good case for elasticsearch as the primary DB
Elastic is more known as distributed full text search engine , not as database...
If you preserve the document _source it can be used as database since almost any time you decide to apply document changes or mapping changes you need to re-index the documents in the index(known as table in relation world) , there is no possibility to update parts of the elastic lucene inverse index , you need to re-index the whole document ...
Elastic index survival mechanism is one of the best , meaning that if you loose node the index lost replicas are automatically replicated to some of the other available nodes so you dont need to do any manual operations ...
If you do regular backups and having no requirement the data to be 24/7 available it is completely acceptable to hold the data and full text index in elasticsearch as like in database ...
But if you need highly available combination I would recommend keeping the documents in mongoDB (known as best for distributed document store) for example and use elasticsearch only in its original purpose as full text search engine ...

How about including JSON doc version? Is it possible for elastic search, to include different versions of JSON docs, to save and to search?

We are using ElasticSearch to save and manage information on complex transactions. We might need to add more information for every transaction, on the near future.
How about including JSON doc version?
Is it possible for elastic search, to include different versions of JSON docs, to save and to search?
How does this affects performance on ElasticSearch?
It's completely possible, By default elastic uses the dynamic mappings for every new documents such as your JSON documents to index them. For each field in your documents elastic creates a table called inverted_index and the search queries executed against them so regardless of your field variation as long as you know which field you want to execute query the data throughput and performance will not be affected.

Elastic Enterprise Search - Is it a best practice to index data of two different json schema in a single index

Hi I'm trying out Elastic Enterprise Search with Elasticsearch. I have a couple of questions on data indexing.
When referring to Elasticsearch documentation, I read that there is a limit to the number of fields that an Elasticsearch index could have. Since Elasticsearch is used with Elastic Enterprise Search I believe there is no arguing that the same applies here. In that case lets say I have multiple document types with various fields. For an example Person.json and Dog.json, they both have different properties. So when indexing I use one search engine in Elastic Enterprise Search to index both Person and Dog so that when I query using the Elastic Enterprise Search API I'll get results which are both Person and Dog depending on the search term.
Is this the way to go,or should I specify a seperate search engine for each schema type?
I am assuming that your person.json and dog.json contains different fields as your heading suggest and weather to create a separate index for these entities or have them in a single index, depends on the various use-cases you have in your application and you will not find elasticsearch marking one approach better than other and mainly will explain the pros/cons based on a particular context(like relevance, performance, management etc).
Please refer to my this SO answer, where I talked about various pros/cons of both the approach and discussion in chat to get more context why OP chose an approach based on his use-case, after knowing the pros/cons.

Elastic search per user access control to document

I'm using ElasticSearch 7.1.1 as a full-text search engine. At the beginning all the documents are accessible to every user. I want to give users the possibility to edit documents. The modified version of the document will be accessible only to the editor and everyone else will only be able to see the default document.
To do this I will add two array to every document:
An array of users excluded from seeing the doc
An array with the only user that can see the this doc
Every time someone edit a document I will:
Add to the excluded users list the user that made the edit
Create document containing the edit available only to that user.
This way in the index I'll have three types of documents:
Documents accessible to everyone
Documents accessible to everyone except some users
Documents accessible only to a specific users
I use ElasticSearch not only to fetch documents but also to calculate live aggregations (e.g. sums of some field) so query-time I will be able to fetch user specific documents.
I don't expect a lot of edits, less than 1% of the total documents.
Is there a smarter, and less query intensive, way to obtain the same results?
You could implement a document level security.
With that you can define roles that restrict the read-access to certain documents that match a query (e.g. you could use the id of the document).
So instead of updating the documents each time via your proposed array-solution, you would instead update the role respectively granting the roles to the particular users. This would of course require that every user has an elasticsearch user.
This feature is the only workaround to fulfill your requirements that Elasticsearch brings on the table "out of the box" as far as I know.
I hope I could help you.

Can we migrate non stored Index data in SOLR to Elastic search?

We are currently using SOLR for full-text search. Now we are planning to move from SOLR to ElasticSearch. When we were in this process i have read somewhere that there are some plugins available which will migrate data from SOLR-ElasticSearch. But it won't be able to migrate those records which are not stored in SOLR. So is there a plugin available which will migrate non-stored index data from SOLR to elastic search if so please let me know.
Currently am using SOLR-to-ES plugin, but it won't migrate the non-stored index data.
Thanks
If the field is not stored, then you don't have the original value. If you have it indexed, what's is in there is the value after it has gone through the analysis chain, and so is probably different than the original one (has no stopwords, is probably lowercased, maybe stemmed...stuff like that).
There are a couple of possibilities that might allow you to have the original content when not stored:
indexed field: if it has been analyzed with just the keyword tokenizer: then the indexed value is the original value.
field has docValues=true then the original value is also stored. This feature was introduced later, so your index might not be using it.
The issue is, the common plugings might not take advantage of those cases where stored=true is not totally necessary. You need to check them.

Resources