How indexing work for dictionary in Elastic Search? - elasticsearch

I have an elastic index containing a dictionary in each document.
Docs:
{
"name" : "name1",
"paymentDict":
{
"card1": { "CardType": "Credit", "CardName": "Axis"},
"card2": { "CardType": "Debit", "CardName": "Axis"}
}
}
Dictionary Type: Dictionary<int,object>
I am expecting a good amount of write on this elastic index and want to test the performance aspect and didn't find anything useful in elastic docs explaining explicitly about the dictionary indexing. Need help in below query
How does indexing work for the dictionary?
Is this indexing will be the same as List<object>?

that would be an object in Elasticsearch - https://www.elastic.co/guide/en/elasticsearch/reference/7.14/object.html
you could also make this super simple and just have a document per card, that way you flatten things out

Related

Joining two indexes in Elastic Search like a table join

I am relatively new to this elastic search. So, I have an index called post which contain documents like this:
{
"id": 1,
"link": "https:www.instagram.com/p/XXXXX/",
"profile_id": 11,
"like_count": 100,
"comment_count": 12
}
I have another index called profile which contain documents like this:
{
"id": 11,
"username": "superman",
"name": "Superman",
"followers": 12312
}
So, as you guys can see, I have all profiles data under the index called profile and all posts data under the index called post. The "profile_id" present in the post document is linked with the "id" present in the profile document.
Is there any way, when I am querying the post index and filtering out the post documents the profile data will also appear along with the post document based on the "profile_id" present in the post document? Or somehow fetch the both data doing a multi-index search?
Thank you guys in advance, any help will be appreciated.
For the sake of performance, Elasticsearch encourages you to denormalize your data and model your documents accordingly to the responses you wish to get from your queries. However, in your case, I would suggest defining the relation post-profile by using a Join datatype (link to Elastic documentation) and using the parent-join queries to run your searches (link to Elastic documentation).

How Keyword and Numeric data Types are stored in elastic search? is it stored in inverted index?

put sana/_mapping/learn { "properties": { "name":{"type":"text"}, "age":{"type":"integer"} } }
POST sana/learn { "name":"rosy", "age":23 }
Quoting the Elasticsearch doc:
Most fields are indexed by default, which makes them searchable. The
inverted index allows queries to look up the search term in unique
sorted list of terms, and from that immediately have access to the
list of documents that contain the term.
Keyword and numeric data types are also indexed and stored in the inverted index so that these fields are searchable, but if you want you can disable it by setting index type to false, in your index mapping, also on these fields(keyword,numeric) doc_values is enabled by default sorting and aggregations etc, but not enabled on analyzed string(text) fields.
Hope I answered your question and let me know if you have any doubt.

how elastic search find document content by doc id

There are many articles talking about inverted index and posting list in elastic search. But I did not find any article which explain that how elastic search find document content by doc id.
Could anyone explain this to me?
thx.
Ragav is correct. However, I do have a bit to add that may help you work with document Ids.
When you index documents that don't have an ID, and ID is generated for you by ElasticSearch. That field name is "_id".
If you know the Id value of the document you wish to find, you can simply perform the query like this:
GET my_index/_search
{
"query": {
"terms": {
"_id": [ "1", "2" ]
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html
The above query would return documents that have have _id equal to 1 OR 2.
As Ragav said in his answer, if you created documents in the way described with id 1 or 2, you would return them with that sample query I pulled from the ElasticSearch documentation.
Hope this helps.
Elasticsearch is built on top of Lucene.
When you index a new document onto Elasticsearch, it indexes _index, _type and _id as a part of the document along with the actual content(_source).
So, when you try to get a document using the get API _index/_type/_id, it is basically converted into a query which searches for doc matching the _index, _type and the _id.
This is how Elasticsearch is able to return you the document.

elastic search: get exact match term results

I have elastic search index with documents having a field "backend_name" like:- google, goolge_staging, google_stg1 etc.
I want only those documents that have "backend_name" = google
I am trying with the term query like this:
{ "query": { "term": { "backend_name": "google" } } }
But it returns me document having "backend_name" as goolge_staging, google_stg1 too. I want just document with "backend_name" = google.
One way to resolve it is to have goolge_staging, google_stg1 etc. in must not list but I want some better way. Suggestions?
It is provably because of the mapping you are using.
Take a look at the Elasticsearch documentation of term query
Try changing the mapping type to keyword so it matches only if it is an exact match.

ElasticSearch results aren't relevant

In ElasticSearch, I've created two documents with one field, "CategoryMajor"
In doc1, I set CategoryMajor to "Restaurants"
In doc2, I set CategoryMajor to "Restaurants Restaurants Restaurants Restaurants Restaurants"
If I perform a search for CategoryMajor:Restaurants, doc1 shows up as MORE RELEVANT than doc2. Which is not typical Lucene behavior, which gives more relevance the more times a term shows up. doc2 should be MORE RELEVANT than doc1.
How in do I fix this?
You can add &explain=true to your GET query to see that score of doc2 is lowered by "fieldNorm" factor. This is caused by default lucene similarity calculation formula, which lowers score for longer documents. Please read this document about default lucene similarity formula:
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/Similarity.html
To disable this behaviour add "omit_norms=true" for CategoryMajor field to your index mapping by sending PUT request to:
http://localhost:9200/index/type/_mapping
with request body:
{
"type": {
properties": {
"CategoryMajor": {
"type": "string",
"omit_norms": "true"
}
}
}
}
I'm not certain, but it may be necessary to delete your index, create it again, put above mapping and then reindex your documents. Reindexing after changing mapping is necessary for sure :).

Resources