fetch perticular number of documents satisfying multiple conditions - Elasticsearch - elasticsearch

I have a Elasticsearch index for an information of fruits as below
GET fruits/fruits_data/_search
[{ id: 1,
name: apple},
{ id: 2,
name: mango},
{ id: 3,
name: apple},
{ id: 4,
name: banana},
{ id: 5,
name: apple},
{ id: 6,
name: mango},
{ id: 7,
name: pineapple},
{ id: 8,
name: jackfruit}]
Now I need to fetch 7 fruits as per the priority (below):
{"apple": 3, "banana": 3, "mango": 2, "guava": 2, "pineapple": 1, "jackfruit": 1}
Here the key indicates the fruit to be fetched and valueindicates the maximum number of the document to be fetched.
This means I need to fetch maximum 3 apple, 3 banana and 1 mango and I can ignore the others in priority hash when I have required number of fruits. But here I have only 1 banana in my ES index so I need to fetch maximum 3 apple, 1 banana, 2 mango and 1 pineapple (Since guava is not present in index we need to ignore it.
Is there a way to fetch fruits like this in ES in a single query. I don't want to use multiple queries.
Thanks

It is not possible to fetch results directly,Try using Aggregation in elasticsearch. You can refer to link below,
[https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html]

Related

Adding fuzziness to an ElasticSearch prefix query

I have two documents:
{id: 1, name: "james"}
{id: 2, name: "james kennedy"}
I am using the match_bool_prefix API for autocomplete, and I would like to be able to match the document with id: 1 even if I incorrectly spell james.
Query: jamis
Desired output: finding document with id: 1.

Filtering products index in elasticsearch by user

I have an index of products. They have regular fields such as id, name, brand etc. Querying this index is working great, however I want to limit the products which are returned for specific users.
Say I have 5 products who’s IDs go from 1-5
Id: 1, name: “Product One”, brand: “Fake Brand”
Id: 2, name: “Product Two”, brand: “Fake Brand”
Id: 3, name: “Product Three”, brand: “Fake Brand”
Id: 4, name: “Product Four”, brand: “Fake Brand”
Id: 5, name: “Product Five”, brand: “Fake Brand”
If there’s no filter, and I search for brand: “Fake Brand”, I get 5 results.
But I want to add this functionality: Say I have two users. User 1 is only able to “see” product IDs 1, 2, and 5. And User 2 is only able to “see” a different subset, say, product IDs 1, 2, 4 and 5.
So if user 1 searches for brand: “Fake Brand”, he only gets back product with IDs 1, 2, and 5. Where as if user 2 searches for brand: “Fake Brand”, he only gets back products with IDs 1, 2, 4 and 5.
Is there a way to add a “user id” to this products query and then store somewhere else what products a user is able to see?
In SQL I would probably have a different table storing what products each user can see and then just do a join. But using ES I think I either have to have two separate indexes or to use nested or has_child/has_parent queries but I’m not entirely sure how to implement it.

Search in Elasticsearch objects from non blocked user

In Elaasticsearch we have 3 collections, users, products and blocked_users.
user[
{id: 1, first_name: “John”, last_name:“Snow”, …}
{id: 2, first_name: “Sarah”, last_name:“Connor”, …}
{id: 2, first_name: “Arnold”, last_name:“Schwarzenegger”, …}
]
products[
{id: 1, user_id: 3, title:“Apple”}
]
blocked_users[
{user_id: 2, blocked_user_id: 3} // this mean user with id 2 blocked user with id 3
]
I want to search products by title, but I don't want to show products of blocked users.
So, when user with id 1 search he will get a response with 1 product
when user with id 2 search 0 products is expected, because product user id is 3, but user with id 2 blocked this user
How should be query?

How to delete RethinkDB documents that cannot be joined?

I have a tables Person and Property. Each one has its own id. In addition, properties have their human owner, hence have a field person_id which "points" to the id in the Person table.
This is an example of three people that have some things. For some reason, person with id=3 was deleted. However, he/she still owns properties with id in [4,5,6].
Person (1k documents)
=====================
{id: 1, name: John, age: 25}
{id: 2, name: Peter, age: 28, pet: cat}
{id: 4, name: Alice}
...
Property (120k documents)
=========================
{id: 1, person_id: 1, name: house}
{id: 2, person_id: 1, name: car, color: blue}
{id: 3, person_id: 2, name: phone}
{id: 4, person_id: 3, name: house}
{id: 5, person_id: 3, name: watch, size: big}
{id: 6, person_id: 3, name: table: material: wood}
...
The question is "How to delete the properties documents that no longer have an existing person they belong to?", i.e. in this case "How to delete properties with id in [4,5,6]?"
person_id is a secondary index.
My thoughs were like somehow extract the properties ids that don't match any persons and than delete them. However, I have no idea how to achieve that.

CouchDB - hierarchical comments with ranking. Hacker News style

I'm trying to implement a basic way of displaying comments in the way that Hacker News provides, using CouchDB. Not only ordered hierarchically, but also, each level of the tree should be ordered by a "points" variable.
The idea is that I want a view to return it in the order I except, and not make many Ajax calls for example, to retrieve them and make them look like they're ordered correctly.
This is what I got so far:
Each document is a "comment".
Each comment has a property path which is an ordered list containing all its parents.
So for example, imagine I have 4 comments (with _id 1, 2, 3 and 4). Comment 2 is children of 1, comment 3 is children of 2, and comment 4 is also children of 1. This is what the data would look like:
{ _id: 1, path: ["1"] },
{ _id: 2, path: ["1", "2"] },
{ _id: 3, path: ["1", "2", "3"] }
{ _id: 4, path: ["1", "4"] }
This works quite well for the hierarchy. A simple view will already return things ordered the way I want it.
The issue comes when I want to order each "level" of the tree independently. So for example documents 2 and 4 belong to the same branch, but are ordered, on that level, by their ID. Instead I want them ordered based on a "points" variable that I want to add to the path - but can't seem to understand where I could be adding this variable for it to work the way I want it.
Is there a way to do this? Consider that the "points" variable will change in time.
Because each level needs to be sorted recursively by score, Couch needs to know the score of each parent to make this work the way you want it to.
Taking your example with the following scores (1: 10, 2: 10, 3: 10, 4: 20)
In this case you'd want the ordering to come out like the following:
.1
.1.4
.1.2
.1.2.3
Your document needs a scores array like this:
{ _id: 1, path: [1], scores: [10] },
{ _id: 2, path: [1, 2], scores: [10,10] },
{ _id: 3, path: [1, 2, 3], scores: [10,10,10] },
{ _id: 4, path: [1, 4], scores: [10,20] }
Then you'll use the following sort key in your view.
emit([doc.scores, doc.path], doc)
The path gets used as a tiebreaker because there will be cases where sibling comments have the exact same score. Without the tiebreaker, their descendants could lose their grouping (by chain of ancestry).
Note: This approach will return scores from low-to-high, whereas you probably want scores (high to low) and path/tiebreaker(low to high). So a workaround for this would be to populate the scores array with the inverse of each score like this:
{ _id: 1, path: [1], scores: [0.1] },
{ _id: 2, path: [1, 2], scores: [0.1,0.1] },
{ _id: 3, path: [1, 2, 3], scores: [0.1,0.1,0.1] },
{ _id: 4, path: [1, 4], scores: [0.1,0.2] }
and then use descending=true when you request the view.
Maybe anybody interestingly the thread on this question with variants of solutions:
http://mail-archives.apache.org/mod_mbox/couchdb-dev/201205.mbox/thread -> theme "Hierarchical comments Hacker News style" 16/05/2012

Resources