Ignoring duplicates within elastic search - elasticsearch

I have many records where the msg is 'a'. Some of these records have the same type.
I'm trying to write a query that counts the number of with msg 'a', but doesn't count duplicates.
Example:
1: msg = 'a', type = 'b'
2: msg = 'a', type = 'b'
3: msg = 'a', type
= 'c'
This should return a count of two because the first and second records have the same type and are only counted once.
Here is my query so far.
body: {
query: {
bool: {
must: [
{
range: {
"#timestamp" => { from: 'now-1d', to: 'now' }
}
},
{ match: { msg: 'a' }}
]
}
}
}
Any help is appreciated!

Try using aggregations they'll count it for you :)
Read here:
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-aggregations-bucket-terms-aggregation.html
And try something like this:
body:{
query: {
bool: {
must: [
{
range: {
"#timestamp" => { from: 'now-1d', to: 'now' }
}
},
{ match: { msg: 'a' }}
]
}
}
},
aggs:{
"type":{
"terms":{
"field":"type"
}
}
}
}

Related

How to reformat result in graphql?

I have a query like this:
query GetProductsBySlug {
categories(
where:{slug: "pianos"}
) {
products (first: 2) {
id,
slug,
name,
price,
images(first: 1) {
url
}
}
}
}
Results from this query:
{
"data": {
"categories": [
{
"products": [
{
...
]
}
]
}
}
I want to remove "data":...", "categories":..., and only get this result:
{
"products": [
{
...
}]
}
How I can do it?
I want that result only include {"products": ...} without data and categories.
You can use just [destructuring assignment][1] and get just the categories like
const {data:{categories}} = queryResult;
and then map this categories array like
categories.map(category=> category.products)
[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment

Pass each matched record value to filter in Elasticsearch

For geo_distance query I'm using a constant value for distance. I need to make it dynamic. So I want to pass the above matched record radius value to distance.
Here's the code:
let searchRadius = '12KM'
query: {
bool: {
must: {
match: {
companyName: {
query: req.text
}
}
},
filter: {
geo_distance: {
distance: searchRadius,//here I want to pass doc['radius']
location: {
lat: parseFloat(req.lat),
lon: parseFloat(req.lon)
}
}
},
}
}
For each record, I have a different radius value. I want to pass doc['radius'] instead of constant searchRadius value.
I can hit two queries then iterate the values but it's not optimal. Can anyone suggest how can I pass each record value to geo_distance filter?
I have resolved from this answer.
Heres the code
query: {
bool: {
must: [
{
match: {
companyName: {
query: req.text
}
}
},
{
script: {
script: {
params: {
lat: parseFloat(req.lat),
lon: parseFloat(req.lon)
},
source: "doc['location'].arcDistance(params.lat, params.lon) / 1000 < doc['searchRadius'].value",
lang: "painless"
}
}
}
]
}
},
Using script Query, from more details:
https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-script-query.html

Hasura: How to filter by user specified ranges, where data contains ranges

In the database, there are shoes of sizes 41.0, or 41.5-42, 42-43, etc. I have them saved as:
41.0 -> shoes_min=41.0; shoes_max=41.0
41.5-42 -> shoes_min=41.5; shoes_max=42.0
42-43 -> shoes_min=42.0; shoes_max=43.0
My proposed filter to show up all of the above examples as a user might (on the frontend there would just be 2 input values of (41, 42) which indicates the range the user is interested in. It should match all of the above 3 examples, except in my case it is not matching entries with shoes_min=42; shoes_max=43
query MyQuery {
models(
where: {
shoes_min: {_gte: "41"}
_and: {
shoes_min: {_lte: "42"}
},
_or: {
shoes_max: {_gte: "41"}
_and: {
shoes_max: {_lte: "42"}
},
},
},
order_by: {shoes_max: asc}
) {
shoes_min
shoes_max
name
url
}
}
Here is the correct request body:
query MyQuery {
models(
where: {
_or: [
{_and: [{shoes_min: {_gte: 41}}, {shoes_min: {_lte: 42}}]},
{_and: [{shoes_max: {_gte: 41}}, {shoes_max: {_lte: 42}}]}
]
},
order_by: [{shoes_min: asc}, {shoes_max: asc}]
) {
shoes_min
shoes_max
name
}
}

Elasticsearch custom sorting / adding filter clauses scores

I have this simple documents set:
{
id : 1,
book_ids : [2,3],
collection_ids : ['a','b']
},
{
id : 2,
book_ids : [1,2]
}
If I run this filter query, it will match both documents:
{
bool: {
filter: [
{
bool: {
should: [
{
bool: {
must_not: {
exists: {
field: 'book_ids'
}
}
}
},
{
bool: {
filter: {
term: {
book_ids: 2
}
}
}
}
]
}
},
{
bool: {
should: [
{
bool: {
must_not: {
exists: {
field: 'collection_ids'
}
}
}
},
{
bool: {
filter: {
term: {
collection_ids: 'a'
}
}
}
}
]
}
}
]
}
}
The thing is I want to sort these documents, and I would like the first one (id: 1) to be returned first because it matched both the book_ids value and the collection_ids values provided.
A simple sort clause like this one is not working:
[
'book_ids',
'collection_ids'
]
because it will return first document 2 due to the book_ids array first value.
Edit: this is a simplified example of the problem I am facing, which has N such clauses in the should clause. Moreover there is an order between the clauses, as I tried to reflect with the sort snippet: results matching the first clause (book_ids) should appear before results matching the second clause (collection_ids). I am really looking for some kind of SQL sort operation where I would only take into account the matching value of the field array. A viable option might be to assign decreasing constant_scores to each term clause, according to the expected sort order, and ES would have to sum this sub-scores to compute the final score. But I cannot figure out how to do it or if it is even possible.
Bonus question:
is there any way for ElasticSearch to return some kind of new document with only the matching values? Here is what I would expect as a response to the above filter query:
{
id : 1,
book_ids : [2],
collection_ids : ['a']
},
{
id : 2,
book_ids : [2]
}
I think you're right about the constant score idea. I think you can do it like this:
{
query: {
bool: {
must: [
{
bool: {
should: [
{
bool: {
must_not: {
exists: {
field: 'book_ids'
}
}
}
},
{
constant_score: {
filter: {
term: {
book_ids: 2
}
},
boost: 100
}
}
]
}
},
{
bool: {
should: [
{
bool: {
must_not: {
exists: {
field: 'collection_ids'
}
}
}
},
{
constant_score: {
filter: {
term: {
collection_ids: 'a'
}
},
boost: 50
}
}
]
}
}
]
}
}
}
I think the only thing you were missing using constant score, was likely just that the top level query needs to be must, not filter. (There's no scoring for filters, all the scores are 0.)
An alternative would be to put the filter inside a function_score query (but leave it as a filter), and then compute the score as you want (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html)
As to the bonus question, it's possible if you use a script field to filter and add a new field like you want (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html), but it's not possible in a straightforward way. It's probably easier and makes more sense to do that filtering after you receive the result, unless you have very long lists in your values.

Partial matching not working in this query

Why does the following only match exact, and not partial?
body: {
query: {
filtered: {
filter: {
bool: {
should: [
{ query: { match: { "name": "*"+searchterm+"*" }}},
]
}
}
}
}
}
"*"+searchterm+"*" should match any words that contains searchterm. ie,
item1
item2
0item
But it only matches words exact searchterm ie, only item. Why is this?
If the name field is using default analyzer then the asterisk wildcard characters are dropped during analysis phase. Hence you always get results where name is exactly sarchterm. You need to use a Wildcard query for matching any document where value of name field contains searchterm.
query: {
filtered: {
filter: {
bool: {
should: [
{
query: {
wildcard: {
"name": "*" + searchterm + "*"
}
}
}
]
}
}
}
}

Resources