I have the next document:
{
id: 222,
email: user#user.com,
experiences: [
{
id: 3,
position: "Programmer",
description: "Programming things"
init_date: "1990-01-01",
end_date: "1999-05-11"
},
{
id: 4,
position: "Full Stack Developer",
description: "Programming things"
init_date: "1999-01-01",
end_date: "2008-05-11"
},
{
id: 7,
position: "Gardener",
description: "Taking care of flowers"
init_date: "2009-01-01",
end_date: "2015-05-11"
},
]
}
So, I would like to do the next filter: keyword: programming, experience years: > 3
The experience years should be the sum of the experiences that match the keyword.
Is it possible to do in only one query?
At the time of Indexing itself you add one extra field for experience, Instead of calculating with query. It will be faster easy to query also.
Related
we have a doc structure
[{
score: 10,
list: [ // nested type
{
id : 3,
value: 10
},
{
id: 4
value: 20
},
{
id: 5,
value: 15
}
]
},
{
score: 1,
list: [
{
id : 3,
value: 4
},
{
id: 4
value: 3
},
{
id: 5,
value: 2
}
]
}...
we're trying to do matrix_stats aggregation on field “score” and nested field “value” and considering only specific ids like 4,
we couldn't find a way to do matrix_stats aggregation on both nested(list.value) and non nested(score) field
example, in 1st doc score is 10 and value of nested list matching id = 4 is 20 … likewise
any possible ways to get the value by scripts or runtimeMapping etc….
or is there any way we can access parent field inside nested aggregation for matrix_stats aggregation
I have a node.js app that serves sqlite database data using sequelize via graphql.
The data types are following:
Student
id: ID!
name: String!
class: Class
Class
id: ID!
title: String!
floor: Int!
I want to get an array of classes with each class containing an array of corresponding students like
[
{
"id": 1,
"title" : "first",
"students" : [
{ "id": 1, "name": "John Doe" },
{ "id": 2, "name": "Mary Smith" }
]
]
Is there any way to do it despite of that kind of reverse linking?
If your database isn't large enough yet, you could always run a one-time script to link it the other way as well, with:
Class {
id: ID!
title: String!
floor: Int!
students: [Student]!
}
Otherwise, you would have to do a manual query for all students that are taking a certain class and join that data into your query for classes.
Indexed documents are like:
{
id: 1,
title: 'Blah',
...
platform: {id: 84, url: 'http://facebook.com', title: 'Facebook'}
...
}
What I want is count and output stats-by-platform.
For counting, I can use terms aggregation with platform.id as a field to count:
aggs: {
platforms: {
terms: {field: 'platform.id'}
}
}
This way I receive stats as a multiple buckets looking like {key: 8, doc_count: 162511}, as expected.
Now, can I somehow add to those buckets also platform.name and platform.url (for pretty output of stats)? The best I've came with looks like:
aggs: {
platforms: {
terms: {field: 'platform.id'},
aggs: {
name: {terms: {field: 'platform.name'}},
url: {terms: {field: 'platform.url'}}
}
}
}
Which, in fact, works, and returns pretty complicated structure in each bucket:
{key: 7,
doc_count: 528568,
url:
{doc_count_error_upper_bound: 0,
sum_other_doc_count: 0,
buckets: [{key: "http://facebook.com", doc_count: 528568}]},
name:
{doc_count_error_upper_bound: 0,
sum_other_doc_count: 0,
buckets: [{key: "Facebook", doc_count: 528568}]}},
Of course, name and url of platform could be extracted from this structure (like bucket.url.buckets.first.key), but is there more clean and simple way to do the task?
It seems the best way to show intentions is top hits aggregation: "from each aggregated group select only one document", and then extract platform from it:
aggs: {
platforms: {
terms: {field: 'platform.id'},
aggs: {
platform: {top_hits: {size: 1, _source: {include: ['platform']}}}
}
}
This way, each bucked will look like:
{"key": 7,
"doc_count": 529939,
"platform": {
"hits": {
"hits": [{
"_source": {
"platform":
{"id": 7, "name": "Facebook", "url": "http://facebook.com"}
}
}]
}
},
}
Which is kinda too deeep (as usual with ES), but clean: bucket.platform.hits.hits.first._source.platform
If you don't necessarily need to get the value of platform.id, you could get away with a single aggregation instead using a script that concatenates the two fields name and url:
aggs: {
platforms: {
terms: {script: 'doc["platform.name"].value + "," + doc["platform.url"].value'}
}
}
I have a difficulties with elasticsearch.
Here is what I want to do:
Let's say unit of my index looks like this:
{
transacId: "qwerty",
amount: 150,
userId: "adsf",
client: "mobile",
goal: "purchase"
}
I want to build different types of statistics of this data and elasticsearch does it really fast. The problem I have is that in my system user can add new field in transaction on demand. Let's say we have another row in the same index:
{
transacId: "qrerty",
amount: 200,
userId: "adsf",
client: "mobile",
goal: "purchase",
token_1: "game"
}
So now I want to group by token_1.
{
query: {
match: {userId: "asdf"}
},
aggs: {
token_1: {
terms: {field: "token_1"},
aggs: {sumAmt: {sum: {field: "amount"}}}
}
}
}
Problem here that it will aggregate only documents with field token_1. I know there is aggregation missing and I can do something like this:
{
query: {
match: {userId: "asdf"}
},
aggs: {
token_1: {
missing: {field: "token_1"},
aggs: {sumAmt: {sum: {field: "amount"}}}
}
}
}
But in this case it will aggregate only documents without field token_1, what I want is to aggregate both types of documents in on query. I tried do this, but it also didn't work for me:
{
query: {
match: {userId: "asdf"}
},
aggs: {
token_1: {
missing: {field: "token_1"},
aggs: {sumAmt: {sum: {field: "amount"}}}
},
aggs: {
token_1: {
missing: {field: "token_1"},
aggs: {sumAmt: {sum: {field: "amount"}}}
}
}
}
}
I think may be there is something like operator OR in aggregation, but I couldn't find anything. Help me, please.
Im working on a nodejs+mongodb project using mongoose. Now I have come across a question I don't know the answer to.
I am using aggregation framework to get grouped results. The grouping is done on a date excluding time data field like: "2013 02 06". Code looks like this:
MyModel.aggregate([
{$match: {$and: [{created_date: {$gte: start_date}}, {created_date: {$lte: end_date}}]}},
{$group: {
_id: {
year: {$year: "$created_at"},
month: {$month: "$created_at"},
day: {$dayOfMonth: "$created_at"}
},
count: {$sum: 1}
}},
{$project: {
date: {
year: "$_id.year",
month:"$_id.month",
day:"$_id.day"
},
count: 1,
_id: 0
}}
], callback);
The grouped results are perfect, except that they are not sorted. Here is an example of output:
[
{
count: 1,
date: {
year: 2013,
month: 2,
day: 7
}
},
{
count: 1906,
date: {
year: 2013,
month: 2,
day: 4
}
},
{
count: 1580,
date: {
year: 2013,
month: 2,
day: 5
}
},
{
count: 640,
date: {
year: 2013,
month: 2,
day: 6
}
}
]
I know the sorting is done by adding this: {$sort: val}. But now I'm not sure what should be the val so the results would be sorted by date as my grouping key es an object of 3 values constructing the date. Does anyone know how this could be accomplished?
EDIT
Have tryed this and it worked :)
{$sort: {"date.year":1, "date.month":1, "date.day":1}}
It appears that this question has a very simple answer :) Just need to sort by multiple nesteed columns like this:
{$sort: {"date.year":1, "date.month":1, "date.day":1}}
I got stuck with the same problem, thanks for your answer.
But I found out that you can get the same result with less code
MyModel.aggregate([
{$match: {$and: [{created_date: {$gte: start_date}}, {created_date: {$lte: end_date}}]}},
{$group: {
_id: {
year: {$year: "$created_at"},
month: {$month: "$created_at"},
day: {$dayOfMonth: "$created_at"}
},
count: {$sum: 1}
}},
{$project: {
date: "$_id", // so this is the shorter way
count: 1,
_id: 0
}},
{$sort: {"date": 1} } // and this will sort based on your date
], callback);
This would work if you are only sorting by date if you had other columsn to sort on. YOu would need to expand _id