is there any way where i can apply group and pagination using createQuery? - rethinkdb

Query like this,
http://localhost:3030/dflowzdata?$skip=0&$group=uuid&$limit=2
and dflowzdata service contains data like,
[
{
"uuid": 123456,
"id": 1
},
{
"uuid": 123456,
"id": 2
},
{
"uuid": 7890,
"id": 3
},
{
"uuid": 123456,
"id": 4
},
{
"uuid": 4567,
"id": 5
}
]
Before Find Hook like,
if (query.$group !== undefined) {
let value = hook.params.query.$group
delete hook.params.query.$group
const query = hook.service.createQuery(hook.params.query);
hook.params.rethinkdb = query.group(value)
}
Its gives correct result but without pagination, like I need only two records but its give me all records
result is,
{"total":[{"group":"123456","reduction":3},{"group":"7890","reduction":1},{"group":"4567","reduction":3}],"data":[{"group":"123456","reduction":[{"uuid":"123456","id":1},{"uuid":"123456","id":2},{"uuid":"123456","id":4}]},{"group":"7890","reduction":[{"uuid":"7890","id":3}]},{"group":"4567","reduction":[{"uuid":"4567","id":5}]}],"limit":2,"skip":0}
can anyone help me how should get correct records using $limit?

According to the documentation on data types, ReQL commands called on GROUPED_DATA operate on each group individually. For more details, read the group documentation. So limit won't apply to the result of group.
The page for group tells: to operate on all the groups rather than operating on each group [...], you can use ungroup to turn a grouped stream or grouped data into an array of objects representing the groups.
Hence ungroup to apply functions to group's result:
r.db('db').table('table')
.group('uuid')
.ungroup()
.limit(2)

Related

Filter with complex key not work (using startkey and endkey)

I create a view with Map function:
function(doc) {
if (doc.market == "m_warehouse") {
emit([doc.logTime,doc.dbName,doc.tableName], 1);
}
}
I want to filter the data with multi-keys:
_design/select_data/_view/new-view/?limit=10&skip=0&include_docs=false&reduce=false&descending=true&startkey=["2018-06-19T09:16:47,527","stage"]&endkey=["2018-06-19T09:16:43,717","stage"]
but I still got:
{
"total_rows": 248133,
"offset": 248129,
"rows": [
{
"id": "01CGBPYVXVD88FPDVR3NP50VJW",
"key": [
"2018-06-19T09:16:47,527",
"ods",
"o_ad_dsp_pvlog_realtime"
],
"value": 1
},
{
"id": "01CGBQ6JMEBR8KBMB8T7Q7CZY3",
"key": [
"2018-06-19T09:16:44,824",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
},
{
"id": "01CGBQ4BKT8S2VDMT2RGH1FQ71",
"key": [
"2018-06-19T09:16:44,707",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
},
{
"id": "01CGBQ18CBHQX3F28649YH66B9",
"key": [
"2018-06-19T09:16:43,717",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
}
]
}
the key "ods" should not in the results.
What did I do wrong?
Your query is not multi-key .. ist start and endkey.
if you want to have results by dbname in a special time range.. you need to change the emit to [doc.dbName,doc.logTime,doc.tableName]
then you query startkey=["stage","2018-06-19T09:16:43,717"]&endkey=["stage","2018-06-19T09:16:47,527"]
(btw. are you sure that your timestamp is in the right order ? In your example the second TS is larger than the first..)
As you have chosen a full date/time stamp as the first level of your key, down to millisecond precision, there are unlikely to be any repeating values in the first level of your compound key. If you indexed just the date, say, as the first key, your date would be grouped by date, dbame and table name in a more predictable way
e.g.
["2018-06-19","ods","o_ad_dsp_pvlog_realtime"]
["2018-06-19","stage","s_ad_ztc_realpv_base_indirect"]
["2018-06-19",stage","s_ad_ztc_realpv_base_indirect"
["2018-06-19","stage","s_ad_ztc_realpv_base_indirect"
With this key structure, the hierarchical grouping of keys works in your favour i.e. all the data from "2018-06-19" is together in the index, with all the data matching ["2018-06-19","stage"] adjacent to each other.
If you need to get to millisecond precision, you could index the data as follows:
function(doc) {
if (doc.market == "m_warehouse") {
emit([doc.dbName,doc.logTime], 1);
}
}
This would create index organised by dbName, but with a secondary sort on time. You can then extract the data for specified dbName between two timestamps.

Count Unique Objects

My index looks like this:
"_source": {
"ProductName": "Random Product Name",
"Views": {
"Washington": [
{ "4nce5bbszjfppltvc": "2018-04-07T18:25:16.160Z" },
{ "4nce5bba8jfpowm4i": "2018-04-07T18:05:39.714Z" },
{ "4nce5bbszjfppltvc": "2018-04-07T18:36:23.928Z" },
]
}
}
I am trying to count the number of unique objects in Views.Washington.
In this case, the result would be 2, since two objects have the same key names. ( first and third object in the array ).
Obviously, my first thought was to use aggregations, but I am not sure how to use them with nested objects, like these.
Can this be done with normal aggregations?
Will I need to use a script?
Yes this can be done with Aggregations: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-nested-aggregation.html

Using elastic search to build flow/funnel results based on unique identifiers

I want to be able to return a set of counts of individual documents from a single index based on a previous set of results, and am wondering if there is a way to do it without running a separate query for each.
So, given a data set like this (simplified version of my ES documents):
{
"name": "visit",
"sessionId": "session1"
},
{
"name": "visit",
"sessionId": "session2"
},
{
"name": "visit",
"sessionId": "session3"
},
{
"name": "click",
"sessionId": "session1"
},
{
"name": "click",
"sessionId": "session3"
}
What I would like to do is be able to search for name: visit and give a count of all those. That part is easy. But I would also like to be able to now count my name: click docs that have the sessionId of the name: visit result set and return a count of how many of those name: click there were as well as the name: visit.
Is there an easy way to do this? I have looked at aggregation APIs but they all seem to not quite fit my needs. There also seems to be a parent/child relationship but it doesn't apply to my situation since both documents I want to individually get counts of are of the same type.
Expected result would be something like this:
{
"count": {
// total number of visit events since this is my start point
"visit": 3,
// the amount of click results that have sessionId
// matching my previous search's sessionId values
"click": 2
}
}
At first glance, you need to do this in two queries:
the first aggregation query to retrieve the sessionIds and
a second aggregation query filtered with those sessionIds to find the count of clicks.
I don't think it's a big deal to run those two queries, but that depends on how much data you have and how many sessionIds you want to retrieve at once.

how to groupBY using spring data

hi i'm using spring data in My project and I'm trying group by two fields, heres the request:
#Query( "SELECT obj from Agence obj GROUP BY obj.secteur.nomSecteur,obj.nomAgence" )
Iterable<Agence> getSecteurAgenceByPc();
but it doesnt work for me..what i want is this result:
-Safi
-CTM
CZC1448YZN
2UA13817KT
-Rabat
-CTM
CZC1349G1B
2UA0490SVR
-Agdal
G3M4NOJ
-Essaouira
-CTM
CZC1221B85
-Gare Routiere Municipale
CZC145YL3
What I get is
{
"status": 0,
"data":
[
{
"secteur": "Safi",
"agence": "CTM"
},
{
"secteur": "Safi",
"agence": "Dep"
},
{
"secteur": "Rabat",
"agence": "Agdal"
},
{
"secteur": "Rabat",
"agence": "CTM"
},
{
"secteur": "Essaouira",
"agence": "CTM"
},
{
"secteur": "Essaouira",
"agence": "Gare Routiere Municipale"
}
]
}
What you want is not possible with JPQL.
What does Group By do?
It combines all rows that are identical in the columns in the group by clause in to one row. Since it combines multiple rows into one, data in other columns can only be present in some combined fashion. For example, you can include MIN/MAX or AVG values, but never the orginal values.
Also the result with always be a table, never a tree.
Also note: there is no duplicated data. Every combination of secteur and agence appears exactly once.
If you want a tree structure, you have to write some java code for that.

How to define document ordering based on filter parameter

Hi Elasticsearch experts.
I have a problem which might be realted to the fact I am indexing DB relational data.
My scenario is the following:
I have two entities:
documents and meetings.
Documents and meetings are independent entities. Although it is possible to assign documents to meetings in a given order.
We are using a join table for this in the DB.
meetings(id,name,date)
document(id,title,author)
meeting_document(doc_id,meeting_id,order)
In elasticsearch I am indexing the documents_id as NESTED property of the meeting
meeting example:
{
id: 25
name:"test",
documents: [22,12,24,55]
}
I will fetch the meeting, after this I would like to send a request to the documents filtering on document.id and asking elasticsearch to return the list in the same order I passed in the list of ids to the filter.
What is the best way to implement this ?
Thanks
Nice Question,
I've spent some time figuring a solution for you and come up with a solution, It might be tricky one but works.
Lets have a look to my query,
I've used script score, for sorting by user defined list.
POST index/type/_search
{
"query": {
"function_score": {
"functions": [
{
"script_score": {
"script": "ar.size()-ar.indexOf(doc['docid'].value)",
"params": {
"ar": [
"1",
"2",
"4",
"3"
]
}
}
}
]
}
},
"filter": {
"terms": {
"docid": [
"1",
"2",
"4",
"3"
]
}
}
}
The thing you have to take care is,
send, same value for filter and in params. Like in the above query.
This returns me hits with doc ids, 1, 2, 4, 3 .
You have to change field name inside script and in filter, and you can use termQuery inside query object.
I've tested the code, Hope this helps!!
Thanks

Resources