I have the following documents stored:
{
"date": 1437429603126,
"id": "7c578fe6-5eeb-466c-a79a-628784fd0d16",
"quote": {
"c": "+2.45",
"c_fix": "2.45",
"ccol": "chg",
"cp": "1.89",
"cp_fix": "1.89",
"div": "0.52",
"e": "NASDAQ",
"ec": "+0.58",
"ec_fix": "0.58",
"eccol": "chg",
"ecp": "0.44",
"ecp_fix": "0.44",
"el": "132.65",
"el_cur": "132.65",
"el_fix": "132.65",
"elt": "Jul 20, 5:59PM EDT",
"id": "22144",
"l": "132.07",
"l_cur": "132.07",
"l_fix": "132.07",
"lt": "Jul 20, 4:09PM EDT",
"lt_dts": "2015-07-20T16:09:40Z",
"ltt": "4:09PM EDT",
"pcls_fix": "129.62",
"s": "2",
"t": "AAPL",
"yld": "1.57"
}
}
And looking to run a query that selects fields quote.t, quote.l, quote.c, quote.cp where t is AAPL order by date. The piece that is missing is grouping by multiple documents in the same day. The logic I need is take the oldest document where quote.t = AAPL. So basically there should only be a single document returned each day, and that document should have the greatest date.
Here is what I have so far, missing the grouping of multiple documents in a single day though.
r.db('macd').table('daily_closes').filter({
'quote': {
't': 'AAPL'
}
}).orderBy('date').pluck('date', {
'quote': [
't',
'l',
'c',
'cp'
]
})
Also, I have secondary indexes, how can I use those in the query?
You need to group by date, but you store day as epoch time. So you need a way to turn it into day and group. We can then group by that value, and sort the reduction array in desc order, then get the first element of that array with nth.
r.table('daily_closes').filter({
'quote': {
't': 'AAPL'
}
}).orderBy('date')
.pluck('date', {
'quote': [
't',
'l',
'c',
'cp'
]
}).group(r.epochTime(r.row('date').div(1000)).date()).orderBy(r.desc('date')).nth(0)
You may got something like this:
{
"group": Mon Jul 20 2015 00:00:00 GMT+00:00 ,
"reduction": {
"_date": Mon Jul 20 2015 00:00:00 GMT+00:00 ,
"date": 1437429603126 ,
"quote": {
"c": "+2.45" ,
"cp": "1.89" ,
"l": "132.07" ,
"t": "AAPL"
}
}
}
So let's reduce noise, we will ungroup it. Basically without ungroup, you are operating on sub stream of each group, when you ungroup, they become a single document. We also only care about data inside reduction, because that contains a single, first document. Here is the final query:
r.table('daily_closes').filter({
'quote': {
't': 'AAPL'
}
}).orderBy('date')
.pluck('date', {
'quote': [
't',
'l',
'c',
'cp'
]
})
.group(r.epochTime(r.row('date').div(1000)).date()).orderBy(r.desc('date')).nth(0)
.ungroup()
.getField('reduction')
Now, let's use index.
First, filter is slow, and limit to 100k document, order without index is slow. Let's switch to getAll with an index. But we cannot order with an index follow by a getAll. So we will use this trick:
Create an index for both value and using between:
r.table('daily_closes').indexCreate('quote_date', [r.row('quote')('t'),r.row('date')])
Now, we use between:
r.table('daily_closes')
.between(['AAPL', r.minval], ['AAPL', r.maxval],{index: 'quote_date'})
.pluck('date', {
'quote': [
't',
'l',
'c',
'cp'
]
})
.group(r.epochTime(r.row('date').div(1000)).date())
.orderBy(r.desc('date')).nth(0)
.ungroup()
.getField('reduction')
I hope this helps.
Related
I'm using GraphQL with .NET Core. I have query like below. As I'm new in GraphQL.NET, I can't understand how to group individual key as array.
`{
readingQuery{
readingsDBFilter(buildingId: 30, objectId: 1, datafieldId: 1, startTime: "02-05-201812-00-00-AM", endTime: "30-05-201811-59-00-PM"){
value,
unix
}
}
}`
I have Output Like this
`{
"data": {
"readingQuery": {
"readingsDBFilter": [
{
"value": 0.66,
"unix": 1525254180000
},
{
"value": 0.68,
"unix": 1525254240000
}
]
}
}
}`
But, Is it possible to return result like this from query.
`{
"data": {
"readingQuery": {
"readingsDBFilter": [
{
"value":[ 0.66, 0.68],
"unix": [1525254180000, 1525254240000]
}
]
}
}
}`
Looks like you need to group values from different records
I guess you have two option here
1) try to group it on SQL level (maybe better to create dateview)
2) do it on runtime level, in code. from my point of view - it's bad. any grouping in code it's much slower then the same operation in db-level
I'm storing room objects in an index like this:
{
"name":"room1",
"availability":"10",
"reservations": [
{
"start_date": "2019-09-12",
"end_date": "2019-09-15",
},
{
"start_date": "2019-09-17",
"end_date": "2019-09-19"
}
]}
Given a new startDate and endDate,
how can I match all rooms where room.availability is greater than the
number of reservations that overlap with these dates?
Have you tried using a range query and a script query to only return the document according to your predicate ?
elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html
I'm working on a google sheets integration project where I'd like to add formatted text to cells (bold, italic). This needs to be for only part of the cell (e.g. only some of the text in the cell is bold ) I can see that this can be done though the CellData object, documented in the sheets api here:
CellData
But I can't work out how to get an instance of these objects. I'm using the sheets service to successfully get a SpreadSheet, Sheet and ValueRange objects, but I can't work out how to get through to the cell data objects themselves to use these methods.
When a part of value of a cell has several formats, you want to retrieve the formats.
You want to put a value with several formats to a cell.
I understand your question as above. If my understanding is correct, how about these samples?
1. Retrieve value
When a part of value of a cell has several formats like below image,
the script for retrieving the values with the formats is as follows.
Sample script:
This sample script retrieves the value from the cell "A1" of "Sheet1".
spreadsheet_id = '### spreadsheet ID ###'
ranges = ['Sheet1!A1']
fields = 'sheets(data(rowData(values(textFormatRuns,userEnteredValue))))'
response = service.get_spreadsheet(spreadsheet_id, ranges: ranges, fields: fields)
Result:
{
"sheets": [
{
"data": [
{
"rowData": [
{
"values": [
{
"userEnteredValue": {
"stringValue": "abcdefg"
},
"textFormatRuns": [
{
"format": {}
},
{
"format": {
"fontSize": 24,
"foregroundColor": {
"red": 1
},
"bold": true
},
"startIndex": 2
},
{
"format": {},
"startIndex": 5
}
]
}
]
}
]
}
]
}
]
}
2. Put value
When a value with several formats is put to a cell, the script is as follows.
Sample script:
This sample script puts the value to the cell "B1" of "Sheet1". As a sample, update_cells is used for this situation.
spreadsheet_id = '### spreadsheet ID ###'
requests = {requests: [
update_cells: {
fields: 'userEnteredValue,textFormatRuns',
range: {sheet_id: 0, start_row_index: 0, end_row_index: 1, start_column_index: 1, end_column_index: 2},
rows: [{values: [{user_entered_value: {
string_value: 'abcdefg'},
text_format_runs: [{format: {}}, {format: {font_size: 24, foreground_color: {red: 1}, bold: true}, start_index: 2}, {format:{}, start_index: 5}]
}]}]
}
]}
response = service.batch_update_spreadsheet(spreadsheet_id, requests, {})
About sheet_id: 0, if you want to other sheet, please modify it.
Result:
Note:
These sample scripts supposes that your environment can use Sheets API.
These are simple samples. So please modify them to your situation.
References:
spreadsheets.get
spreadsheets.batchUpdate
textFormatRuns
updateCells
I create a view with Map function:
function(doc) {
if (doc.market == "m_warehouse") {
emit([doc.logTime,doc.dbName,doc.tableName], 1);
}
}
I want to filter the data with multi-keys:
_design/select_data/_view/new-view/?limit=10&skip=0&include_docs=false&reduce=false&descending=true&startkey=["2018-06-19T09:16:47,527","stage"]&endkey=["2018-06-19T09:16:43,717","stage"]
but I still got:
{
"total_rows": 248133,
"offset": 248129,
"rows": [
{
"id": "01CGBPYVXVD88FPDVR3NP50VJW",
"key": [
"2018-06-19T09:16:47,527",
"ods",
"o_ad_dsp_pvlog_realtime"
],
"value": 1
},
{
"id": "01CGBQ6JMEBR8KBMB8T7Q7CZY3",
"key": [
"2018-06-19T09:16:44,824",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
},
{
"id": "01CGBQ4BKT8S2VDMT2RGH1FQ71",
"key": [
"2018-06-19T09:16:44,707",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
},
{
"id": "01CGBQ18CBHQX3F28649YH66B9",
"key": [
"2018-06-19T09:16:43,717",
"stage",
"s_ad_ztc_realpv_base_indirect"
],
"value": 1
}
]
}
the key "ods" should not in the results.
What did I do wrong?
Your query is not multi-key .. ist start and endkey.
if you want to have results by dbname in a special time range.. you need to change the emit to [doc.dbName,doc.logTime,doc.tableName]
then you query startkey=["stage","2018-06-19T09:16:43,717"]&endkey=["stage","2018-06-19T09:16:47,527"]
(btw. are you sure that your timestamp is in the right order ? In your example the second TS is larger than the first..)
As you have chosen a full date/time stamp as the first level of your key, down to millisecond precision, there are unlikely to be any repeating values in the first level of your compound key. If you indexed just the date, say, as the first key, your date would be grouped by date, dbame and table name in a more predictable way
e.g.
["2018-06-19","ods","o_ad_dsp_pvlog_realtime"]
["2018-06-19","stage","s_ad_ztc_realpv_base_indirect"]
["2018-06-19",stage","s_ad_ztc_realpv_base_indirect"
["2018-06-19","stage","s_ad_ztc_realpv_base_indirect"
With this key structure, the hierarchical grouping of keys works in your favour i.e. all the data from "2018-06-19" is together in the index, with all the data matching ["2018-06-19","stage"] adjacent to each other.
If you need to get to millisecond precision, you could index the data as follows:
function(doc) {
if (doc.market == "m_warehouse") {
emit([doc.dbName,doc.logTime], 1);
}
}
This would create index organised by dbName, but with a secondary sort on time. You can then extract the data for specified dbName between two timestamps.
Say I've got a dynamic array A of values [x,y,z].
I want to return all results for which property P has a value that exists in A.
I could write some recursive filter that concatenates 'or's for each value in A, but it's extremely clunky.
Any other out-of-the-box way to do this?
You can use the filter command in conjunction with the reduce and contains command to accomplish this.
Example
Let's say you have the following documents:
{
"id": "41e352d0-f543-4731-b427-6e16a2f6fb92" ,
"property": [ 1, 2, 3 ]
}, {
"id": "a4030671-7ad9-4ab9-a21f-f77cba9bfb2a" ,
"property": [ 5, 6, 7 ]
}, {
"id": "b0694948-1fd7-4293-9e11-9e5c3327933e" ,
"property": [ 2, 3, 4 ]
}, {
"id": "4993b81b-912d-4bf7-b7e8-e46c7c825793" ,
"property": [ "b" ,"c" ]
}, {
"id": "ce441f1e-c7e9-4a7f-9654-7b91579029be" ,
"property": [ "a" , "b" , "c" ]
}
From these sequence, you want to get all documents that have either "a" or 1 in their property property. You can write a query that returns a chained contains statement using reduce.
r.table('30510212')
// Filter documents
.filter(function (row) {
// Array of properties you want to filter for
return r.expr([ 1, 'a' ])
// Insert `false` as the first value in the array
// in order to make it the first value in the reduce's left
.insertAt(0, false)
// Chain up the `contains` statement
.reduce(function (left, right) {
return left.or(row('property').contains(right));
});
})
Update: Better way to do it
Actually, you can use 2 contains to execute the same query. This is shorter and probably a bit easier to understand.
r.table('30510212')
.filter(function (row) {
return row('property').contains(function (property) {
return r.expr([ 1, 'a' ]).contains(property);
})
})