Clichouse visitParamExtract - extract second value by key in case with duplicated keys - clickhouse

visitParamExtractString is extract value by key. If key is not uniq. Function just return first match. What a possbile ways to get second match or third match ?
From docs :
Fields are searched for on any nesting level, indiscriminately. If there are multiple matching fields, the first occurrence is used
SELECT time, visitParamExtractSting(message, 'interval') as interval
FROM stage.logs dl
WHERE time >= '2022-09-29 23:00:00'
order by time
raw looks like :
{"body":[{"value":3,"interval":"SECOND","intervalNum":10},{"value":205015,"interval":"DAY","intervalNum":1}],"code":200,"endpoint":"/api/v3/rateLimit/order"}
in this case i want get "interval":"DAY" instead "interval":"SECOND"
so result shoult be like :
time interval
2022-10-30 20:02:01.333 DAY

visitParam is a simple function and a very fast function because of that it's not possible.
You have to use JSONExtract* functions. https://clickhouse.com/docs/en/sql-reference/functions/json-functions/#jsonextractjson-indices_or_keys-return_type
SELECT tupleElement(JSONExtract('{"body":[{"value":3,"interval":"SECOND","intervalNum":10},{"value":205015,"interval":"DAY","intervalNum":1}],"code":200,"endpoint":"/api/v3/rateLimit/order"}', 'body', 'Array(Tuple(value Int64, interval String, intervalNum Int64))'), 'interval') AS j
┌─j────────────────┐
│ ['SECOND','DAY'] │
└──────────────────┘

Related

Return last n entries from a column

I was wondering how to return the last n entries from a column in Google Sheets.
I have seen how to return the last entry, but say I wanted to return the last 5 entries. What formula is best to use?
You can return the last n entries with the following logic :
=query(Query_formula,"Select * offset "&countif_formula)
You also need to know what your count_formula is. To filter the last n rows you can use this formula :
=countif(B2:B,E2)
This is an example based of the following link's use case :
Filter rows in Google Sheet

Formula to sort by column that contains times and text, place text at the end, in Google Sheets

My Google Sheets Select statement selects rows from a master sheet of results and then sorts them by time (least amount of time first) ascending.
Some of these results will be entered as DNS and I want them to appear at the end of the time but they appear at the top.
Here is my statement:
Select A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,V,W where not D contains '/' and O is not Null order by P Asc
Column P contains time in HH:mm:ss formatted as duration. If one of the riders is a DNS, they appear at the top of the sort. Unless I sort descending - which is not desired.
With DNS, you probably mean "did not start". The query() function will only accept one data type in a column, and because most of the values in column P are time values, the "DNS" values will return as null. query() sorts null values first.
Try this to sort your data the way you describe:
=sort( { Data!A2:R, Data!Q2:R, Data!V2:W }, Data!P2:P, true )
Then use filter() or query() to remove rows where column D contains a /.

Extract more rows from a left join with Laravel

I have to extract the rows where the created_at is inside the week. Unfortunately, only one line is extracted from me and no more lines as I expected. Why?
Query:
$scadenze = DB::table('processi')
->leftJoin('scadenze', 'processi.id', '=', 'scadenze.processo_id')
->where('responsabile',$utente->id)
->whereNotIn('scadenze.stato', [4,5])
->whereBetween('scadenze.termine_stimato',[\Carbon::now()->startOfWeek(), Carbon::now()->endOfWeek()])
->avg('tempistica');
This query extract just one row, but in reality many more lines should be extracted.
Because ->avg('tempistica'); return average value from all your rows in this query, i.e. return just one value.
Solution:
I was wrong to use the avg with sum function. The rows were extracted correctly but instead of being added (by timing) an average was made. Thank you all for your help.

How to retrieve the last 100 documents with a MongoDB/Moped query?

I am using the Ruby Mongoid gem and trying to create a query to retrieve the last 100 documents from a collection. Rather than using Mongoid, I would like to create the query using the underlying driver (Moped). The Moped documentation only mentions how to retrieve the first 100 records:
session[:my_collection].find.limit(100)
How can I retrieve the last 100?
I have found a solution, but you will need to sort collection in descending order. If you have a field id or date you would do:
Method .sort({fieldName: 1 or -1})
The 1 will sort ascending (oldest to newest), -1 will sort descending (newest to oldest). This will reverse entries of your collection.
session[:my_collection].find().sort({id:-1}) or
session[:my_collection].find().sort({date:-1})
If your collection contain field id (_id) that identifier have a date embedded, so you can use
session[:my_collection].find().sort({_id:-1})
In accordance with your example using .limit() the complete query will be:
session[:my_collection].find().sort({id:-1}).limit(100);
Technically that query isn't finding the first 100, that's essentially finding 100 random documents because you haven't specified an order. If you want the first then you'd have to say explicitly sort them:
session[:my_collection].find.sort(:some_field => 1).limit(100)
and to reverse the order to find the last 100 with respect to :some_field:
session[:my_collection].find.sort(:some_field => -1).limit(100)
# -----------------------------------------------^^
Of course you have decide what :some_field is going to be so the "first" and "last" make sense for you.
If you want them sorted by :some_field but want to peel off the last 100 then you could reverse them in Ruby:
session[:my_collection].find
.sort(:some_field => -1)
.limit(100)
.reverse
or you could use use count to find out how many there are then skip to offset into the results:
total = session[:my_collection].find.count
session[:my_collection].find
.sort(:some_field => 1)
.skip(total - 100)
You'd have to check that total >= 100 and adjust the skip argument if it wasn't of course. I suspect that the first solution would be faster but you should benchmark it with your data to see what reality says.

Zend lucene - search within range

I have the following code to create the Zend Lucene index
$doc->addField(Zend_Search_Lucene_Field::UnStored('keywords', $job->getKeywords()));
$doc->addField(Zend_Search_Lucene_Field::UnStored('title', $job->getTitle()));
$doc->addField(Zend_Search_Lucene_Field::UnStored('region', $job->getRegion()));
$doc->addField(Zend_Search_Lucene_Field::keyword('minSalary', $minSalary));
$doc->addField(Zend_Search_Lucene_Field::keyword('maxSalary', $maxSalary));
$doc->addField(Zend_Search_Lucene_Field::UnStored('type', $job->getType()));
and my search query is
$query = 'minSalary:[0 TO 20000]';
Here I am trying to get all jobs whose minSalary is equal or less than 20000. But the result I get has jobs with following minSalary values
110000
100000
20000
10000
Can anyone advice on this?
Thanks
B
I suggest to use strings instead of numeric values. Convert all numeric values (e.g. 1000) in strings with the same length (e.g. 0001000) during the indexing process. So, if you want to search for a minSalary from 0 to 20000, your query string has to look like this:
$query = "minSalary:[0000000 TO 0020000]";

Resources