Returning MAX() and MIN() in one query

Returning MAX() and MIN() in one query - rethinkdb

On my RethinkDB 1.16.2-1 on Linux, I have a "products" table that has a "categories" array and a "models" array like this:
{
"name": "ABC Cable Series" ,
"categories": [
"Analog Audio>Instrument>Cables" ,
"Analog Audio>Microphone Cables"
] ,
"models": [
{
"modelCode": "ABC-1" ,
"ssp": 11.95 , ...
} ,
{
"modelCode": "ABC-2" ,
"ssp": 15.95 , ...
}
]
} , ...
I need to get both the minimum and maximum price (ssp) range of models in products that contain the given product category. I can currently get the maximum price like this:
r.db("store").table("products").filter(function(prod) {
return prod("categories").contains(
function(cat){return cat.match("^Analog Audio>")
})
}).concatMap(function(doc) {
return doc("models")("ssp")
}).max()
Other than running 2 queries, is there a more efficient way to get both MAX and MIN values in one query?

Presuming you want an object with both values, you can do the following:
r.db('test').table('products').filter(function(prod) {
return prod("categories").contains(
function(cat){return cat.match("^Analog Audio>")
})
}).concatMap(function(doc) {
return doc("models")("ssp")
})
.coerceTo('array') // Convert Stream to Array
.do(function (rows) { // Pass array to to `.do`
return { // Return Object
max: rows.max(),
min: rows.min()
}
})

You can also use reduce (http://rethinkdb.com/api/javascript/reduce/) to compute both values without converting all data to an array first:
r.db("store").table("products").filter(function(prod) {
return prod("categories").contains(
function(cat){return cat.match("^Analog Audio>")
})
}).map(function(doc) {
return {
min: doc("models")("ssp").min(),
max: doc("models")("ssp").max()
}
}).reduce(function (le, ri) {
return {
min: r.expr([le("min"), ri("min")]).min(),
max: r.expr([le("max"), ri("max")]).max()
}
})

Related

Elasticsearch random_score pushes documents towards the end of results

Here's the logic I am trying to accomplish:
I am using Elasticsearch to display top selling Products and randomly inserting newly created products in the results using function_score query DSL.
The issue I am facing is that I am using random_score fn for newly created products and the query does inserts new products up till page 2 or 3 but then rest all the other newly created products pushed towards the end of search results.
Here's the logic written for function_score:
function_score: {
query: query,
functions: [
{
filter: [
{ terms: { product_type: 'sponsored') } },
{ range: { live_at: { gte: 'CURRENT_DATE - 1.MONTH' } } }
],
random_score: {
seed: Time.current.to_i / (60 * 10), # new seed every 10 minutes
field: '_seq_no'
},
weight: 0.975
},
{
filter: { range: { live_at: { lt: 'CURRENT_DATE - 1.MONTH' } } },
linear: {
weighted_sales_rate: {
decay: 0.9,
origin: 0.5520974289580515,
scale: 0.5520974289580515
}
},
weight: 1
}
],
score_mode: 'sum',
boost_mode: 'replace'
}
And then I am sorting based on {"_score" => { "order" => "desc" } }
Let's say there are 100 sponsored products created in last 1 month. Then the above Elasticsearch query displays 8-10 random products (3 to 4 per page) as I scroll through 2 or 3 pages but then all other 90-92 products are displayed in last few pages of the result. - This is because the score calculated by random_score for 90-92 products is coming lower than the score calculated by linear
decay function.
Kindly suggest how can I modify this query so that I continue to see newly created Products as I navigate through pages and can prevent pushing new records towards the end of results.
[UPDATE]
I tried adding gauss decay function to this query (so that I can somehow modify the score of the products appearing towards the end of result) like below:
{
filter: [
{ terms: { product_type: 'sponsored' } },
{ range: { live_at: { gte: 'CURRENT_DATE - 1.MONTH' } } },
{ range: { "_score" => { lt: 0.9 } } }
],
gauss: {
views_per_age_and_sales: {
origin: 1563.77,
scale: 1563.77,
decay: 0.95
}
},
weight: 0.95
}
But this too is not working.
Links I have referred to:
https://intellipaat.com/community/12391/how-to-get-3-random-search-results-in-elasticserch-query
Query to get random n items from top 100 items in Elastic Search
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-function-score-query.html

I am not sure if this is the best solution, but I was able to accomplish this with wrapping up the original query with script_score query + I have added a new ElasticSearch indexing called sort_by_views_per_year. Here's how the solution looks:
Link I referred to: https://github.com/elastic/elasticsearch/issues/7783
attribute(:sort_by_views_per_year) do
object.live_age&.positive? ? object.views_per_year.to_f / object.live_age : 0.0
end
Then while querying ElasticSearch:
def search
#...preparation of query...#
query = original_query(query)
query = rearrange_low_scoring_docs(query)
sort = apply_sort opts[:sort]
Product.search(query: query, sort: sort)
end
I have not changed anything in original_query (i.e. using random_score to products <= 1.month.ago and then use linear decay function).
def rearrange_low_scoring_docs query
{
function_score: {
query: query,
functions: [
{
script_score: {
script: "if (_score.doubleValue() < 0.9) {return 0.9;} else {return _score;}"
}
}
],
#score_mode: 'sum',
boost_mode: 'replace'
}
}
end
Then finally my sorting looks like this:
def apply_sort
[
{ '_score' => { 'order' => 'desc' } },
{ 'sort_by_views_per_year' => { 'order' => 'desc' } }
]
end
It would be way too helpful if ElasticSearch random_score query DSL starts supporting something like: max_doc_to_include and min_score attributes. So that I can use it like:
{
filter: [
{ terms: { product_type: 'sponsored' } },
{ range: { live_at: { gte: 'CURRENT_DATE - 1.MONTH' } } }
],
random_score: {
seed: 123456, # new seed every 10 minutes
field: '_seq_no',
max_doc_to_include: 10,
min_score: 0.9
},
weight: 0.975
},

PouchDB quick search: Search within for loop and return results AFTER for loop finish

I have the following document structure:
{
"_id": "car_1234",
"_rev": "1-9464f5d70547c255a423ff8dae653db1",
"Tags": [
"Audi",
"A4",
"black"
],
"Car Brand": "Audi",
"Model": "A4",
"Color": "black",
"CarDealerID": "5"
}
The Tags field stores the information of the document in a list. This structure needs to stay like this. Now the user has the opportunity to search for cars in a HTML text input field, where a , represents a separation between cars. Let's take the following example:
black Audi, pink Audi A4
Here the user wants to find a black Audi or a pink Audi A4. My approach of querying through the database is by splitting the entered words to the following structure [["black", "Audi"],["pink", "Audi", "A4"]] and to search inside the Tags field of each document in the db if all the words in a subarray (e.g. "black" and "Audi") are existent and to return the CarDealerID.
///Before this I return the word list as described
}).then(function (wordList) {
results = [];
for (var i = 0; i < userWords.length; i++) {
//Check if the object is a single word or an array of words
if (wordList[i].constructor === Array) {
//Recreate the words in the array as one string
wordString = ""
wordList[i].forEach(function (part) {
wordString += part + " "
})
wordString = wordString.trim()
//Search for the car
car_db.search({
query: wordString,
fields: ["Tags"],
include_docs: true
}).then(function(result) {
result.rows.forEach(function (row) {
results.push(row.doc.CarDealerID)
})
})
} else {
car_db.search({
query: userWords[i],
fields: ["Tags"],
include_docs: true
}).then(function(result) {
result.rows.forEach(function (row) {
results.push(row.doc.CarDealerID)
})
})
}
return results
}).then(function(results) {
console.log(results)
}).catch(function (err) {
console.log(err)
});
My Problem
My problem is now that the results are returned before the for loop finishes. This is probably because it is an async procedure and the result should wait to be returned until this async is finished. But I don't know how to achieve that. I hope someone can help me out.

Thanks to Nolan Lawson's Blog (Rookie Mistake #2) I could figure it out. Instead of the for loop, I use
return Promise.all(wordList.map(function (i) {
results = [];
//
//Same Code as before
//
//Return results inside the map function
return results;
}));

How find the maximum of all arrays in an array of arrays?

I'm trying to learn D3 by book and examples. One example I'm working through is a simple (multi) line chart located here http://bl.ocks.org/mbostock/3884955#index.html .
I can follow along for the most part but I can't make sense of this:
y.domain([
d3.min(cities, function(c) { return d3.min(c.values, function(v) { return v.temperature; }); }),
d3.max(cities, function(c) { return d3.max(c.values, function(v) { return v.temperature; }); })
]);
When I was trying to write the code on my own, using the example as a cheat sheet, I came up with this
y.domain([0, d3.max(data, function(d) { return d.temperature; })]);
because I wanted the y range to span from 0 to the max of all temperatures.
I believe I have two questions here:
1) is the nested mins and maxs because it's looking at the max of each array within the array?
2) am I correct thinking that 'cities' is the entire array and values is the array of temperatures within 'cities'?
Apologies if this question isn't very focused. I believe I want to figure out how to find the maximum of an array of arrays.

Is the nested mins and maxs because it's looking at the max of each array within the array?
Yes! you are right the cities json is an array which has another array in it with key values the idea here is t find the min temperature in this nested array
d3.min(cities, function(c) { return d3.min(c.values, function(v) { return v.temperature; }); }),
am I correct thinking that 'cities' is the entire array and values is the array of temperatures within 'cities'?
Yes you are correct again copy this json below in a json formatter you will be able to understand the JSON better:
cities = [
{
"name":"New York",
"values":[
{
"date":"2011-09-30T18:30:00.000Z",
"temperature":63.4
},
{
"date":"2011-10-01T18:30:00.000Z",
"temperature":58
},
{
"date":"2011-10-02T18:30:00.000Z",
"temperature":53.3
},
{
"date":"2011-10-03T18:30:00.000Z",
"temperature":55.7
},
{
"date":"2011-10-04T18:30:00.000Z",
"temperature":64.2
}
]
},
{
"name":"San Francisco",
"values":[
{
"date":"2011-09-30T18:30:00.000Z",
"temperature":62.7
},
{
"date":"2011-10-01T18:30:00.000Z",
"temperature":59.9
},
{
"date":"2011-10-02T18:30:00.000Z",
"temperature":59.1
},
{
"date":"2011-10-03T18:30:00.000Z",
"temperature":58.8
}
]
},
{
"name":"Austin",
"values":[
{
"date":"2011-09-30T18:30:00.000Z",
"temperature":72.2
},
{
"date":"2011-10-01T18:30:00.000Z",
"temperature":67.7
},
{
"date":"2011-10-02T18:30:00.000Z",
"temperature":69.4
}
]
}
]
Hope this helps!

Rethinkdb insert query results into a table

I'm trying to insert the results of a query from one table into another table. However, when I attempt to run the query I am receiving an error.
{
"deleted": 0 ,
"errors": 1 ,
"first_error": "Expected type OBJECT but found ARRAY." ,
"inserted": 0 ,
"replaced": 0 ,
"skipped": 0 ,
"unchanged": 0
}
Here is the the insert and query:
r.db('test').table('destination').insert(
r.db('test').table('source').map(function(doc) {
var result = doc('result');
return result('section_list').concatMap(function(section) {
return section('section_content').map(function(item) {
return {
"code": item("code"),
"name": item("name"),
"foo": result("foo"),
"bar": result("bar"),
"baz": section("baz"),
"average": item("average"),
"lowerBound": item("from"),
"upperBound": item("to")
};
});
});
});
);
Is there a special syntax for this, or do I have to retrieve the results and then run a separate insert?

The problem is that your inner query is returning a stream of arrays. You can't insert arrays into a table (only objects), so the query fails. If you change the outermost map into a concatMap it should work.

The problem here was that the result was a sequence of an array of objects. i.e
[ [ { a:1, b:2 }, { a:1, b:2 } ], [ { a:2, b:3 } ] ]
Therefore, I had to change the outer map call to a concatMap call. The query then becomes:
r.db('test').table('destination').insert(
r.db('test').table('source').concatMap(function(doc) {
var result = doc('result');
return result('section_list').concatMap(function(section) {
return section('section_content').map(function(item) {
return {
"code": item("code"),
"name": item("name"),
"foo": result("foo"),
"bar": result("bar"),
"baz": section("baz"),
"average": item("average"),
"lowerBound": item("from"),
"upperBound": item("to")
};
)});
});
});
}
Thanks goes to #AtnNn on the #rethinkdb freenode for pointing me in the right direction.

How to access column name dynamically in Kendo Grid template

I need to access the column name dynamically in Kendo Grid template.
Code:
$("#grid").kendoGrid({
dataSource: [
{ Quantity: 2 , Amount: 650},
{ Quantity: 0, Amount: 0 },
{ Quantity: 1, Amount: 500 },
{ Quantity: 4, Amount: 1047 }
],
sortable: true,
columns: [
{
field: "Quantity",
template: function (dataItem) {
if (dataItem.Quantity == '0') {
return "--";
} else {
return dataItem.Quantity;
}
}
},
{
field: "Amount",
template: function (dataItem) {
if (dataItem.Amount == '0') {
return "--";
} else {
return dataItem.Amount;
}
}
}
]
});
Here inside the "columns -> template", I need to access the column thru variable instead of hardcoding it. How can I do that? Because in real life I will be having dynamic columns populated into dataSource and I will construct the columns array inside the for loop. Please help.
Please access this JSBIN: http://jsbin.com/egoneWe/1/edit

From what I understand, you build the columns array using something like:
var Definition = [
{ field: "Quantity" },
{ field: "Amount" }
];
var columns = [];
$.each(Definition, function (idx, item) {
columns.push({
field : item.field,
template: function (dataItem) {
...;
}
})
});
$("#grid").kendoGrid({
dataSource: data,
sortable : true,
columns : columns
});
Right? And the problem is that you want to use the same template function for several (all) columns instead of having to rewrite many.
If so, what you can do is:
var Definition = [
{ field: "Quantity" },
{ field: "Amount" }
];
var columns = [];
$.each(Definition, function (idx, item) {
columns.push({
field : item.field,
template: function (dataItem) {
return commonTemplateFunction(dataItem, item.field);
}
})
});
What I use in the columns array (columns definition for the Grid) is a function that receives two arguments: the dataItem for the row and the field's name being edited.
Then, I define the template function as:
function commonTemplateFunction(dataItem, field) {
if (dataItem[field] == '0') {
return "--";
} else {
return dataItem[field];
}
}
And your modified code is here : http://jsbin.com/egoneWe/3/edit
So, despite I cannot guess the column name, I can do the trick using the columns initiator.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Returning MAX() and MIN() in one query - rethinkdb

Related

Elasticsearch random_score pushes documents towards the end of results

PouchDB quick search: Search within for loop and return results AFTER for loop finish

How find the maximum of all arrays in an array of arrays?

Rethinkdb insert query results into a table

How to access column name dynamically in Kendo Grid template

Categories

Resources