Bar Chart on Dimension-1 and Stacked by Dimension-2 - dc.js

Summary
I want to display a bar chart whose dimension is days and is stacked by a different category (i.e. x-axis = days and stack = category-1). I can do this "manually" in that I can write if-then's to zero or display the quantity, but I'm wondering if there's a systematic way to do this.
JSFiddle https://jsfiddle.net/wostoj/rum53tn2/
Details
I have data with dates, quantities, and other classifiers. For the purpose of this question I can simplify it to this:
data = [
{day: 1, cat: 'a', quantity: 25},
{day: 1, cat: 'b', quantity: 15},
{day: 1, cat: 'b', quantity: 10},
{day: 2, cat: 'a', quantity: 90},
{day: 2, cat: 'a', quantity: 45},
{day: 2, cat: 'b', quantity: 15},
]
I can set up a bar chart, by day, that shows total units and I can manually add the stacks for 'a' and 'b' as follows.
var dayDim = xf.dimension(_ => _.day);
var bar = dc.barChart("#chart");
bar
.dimension(dayDim)
.group(dayDim.group().reduceSum(
_ => _.cat === 'a' ? _.quantity : 0
))
.stack(dayDim.group().reduceSum(
_ => _.cat === 'b' ? _.quantity : 0
));
However, this is easy when my data has only 2 categories, but I'm wondering how I'd scale this to 10 or an unknown number of categories. I'd imagine the pseudo-code I'm trying to do is something like
dc.barChart("#chart")
.dimension(xf.dimension(_ => _.day))
.stackDim(xf.dimension(_ => _.cat))
.stackGroup(xf.dimension(_ => _.cat).group().reduceSum(_ => _.quantity));

I mentioned this in my answer to your other question, but why not expand on it a little bit here.
In the dc.js FAQ there is a standard pattern for custom reductions to reduce more than one value at once.
Say that you have a field named type which determines which type of value is in the row, and the value is in a field named value (in your case these are cat and quantity). Then
var group = dimension.group().reduce(
function(p, v) { // add
p[v.type] = (p[v.type] || 0) + v.value;
return p;
},
function(p, v) { // remove
p[v.type] -= v.value;
return p;
},
function() { // initial
return {};
});
will reduce all the rows for each bin to an object where the keys are the types and the values are the sum of values with that type.
The way this works is that when crossfilter encounters a new key, it first uses the "initial" function to produce a new value. Here that value is an empty object.
Then for each row it encounters which falls into the bin labelled with that key, it calls the "add" function. p is the previous value of the bin, and v is the current row. Since we started with a blank object, we have to make sure we initialize each value; (p[v.type] || 0) will make sure that we start from 0 instead of undefined, because undefined + 1 is NaN and we hate NaNs.
We don't have to be as careful in the "remove" function, because the only way a row will be removed from a bin is if it was once added to it, so there must be a number in p[v.type].
Now that each bin contains an object with all the reduced values, the stack mixin has helpful extra parameters for .group() and .stack() which allow us to specify the name of the group/stack, and the accessor.
For example, if we want to pull items a and b from the objects for our stacks, we can use:
.group(group, 'a', kv => kv.value.a)
.stack(group, 'b', kv => kv.value.b)
It's not as convenient as it could be, but you can use these techniques to add stacks to a chart programmatically (see source).

Related

Google Sheets: Sort Two Columns

In a sheet I have two rows that I'd like to sort on:
ColA: Archived - True/False checkbox
ColB: SortOrder - number, a way I have at the moment to group things
Cols C-G: various details
I have a function that does this from a macro, as I'll add a button to the sheet later to run this:
function SortServices() {
var ss = SpreadsheetApp.getActive();
ss.getRange('A1:B').activate();
ss.getActiveRange().offset(1, 0, spreadsheet.getActiveRange().getNumRows() - 1).sort([{column: 1, ascending: true}, {column: 2, ascending: true}]);
};
I'm having a problem, as there are 1000 rows but only ~300 rows currently have data. Of these, about 200 are unchecked and 100 checked (as Archived). So this sorts the entire colA, then colB, which doesn't correspond to the data cols.
Struggling to explain this, imgs might be better. Here's an example, originally:
After applying the Sort:
The issue must be the rows without data, but ColA has checkboxes, which by default are unchecked. Is there a simple way to get around this? (I don't know much about Sheets)
[Edit] Sorry, should have added what I would like... the rows with no data at the end, same as the first img. When a row is checked as Archived, using Sort would then move it down into the lower section together with other Archived items:
Solution
Divide the data in two sets and sort in the desired order those that meet the requirements (have data in column B).
Code
function myFunction() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet()
var lr = ss.getLastRow()
// original values
var range = ss.getRange(1, 1, lr, 3).getValues()
// cells without data in colB
var filtered = range.filter(function (value, index, arr) {
return value[1] == "";
});
// cells with data in colB
var original = range.filter(x => !filtered.includes(x));
// write
ss.getRange(1, 1, original.length, 3).setValues(original).sort([{ column: 1, ascending: true }, { column: 2, ascending: true }]);
ss.getRange(1 + original.length, 1, filtered.length, 3).setValues(filtered)
}
Thought there would be an easier answer, but this works OK:
Inserted a new colA with formula to create a single sortable column:
=if( isblank(D2), 20000, if(B2=true, 10000+C2, C2 ) )
Then used this script:
function SortServices() {
var ss = SpreadsheetApp.getActive();
var sheet = SpreadsheetApp.getActive().getActiveSheet();
sheet.getRange(1, 1, sheet.getMaxRows(), sheet.getMaxColumns()).activate();
ss.getActiveRange().offset(1, 0, ss.getActiveRange().getNumRows() - 1).sort({column: 1, ascending: true});
};
Probably add this to a custom menu.

How can i mutate a list in elixir which I am iterating using Enum.map ? or need opinion on using nested recursion

I have two lists in elixir. One list (list1) has values which get consumed in another list (list2). So I need to iterate over list2 and update values in list1 as well as list2.
list1 = [
%{
reg_no: 10,
to_assign: 100,
allocated: 50
},
%{
reg_no: 11,
to_assign: 100,
allocated: 30
},
%{
reg_no: 12,
to_assign: 100,
allocated: 20
}
]
list2 = [
%{
student: student1,
quantity: 60,
reg_nos: [reg_no_10, reg_no_11]
},
%{
student: student2,
quantity: 40,
reg_nos: [reg_no_11, reg_no_12]
},
%{
student: student3,
quantity: 30,
reg_nos: nil
}
]
I need to assign values from list1 to quantity field of list2 till quantity is fulfilled. e.g. student1 quantity is 60 which will need reg_no 10 and reg_no 11.
With Enum.map I cannot pass updated list1 for 2nd iteration of list2 and assign value reg_nos: reg_no_11, reg_no_12for student2.
So, my question is how can I send updated list1 to 2nd iteration in list2?
I am using recursion to get quantity correct for each element in list2. But again, should I use recursion only to send updated list1 in list2? With this approach, there will be 2 nested recursions. Is that a good approach?
If I understand your question correctly, you want to change values in a given list x, based on a list of values in another list y.
What you describe is not possible in a functional language due to immutability, but you can use a reduce operation where x is the state or so-called "accumulator".
Below is an example where I have a ledger with bank accounts, and a list with transactions. If I want to update the ledger based on the transactions I need to reduce over the transactions and update the ledger per transaction, and pass the updated ledger on to the next transaction. This is the problem you are seeing as well.
As you can see in the example, in contrast to map you have a second parameter in the user-defined function (ledger). This is the "state" you build up while traversing the list of transactions. Each time you process a transaction you have a change to return a modified version of the state. This state is then used to process the second transaction, which in turn can change it as well.
The final result of a reduce call is the accumulator. In this case, the updated ledger.
def example do
# A ledger, where we assume the id's are unique!
ledger = [%{:id => 1, :amount => 100}, %{:id => 2, :amount => 50}]
transactions = [{:transaction, 1, 2, 10}]
transactions
|> Enum.reduce(ledger, fn transaction, ledger ->
{:transaction, from, to, amount} = transaction
# Update the ledger.
Enum.map(ledger, fn entry ->
cond do
entry.id == from -> %{entry | amount: entry.amount - amount}
entry.id == to -> %{entry | amount: entry.amount + amount}
end
end)
end)
end

Rxjs GroupBy, Reduce in order to Pivot on ID

I'm looking for a bit of help understanding this example taken from the rxjs docs.
Observable.of<Obj>({id: 1, name: 'aze1'},
{id: 2, name: 'sf2'},
{id: 2, name: 'dg2'},
{id: 1, name: 'erg1'},
{id: 1, name: 'df1'},
{id: 2, name: 'sfqfb2'},
{id: 3, name: 'qfs1'},
{id: 2, name: 'qsgqsfg2'}
)
.groupBy(p => p.id, p => p.name)
.flatMap( (group$) => group$.reduce((acc, cur) => [...acc, cur], ["" + group$.key]))
.map(arr => ({'id': parseInt(arr[0]), 'values': arr.slice(1)}))
.subscribe(p => console.log(p));
So the aim here is to group all the items by id and produce an object with a single ID and a values property which includes all the emitted names with matching IDs.
The second parameter to the groupBy operator identifies the return value. Effectively filtering the emitted object's properties down to the name. I suppose the same thing could be achieved by mapping the observable beforehand. Is it possible to pass more than one value to the return value parameter?
The line I am finding very confusing is this one:
.flatMap( (group$) => group$.reduce((acc, cur) => [...acc, cur], ["" + group$.key]))
I get that we now have three grouped observables (for the 3 ids) that are effectively arrays of emitted objects. With each grouped observable the aim of this code is to reduce it an array, where the first entry in the array is the key and subsequent entries in the array are the names.
But why is the reduce function initialized with ["" + group$.key], rather than just [group$.key]?
And why is this three dot notation [...acc, cur] used when returning the reduced array on each iteration?
But why is the reduce function initialized with ["" + group$.key], rather than just [group$.key]?
The clue to answer this question is in the .map() function a bit further down in the code.
.map(arr => ({'id': parseInt(arr[0]), 'values': arr.slice(1)}))
^^^^^^^^
Note the use parseInt. Without the "" + in the flatMap this simply wouldn't compile since you'd be passing a number type to a function that expects a string. Remove the parseInt and just use arr[0] and you can remove "" + as well.
And why is this three dot notation [...acc, cur] used when returning
the reduced array on each iteration?
The spread operator here is used to add to the array without mutating the array. But what does it do? It will copy the original array, take all the existing elements out of the array, and deposit the elements in the new array. In simpler words, take all elements in acc, copy them to a new array with cur in the end. Here is a nice blog post about object mutation in general.

Trouble with Linq in group by query

Maybe someone knows how to achieve this kind of query in linq (or lambda).
I have this set in a list
Filter: My input will be code 100 and 101, I need to get the "values", in this example = 1, 2.
Problem: If you input 100 and 101, you´ll get 3 results, because of 100 from group 1 and group 2. I just need the pair that matches in the same group. (And you don´t have group as an input param)
How can I solve this if the group fully exists?
thanks!
Starting with a simple representation in code of what you have in a picture:
var list = new[]
{
new{code = 100, value = 1, group = 1},
new{code = 101, value = 2, group = 1},
new{code = 100, value = 3, group = 2},
new{code = 103, value = 4, group = 2},
};
var inp = new[]{100, 103};
Then we can do:
list
.GroupBy(el => el.group) // Group by the "group" field.
.Where(grp => !inp.Except(grp.Select(el => el.code)).Any()) // Exclude groups that don't contain all input values
.Single() // Obtain the only such group (with a check that there is only one)
.Select(el => el.value); // Obtain the "value" fields.
If you could perhaps have inputs that were a subset of the "code" fields of some groups, you could also check that you match all of the group completely by excluding groups which have a different size:
list
.GroupBy(el => el.group)
.Where(grp =>
grp.Count‎() == inp.Count()
&& !inp.Except(grp.Select(el => el.code)).Any())
.Single()
.Select(el => el.value);
There are other variations that match other possible interpretations of your question (e.g. I'm assuming there can be only one matching group, but that wasn't clear).

Dimple Specific Axis Variable Value

My Data is nested in the hopes of adding in a value slider to dynamically change the year. One entry looks like:
key: "dot1"
values: Array[3]
[0]: Object
x: 3
y: 2
[1]: Object
x: 2
y: 5
[2]: Object
x: 3
y: 5
key: "dot2"
etc...
Is there a way to access the values to graph? Something like:
var chart = new dimple.chart(svg, rawData);
chart.setBounds(90, 35, 480, 325)
var myyAxis = chart.addMeasureAxis("y", "values[0][y]");
var myxAxis = chart.addMeasureAxis("x", "values[0][x]");
chart.addSeries(["key"], dimple.plot.bubble);
chart.draw(1000);
You can't add a computed string as a measure property, so the easiest way I know to accomplish this is to format the data to flatten out your internal arrays, so instead of
key: "dot1"
values: Array[3]
[0]: Object
x: 3
y: 2
[1]: Object
x: 2
y: 5
[2]: Object
x: 3
y: 5
you end up with
[
{ 'key' : 'dot1', 'x' : 3, 'y' : 2 },
{ 'key' : 'dot1', 'x' : 2, 'y' : 5 },
{ 'key' : 'dot1', 'x' : 3, 'y' : 5 },
]
I got a basic example of this working here : http://jsbin.com/rijuvoneyu/1/edit?js,output
It uses underscore.js to map each row in the dataset, map each value array inside (so for every 1 row in the dataset you end up with 3), then flatten the entire structure out. Now you can just use 'x' and 'y' as the properties for the measure axes.
You'll notice I added a unique identifier for each new data row. Without that, if I only did
var series = chart.addSeries(["key"], dimple.plot.bubble);
dimple was adding all the values sharing the same 'key' together, so only plotting two bubbles. I'm not sure if there is a better way to not have that happen without the unique id, but it works well enough.

Resources