How do I add items with a score above x to goodItems for precision metric in Lenskit 3.0? - lenskit

I'd like to add the precision metric and use only items
with a rating higher than 4.0 as 'goodItems'
In Lenskit 2 this could be done by:
metric precision {
listSize 10
candidates ItemSelectors.addNRandom(ItemSelectors.testItems(), 100)
exclude ItemSelectors.trainingItems()
goodItems ItemSelectors.testRatingMatches(Matchers.greaterThanOrEqualTo(4.0d))
}
Now I'm trying to do the same in Lenskit 3 with graddle but obviously
metric('pr') {
goodItems 'ItemSelectors.testRatingMatches(Matchers.greaterThanOrEqualTo(4.0d))'
}
doesn't work, since there is no ItemSelectors class in Lenskit 3.0.
How can I link the goodItems with the appropriate items and discard low-rated items in order to achieve a correct precision value?

As told by Mr. Ekstrand, you can select the good items by adding the following line to the gradle build file.
goodItems 'user.testHistory.findAll({ it instanceof org.lenskit.data.ratings.Rating && it.value >= 4 })*.itemId'
However, this returns an Object, in Itemselector.class, there is a parsing that happens to Set, this however doesn't work since the returned object is of the ArrayList Type. If I'm correct this means that the Object needs to be casted to an ArrayList before being casted to a set, I did this by copying the Itemselector class and replacing:
Set<Long> set = (Set<Long>) script.run();
by:
Set<Long> set = new HashSet<Long>((ArrayList<Long>)script.run());
This returns the correct items from my test-set, rated above 4.0

This goodItems should work:
user.testHistory.findAll({ it instanceof org.lenskit.data.ratings.Rating && it.value >= 4 })*.itemId.toSet()

Related

Dynamics crm + Plugin code to store sum formula across a entity collection

I have the below requirement to be implemented in a plugin code on an Entity say 'Entity A'-
Below is the data in 'Entity A'
Record 1 with field values
Price = 100
Quantity = 4
Record 2 with field values
Price = 200
Quantity = 2
I need to do 2 things
Add the values of the fields and update it in a new record
Store the Addition Formula in a different config entity
Example shown below -
Record 3
Price
Price Value = 300
Formula Value = 100 + 200
Quantity
Quantity Value = 6
Formula Value = 4 + 2
Entity A has a button named "Perform Addition" and once clicked this will trigger the plugin code.
Below is the code that i have tried -
AttributeList is the list of fields i need to perform sum on. All fields are decimal
Entity EntityA = new EntityA();
EntityA.Id = new Guid({"Guid String"});
var sourceEntityDataList = service.RetrieveMultiple(new FetchExpression(fetchXml)).Entities;
foreach (var value in AttributeList)
{
EntityA[value]= sourceEntityDataList.Sum(e => e.Contains(value) ? e.GetAttributeValue<Decimal>(value) : 0);
}
service.Update(EntityA);
I would like to know if there is a way through linq I can store the formula without looping?
and if not how can I achieve this?
Any help would be appreciated.
Here are some thoughts:
It's interesting that you're calculating values from multiple records and populating the result onto a sibling record rather than a parent record. This is different than a typical "rollup" calculation.
Dynamics uses the SQL sequential GUID generator to generate its ids. If you're generating GUIDs outside of Dynamics, you might want to look into leveraging the same logic.
Here's an example of how you might refactor your code with LINQ:
var target = new Entity("entitya", new Guid("guid"));
var entities = service.RetrieveMultiple(new FetchExpression(fetchXml)).Entities.ToList();
attributes.ForEach(a => target[a] = entities.Sum(e => e.GetAttributeValue<Decimal>(a));
service.Update(target);
The GetAttributeValue<Decimal>() method defaults to 0, so we can skip the Contains call.
As far as storing the formula on a config entities goes, if you're looking for the capability to store and use any formula, you'll need a full expression parser, along the lines of this calculator example.
Whether you'll be able to do the Reflection required in a sandboxed plugin is another question.
If, however, you have a few set formulas, you can code them all into the plugin and determine which to use at runtime based on the entities' properties and/or config data.

How to get dynamic field count in dc.js numberDisplay?

I'm currently trying to figure out how to get a count of unique records to display using DJ.js and D3.js
The data set looks like this:
id,name,artists,genre,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms,time_signature
6DCZcSspjsKoFjzjrWoCd,God's Plan,Drake,Hip-Hop/Rap,0.754,0.449,7,-9.211,1,0.109,0.0332,8.29E-05,0.552,0.357,77.169,198973,4
3ee8Jmje8o58CHK66QrVC,SAD!,XXXTENTACION,Hip-Hop/Rap,0.74,0.613,8,-4.88,1,0.145,0.258,0.00372,0.123,0.473,75.023,166606,4
There are 100 records in the data set, and I would expect the count to display 70 for the count of unique artists.
var ndx = crossfilter(spotifyData);
totalArtists(ndx);
....
function totalArtists(ndx) {
// Select the artists
var totalArtistsND = dc.numberDisplay("#unique-artists");
// Count them
var dim = ndx.dimension(dc.pluck("artists"));
var uniqueArtist = dim.groupAll();
totalArtistsND.group(uniqueArtist).valueAccessor(x => x);
totalArtistsND.render();
}
I am only getting 100 as a result when I should be getting 70.
Thanks a million, any help would be appreciated
You are on the right track - a groupAll object is usually the right kind of object to use with dc.numberDisplay.
However, dimension.groupAll doesn't use the dimension's key function. Like any groupAll, it looks at all the records and returns one value; the only difference between dimension.groupAll() and crossfilter.groupAll() is that the former does not observe the dimension's filters while the latter observes all filters.
If you were going to use dimension.groupAll, you'd have to write reduce functions that watch the rows as they are added and removed, and keeps a count of how many unique artists it has seen. Sounds kind of tedious and possibly buggy.
Instead, we can write a "fake groupAll", an object whose .value() method returns a value dynamically computed according to the current filters.
The ordinary group object already has a unique count: the number of bins. So we can create a fake groupAll which wraps an ordinary group and returns the length of the array returned by group.all():
function unique_count_groupall(group) {
return {
value: function() {
return group.all().filter(kv => kv.value).length;
}
};
}
Note that we also have to filter out any bins of value zero before counting.
Use the fake groupAll like this:
var uniqueArtist = unique_count_groupall(dim.group());
Demo fiddle.
I just added this to the FAQ.

Spring data mongo - Get sum of array of object

I have the following document:
{
pmv: {
budgets: [
{
amount: 10
},
{
amount: 20
}
]
}
}
and I need to sum the amount field from every object in budgets. But it's also possible that the budget object doesn't exist so I need to check that.
How could I do this? I've seen many questions but with projections, I just need a integer number which in this case would be 30.
How can I do it?
Thanks.
EDIT 1 FOR PUNIT
This is the code I tried but its giving me and empty aray
AggregationOperation filter = match(Criteria.where("pmv.budgets").exists(true).not().size(0));
AggregationOperation unwind = unwind("pmv.budgets");
AggregationOperation sum = group().sum("budgets").as("amount");
Aggregation aggregation = newAggregation(filter, unwind, sum);
mongoTemplate.aggregate(aggregation,"Iniciativas",String.class);
AggregationResults<String> aggregationa = mongoTemplate.aggregate(aggregation,"Iniciativas",String.class);
List<String> results = aggregationa.getMappedResults();
You can do this with aggregation pipeline
db.COLLECTION_NAME.aggregate([
{"pmv.budgets":{$exists:true,$not:{$size:0}}},
{$unwind:"$pmv.budgets"},
{amount:{$sum:"$pmv.budgets"}}
]);
This pipeline contains three queries:
get document having non-null and non-empty budgets
$unwind basically open the array and create one document for each array element. e.g. if one document of budgets has 3 elements then it will create 3 document and fill budgets property from each of the array element. You can read more about it here
sum all the budgets property using $sum operator
You can read more about aggregation pipeline here
EDIT: as per comments, adding code for java as well.
AggregationOperation filter = match(Criteria.where("pmv.budgets").exists(true).not().size(0));
AggregationOperation unwind = unwind("pmv.budgets");
AggregationOperation sum = group().sum("pmv.budgets").as("amount");
Aggregation aggregation = newAggregation(filter, unwind, sum);
mongoTemplate.aggregate(aggregation,COLLECTION_NAME,Output.class);
You can do this in more inline way as well but I wrote it like this so that it will be easy to understand.
I hope this answer your question.

Azure Search Scoring Profile Magnitude by Downloads

I am new to Azure Search so I just want to run this by before I try to implement it. We have a search setup on items and we want to score/rank the results based on its initial score and how many times the item has been used/downloaded. We want the items downloaded the most to appear at the top of the result list.
We have a separate field in the search index that contains the used/download count (itemCount).
I know I have to set up a Magnitude profile but I am not sure what to use for the range as the itemCount can contain 0 - N So do I just set the range to be some large number i.e. 100,000,000 or what is the best practice?
var functionRankByDownload = new MagnitudeFunction()
{
Boost = 1000,
BoostingRangeStart = 0,
BoostingRangeEnd = 100000000,
ConstantBoostBeyondRange = true,
FieldName = "itemCount",
Interpolation = InterpolationTypes.Linear
};
scoringProfile1.Functions = new List() { functionRankByDownload };
I found the score calculation is as follows:
((initialScore * boost * itemCount) - min) / (max-min)
So it seems like it should work ok having a large value for the max but again just wanting to know the best practice.
Thanks!
That seems reasonable. The BoostingRangeEnd can be any reasonable bound to your range depending on the scenario. Since, you are using ConstantBoostBeyondRange, it would also take care of boosting values outside ranges appropriately.
You might also want to experiment with the boost value for a large range like this and see if a bigger boost value is more helpful for your scenario.

Kendo UI Grid sum aggregate on a column with an optional property

I'm using the Aggregates feature of a Kendo UI Grid Widget to display the sum of a grouped column. The column in question is bound to an optional field / property so when the data set includes a data item where property is not present (i.e. undefined), the sum ends up being NaN as you would expect.
The Kendo DataSource calculates these aggregates "when the data source populates with data" but does not include a feature to allow custom aggregates that would allow me to implement a version of sum that substitutes 0 for undefined values.
I can see where the sum aggregate function is defined in kendo.data.js but I would prefer not to change that if possible.
I have an idea how to solve this by writing a function to query the $("#myGridId").data("kendoGrid").dataSource but I'd like to know if there is a better option.
After comparing the latest code in kendo.data.js to my project version (2013.1.319) I see that they have changed the implementation of the sum aggregate function to handle this case by only performing the addition if the value is a number. Problem solved if I can get the project updated to the latest version of Kendo UI.
The code snippet below is at line 1369 of version 2014.1.416.
sum: function(accumulator, item, accessor) {
var value = accessor.get(item);
if (!isNumber(accumulator)) {
accumulator = value;
} else if (isNumber(value)) {
accumulator += value;
}
return accumulator;
}

Resources