Querying counts from large datasets using eloquent - laravel

I have the following relationships:
A Job has many Roles.
public function roles()
{
return $this->hasMany(Role::class);
}
A Role has many Shifts and Assignments through shifts.
public function shifts()
{
return $this->hasMany(Shift::class);
}
public function assignments()
{
return $this->hasManyThrough(Assignment::class, Shift::class);
}
A Shift has many Assignments.
public function assignments()
{
return $this->hasMany(Assignment::class);
}
I need to get a count of all assignments with a certain status, let's say "Approved". These counts are causing my application to go extremely slowly. Here is how I have been doing it:
foreach ($job->roles as $role){
foreach ($role->shifts as $shift) {
$pendingCount = $shift->assignments()->whereStatus("Pending")->count();
$bookedCount = $shift->assignments()->whereIn('status', ["Booked"])->count();
}
}
I am certain that there must be a better, faster way. Some of these queries are taking upwards of 30+ seconds. There are hundreds of thousands of Assignments, which I know is affecting performance. How can I speed up these queries?

You're running into the N+1 issue here a few times. You want to lazy load the nested assignment through jobs then you can access the relation and your where() and whereIn() calls are executed on the returned collection instead of on the query builder which is why you have to use where('status', "Pending") instead of whereStatus("Pending") in my example because the collection won't automatically resolve this constraint:
$job = Job::with('roles.assignments')->find($jobId);
foreach ($job->roles as $role) {
$pendingCount = $role->assignments->where('status', "Pending")->count();
$bookedCount = $role->assignments->whereIn('status', ["Booked"])->count();
}
This should be a lot quicker for you.
UPDATE
You could even take that one step further and map the result and store the results in a property on the role:
$job->roles->map(function($role) {
$role->pending_count = $role->assignemnts->where('status', "Pending")->count();
$role->booked_count = $role->assignments->whereIn('status', ["Booked"])->count();
return $role;
});

Related

Refactoring multiple same queries with different value

Is there a better way to refactor this code? It's basically the same line with different values. I thought of doing a for loop but maybe there's another way.
$date = $request->date;
$underConsideration = Application::whereDate('created_at',$date)->where('status','Under consideration')->count();
$phoneScreened = Application::whereDate('created_at',$date)->where('status','Phone screened')->count();
$interviewed = Application::whereDate('created_at',$date)->where('status','Interviewed')->count();
$offerExtended = Application::whereDate('created_at',$date)->where('status','Offer extended')->count();
You should create a separate method for this.
public function myMethod()
{
$under_consideration = $this->fetchApplicationData($request, 'Under consideration');
$phone_screened = $this->fetchApplicationData($request, 'Phone screened');
$interviewed = $this->fetchApplicationData($request, 'Interviewed');
$offer_extended = $this->fetchApplicationData($request, 'Offer extended');
}
private function fetchApplicationData($request, $status)
{
return Application::
whereDate('created_at', $request->date)
->where('status', $status)
->count();
}
That is a much cleaner code.
However, I advise that you should put the items:
Under consideration, Phone screened, Interviewed, Offer extended
into a separate table on the database and just save the status_id on your main table.
One of the major advantage on this is the speed. For example, your client wants to know all record that has a status of Interviewed for a certain date span. Database searching against integer is a lot faster than string.
You could create a method to handle the select operations. Something like:
public function yourExisitingMethod() {
$date = $request->date;
$underConsideration = getData($date,'Under consideration');
$phoneScreened = getData($date,'Phone screened');
$interviewed = getData($date,'Interviewed');
$offerExtended = getData($date,'Offer extended');
}
public function getData($date, $status) {
$data = Application::whereDate('created_at',$date)->where('status',$status)->count();
return $data;
}
This would at the very least improve the maintainability of your code, and in my opinion improves reusability and readability.

How to eager load (one to many) average and sort

I need to get the average rating from 'Reviews' which is one to many with 'Recipes' and be able to sort with this. These are the following that I have tried so far but have had no success with.
// IN RECIPES
public function recipes() {
return $this->hasMany('reviews');
}
public function avgRating() {
return $this->reviews()
->selectRaw('avg(rating) as rating, recipe_id')
->groupBy('recipe_id');
}
public function avgRating() { // ALWAYS RETURNS EAGER LOAD ERROR AS IT'S A FLOAT, SO NOT USED
return $this->reviews->avg('rating');
}
// IN CONTROLLER
$query = Recipe::select();
$query->withCount('reviews'); // GIVES ME COUNT OF REVIEWS WHICH I NEED
// PROVIDES avgRating: {recipeId, avgRating},
// WHICH I DON'T KNOW HOW TO SORT, AND WOULD PREFER AS avgRating: avgRating;
$query->with('avgRating');
// PROVIDES reviews: [{recipe_id, rating}]
// WHICH ISN'T AVERAGE SO NOT EASILY SORTABLE
$query->with('reviews:recipe_id,rating');
$query->paginate(8);
This would be my preferred output (simplified from paginate)
{
recipes: {
otherdata: x,
avgRating: y,
reviews_count: z,
}
}
I don't necessarily want to have to append avgRating to this recipe all the time. Would it be possible to have a dynamic append?
EDIT: Since I'm not sure if there was ever a fix I found in stack overflow, here was the solution I eventually came up with. It might not be the most efficient solution out there but it works. If there is anyone that can improve it, I'd be most happy to update this as well.
$query->withCount([
'reviews as reviewCount',
'reviews as rating' => function($query) {
$query->select(\DB::raw('coalesce(avg(rating),0)'));}
]);
$query->orderBy('rating', 'desc')

Parent, Child, Child of Child in Laravel

I have three models, Province, City and Job.
Province has the following:
public function cities() {
return $this->hasmany('City');
}
City has the following:
public function province() {
return $this->belongsTo('Province', 'province_id');
}
public function jobs() {
return $this->hasmany('Job');
}
Job has the following:
public function city() {
return $this->belongsTo('City', 'city_id');
}
I am trying to get the total number of jobs in each province and the following doesn't work. Would appreciate if someone could point out what I am doing wrong?
$province->cities->jobs->count()
Thanks!
You have ready-to-use solution in Eloquent, no need for loading everything and adding up count on the collections.
// Province
public function jobs()
{
return $this->hasManyThrough('Job', 'City');
}
then just:
$province->jobs()->count();
It runs simple SELECT count(*) without loading redundant collections.
Additionally, if you need to eager load that count on the collection of provinces, then use this:
public function jobsCount()
{
return $this->jobs()->selectRaw('count(*) as aggregate')->groupBy('province_id');
}
public function getJobsCountAttribute()
{
if ( ! array_key_exists('jobsCount', $this->relations)) $this->load('jobsCount');
return $this->getRelation('jobsCount')->first()->aggregate;
}
With this you can easily get the count for multiple provinces at once (with just 2 queries executed):
$provinces = Province::with('jobsCount')->get();
foreach ($provinces as $province)
{
$province->jobsCount;
}
This is because cities is a Collection you have to loop through each of them to get the jobs number and add them up.
Within controller: Like so;
$job_count = 0;
$province->cities->each(function ($city) use ($job_count){
$job_count += $city->jobs->count();
});
The $job_count would be equal to the total number of jobs within each of it cities.
Please Note: Be sure to eager load your relations data to reduce the amount of queries that are made on your database.
$province = Province::with('cities', 'cities.jobs')...

Testing that an array is ordered randomly

I am testing my code with PHPunit. My code has several ordering-methods: by name, age, count and random. Below the implementation and test for sorting by count. These are pretty trivial.
class Cloud {
//...
public function sort($by_property) {
usort($this->tags, array($this, "cb_sort_by_{$by_property}"));
return $this;
}
private function cb_sort_by_name($a, $b) {
$al = strtolower($a->get_name());
$bl = strtolower($b->get_name());
if ($al == $bl) {
return 0;
}
return ($al > $bl) ? +1 : -1;
}
/**
* Sort Callback. High to low
*/
private function cb_sort_by_count($a, $b) {
$ac = $a->get_count();
$bc = $b->get_count();
if ($ac == $bc) {
return 0;
}
return ($ac < $bc) ? +1 : -1;
}
}
Tested with:
/**
* Sort by count. Highest count first.
*/
public function testSortByCount() {
//Jane->count: 200, Blackbeard->count: 100
//jane and blackbeard are mocked "Tags".
$this->tags = array($this->jane, $this->blackbeard);
$expected_order = array("jane", "blackbeard");
$given_order = array();
$this->object->sort("count");
foreach($this->object->get_tags() as $tag) {
$given_order[] = $tag->get_name();
}
$this->assertSame($given_order, $expected_order);
}
But now, I want to add "random ordering"
/**
* Sort random.
*/
public function testSortRandom() {
//what to test? That "shuffle" got called? That the resulting array
// has "any" ordering?
}
The implementation could be anything from calling shuffle($this->tags) to a usort callback that returns 0,-1 or +1 randomly. Performance is an issue, but testability is more important.
How to test that the array got ordered randomly? AFAIK it is very hard to stub global methods like shuffle.
Assuming you are using shuffle your method should look like this
sortRandom() {
return shuffle($this->tags);
}
Well, you don't need to test if keys are shuffled but if array is still returned.
function testSortRandom(){
$this->assertTrue(is_array($this->object->sortRandom()));
}
You should test your code, not php core code.
This is actually not really possible in any meaningful sense. If you had a list with just a few items in, then it'd be entirely possible that sorting by random would indeed look like it's sorted by any given field (and as it happens the odds of it being in the same order as sorting by any other field are pretty high if you don't have too many elements)
Unit testing a sort operation seems a bit daft if you ask me though if the operation doesn't actually manipulate the data in any way. Feels like unit testing for the sake of it rather than because it's actually measuring that something works as intended.
I decided to implement this with a global-wrapper:
class GlobalWrapper {
public function shuffle(&$array);
shuffle($array);
}
}
In the sort, I call shuffle through that wrapper:
public function sort($by_property) {
if ($by_property == "random") {
$this->global_wrapper()->shuffle($this->tags);
}
//...
}
Then, in the tests I can mock that GlobalWrapper and provide stubs for global functions that are of interest. In this case, all I am interested in, is that the method gets called, not what it outputs[1].
public function testSortRandomUsesShuffle() {
$global = $this->getMock("GlobalWrapper", array("shuffle"));
$drupal->expects($this->once())
->method("shuffle");
$this->object->set_global_wrapper($drupal);
$this->object->sort("random");
}
[1] In reality I have Unit Tests for this wrapper too, testing the parameters and the fact it does a call-by-ref. Also, this wrapper was already implemented (and called DrupalWrapper) to allow me to stub certain global functions provided by a third party (Drupal). This implementation allows me to pass in the wrapper using a set_drupal() and fetch it using drupal(). In above examples, I have called these set_global_wrapper() and global_wrapper().

Linq2SQL "Local sequence cannot be used in LINQ to SQL" error

I have a piece of code which combines an in-memory list with some data held in a database. This works just fine in my unit tests (using a mocked Linq2SqlRepository which uses List).
public IRepository<OrderItem> orderItems { get; set; }
private List<OrderHeld> _releasedOrders = null;
private List<OrderHeld> releasedOrders
{
get
{
if (_releasedOrders == null)
{
_releasedOrders = new List<nOrderHeld>();
}
return _releasedOrders;
}
}
.....
public int GetReleasedCount(OrderItem orderItem)
{
int? total =
(
from item in orderItems.All
join releasedOrder in releasedOrders
on item.OrderID equals releasedOrder.OrderID
where item.ProductID == orderItem.ProductID
select new
{
item.Quantity,
}
).Sum(x => (int?)x.Quantity);
return total.HasValue ? total.Value : 0;
}
I am getting an error I don't really understand when I run it against a database.
Exception information:
Exception type: System.NotSupportedException
Exception message: Local sequence cannot be used in LINQ to SQL
implementation of query operators
except the Contains() operator.
What am I doing wrong?
I'm guessing it's to do with the fact that orderItems is on the database and releasedItems is in memory.
EDIT
I have changed my code based on the answers given (thanks all)
public int GetReleasedCount(OrderItem orderItem)
{
var releasedOrderIDs = releasedOrders.Select(x => x.OrderID);
int? total =
(
from item in orderItems.All
where releasedOrderIDs.Contains(item.OrderID)
&& item.ProductID == orderItem.ProductID
select new
{
item.Quantity,
}
).Sum(x => (int?)x.Quantity);
return total.HasValue ? total.Value : 0;
}
I'm guessing it's to do with the fact
that orderItems is on the database
and releasedItems is in memory.
You are correct, you can't join a table to a List using LINQ.
Take a look at this link:
http://flatlinerdoa.spaces.live.com/Blog/cns!17124D03A9A052B0!455.entry
He suggests using the Contains() method but you'll have to play around with it to see if it will work for your needs.
It looks like you need to formulate the db query first, because it can't create the correct SQL representation of the expression tree for objects that are in memory. It might be down to the join, so is it possible to get a value from the in-memory query that can be used as a simple primitive? For example using Contains() as the error suggests.
You unit tests work because your comparing a memory list to a memory list.
For memory list to database, you will either need to use the memoryVariable.Contains(...) or make the db call first and return a list(), so you can compare memory list to memory list as before. The 2nd option would return too much data, so your forced down the Contains() route.
public int GetReleasedCount(OrderItem orderItem)
{
int? total =
(
from item in orderItems.All
where item.ProductID == orderItem.ProductID
&& releasedOrders.Contains(item.OrderID)
select new
{
item.Quantity,
}
).Sum(x => (int?)x.Quantity);
return total.HasValue ? total.Value : 0;
}

Resources