I have a table that has 3 column id, sub_id, name. That is a pretty big table and there are some duplicates.
What is the best way to detect the duplicates so that I can remove them?
I tried this but it returns everything (I guess thinking ids are making them non-unique)
$collection = \App\MyModel::all();
$colUnique = $collection->unique(['name', 'sub_id']);
$dupes = $collection->diff($colUnique);
I want to get the models that has same name and sub_id.
id sub_id name
1 2 John
2 2 John <- duplicate
3 2 Robin <- unique
My best bet would be DB::Query.
Step 1: Fetch data by group by
$uniqueData = DB::table('TABLE_NAME')
->groupBy(['sub_id', 'name'])
->select('id')
->toArray();
Step 2: Delete duplicate record.
$noOfDeletedRecords = DB::table('TABLE_NAME')
->whereNotIn($uniqueData)
->delete();
Benefits:
1. Only 2 Queries
2. Better performance over collection.
You can utilize Collection.groupBy method.
$collection = \App\MyModel::all();
$collection
// Group models by sub_id and name
->groupBy(function ($item) { return $item->sub_id.'_'.$item->name; })
// Filter to remove non-duplicates
->filter(function ($arr) { return $arr->count()>1; })
// Process duplicates groups
->each(function ($arr) {
$arr
// Sort by id (so first item will be original)
->sortBy('id')
// Remove first (original) item from dupes collection
->splice(1)
// Remove duplicated models from DB
->each(function ($model) {
$model->delete();
});
})
I have a query that I have built, and I am trying to understand how I can achieve the same thing but in one single query. I am fairly new to Laravel and learning. Anyway someone could help me understand how I can achieve what I am after?
$activePlayerRoster = array();
$pickupGames = DB::table('pickup_games')
->where('pickupDate', '>=', Carbon::now()->subDays(30)->format('m/d/Y'))
->orderBy('pickupDate', 'ASC')
->get();
foreach ($pickupGames as $games) {
foreach(DB::table('pickup_results')
->where('pickupRecordLocatorID', $games->recordLocatorID)
->get() as $activePlayers) {
$activePlayerRoster[] = $activePlayers->playerID;
$unique = array_unique($activePlayerRoster);
}
}
$activePlayerList = array();
foreach($unique as $playerID) {
$playerinfo = DB::table('players')
->select('player_name')
->where('player_id', $playerID)
->first();
$activePlayerList[] = $playerinfo;
}
return $activePlayerList;
pickup_games
checkSumID
pickupDate
startTime
endTime
gameDuration
winningTeam
recordLocatorID
pickupID
1546329808471
01/01/2019
08:03 am
08:53 am
50 Minute
2
f47ac0fc775cb5793-0a8a0-ad4789d4
216
pickup_results
id
checkSumID
playerID
team
gameResult
pickOrder
pickupRecordLocatorID
1
1535074728532
425336395712954388
1
Loss
0
be3532dbb7fee8bde-2213c-5c5ce710
First, you should try to write SQL query, and then convert it to Laravel's database code.
If performance is not critical for you, then it could be done in one query like this:
SELECT DISTINCT players.player_name FROM pickup_results
LEFT JOIN players ON players.player_id = pickup_results.playerID
WHERE EXISTS (
SELECT 1 FROM pickup_games
WHERE pickupDate >= DATE_FORMAT(SUBDATE(NOW(), INTERVAL 30 DAY), '%m/%d/%Y')
AND pickup_results.pickupRecordLocatorID = recordLocatorID
)
Here I'm assuming you know what you're doing with this dates comparison, because it looks weird to me.
Now, let's convert it to Laravel's code:
DB::table('pickup_results')
->select('players.player_name')->distinct()
->leftJoin('players', 'players.player_id', '=', 'pickup_results.playerID')
->whereExists(function ($query) {
$query->select(DB::raw(1))
->from('pickup_games')
->where('pickupDate', '>=', Carbon::now()->subDays(30)->format('m/d/Y'))
->whereRaw('pickup_results.pickupRecordLocatorID = recordLocatorID');
})
->get();
Basically, I would reduce the query to its SQL variant to get directly at its core.
The essence of the query is
select `x` FROM foo WHERE id IN (
select distinct bar.id from bar join baz on bar.id = baz.id);
This can be interpreted in Eloquent as:
$thirtyDaysAgo = Carbon::now()->subDays(30)->format('m/d/Y');
$playerIds = DB::table('pickup_games')
->select('pickup_games.player_id')
->join(
'pickup_results',
'pickup_results.pickupRecordLocatorID',
'pickup_games.recordLocatorID')
->where('pickupDate', '>=', $thirtyDaysAgo)
->orderBy('pickupDate', 'ASC')
->distinct('pickup_games.player_id');
$activePlayers = DB::table('players')
->select('player_name')
->whereIn('player_id', $playerIds);
//>>>$activePlayers->toSql();
//select "player_name" from "players" where "player_id" in (
// select distinct * from "pickup_games"
// inner join "pickup_results"
// on "pickup_results"."pickupRecordLocatorID" = "pickup_games"."recordLocatorID"
// where "pickupDate" >= ? order by "pickupDate" asc
//)
From the resulting query, it may be better to refactor the join as relationship between the Eloquent model for pickup_games and pickup_results. This will help to further simplify $playerIds.
$pidArray contains product ID's, some of those product ID's can be the same. I.E: 34 34 56 77 99 34. As is, it appears the whereIn method does not return results for a productId it has already found in $pidArray, even if it has a different index.
$productDataForOrder = Product::whereIn('id', $pidArray)->get(['id','price']);
$totalAmount = $productDataForOrder->sum('price');
$productDataForOrder now contains product data, but only for unique ProductID's in $pidarray. So when sum function is run, the sum is wrong as it does not take into account the price for multiple instances of the same productID.
The following code also does not return objects for every product ID in the array which are the same. So if $pidArray contains three identical product ID's, the query will only return a collection with one object, instead of three.
$query = Product::select();
foreach ($pidArray as $id)
{
$query->orWhere('id', '=', $id);
}
$productDataForOrder = $query->get(['id','price']);
$totalAmount = $productDataForOrder->sum('price');
You're not going to be able to get duplicate data the way that you're trying. SQL is returning the rows that match your where clause. It is not going to return duplicate rows just because your where clause has duplicate ids.
It may help to think of it this way:
select * from products where id in (1, 1)
is the same as
select * from products where (id = 1) or (id = 1)
There is only one record in the table that satisfies the condition, so that is all you're going to get.
You're going to have to do some extra processing in PHP to get your price. You can do something like:
// First, get the prices. Then, loop over the ids and total up the
// prices for each id.
// lists returns a Collection of key => value pairs.
// First parameter (price) is the value.
// Second parameter (id) is the key.
$prices = Product::whereIn('id', $pidArray)->lists('price', 'id');
// I used array_walk, but you could use a plain foreach instead.
// Or, if $pidArray is actually a Collection, you could use
// $pidArray->each(function ...)
$total = 0;
array_walk($pidArray, function($value) use (&$total, $prices) {
$total += $prices->get($value, 0);
});
echo $total;
The whereIn method only limits the results to the values in the given array. From the docs:
The whereIn method verifies that a given column's value is contained within the given array
Id make a query variable and loop through the array adding to the query variable in each pass. Something like this:
$query = Product::select();
foreach ($pidArray as $id)
{
$query->where('id', '=', $id);
}
$query->get(['id','price']);
Here is a code that would work for your use case expanding on #patricus
You first fetch an array of key as id and value as price from the products table
$prices = Product::whereIn('id', $pidArray)->lists('price', 'id');
$totalPrice = collect([$pidArray])->reduce(function($result, $id) use ($prices) {
return $result += $prices[$id];
}, 0);
I have this table:
Friends
id | friendid
10 | 15
12 | 10
10 | 13
and i want to put all values (either from id or friendid columns) in one variable. is it posible?
i have this code:
$id = 10;
$id = DB::table('friends')->where('id', $x)->orWhere('friendid', $x)->lists('friendid');
but this line of code only returns values from 'friendid' column and what i want is to store values from 'friendid' and 'id' columns.
You can pass a second argument to lists, which will give set the keys of the resulting array:
$query = DB::table('friends')->where('id', $x)->orWhere('friendid', $x);
$results = $query->lists('friendid', 'id);
If you want it all as values of a single array, use this:
$query = DB::table('friends')->where('id', $x)->orWhere('friendid', $x);
$ids = [];
foreach ($query->select('id', 'friendid')->get() as $record) {
$ids[] = $record->id;
$ids[] = $record->friendid;
}
I am creating a system of newsfeed, and as you can easily guess, it is beyond my skills.
Please be kind to put me on the right track or provide something I can go on with.
I have several hundred events (model name is Event1, table 'events')
I also have a pivot table in which users can assign any event's importance (values 0,1,2,3)
The relevant columns of the pivot table user_attitudes (Model Userattitude) are
id, item_type, item_id, importance, attitude, creator_id
An example three record are:
456 - event - 678 - 2 - 4
457 - event - 690 - 3 - 15
458 - event - 690 - 1 - 4
459 - participant - 45 - 1 - 4
Plain English: Total aggregated importance of the event #690 is '4', while the event #678 is '2'.
Therefore in my ranking the event #690 should be listed as first.
Just to see the bigger pic: the user #4 also rated participant # 45 as importance = 1.
The table services many models - the above example include two - just to give a better image of what I have.
WHAT I NEED:
I wish to print a ranking of top 5 events (and later other models). I wish to be able to use two methods of calculating the total score:
by counting the actual value (0,1,2,3)
by counting any value above 0 as 1 point.
I want to generate views which filter events by this criteria:
at least one user set the importance to '0' (I need it to flag an event as untrustworthy)
events which has not been rated yet
reverse of the above - events which are rated by at least one user
events listed by number of users who assigned any importance to it
This is easy, but still I have no idea how to make it happen. The same filters as the above #2, but related to a particular user decisions:
list 5 or 10 events (random or newest) which has not yet been rated by the user
maybe something like this would be an answer:
$q->where('creator_id', '=', Auth::user()->id);
Relevant code:
As I don't really grasp the merged relations, I might fail to show everything needed to provide help - ask for more code in comments.
Models:
Event1 (table 'events'):
public function importances()
{
return $this->morphMany('Userattitude', 'item');
}
public function user_importance($user)
{
return $this->morphMany('Userattitude', 'item')->where('creator_id', ($user ? $user->id : NULL))->first();
}
User: (table 'users' - standard user table)
public function importances()
{
return $this->hasMany('Userattitude', 'creator_id');
}
In model Userattitude (different from User, table name 'user_attitudes')
public function events()
{
return $this->morphTo('item')->where('item_type', 'event');
}
public function event()
{
return $this->belongsTo ('Event1', 'item_id');
}
PROBLEMS IN REPLY TO #lucas answer:
PROBLEM 1.
table name 'items' keeps me confused as in my project 'items' are events (model Event1), the participants (model Entity) and other objects.
Can we stick to my naming until I get hold of the knowledge you are providing?
it also contains column named attitudes, which is used for blacklisting particular items.
For instance, an item of type 'entity' (possible participant of multiple events) can be voted by user two-wise:
- by importance set by an user (we are doing this now, values available to use are 0,1,2,3)
- by attitude of an user toward (possible value (-1, 0, 1)
Such solution allows me to compute karma of each item. For instance -1 x 3 = -3 (worst possible karma value), while 1 x 2 = 2 (medium positive karma).
In consequence I am unable to use queries with the users method. It is still too confusing to me, sorry. We diverted too far from my original mental image.
Consider this query:
$events = Event1::has('users', '<', 1)->get();
If in Event1 I declare
public function users()
{
return $this->morphToMany('User', 'item', null, null, 'creator_id');
}
Note: User is the standard users table, where username, password and email are stored
I get this error:
[2014-12-28 05:02:48] production.ERROR: FATAL DATABASE ERROR: 500 = SQLSTATE[42S02]: Base table or view not found: 1146 Table 'niepoz_niepozwalam.items' doesn't exist (SQL: select * from `Events` where (select count(*) from `users` inner join `items` on `users`.`id` = `items`.`creator_id` where `items`.`item_id` = `Events`.`id` and `items`.`item_type` = Event1) >= 1) [] []
if I change the method definition to
public function users()
{
return $this->morphToMany('Userattitude', 'item', null, null, 'creator_id');
}
Note: Userattitude is model (table name is 'user_attitudes') where i store user judgments. This table contains columns 'importance' and 'attitude'.
I get the same error.
If I change the method to
public function users()
{
return $this->morphToMany('User', 'Userattitudes', null, null, 'creator_id');
}
I get this:
[2014-12-28 05:08:28] production.ERROR: FATAL DATABASE ERROR: 500 = SQLSTATE[42S22]: Column not found: 1054 Unknown column 'user_attitudes.Userattitudes_id' in 'where clause' (SQL: select * from Events where (select count(*) from users inner join user_attitudes on users.id = user_attitudes.creator_id where user_attitudes.Userattitudes_id = Events.id and user_attitudes.Userattitudes_type = Event1) >= 1) [] []
Possible solution:
the 'user_attitudes' table alias with name 'items'.
I could create a view with the required name.
I did it, but now the query produces no results.
PROBLEM 2
should I rename creator_id into user_id - or keep both columns and keep duplicated information in them? The creator_id follows conventions and I use it to create records... how to resolve this dillema?
PROBLEM 3.
As far as I understand, if I want to get a USER-RELATED list of top-5 events,
I need to ad another line to the code, which narrows search scope to records created by a particular logged in user:
Auth::user()->id)
The code would look like this:
All with importance 0
$events = Event1::whereHas('users', function($q){
$q->where('importance', 0);
$q->where('creator_id', '=', Auth::user()->id);
})->get();
right?
PROBLEM 5:
Ok, I am now able to output a query like these:
$rank_entities = Entity::leftJoin('user_attitudes', function($q){
$q->on('entity_id', '=', 'entities.id');
$q->where('item_type', '=', 'entity');
})
->selectRaw('entities.*, SUM(user_attitudes.importance) AS importance')
->groupBy('entities.id')
->orderBy('importance', 'desc')
->take(6)
->get();
and in the foreach loop I can display the total importance count with this code:
{{$e->importance or '-'}}
But How I could display count of an alternative query: SUM of values from another column, named attitude, which can be computed in this SEPARATE query:
In other words, in my #foreach loop I need to display both $e->importance and a computed SUM(user_attitudes.attitude) AS karma, which for now can be received with this query:
$rank_entities = Entity::leftJoin('userattitudes', function($q){
$q->on('entity_id', '=', 'entities.id');
$q->where('item_type', '=', 'entity');
})
->selectRaw('entities.*, SUM(userattitudes.karma) AS karma')
->groupBy('entities.id')
->orderBy('karma', 'desc')
->take(5)
->get();
My solution would be to create some extra columns in the 'entities' table:
- karma_negative
- karma_positive
to store/update total amount of votes each time someone is voting.
First, let's talk about the setup. I wasn't entirely sure how and if your's works but I created this on my testing instance and it worked, so I recommend you change yours accordingly:
Database
events
That's a simple one (and you probably already have it like this
id (primary key)
name (or something like that)
etc
users
I'm not sure if in your example that is Userattitude but I don't think so...
id (primary key)
email (?)
etc
items
This is the important one. The pivot table. The name can be different but to keep it simple and follow conventions it should be the plural of the polymorphic relation (in your case item => items)
id (actually not even necessary, but I left it in there)
item_type
item_id
importance
creator_id (consider changing that to user_id. This would simplify the relationship declaration)
Models
I think you have to read the docs again. You had several weird relations declared. Here's how I did it:
Event1
By default Laravel uses the classname (get_class($object)) as value for the ..._type column in the database. To change that you need to define $morphClass in your models.
class Event1 extends Eloquent {
protected $table = 'events';
protected $morphClass = 'event';
public function users()
{
return $this->morphToMany('User', 'item', null, null, 'creator_id');
}
}
User
class User extends Eloquent implements UserInterface, RemindableInterface {
// ... default laravel stuff ...
public function events(){
return $this->morphedByMany('Event1', 'item', null, null, 'creator_id');
}
}
Queries
Alright now we can get started. First one additional information. I used Eloquent relations whenever possible. In all the queries a join() is made it would be slower to use relations because certain things (like counting or calculating the maximum) would have to be done in PHP after the query. And MySQL does a pretty good job (also performance wise) at those things.
Top 5 by total value
$events = Event1::leftJoin('items', function($q){
$q->on('item_id', '=', 'events.id');
$q->where('item_type', '=', 'event');
})
->selectRaw('events.*, SUM(items.importance) AS importance')
->groupBy('events.id')
->orderBy('importance', 'desc')
->take(5)
->get();
Top 5 by number of votes over 0
$events = Event1::leftJoin('items', function($q){
$q->on('item_id', '=', 'events.id');
$q->where('item_type', '=', 'event');
$q->where('importance', '>', 0);
})
->selectRaw('events.*, COUNT(items.id) AS importance')
->groupBy('events.id')
->orderBy('importance', 'desc')
->take(5)
->get();
All with importance 0
$events = Event1::whereHas('users', function($q){
$q->where('importance', 0);
})->get();
All without any votes
$events = Event1::has('users', '<', 1)->get();
All with 1+ votes
$events = Event1::has('users')->get();
All ordered by number of votes
$events = Event1::leftJoin('items', function($q){
$q->on('item_id', '=', 'events.id');
$q->where('item_type', '=', 'event');
})
->selectRaw('events.*, COUNT(items.id) AS count')
->groupBy('events.id')
->orderBy('count', 'desc')
->get();
Newest 5 without votes
If you are using Eloquents timestamps created_at:
$events = Event1::has('users', '<', 1)->latest()->take(5)->get();
If you're not (order by greatest id):
$events = Event1::has('users', '<', 1)->latest('id')->take(5)->get();
Random 5 without votes
$events = Event1::has('users', '<', 1)->orderByRaw('RAND()')->take(5)->get();
I did not add any explanations to the queries on purpose. If you want to know more about something specific or need help, please write a comment
PROBLEM 4: SOLVED! (credit to #lukasgeiter)
If you wish to display a ranking of items and limit the results to a particular tag defined in a pivot table, this is the solution:
$events = Event1 (table name = 'events')
For example, the tag would be war: defined in table
eventtags
Event nature are defined as
id = '1' is name = 'wars'
id = '2' is name = 'conflicts'
pivot table, which assigns multiple tags:
event_eventtags they are defined as id = '4'
Example records for event_eventtags:
id - event_id - eventtag_id
1 - 45 - 1
2 - 45 - 2
Plain English: the Event1 #45 is tagged as war(#1) and conflict(#2)
Now in order to print a list of 10 wars you should define your query in this way:
$events= Entity::join('event_eventtags', function($q){
$q->on('entity_id', '=', 'entities.id');
$q->where('entitycapacitytypes_id', '=', 1);
})->leftJoin('user_attitudes', function($q){
$q->on('item_id', '=', 'entities.id');
$q->where('item_type', '=', 'entity');
})
->selectRaw('entities.*, SUM(user_attitudes.importance) AS importance')
->groupBy('entities.id')
->orderBy('importance', 'desc')
->take(10)
->get();
The user_attitudes is part of voting system described in the original question. You can remove it and sort events by another method.