What is the fastest way to seed 1m records? - laravel

What would be the fastest way using as less as possible memory to seed 1M records in Laravel?
for( $i=0; $i<1000; $i++){
User::factory()->count(1000)->make();
}
Or using chunk method or everytime the loop has 1000 records make the records and empty the variable? Or are there any other ways this could be done faster?

Eloquent models give a significant overhead. Laravel factories are a nice pattern to get some data in your database, but you will notice a slow-down for 1000s of models, since each of the models first get constructed from the PHP class (and if you would persist the factory models using ->create() instead of ->make(), it sends out a bunch of creating/created/saving/saved events as well). So the most performant way forward is to not use the factory code but instead use simple DB statements like so:
$data = [['username' => 'foo', ...], ['username' => 'bar', ...]];
Illuminate\Support\Facades\DB::table('users')->insert($data);
At a certain $data size you could run into an SQL prepared statement placeholder limit, which basically means the query is becoming too big. In that case, you can also just chunk the code above to:
// Use `collect()` to create a Collection object (glorified array)
// which can chunk/paginate the data:
collect($data)->chunk(100)->each(function($chunk) {
DB::table('users')->insert($chunk);
});
That being said, the Laravel factories are not bad to use and they are very handy in testing.

Related

Laravel lazy collection for huge data

I am querying a large data sets from the table and then iterating through a loop for creating a json file.
$user = App\User::all();
foreach($user as $val){
// logic goes here for creating the json file
}
Now the problem i am facing is that when iterating through the loop it is consuming memory and i am getting error 'Allowed memory size exhausted'.And also the cpu usage of the server becomng so high.
My question how i should use the laravel lazy collections to get rid of this issue.I have gone through the offcial docs but couldnt find the way.
Just replace the all method with the cursor one.
$user = App\User::cursor();
foreach($user as $val){
// logic goes here for creating the json file
}
For more informations about the methods you can chain, refer to the official documentation

Effiecient way to get data from database in foreach

I am loading data from excel. In foreach I am checking for each record if it does exist in database:
$recordExists = $this->checkIfExists($record);
function checkIfExists($record) {
$foundRecord = $this->repository->newQuery()
->where(..., $record[...])
->where(..., $record[...])
...
->get();
}
When the excel contains up to 1000 values which is relatively small piece of data - the code runs around 2 minutes. I am guessing this is very inefficient way to do it.
I was thinking of passing the array of loaded data to the method checkIfExists but then I could not query on the data.
What would be a way to proceed?
You can use laravel queue if you want to do a lot of work within a very short time. Your code will run on backend. Client can not recognize the process. just show a message to client that this process is under queue. Thats it
You can check the Official Documentation From Below Url
https://laravel.com/docs/5.8/queues
If you passes all the data from the database to the function (so no more queries to the database), you can use laravel collections functions to filter.
On of them is where => https://laravel.com/docs/5.8/collections#method-where
function checkIfExists($record, Collection $fetchedDataFromDatabase) {
// laravel collectons 'where' function
$foundRecord = $fetchedDataFromDatabase
->where(..., $record[...])
->where(..., $record[...]);
}
other helpful functions.
filter
contains

How to do string functions on a db table column?

I am trying to do string replace on entries of a column inside a db table. So far, I have reached till here:
$misa = DB::table('mis')->pluck('name');
for($i=0;;$i++)
{
$misa[$i] = substr_replace("$misa[$i]","",-3);
}
The error I am getting is "Undefined offset:443".
P.S. I am not a full-fledged programmer. Only trying to develop a few simple programs for my business. Thank You.
Since it's a collection, use the transform() collection method transform it and avoid this kind of errors. Also, you can just use str_before() method to transform each string:
$misa = DB::table('mis')->pluck('name');
$misa->transform(function($i) {
return str_before($i, ':ut');
});
There are a few ways to make this query prettier and FASTER! The beauty of Laravel is that we have the use of both Eloquent for pretty queries and then Collections to manage the data in a user friendly way. So, first lets clean up the query. You can instead use a DB::Raw select and do all of the string replacing in the query itself like so:
$misa = DB::table('mis')->select(DB::raw("REPLACE(name, ':ut' , '') as name"));
Now, we have a collection containing only the name column, and you've removed ':ut' in your specific case and simply replaced it with an empty string all within the MySQL query itself.
Surprise! That's it. No further php manipulation is required making this process much faster (will be noticeable in large data sets - trust me).
Cheers!

If I eager load associated child records, then that means future WHERE retrievals won't dig through database again?

Just trying to understand... if at the start of some method I eager load a record and its associated children like this:
#object = Object.include(:children).where(email:"test#example.com").first
Then does that mean that if later I have to look through that object's children this will not generate more database queries?
I.e.,
#found_child = #object.children.where(type_of_child:"this type").first
Unfortunately not - using ActiveRecord::Relation methods such as where will query the database again.
You could however filter the data without any further queries, using the standard Array / Enumerable methods:
#object.children.detect {|child| child.type_of_child == "this type"}
It will generate another database query in your case.
Eager loading is used to avoid N+1 queries. This is done by loading all associated objects. But this doesn't work when you want to filter that list with where later on, Rails will than build a new query and run that one.
That said: In your example the include makes your code actually slower, because it loads associated object, but cannot use them.
I would change your example to:
#object = Object.find_by(email: "test#example.com")
#found_child = #object.children.find_by(type_of_child: "this type")

How to retrieve a customised list of records according to Repository pattern?

I wanna move the business logic out of controller actions. I read a lot about repository pattern in laravel with tons of examples.
However they're usually pretty straightforward - we have a class that uses some repository to fetch a list of all possible records, the data is returned to the controller and passed to the view.
Now what if our list isn't all the possible records? What if it depends on many things. For example:
we display the list as "pages" so we might need X records for Y-th page
we might need to filter the list or even apply multiple filters (status, author, date from - to etc)
the user can change the sorting of the data (for example by clicking the table column titles)
we might need some data from other data sources (joined tables) or it might even be used for sorting (so lazy loading won't work)
Should I write a special method with all these cases in mind? Something like that:
public function getForDisplay(
$with = array(),
$filters = array(),
$count = 20,
$page = 0,
$orderBy = 'date',
$orderDir = 'DESC'
)
{
//all the code goes here
return $result;
}
And then call it like this from my controller:
$orders = $this->orders->getForDisplay(
array('customer', 'address', 'seller'),
Input::get('filters', array()),
20,
Input::get('page', 0),
Input::get('sort', 'date'),
Input::get('direction', 'DESC')
);
This looks wrong already and we didn't even get to the repositories yet.
What are the best/correct practices for solving situations like this? I'm pretty sure there has to be a way to achieve the desired results without adding all the possible combinations as a method arguments.
Use the repository pattern just for business model updates and you'll end up with very specific query methods (the Domain usually doesn't need many queries and they are pretty straightforward). For UI/reporting querying purposes, you can use a simple DAO/Service/ORM/QUery Handler , that will take some input and returns the desired data (at least part of the view model).
Since you're already using an ORM, you can use it directly. Note that you can use the ORM for domain updates also, but inside a repository's implementation i.e the app only sees the repository interface. We care about separation at the business layer, for UI querying you can skip the unneeded abstraction.
Btw, because we're talking about design, everything is subjective and thus, there's no single best/optimum way of doing things.

Resources