I currently parse a CSV file to insert data into a database, but the problem is that because it's 20 000 rows, it takes very long. Is there a way to insert more lines at once using Laravel migrations?
This is what I am doing at the moment:
foreach ($towns as $town) {
DB::table('town')->insert(
array(
// data goes here
)
);
}
I think maybe my question is a bit vague. I want to know what the format is to mass insert multiple items using one query, and if this will actually make a difference in speed?
You can mass insert by filling an array with your data:
foreach ($towns as $town) {
$array[] = array(... your data goes here...);
}
And then run it just once
DB::table('town')->insert($array);
But I really don't know how much faster it can be. You can also disable query log:
DB::disableQueryLog();
It uses less memory and is usually faster.
Related
I'm using EF Core but I'm not really an expert with it, especially when it comes to details like querying tables in a performant manner...
So what I try to do is simply get the max-value of one column from a table with filtered data.
What I have so far is this:
protected override void ReadExistingDBEntry()
{
using Model.ResultContext db = new();
// Filter Tabledata to the Rows relevant to us. the whole Table may contain 0 rows or millions of them
IQueryable<Measurement> dbMeasuringsExisting = db.Measurements
.Where(meas => meas.MeasuringInstanceGuid == Globals.MeasProgInstance.Guid
&& meas.MachineId == DBMatchingItem.Id);
if (dbMeasuringsExisting.Any())
{
// the max value we're interested in. Still dbMeasuringsExisting could contain millions of rows
iMaxMessID = dbMeasuringsExisting.Max(meas => meas.MessID);
}
}
The equivalent SQL to what I want would be something like this.
select max(MessID)
from Measurement
where MeasuringInstanceGuid = Globals.MeasProgInstance.Guid
and MachineId = DBMatchingItem.Id;
While the above code works (it returns the correct value), I think it has a performance issue when the database table is getting larger, because the max filtering is done at the client-side after all rows are transferred, or am I wrong here?
How to do it better? I want the database server to filter my data. Of course I don't want any SQL script ;-)
This can be addressed by typing the return as nullable so that you do not get a returned error and then applying a default value for the int. Alternatively, you can just assign it to a nullable int. Note, the assumption here of an integer return type of the ID. The same principal would apply to a Guid as well.
int MaxMessID = dbMeasuringsExisting.Max(p => (int?)p.MessID) ?? 0;
There is no need for the Any() statement as that causes an additional trip to the database which is not desirable in this case.
I have I need to insert multiple record in database . Currently I am inserting with loop which is causing timeout when record is big. Is there any way that we dont use loop?
$consignments = Consignment::select('id')->where('customer_id',$invoice->customer_id)->doesntHave('invoice_charges')->get();
foreach($consignments as $consignment){
InvoiceCharge::create(['invoice_id'=>$invoice->id,'object_id'=>$consignment->id,'model'=>'Consignment']);
}
consignment has hasOne relation in model
public function invoice_charges()
{
return $this->hasOne('App\Models\Admin\InvoiceCharge', 'object_id')->where('model', 'Consignment');
}
How about this:
$consignments = Consignment::select('id')->where('customer_id',$invoice->customer_id)->doesntHave('invoice_charges')->get();
foreach($consignments as $consignment){
$consignment_data[] = ['invoice_id'=>$invoice->id,'object_id'=>$consignment->id,'model'=>'Consignment'];
}
InvoiceCharge::insert($consignment_data);
In this way you enter with one query rather than loop. Just check if consignment_data array is ok.
If you want to save time, but can give more memory, you can use Cursor,
Cursor: You will use PHP Generators to search your query items one by one. 1)It takes less time 2) Uses more memory
$consignments = Consignment::select('id')->where('customer_id',$invoice->customer_id)->doesntHave('invoice_charges')->cursor();
foreach($consignments as $consignment){
InvoiceCharge::create(['invoice_id'=>$invoice->id,'object_id'=>$consignment->id,'model'=>'Consignment']);
}
You can refer from here
I have a table called rentals, within each row are columns state,city,zipcode which all house ids to another table with that info. There are about 3400 rentals. I am pulling each column to display the states,city and zipcode distinctly. I need to show how many rentals are in each one. I am doing this now via ajax, the person starts typing in what they want to see and it auto completes it with the count, but its slow because of the way im doing it.
$rentals_count = Rentals::where('published',1)->get();
foreach($states as $state) {
echo $state.”-“.$rentals_count->where(‘state’,$state->id)->count();
}
Above is roughly what im doing with pieces removed because they are not related to this question. Is there a better way to do this? It lags a bit so the auto complete seems broken to a new user.
Have you considered Eager loading your eloquent query? Eager loading is used to reduce query operations. When querying, you may specify which relationships should be eager loaded using the with method:
$rental_counts = Rentals::where('published',1)->with('your_relation')->get();
You can read more about that in Laravel Documentation
$rentals = Rentals::wherePublished(true)->withCount('state')->get();
When you loop through $rentals, the result will be in $rental->state_count
Setup a relation 'state' on rentals then call it like this
$rentals_count = Rentals::where('published',1)->with('state')->get()->groupBy('state');
$rentals_count->map(function($v, $k){
echo $v[0]->state->name .' - '. $v->count();
});
Meanwhile in Rentals Model
public function state(){
return $this->hasOne(State::class, 'state'); //state being your foreign key on rentals table. The primary key has to be id on your states table
}
Is there any way to update a sequence and know the primary keys of the updated documents?
table.filter({some:"value"}).update({something:"else"})
Then know the primary keys of the records that were updated without needing a second query?
It's currently not possible to return multiple values with {returnVals: true}, see for example https://github.com/rethinkdb/rethinkdb/issues/1382
There's is however a way to trick the system with forEach
r.db('test').table('test').filter({some: "value"}).forEach(function(doc) {
return r.db('test').table('test').get(doc('id')).update({something: "else"}, {returnVals: true}).do(function(result) {
return {generated_keys: [result("new_val")]}
})
})("generated_keys")
While it works, it's really really hack-ish. Hopefully with array limits, returnVals will soon be available for range writes.
I have a file with over 30,000 records and another with 41,000. Is there a best case study for seeding this using laravel 4's db:seed command? A way to make the inserts more swift.
Thanks for the help.
Don't be afraid, 40K rows table is kind of a small one. I have a 1 milion rows table and seed was done smoothly, I just had to add this before doing it:
DB::disableQueryLog();
Before disabling it, Laravel wasted all my PHP memory limit, no matter how much I gave it.
I read data from .txt files using fgets(), building the array programatically and executing:
DB::table($table)->insert($row);
One by one, wich may be particularily slow.
My database server is a PostgreSQL and inserts took around 1.5 hours to complete, maybe because I was using a VM using low memory. I will make a benchmark one of these days on a better machine.
2018 Update
I have run into the same issue and after 2 days of headache, I could finally write script to seed 42K entries in less than 30s!
You ask How?
1st Method
This method assumes that you have a database with some entries in it(in my case were 42k entries) and you want to import same into other database. Export your database as CSV files with header names and put the file into the public folder of your project and then you can parse the file and insert one by one all the entries in new database via seeder.
So your seeder will look something like this:
<?php
use Illuminate\Database\Seeder;
class {TableName}TableSeeder extends Seeder
{
/**
* Run the database seeds.
*
* #return void
*/
public function run()
{
$row = 1;
if (($handle = fopen(base_path("public/name_of_your_csv_import.csv"), "r")) !== false) {
while (($data = fgetcsv($handle, 0, ",")) !== false) {
if ($row === 1) {
$row++;
continue;
}
$row++;
$dbData = [
'col1' => '"'.$data[0].'"',
'col2' => '"'.$data[1].'"',
'col3' => '"'.$data[2].'"',
so on...how many columns you have
];
$colNames = array_keys($dbData);
$createQuery = 'INSERT INTO locations ('.implode(',', $colNames).') VALUES ('.implode(',', $dbData).')';
DB::statement($createQuery, $data);
$this->command->info($row);
}
fclose($handle);
}
}
}
Simple and Easy :)
2nd method
In case you can modify the settings of your PHP and allocate a big size to aprticular script then this method will work as well.
Well basically you need to focus on three major steps:
Allocate more memory to script
Off Query Logger
Divide your data in chunks of 1000
Iterate through data and use insert() to create chunks of 1K at a time.
So if I combine all of the above mentioned steps in a seeder, your seeder will look something like this:
<?php
use Illuminate\Database\Seeder;
class {TableName}TableSeeder extends Seeder
{
/**
* Run the database seeds.
*
* #return void
*/
public function run()
{
ini_set('memory_limit', '512M');//allocate memory
DB::disableQueryLog();//disable log
//create chunks
$data = [
[
[
'col1'=>1,
'col2'=>1,
'col3'=>1,
'col4'=>1,
'col5'=>1
],
[
'col1'=>1,
'col2'=>1,
'col3'=>1,
'col4'=>1,
'col5'=>1
],
so on..until 1000 entries
],
[
[
'col1'=>1,
'col2'=>1,
'col3'=>1,
'col4'=>1,
'col5'=>1
],
[
'col1'=>1,
'col2'=>1,
'col3'=>1,
'col4'=>1,
'col5'=>1
],
so on..until 1000 entries
],
so on...until how many entries you have, i had 42000
]
//iterate and insert
foreach ($data as $key => $d) {
DB::table('locations')->insert($d);
$this->command->info($key);//gives you an idea where your iterator is in command line, best feeling in the world to see it rising if you ask me :D
}
}
}
and VOILA you are good to go :)
I hope it helps
I was migrating from a different database and I had to use raw sql (loaded from an external file) with bulk insert statements (I exported structure via navicat which has the option to break up your insert statements every 250KiB). Eg:
$sqlStatements = array(
"INSERT INTO `users` (`name`, `email`)
VALUES
('John Doe','john.doe#gmail.com'),.....
('Jane Doe','jane.doe#gmail.com')",
"INSERT INTO `users` (`name`, `email`)
VALUES
('John Doe2','john.doe2#gmail.com'),.....
('Jane Doe2','jane.doe2#gmail.com')"
);
I then looped through the insert statements and executed using
DB::statement($sql).
I couldn't get insert to work one row at a time. I'm sure there's alternatives that are better but this at least worked while letting me keep it within Laravel's migration/seeding.
I had the same problem today. Disabling query log wasn't enough. Looks like an event also get fired.
DB::disableQueryLog();
// DO INSERTS
// Reset events to free up memory.
DB::setEventDispatcher(new Illuminate\Events\Dispatcher());