Laravel Eloquent Query to 2 Million Rows takes long time - laravel

I have to pull data from a table that has 2 million rows. The eloquent query looks like this:
$imagesData = Images::whereIn('file_id', $fileIds)
->with('image.user')
->with('file')
->orderBy('created_at', 'DESC')
->simplePaginate(12);
The $fileIds array used in whereIn can contain 100s or even 1000s of file ids.
The above query works fine in small table. But in production site that has over 2 million rows in Images table, it takes over 15 seconds to get a reply. I use Laravel for api only.
I have read through other discussions on this topic. I changed paginate() to simplePaginate(). Some suggests perhaps having a DB:: query with whereRaw might work better than whereIn. Some says it might be due to PDO in php while processing whereIn and some recommends using Images::whereIn which I already used.
I use MariaDB, with InnoDB for db engine and its loaded into RAM. The sql queries performs well for all other queries, but only the ones that has to gather data from huge table like this takes time.
How can I optimise the above laravel query so I can reduce down the query response to couple of seconds if possible when the table has millions of rows?

You need indexing, which segmented your data by certain columns. You are accessing file_id and created_at. Therefore this following index will help performance.
$table->index(['file_id', 'created_at']);
Indexing will increase insert time and can make queries have weird execution plans. If you use the SQL EXPLAIN on the query before an after executing the query, we can verify it helps the problem.

Here is an update on steps taken to speed up the page load.
The real culprit for such a really slow query was not the above specific query alone. After the above query fetches the data, the php iterates over the data and does a sub-query inside it to check something. This query was using the filename column to search the data during each iteration. Since filename is string and not indexed, the response time for that controller took so long since each sub-query is crawling through 1.5 millions rows inside a foreach loop. Once I removed this sub-query, the loading time decreased by a lot.
Secondly, I added the index to file_id and created_at as suggested by #mrhn and #ceejayoz above. I created a migration file like this:
Schema::table('images', function (Blueprint $table) {
$table->index('file_id', 'created_at');
});
Optimised the PHP script further. I removed all queries that does searches using fileNames and changed it to use the id to fetch results. Doing this made a huge difference throughout the app and also improved the server speed due to less CPU work during peak hours.
Lastly, I optimised the server by performing the following steps:
Completed any yum updates
Updated Litespeed
Tweaked Apache profile, this included some caching modules and installed EA-PHP 7.3
Updated cPanel
Tuned Mysql to allow your server to utilize some more of your server resources.
Once I edit all of the above steps, I found a huge difference in the loading speed. Here are the results:
BEFORE:
AFTER:
Thanks to each and everyone who have commented on this. Your comments helped me in performing all of the above steps and the result was fruitful. Thanks heaps.

Related

What are the best options for data loading in bigger systems? (Laravel)

as my question says, I would like to know what is my best choice for loading data in bigger systems in Laravel.
At the moment I use Laravel Eloquent to pull data from database and in my views I use dataTables JS library. It is effective for smaller systems or websites. However I seem to find myself in a position where systems that are bigger and more complex take very long time to load those data (sometimes more than 15 seconds).
I have found some solutions:
Eager loading relations helps with relations in row
Using DB instead of Eloquent
using Laravel pagination instead of dataTables pagination
loading data to dataTables from ajax source
However the pagination has some problems, especially with dataTables having the option of ordering/searching data.
My question is do you have any advice on how to load data in most effective way, so that it is as fast as it could be and the code as clean as possible. What do you do, when you need to load ginormous amount of data?
On of the best ways to optimize your queries is by using MySQL indexes. As per this documentation:
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data.
This is much faster than reading every row sequentially.
The most simple way to create an index is the following:
CREATE INDEX index_name
ON table_name (column1, column2, ...);
If you want the Laravel way of creating index, you can do it by using index() method, as per official documentation:
$table->index('column');

Laravel mongodb very slow fetching records using wherein query

i have a around 6gb collection with more then 7 lacks records.
trying below query
Location::whereIn('location_id', $id)->get();
but its taking more than 15th seconds to fetch data for 200 records i can not use pagination i need it in one go.
I am using Laravels jeneggers/mongodb package.
Searching large data is what's taking the time, likely not the small response.
Look into adding Indexes
Another option would be looking into archiving old data to make the data that's being queried shorter.

faster large data exports in laravel to avoid timeouts

I am to generate a report from the database with 1000's of records. This report is to be generated on monthly basis and at times the user might want to get a report spanning like 3 months. Already as per the current records, a month's data set can reach to like 5000.
I am currently using vue-excel to which makes an api call to laravel api and there api returns the resource which is now exported by vue-excel. The resource does not only return the model data but there are related data sets I also need to fetch.
This for smaller data sets works fine that is when I am fetching like 3000 records but for anything larger than this, the server times out.
I have also tried to use laravel excel with the query concern actually timed them and both take same amount of time because laravel excel was also mapping to get me the relations.
So basically, my question is: is there some better way to do this so as get this data faster and avoid the timeouts
just put this on start of the function
ini_set(max_execution_time, 84000); //84000 is in seconds
this will override the laravel inbuild script runtime max value.

Querying from azure db using linq takes a lot of time

I am using Azure db and my table has over 120000 records.
Applied with paging of 10 records a page I am fetching data into IQueryable, but fetching 10 records only taking around 2 minutes. This query has no join and just 2 filters. While using Azure search I can get all the records within 3 seconds.
Please suggest me who to minimise my Linq search as azure search is costly.
Based on the query in the comment to your question, it looks like you are reading the entire db table to the memory of your app. The data is potentially transferred from one side of the data centre to another, thus causing the performance issue.
unitofwork.Repository<EntityName().GetQueriable().ToList().Skip(1).Take(10);
Without seeing the rest of the code, I'm just guessing your LINQ query should be something like this:
unitofwork.Repository<EntityName().GetQueriable().Skip(1).Take(10).ToList();
Skip and Take should be executed on the db Server, while .ToList() is at the end will materialize the entities.

Joomla getItems default Pagination

Can anyone tell me if the getItems() function in the model automatically adds the globally set LIMIT before it actions the query (from getListQuery()). Joomla is really struggling, seemingly trying to cache the entire results (over 1 million records here!).
After looking in /libraries/legacy/model/list.php AND /libraries/legacy/model/legacy.php it appears that getItems() does add LIMIT to setQuery using $this->getState('list.limit') before it sends the results to the cache but if this is the case - why is Joomla struggling so much.
So what's going on? How come phpMyAdmin can return the limited results within a second and Joomla just times out?
Many thanks!
If you have one million records, you'll most definitely want to do as Riccardo is suggesting, override and optimize the model.
JModelList runs the query twice, once for the pagination numbers and then for the display query itself. You'll want to carefully inherit from JModellist to avoid the pagination query.
Also, the articles query is notorious for it's joins. You can definitely lose some of that slowdown (doubt you are using the contacts link, for example).
If all articles are visible to public, you can remove the ACL check - that's pretty costly.
There is no DBA from the West or the East who is able to explain why all of those GROUP BY's are needed, either.
Losing those things will help considerably. In fact, building your query from scratch might be best.
It does add the pagination automatically.
Its struggling is most likely due to a large dataset (i.e. 1000+ items returned in the collection) and many lookup fields: the content modules for example join as many as 10 tables, to get author names etc.
This can be a real killer, I had queries running for over one second with a dedicated server and only 3000 content items. One tag cloud component we found could take as long as 45 seconds to return a keywords list. If this is the situation (a lot of records and many joins), your only way out is to further limit the filters in the options to see if you can get some faster results (for example, limiting to articles in the last 3 months can reduce the time needed dramatically).
But if this is not sufficient or not viable, you're left with writing a new optimized query in a new model, which ultimately will bring the best performance optimization of any other optimization. In writing the query, consider leveraging the database specific optimizations, i.e. adding indexes, full-text indexes and only use joins if you really need them.
Also consider that joins must never grow with the number of fields, translations or else.
A constant query is easy for the db engine to optimize and cache, whilst a dynamic query will never be as efficient.

Resources