Complex Primefaces datatable slow loading on client side - performance

I have a lazy datatable with 40 complex editabe columns (most of them have autocomplete, calendar and selectOneMenu components) plus sorting and filtering... and it's taking too much time to load.
I'm using pagination and I noticed that when I select 10 rows by default, the time to load on the browser seems pretty decent. But if I choose 50 rows, changing from one page to the other takes around 13 seconds in the best cases (yes I know, it's 40 complex columns and 50 rows... ).
At first I thought the time complexity was on the server side, because of the complex query it executes each time you apply a filter or change page, but I measured it and the time it takes to query the DB is basically irrelevant compared with the amount of seconds it takes to load.Even more, I checked the perfomance on Chrome and almost 60% of the time is dedicated to 'Scripting' according to the performance tester (I don't really understand all the process of loading the page onto the browser, but I'll trust them about that).
So, my question is...
Am I missing something crucial about performance here? Is there something I can still do from the code in order to improve the loading time? My client's requirements led me into this solution and I cannot sacrifice ant functionalities... So the components in every column, the number of columns and the number of rows per page are non-negotiable. I hope you can give me some hint about it since I'm having serious claims about the response times and don't really know what to do.
Thanks in advance.

Related

Google Sheets ArrayFormula with Filter and/or SumIf

I'm working on creating a faster, more efficient report within Google sheets (we're limited to that at the moment, but senior management is looking for alternatives in the distant future). The first version was Queries and Importranges across roughly 3,000 cells. This, of course, caused monumental lag and frequently caused major issues, particularly when looking for different date ranges.
The second version used Importranges to about 30 different tabs within the workbook, then using Filters to get the data from those tabs. It is significantly faster, but still somewhat sluggish; taking up to 5 minutes to update with new date ranges.
The third version, the one I'm working on, I'd like to make using ArrayFormula so that instead of 3,000 cells with a formula, it'll be about 60 cells with formula, which I think should speed things up tremendously.
I've looked at several options, but none seemed to quite fit the need that I'm looking to fill. I even experimented with several to see if they could be tweaked to fit. No dice.
Here's a sample spreadsheet where I've entered a couple of days worth of test data for some sample colleagues:
https://docs.google.com/spreadsheets/d/18rF3ZnvmNKZ9finPaR3a2rjsVBoDF4XREyRE4gazsfc/edit?usp=sharing
I'm matching the colleague and dates with the date in the Daily column to get their metrics using a FILTER formula. I'm at a loss in how to do that with an ArrayFormula so that I only need a single formula for each column. Any help would be appreciated.
As I'm still relatively new to the more complex uses of spreadsheets, please explain how any formulae you provide works.

Better measure than "average" to highlight outlying values?

For web app debugging purposes, I am currently showing a single number that represents the average JSON request time.
We do a large number of JSON requests, and most of them succeed quickly. The ones that occasionally take a few seconds to complete are the real problem. Showing the average request time is nearly useless, as the average is almost always excellent.
What kind of calculation can I do to generate a quality-of-service type of number that will go up when long request times start to become more prevalent?
Something like "max request time time in the last 5 minutes" is one way, but that doesn't give any information about how many times it happened or really how bad the problem is.
Thank you.

Approach to handle data that keeps changing

I am working on a website, and need to present something like 'x% of users who viewed this page bought this product.'
Despite the discussion of the business value, I want to know what would be a acceptable approach to get the data of x%.
I currently have two approaches. Either requires saving the number of users viewed the page and number of users who bought this product.
One approach is to calculate this data on the fly. The pros of this is that it presents accurate data, while the cons is that the wait time for user increases due to the calculation.
The other approach is that for every users viewed or bought this product, calculate the x% amount and persists the data to database. The pros of this is that it allows the users to quickly get the info, while the cons will be a lot of extra calculations, and the data may not be as accurate.
Assuming we expect hundreds of page views per hour, I wonder which is better approach? Or maybe a third approach will work better?
Thanks!
I think your best bet would be to find a balance between calculating the exact value on every user visit, and having the most accurate display.
You could log every user visit and every purchase in a database, then on every 100th visit or so, perform the calculation. Also log that calculation in your database and have your site pull the information from there, rather then calculate it on every visit.
And depending on how accurate you need to be, and how performance heavy the operation is, you can adjust your interval for calculating the value.
So in all we have that each user's visit increments a value in a database. On the back-end, that value is checked to see if it has went over another interval (so if your interval is every 100, the value is checked to see if its value % 100 == 0). And then you have an operation that only takes place 1/100th of the time a user visits the site, and is still accurate to within the hour (according to your calculation of having hundreds of views per hour).
Having said this, I agree with Jim Garrison's comment about premature optimization. I don't think the operation will have a noticable impact on your site's performance, and if you wanted to be as accurate as possible, you can run the calculation every time a user visits the site or purchases an item.

Mongodb count with queries have poor performance

Counting the number of records in a collection that is matched by a query even on an indexed field takes too much time. For example lets say there is a collection consists of 10000 records, and there is an index on the creationDate field of this collection. Getting the last ten records from the collection is faster than counting the number of records created on the last day. It takes more than 5 seconds, sometimes even up to 70 seconds to return the result of the count query. Do you have any idea how to solve this problem, what is the best way to solve this issue?
Btw we also use morphia, and we saw that getting the count through morphia is even slower, so for count queries, we transform the morphia query to the java driver query. Did anyone encounter a similar situation, why does morphia response even slower? Does this happen only for count queries or is it slow in general compared to using only java driver?
Help, suggestions or work-arounds would be really appreciated, our application relies heavily on count queries and the slowness of system is really annoying for us right now.
Thanks in advance.
While this might not be the final answer, let's get started and evolve this further:
Your indexes should always fit into RAM, otherwise you will get really bad performance.
To evaluate how much RAM is used, you can either use 10gen's MMS or check with various tools. For a description plus possible reasons for low (resident) memory usage see http://www.kchodorow.com/blog/2012/05/10/thursday-5-diagnosing-high-readahead/. Or you simply haven't accessed enough data yet in which case you can use MongoDB's touch (but I doubt that since you're already having performance issues).
Besides adding RAM and making sure you use all the available RAM, you could also drop unused indexes or use compound indexes where possible.
Wait for a fix.
As Asya Kamsky commented out, count performance is really bad on 2.2. The only workaround we've found is to avoid them as much as possible.
(There are other things that are unexplainably slow with mongodb - such as aggregate queries - most of those have associated JIRA issues and are being worked on/scheduled)

Why is pagination so resource-expensive?

It's one of those things that seems to have an odd curve where the more I think about it, the more it makes sense. To a certain extent, of course. And then it doesn't make sense to me at all.
Care to enlighten me?
Because in most cases you've got to sort your results first. For example, when you search on Google, you can view only up to 100 pages of results. They don't bother sorting by page-rank beyond 1000 websites for given keyword (or combination of keywords).
Pagination is fast. Sorting is slow.
Lubos is right, the problem is not the fact that you are paging (which takes a HUGE amount of data off the wire), but that you need to figure out what is actually going on the page..
The fact that you need to page implies there is a lot of data. A lot of data takes a long time to sort :)
This is a really vague question. We'd need a concrete example to get a better idea of the problem.
This question seems pretty well covered, but I'll add a little something MySQL specific as it catches out a lot of people:
Avoid using SQL_CALC_FOUND_ROWS. Unless the dataset is trivial, counting matches and retrieving x amount of matches in two separate queries is going to be a lot quicker. (If it is trivial, you'll barely notice a difference either way.)
I thought you meant pagination of the printed page - that's where I cut my teeth. I was going to enter a great monologue about collecting all the content for the page, positioning (a vast number of rules here, constrait engines are quite helpful) and justification... but apparently you were talking about the process of organizing information on webpages.
For that, I'd guess database hits. Disk access is slow. Once you've got it in memory, sorting is cheap.
Of course sorting on a random query takes some time, but if you're having problems with the same paginated query being used regulary, there's either something wrong with the database setup (improperly indexing/none at all, too little memory etc. I'm not a db-manager) or you're doing pagination seriously wrong:
Terribly wrong: e.g. doing select * from hugetable where somecondition; into an array getting the page count with the array.length pick the relevant indexes and dicard the array - then repeating this for each page... That's what I call seriously wrong.
The better solution two queries: one getting just the count then another getting results using limit and offset. (Some proprietary, nonstandard-sql server might have a one query option, I dunno)
The bad solution might actually work quite okay in on small tables (in fact it's not unthinkable that it's faster on very small tables, because the overhead of making two queries is bigger than getting all rows in one query. I'm not saying it is so...) but as soon as the database begins to grow the problems become obvious.

Resources