We have a grid spanning 126 rows and 11 columns. The grid is editable with roughly a thousand textboxes ( I understand bad design, but seriously the client is adamant).
So on these text boxes we call jQuery custom function to calculate sum and multiplication across the length and breadth of the grid.
The custom method is applied to two or three rows in groups to give subtotals and totals.
Because of the huge amount of generated script the page has slowed down drastically.Drastically, means when I enter any number in textbox it takes atleast 2 seconds to respond back and populate the results in designated textboxes.
We are using .live() method as the grid is inside Updatepanel.
Any help in optimizing the horrible performance is much needed and will be Highly appreciated.
First, everything below is a guess. Have you profiled the page to see where the bottlenecks lie? Is there a public URL where we can see this in action?
A minor improvement can probably be had by switching to .delegate() or .on() for attaching events. Attach the event as close as posible to the grid. I doubt that will help a lot though.
Basically it sounds like you're trying to implement a spreadsheet, so I would advise the same techniques a spreadsheet uses. Use a dependency graph to determine what really needs to be recalculated when a cell changes. Store intermediate results of things that don't change very often. Rather than attempting to recalculate everything at once, use a setTimeout to calculate a few rows of the grid at a time.
Related
I have a lazy datatable with 40 complex editabe columns (most of them have autocomplete, calendar and selectOneMenu components) plus sorting and filtering... and it's taking too much time to load.
I'm using pagination and I noticed that when I select 10 rows by default, the time to load on the browser seems pretty decent. But if I choose 50 rows, changing from one page to the other takes around 13 seconds in the best cases (yes I know, it's 40 complex columns and 50 rows... ).
At first I thought the time complexity was on the server side, because of the complex query it executes each time you apply a filter or change page, but I measured it and the time it takes to query the DB is basically irrelevant compared with the amount of seconds it takes to load.Even more, I checked the perfomance on Chrome and almost 60% of the time is dedicated to 'Scripting' according to the performance tester (I don't really understand all the process of loading the page onto the browser, but I'll trust them about that).
So, my question is...
Am I missing something crucial about performance here? Is there something I can still do from the code in order to improve the loading time? My client's requirements led me into this solution and I cannot sacrifice ant functionalities... So the components in every column, the number of columns and the number of rows per page are non-negotiable. I hope you can give me some hint about it since I'm having serious claims about the response times and don't really know what to do.
Thanks in advance.
I recently worked on a project in google sheets. While everything is working, changes in the page take a while to process and load. There are a few parts of the project (described below). Is there a way to find out what is causing the biggest load on the project, so that I can work on that area?
Alternatively, if someone has experience with the following types of functions, what do you think is causing the biggest slowdown:
I have a query to lookup and match values. Would this be faster as a vlookup with sort in an arrayformula?
=IFERROR(QUERY(Record!A:C,"Select A where B = '"&B7&"' order by A desc limit 1 label A ''"),"")
I have random number generation through an arrayformula:
=ARRAYFORMULA(IF(ROW(B6:B)=6,"RANDOM",IF(ISBLANK(B6:B),"",RANDBETWEEN(0+0*ROW(B6:B),COUNTA(B6:B)))))
This fills in a cell with a random number if the one next to it has a value. I use this for random sampling in another query later.
I have some conditional formatting based on whether the cell has something in it.
I have some data validation based on a range of cells.
Note: Since my main question is about performance, I didn't think having an example file would be beneficial. It would take me a little to make one so if Ii should, let me know. Also, since other questions deal with scripting performance (like this one and this one) I feel like my question is different.
I suspect the RANDBETWEEN formula is your biggest culprit - basically every time the spreadsheet changes in any way whatsoever, even if you dont actually edit, the numbers all change, so inside an arrayformula, depending on how many rows you have, its always recalculating the rand for every single row
I'm trying to get around to fully understand how (and if) d3 observes changes to underlying data.
I've got this example working so far:
http://jsfiddle.net/AyKarsi/x95dk3uu/1/
Ideally I would like to completely remove the necessity to call the
draw()
function and let d3 handle the changes to the data.
In a simple example, like the above it doesn't really matter what to update, but if there is a lot of data redrawing the whole graph doesn't feel right.. :)
So how can I improve the above example and make it more efficient?
d3 is not going to observe your dataSet. You still need to call draw().
It feels right and it works fast because it reuses existing nodes every time you call draw(). Moreover you can use key function to bound elements to data, then even when order of the items change it will still reuse nodes with particular data.
Read more about data here https://github.com/mbostock/d3/wiki/Selections#data and key functions here http://bost.ocks.org/mike/constancy/#key-functions
Key functions can be useful for improving performance independent of
transitions. For example, if you filter a large table, you can use a
key function to reduce the number of DOM modifications: reorder DOM
elements in the update selection rather than regenerating them
I was recently asked this in an interview (Software Engineer) and didn't really know how to go about answering the question.
The question was focused on both the algorithm of the spreadsheet and how it would interact with the browser. I was a bit confused on what data structure would be optimal to handle the cells and their values. I guess any form of hash table would work with cells being the unique key and the value being the object in the cell? And then when something gets updated, you'd just update that entry in your table. The interviewer hinted at a graph but I was unsure of how a graph would be useful for a spreadsheet.
Other things I considered were:
Spreadsheet in a browser = auto-save. At any update, send all the data back to the server
Cells that are related to each other, i.e. C1 = C2+C3, C5 = C1-C4. If the value of C2 changes, both C1 and C5 change.
Usage of design patterns? Does one stand out over another for this particular situation?
Any tips on how to tackle this problem? Aside from the algorithm of the spreadsheet itself, what else could the interviewer have wanted? Does the fact that its in a browser as compared to a separate application add any difficulties?
Thanks!
For an interview this is a good question. If this was asked as an actual task in your job, then there would be a simple answer of use a third party component, there are a few good commercial ones.
While we can't say for sure what your interviewer wanted, for me this is a good question precisely because it is so open ended and has so many correct possible answers.
You can talk about the UI and how to implement the kind of dynamic grid you need for a spreadsheet and all the functionality of the cells and rows and columns and selection of cells and ranges and editing of values and formulas. You probably could talk for a while on the UI implications alone.
Alternatively you can go the data route, talk about data structures to hold a spreadsheet, talk exactly about links between cells for formulas, talk about how to detect and deal with circular references, talk about how in a browser you have less control over memory and for very large spreadsheets you could run into problems earlier. You can talk about what is available in JavaScript vs a native language and how this impacts the data structures and calculations. Also along with data, a big important issue with spreadsheets is numerical accuracy and floating point number calculations. Floating point numbers are made to be fast but are not necessarily accurate in extreme levels of precision and this leads to a lot of confusing questions. I believe very recently Excel switched to their own representation of a fixed decimal number as it's now viable to due spreadsheet level calculations without using the built-in floating point calculations. You can also talk about data structures and calculation and how they affect performance. In a browser you don't have threads (yet) so you can't run all the calculations in the background. If you have 100,000 rows with complex calculations and change one value that cascades across everything, you can get a warning about a slow script. You need to break up the calculation.
Finally you can run form the user experience angle. How is the experience in a browser different from a native application? What are the advantages and what cool things can you do in a browser that may be difficult in a desktop application? What things are far more complicated or even totally impossible (example, associate your spreadsheet app with a file type so a user can double-click a file and open it in your online spreadsheet app, although I may be wrong about that still being unsupported).
Good question, lots of right answers, very open ended.
On the other hand, you could also have had a bad interviewer that is specifically looking for the answer they want and in that case you're pretty much out of luck unless you're telepathic.
You can say hopelessly too much about this. I'd probably start with:
If most of the cells are filled, use a simply 2D array to store it.
Otherwise use a hash table of location to cell
Or perhaps something like a kd-tree, which should allow for more efficient "get everything in the displayed area" queries.
By graph, your interviewer probably meant have each cell be a vertex and each reference to another cell be a directed edge. This would allow you to do checks for circular references fairly easily, and allow for efficiently updating of all cells that need to change.
"In a browser" (presumably meaning "over a network" - actually "in a browser" doesn't mean all that much by itself - one can write a program that runs in a browser but only runs locally) is significant - you probably need to consider:
What are you storing locally (everything or just the subset of cells that are current visible)
How are you sending updates to the server (are you sending every change or keeping a collection of changed cells and only sending updates on save, or are you not storing changes separately and just sending the whole grid across during save)
Auto-save should probably be considered as well
Will you have an "undo", will this only be local, if not, how will you handle this on the server and how will you send through the updates
Is only this one user allowed to work with it at a time (or do you have to cater for multi-user, which brings dealing with conflicts, among other things, to the table)
Looking at the CSS cursor property just begs for one to create
a spreadsheet web application.
HTML table or CSS grid? HTML tables are purpose built for tabular
data.
Resizing cell height and width is achievable with offsetX and
offsetY.
Storing the data is trivial. It can be Mongo, mySQL, Firebase,
...whatever. On blur, send update.
Javascrip/ECMA is more than capable of delivering all the Excel built-in
functions. Did I mention web workers?
Need to increment letters as in column ID's? I got you covered.
Most importantly, don't do it. Why? Because it's already been done.
Find a need and work that project.
I am using Bing Maps with Ajax and I have about 80,000 locations to drop pushpins into. The purpose of the feature is to allow a user to search for restaurants in Louisiana and click the pushpin to see the health inspection information.
Obviously it doesn't do much good to have 80,000 pins on the map at one time, but I am struggling to find the best solution to this problem. Another problem is that the distance between these locations is very small (All 80,000 are in Louisiana). I know I could use clustering to keep from cluttering the map, but it seems like that would still cause performance problems.
What I am currently trying to do is to simply not show any pins until a certain zoom level and then only show the pins within the current view. The way I am currently attempting to do that is by using the viewchangeend event to find the zoom level and the boundaries of the map and then querying the database (through a web service) for any points in that range.
It feels like I am going about this the wrong way. Is there a better way to manage this large amount of data? Would it be better to try to load all points initially and then have the data on hand without having to hit my web service every time the map moves. If so, how would I go about it?
I haven't been able to find answers to my questions, which usually means that I am asking the wrong questions. If anyone could help me figure out the right question it would be greatly appreciated.
Well, I've implemented a slightly different approach to this. It was just a fun exercise, but I'm displaying all my data (about 140.000 points) in Bing Maps using the HTML5 canvas.
I previously load all the data to the client. Then, I've optimized the drawing process so much that I've attached it to the "Viewchange" event (which fires all the time during the view change process).
I've blogged about this. You can check it here.
My example does not have interaction on it but could be easily done (should be a nice topic for a blog post). You would have thus to handle the events manually and search for the corresponding points yourself or, if the amount of points to draw and/or the zoom level was below some threshold, show regular pushpins.
Anyway, another option, if you're not restricted to Bing Maps, is to use the likes of Leaflet. It allows you to create a Canvas Layer which is a tile-based layer but rendered in client-side using HTML5 canvas. It opens a new range of possibilities. Check for example this map in GisCloud.
Yet another option, although more suitable to static data, is using a technique called UTFGrid. The lads that developed it can certainly explain it better than me, but it scales for as many points as you want with a fenomenal performance. It consists on having a tile layer with your info, and an accompanying json file with something like an "ascii-art" file describing the features on the tiles. Then, using a library called wax it provides complete mouse-over, mouse-click events on it, without any performance impact whatsoever.
I've also blogged about it.
I think clustering would be your best bet if you can get away with using it. You say that you tried using clustering but it still caused performance problems? I went to test it out with 80000 data points at the V7 Interactive SDK and it seems to perform fine. Test it out yourself by going to the link and change the line in the Load module - clustering tab:
TestDataGenerator.GenerateData(100,dataCallback);
to
TestDataGenerator.GenerateData(80000,dataCallback);
then hit the Run button. The performance seems acceptable to me with that many data points.