Large datasets, millions of records, need special programming to maintain speed in DBGrids.
I want to know if there are any ready-made components for Delphi (DBGrids) that do this automatically?
EDIT For Example: Some databases have features such as fetch 1st X records (eg 100 records). When I reach the bottom with scrolling, I want to auto fetch the next 100. Conversely when I reach the beginning, I want to fetch the previous 100. I know I can program this, but it sure is possible to propagate that feature to a DBGrid control where the DBGrid does the buffering. It will save quite a bit of programming - you simply have to set the "buffer size" so to speak.
You might want to take a look at the wonderful (free, open source, dual licensed as MPL 1.1 and GPL thus usable in closed source apps) Virtual TreeView and its user-supplied descendants (scroll down the page to find those.)
Edit to reflect the question's edit: Virtual TreeView not only allows you to handle millions of nodes without keeping them in memory, but that is in fact the preferred way of using it. You supply the data through event callbacks when it's needed, and you can tell the tree to cache that data (or not.)
Oh, and of course it also has a grid / report mode where it can function as a table (just set the GridExtensions property to True.)
I would have a look at Developer Express QuantumGrid Suite. (#birger: you just were a tick faster ;-) ) So I'm not just duplicating the answer, some elaboration:
The DevExpress Grid uses a data controller that has several modes to controll the data bound to the grid. One of these is exactly what you are looking for:
Grid Mode
When using Grid Mode, only a fixed
number of dataset records is loaded
into memory. Because only a limited
set of records are retrieved from the
dataset, automatic sorting, filtering
and summary calculations are disabled
in Grid Mode (must be controlled
manually instead). By default, this
mode is disabled and the
ExpressDataController loads all
records in a dataset.
It does have some drawbacks, which seem pretty obvious: you cannot make a summary, sort, or filter if you do not have all records at hand.
NextGrid is light, fast and nice looking grid for Delphi
http://www.bergsoft.net/component/next-grid/features.htm
HANDLING LARGE AMOUNT OF CELLS WITHOUT LOOSING SPEED
NextGrid can handle very large amount
of cells without losing speed. Speed
of adding, modifying and deleting data
doesn't depend of the amount of cells.
In NextGrid demo you can see how fast
NextGrid work with 100,000 rows and 10
columns = 1,000,000 cells
I think the DevExpress Quantumgrid supports this very good.
sorry, I just saw your comment to Neftalí
if you would to bring 100 record per time, and then fetch the next 100, this work related to database access components, look at devart components, they are offer direct access components to most used database, and they have the feature you are asking about and more:
http://www.devart.com/products-vcl.html
Related
Are there any solutions out there for sorting with data virtualization? The use case is a large set
of transactions sorted in any of several ways. Editing a transaction puts it out of order, saving it may move the transaction to a different page. By itself that is not so bad, however (a) it has to work with a validation system and (b) other entries can be added or edited prior to the save, resulting in an increasingly disordered list.
One solution that I think wouldn't work very well would be to resort each dirty page (e.g. right after the save)I because that would significantly increase the number of entries notified to the list view, resulting in flicker.
I don't follow your question. Maybe, you want to sort your collection on your ItemsControl. There should be some third-party controls can achieve your target.
For example, DataGrid in Syncfusion.
The SfDataGrid control for Universal Windows Platform is used to display collection of data in rows and columns. It includes editing and data shaping features (Sorting, grouping, filtering and etc) that allows the end users to easily manage the data.
What I did in the end was to only use a single virtual page. With just one page you can more easily manage the relocation of edited entries, since you won't have to move them between pages or deal with them falling in between pages. New entities can be kept at the bottom of the collection until they are saved. Having only one page also improves the chances of detecting concurrency errors.
Note this question was not primarily about re-sorting, it was more about maintaining a sort order during edits.
The DataGrid suggestion does not suit the situation, I need free-form templates rather than a grid.
Validation, concurrency, and the need for batch edits just make it a bit harder.
I am working in BluePrism Robotics Process Automation and trying to load an excel sheet with more than 100k records (It might go upwards of 300k in some cases).
I am trying to load internal work queue of BluePrism, but I get an error as quoted below:
'Load Data Into Queue' ERROR: Internal : Exception of type 'System.OutOfMemoryException' was thrown.
Is there a way to avoid this problem, in the way where I can free up more memory?
I plan to process records one by one from queue, and put them into new excel sheets categorically. Loading all that data in a collection and looping over it may be memory consuming, so I am trying to find out a more efficient way.
I welcome any and all help/tips.
Thanks!
Basic Solution:
Break up the number of Excel rows you are pulling into your Collection data item at any one time. The thresholds for this will depend on your resource system memory and architecture, as well as structure and size of the data in the Excel Worksheet. I've been able to quickly move 50k 10-column-rows from Excel to a Collection and then into the Blue Prism queue very quickly.
You can set this up by specifying the Excel Worksheet range to pull into the Collection data item, and then shift that range each time the Collection has been successfully added to the queue.
After each successful addition to the queue and/or before you shift the range and/or at a predefined count limit you can then run a Clean Up or Garbage Collection action to free up memory.
You can do all of this with the provided Excel VBO and an additional Clean Up object.
Keep in mind:
Even breaking it up, looping over a Collection this large to amend the data will be extremely expensive and slow. The most efficient way to make changes to the data will be at the Excel Workbook level or when it is already in the Blue Prism queue.
Best Bet: esqew's alternative solution is the most elegant and probably your best bet.
Jarrick hit it on the nose in that Work Queue items should provide the bot with information on what they are to be working on and a Control Room feedback space, but not the actual work data to be implemented/manipulated.
In this case you would want to just use the items Worksheet row number and/or some unique identifier from a single Worksheet column as the queue item data so that the bot can provide Control Room feedback on the status of the item. If this information is predictable enough in format there should be no need to move any data from the Excel Worksheet to a Collection and then into a Work Queue, but rather simply build the queue based on that data predictability.
Conversely you can also have the bot build the queue "as it happens", in that once it grabs the single row data from the Excel Worksheet to work it, can as well add a queue item with the row number of the data. This will then enable Control Room feedback and tracking. However, this would, in almost every case, be a bad practice as it would not prevent a row from being worked multiple times unless the bot checked the queue first, at which point you've negated the speed gains you were looking to achieve in cutting out the initial queue building in the first place. It would also be impossible to scale the process for multiple bots to work the Excel Worksheet data efficiently.
This is a common issue for RPA, especially if working with large excel files. As far as I know, there are no 100% solutions, but only methods reduce the symptoms. I have run into this problem several times and these are the ways I would try to handle them:
Disable or Errors only for stage logging.
Don`t log parameters on action stages (especially ones that work with the excel files)
Run Garbage collection process
See if it is possible to avoid reading excel files into BP collections and use OLEDB to query the file
See if it is possible to increase the Ram memory on the machines
If they’re using the 32-bit version of the app, then it doesn’t really matter how much memory you feed it, Blue Prism will cap out at 2 GB.
This is may be because of BP Server as the memory is shared between Processes and Work queue.Better option is to use two bots and multiple queues to avoid Memory Error.
If you're using Excel documents or CSV files, you can use the OLEDB object to connect and query against it as if it were a database. You can use the SQL syntax to limit the amount of rows that are returned at a time and paginate through them until you've reached the end of the document.
For starters, you are making incorrect use of the Work Queue in Blue Prism. The Work Queue should not be used to store this type and amount of data. (please read the BP documentation on Work Queues thoroughly).
Solving the issue at hand, being the misuse requires 2 changes:
Only store references in your Item Data which point to the Excel file containing the data.
If you're consulting this much data many times, perhaps convert the file into a CSV, write a VBO that queries the data directly in the CSV.
The first change is not just a recommendation, but as your project progresses and IT Architecture and InfoSec comes into play, it will be mandatory.
As for the CSV VBO, take a look at C#, it will make your life a lot easier than loading all this data into BP (time consuming, unreliable, ...).
I am using GWT 2.4. There are times when I have to show huge amount of records for example: 50,000 records on my screen in a gridtable or flextable. But it takes very long to load that screen say around 30 mins or so; or, ultimately the screen hangs or at times IE displays an error saying that this might take too long and your application will stop working, so do you wish to continue.
Is there any solution to improve gwt performance?
Don't bring all data at once, you should bring it in pages, as the comments suggested here.
However, paging not be trivial , as it might be that during paging your db is filled with more entries, and if you're using some sorting algorithm for the results,
the new entries might ruin your sorting (for example, when trying to fetch page #2, some entries that should have been on the first page are inserted.
You may decided that you create some sort of "cursor" for paging purposes and it will reflect the state of your database at the point you created it, so you will ignore entires that are entered during traversal between pages.
Another option you may consider, as part of paging is providing only a small version for each record - i.e - only the most important details, and let the user double click if he wants to see the whole details for the record - this can also provide you some performance improvement within each page.
I have a question regarding capabilities jqGrid. We want to turn paging off and load all the data on the client side without implementing virtual scrolling (paging turned off as well). How many records can the jqGrid handle realistically as we have been trying to load 50,000 records * 20 columns and it seems to blow up (Note: Virtual Scrolling turned off and Paging turned off)
I don't think that it's good idea. No user are able to look through 1 million cells. So you want to send 99,9% (or more) unneeded data to the client. What you really need is implementation of the subject oriented filtering of the data.
The performance of the grid will be mostly depend from the JavaScript engine of the web browser which you use. Nobody can get you common recommendation for IE6 and Chrome 19 because of very different JavaScript performance.
In any way I am sure that you have to implement paging, sorting and filtering on the server side. I think it's really required in case of 1 million cells of data.
we are working on a .net desktop application. On the GUI, there are number of tabs and panels. graph tab, imges tab, result grid tab etc. The task is to fetch about 50,000 records and be able to take its different views. e.g can have graph amoung two columns, can go throug the results in the grid and can view images of the records.
Application is developed but its performence is too bad. We are trying to target it atleast for 50,000 records, but its response gets weired on about 5000 records.
Facts:
1- Queryies are complex which include number on joins. On the avg 10 to 12 tables joins. And sometimes subquery table as join. It takes about 8 to 10 seconds to return results.
What can be done to achive performance at this level.
- Index are used properly.
Can using SSIS (Sql Server Integration Services) help in the context?
2- Graphs support very less amount of data and start getting exshast on about 4000 records. What can be done to improve graph's performance?
Pagging can't be used when graphs are involve.
Please post your schema and a sample query so that it can be improved upon.
As for the graph performance, here's some general performance tips (assuming WinForms):
Does the graph object have a .BeginUpdate() or .BeginDataUpdate() and a corresponding .EndUpdate() / .EndDataUpdate() method? If it has those then you should be using that. The same applies to the GridControl as well.
Are you adding the items to the graph/grid (or their datasource) one by one, or are you calling .AddRange() or setting the data source / data bindings. If you're adding the adding the items one by one then it will often remake the list over and over, this was a common problem with the .NET 1.1 ListView control, because the items underneath were stored in an array and each .Add(..) call recreated the array so it very quickly went O(n^2) for adding items.
What graphing and grid controls are you using?