What is the capacity of a BluePrism Internal Work Queue?

What is the capacity of a BluePrism Internal Work Queue? - memory-management

I am working in BluePrism Robotics Process Automation and trying to load an excel sheet with more than 100k records (It might go upwards of 300k in some cases).
I am trying to load internal work queue of BluePrism, but I get an error as quoted below:
'Load Data Into Queue' ERROR: Internal : Exception of type 'System.OutOfMemoryException' was thrown.
Is there a way to avoid this problem, in the way where I can free up more memory?
I plan to process records one by one from queue, and put them into new excel sheets categorically. Loading all that data in a collection and looping over it may be memory consuming, so I am trying to find out a more efficient way.
I welcome any and all help/tips.
Thanks!

Basic Solution:
Break up the number of Excel rows you are pulling into your Collection data item at any one time. The thresholds for this will depend on your resource system memory and architecture, as well as structure and size of the data in the Excel Worksheet. I've been able to quickly move 50k 10-column-rows from Excel to a Collection and then into the Blue Prism queue very quickly.
You can set this up by specifying the Excel Worksheet range to pull into the Collection data item, and then shift that range each time the Collection has been successfully added to the queue.
After each successful addition to the queue and/or before you shift the range and/or at a predefined count limit you can then run a Clean Up or Garbage Collection action to free up memory.
You can do all of this with the provided Excel VBO and an additional Clean Up object.
Keep in mind:
Even breaking it up, looping over a Collection this large to amend the data will be extremely expensive and slow. The most efficient way to make changes to the data will be at the Excel Workbook level or when it is already in the Blue Prism queue.
Best Bet: esqew's alternative solution is the most elegant and probably your best bet.
Jarrick hit it on the nose in that Work Queue items should provide the bot with information on what they are to be working on and a Control Room feedback space, but not the actual work data to be implemented/manipulated.
In this case you would want to just use the items Worksheet row number and/or some unique identifier from a single Worksheet column as the queue item data so that the bot can provide Control Room feedback on the status of the item. If this information is predictable enough in format there should be no need to move any data from the Excel Worksheet to a Collection and then into a Work Queue, but rather simply build the queue based on that data predictability.
Conversely you can also have the bot build the queue "as it happens", in that once it grabs the single row data from the Excel Worksheet to work it, can as well add a queue item with the row number of the data. This will then enable Control Room feedback and tracking. However, this would, in almost every case, be a bad practice as it would not prevent a row from being worked multiple times unless the bot checked the queue first, at which point you've negated the speed gains you were looking to achieve in cutting out the initial queue building in the first place. It would also be impossible to scale the process for multiple bots to work the Excel Worksheet data efficiently.

This is a common issue for RPA, especially if working with large excel files. As far as I know, there are no 100% solutions, but only methods reduce the symptoms. I have run into this problem several times and these are the ways I would try to handle them:
Disable or Errors only for stage logging.
Don`t log parameters on action stages (especially ones that work with the excel files)
Run Garbage collection process
See if it is possible to avoid reading excel files into BP collections and use OLEDB to query the file
See if it is possible to increase the Ram memory on the machines
If they’re using the 32-bit version of the app, then it doesn’t really matter how much memory you feed it, Blue Prism will cap out at 2 GB.

This is may be because of BP Server as the memory is shared between Processes and Work queue.Better option is to use two bots and multiple queues to avoid Memory Error.

If you're using Excel documents or CSV files, you can use the OLEDB object to connect and query against it as if it were a database. You can use the SQL syntax to limit the amount of rows that are returned at a time and paginate through them until you've reached the end of the document.

For starters, you are making incorrect use of the Work Queue in Blue Prism. The Work Queue should not be used to store this type and amount of data. (please read the BP documentation on Work Queues thoroughly).
Solving the issue at hand, being the misuse requires 2 changes:
Only store references in your Item Data which point to the Excel file containing the data.
If you're consulting this much data many times, perhaps convert the file into a CSV, write a VBO that queries the data directly in the CSV.
The first change is not just a recommendation, but as your project progresses and IT Architecture and InfoSec comes into play, it will be mandatory.
As for the CSV VBO, take a look at C#, it will make your life a lot easier than loading all this data into BP (time consuming, unreliable, ...).

Related

Solutions for Recording Google Protocol Buffer Messages

My project is using protobufs for our data types. We need to be able to record our data so it can be played back later. Our use case is to recreate the event or to reprocess the same data but with new algorithms and check for improvements.
As data is flowing through our system it is all protobufs. These are easily serilized to a byte array which could be recorded to files or maybe as blobs in a database. Playback would simply mean reading the byte array and converting back to a protobuf, then sending it off into our software again.
Are there any existing technologies used for recording protobufs?
Even though the initial use case is very simple, eventually the solution will get more complex. It will probably need to:
Farm out recording to multiple hosts to keep up with the input data rate
Allow querying to find out how much data exists during a specific time period
Play back only those data records where some field has a specific value
Save the data for long term storage, e.g. never delete a record but instead move it to a tape backup
I think the above is best accomplished using a database which stores some subset of meta data along with the protobuf byte array itself. Before I go reinventing the wheel, I would like opinions on anything that exists already that might do this job.

How can I limit memory usage when generating a CSV from a large resultset?

I have a web application in Spring that has a functional requirement for generating a CSV/Excel spreadsheet from a result set coming from a large Oracle database. The expected rows are in the 300,000 - 1,000,000 range. Time to process is not as large of an issue as keeping the application stable -- and right now, very large result sets cause it to run out of memory and crash.
In a normal situation like this, I would use pagination and have the UI display a limited number of results at a time. However, in this case I need to be able to produce the entire set in a single file, no matter how big it might be, for offline use.
I have isolated the issue to the ParameterizedRowMapper being used to convert the result set into objects, which is where I'm stuck.
What techniques might I be able to use to get this operation under control? Is pagination still an option?

A simple answer:
Use a JDBC recordset (or something similar, with an appropriate array/fetch size) and write the data back a LOB, either temporary or back into the database.
Another choice:
Use PL/SQL in the database to write a file using UTL_FILE for your recordset in CSV format. As the file will be on the database server, not on the client, Use UTL_SMTP or JavaMail using Java Stored Procedures to mail the file. After all, I'd be surprised if someone was going to watch the hourglass turn over repeatedly waiting for a 1 million row recordset to be generated.

Instead of loading an entire file in memory you can process each row individually and use output stream to send the output directly to the web browser. E.g. in servlets API, you can get the output stream from ServletResponse.getOutputStream() and then simply write result CSV lines to that stream.

I would push back on those requirements- they sound pretty artificial.
What happens if your application fails, or the power goes out before the user looks at that data?
From your comment above, sounds like you know the answer- you need filesystem or oracle access, in order to do your job.
You are being asked to generate some data- something that is not repeatable by sql?
If it were repeatable, you would just send pages of data back to the user at a time.
Since this report, I'm guessing, has something to do with the current state of your data, you need to store that result somewhere, if you can't stream it out to the user. I'd write a stored procedure in oracle- it's much faster not to send data back and forth across the network. If you have special tools or its just easier, sounds like there's nothing wrong with doing it on the java side instead.
Can you schedule this report to run once a week?

Have you considered the performance of an Excel spreadsheet with 1,000,000 rows?

Performance of bcp/BULK INSERT vs. Table-Valued Parameters

I'm about to have to rewrite some rather old code using SQL Server's BULK INSERT command because the schema has changed, and it occurred to me that maybe I should think about switching to a stored procedure with a TVP instead, but I'm wondering what effect it might have on performance.
Some background information that might help explain why I'm asking this question:
The data actually comes in via a web service. The web service writes a text file to a shared folder on the database server which in turn performs a BULK INSERT. This process was originally implemented on SQL Server 2000, and at the time there was really no alternative other than chucking a few hundred INSERT statements at the server, which actually was the original process and was a performance disaster.
The data is bulk inserted into a permanent staging table and then merged into a much larger table (after which it is deleted from the staging table).
The amount of data to insert is "large", but not "huge" - usually a few hundred rows, maybe 5-10k rows tops in rare instances. Therefore my gut feeling is that BULK INSERT being a non-logged operation won't make that big a difference (but of course I'm not sure, hence the question).
The insertion is actually part of a much larger pipelined batch process and needs to happen many times in succession; therefore performance is critical.
The reasons I would like to replace the BULK INSERT with a TVP are:
Writing the text file over NetBIOS is probably already costing some time, and it's pretty gruesome from an architectural perspective.
I believe that the staging table can (and should) be eliminated. The main reason it's there is that the inserted data needs to be used for a couple of other updates at the same time of insertion, and it's far costlier to attempt the update from the massive production table than it is to use an almost-empty staging table. With a TVP, the parameter basically is the staging table, I can do anything I want with it before/after the main insert.
I could pretty much do away with dupe-checking, cleanup code, and all of the overhead associated with bulk inserts.
No need to worry about lock contention on the staging table or tempdb if the server gets a few of these transactions at once (we try to avoid it, but it happens).
I'm obviously going to profile this before putting anything into production, but I thought it might be a good idea to ask around first before I spend all that time, see if anybody has any stern warnings to issue about using TVPs for this purpose.
So - for anyone who's cozy enough with SQL Server 2008 to have tried or at least investigated this, what's the verdict? For inserts of, let's say, a few hundred to a few thousand rows, happening on a fairly frequent basis, do TVPs cut the mustard? Is there a significant difference in performance compared to bulk inserts?
Update: Now with 92% fewer question marks!
(AKA: Test Results)
The end result is now in production after what feels like a 36-stage deployment process. Both solutions were extensively tested:
Ripping out the shared-folder code and using the SqlBulkCopy class directly;
Switching to a Stored Procedure with TVPs.
Just so readers can get an idea of what exactly was tested, to allay any doubts as to the reliability of this data, here is a more detailed explanation of what this import process actually does:
Start with a temporal data sequence that is ordinarily about 20-50 data points (although it can sometimes be up a few hundred);
Do a whole bunch of crazy processing on it that's mostly independent of the database. This process is parallelized, so about 8-10 of the sequences in (1) are being processed at the same time. Each parallel process generates 3 additional sequences.
Take all 3 sequences and the original sequence and combine them into a batch.
Combine the batches from all 8-10 now-finished processing tasks into one big super-batch.
Import it using either the BULK INSERT strategy (see next step), or TVP strategy (skip to step 8).
Use the SqlBulkCopy class to dump the entire super-batch into 4 permanent staging tables.
Run a Stored Procedure that (a) performs a bunch of aggregation steps on 2 of the tables, including several JOIN conditions, and then (b) performs a MERGE on 6 production tables using both the aggregated and non-aggregated data. (Finished)
OR
Generate 4 DataTable objects containing the data to be merged; 3 of them contain CLR types which unfortunately aren't properly supported by ADO.NET TVPs, so they have to be shoved in as string representations, which hurts performance a bit.
Feed the TVPs to a Stored Procedure, which does essentially the same processing as (7), but directly with the received tables. (Finished)
The results were reasonably close, but the TVP approach ultimately performed better on average, even when the data exceeded 1000 rows by a small amount.
Note that this import process is run many thousands of times in succession, so it was very easy to get an average time simply by counting how many hours (yes, hours) it took to finish all of the merges.
Originally, an average merge took almost exactly 8 seconds to complete (under normal load). Removing the NetBIOS kludge and switching to SqlBulkCopy reduced the time to almost exactly 7 seconds. Switching to TVPs further reduced the time to 5.2 seconds per batch. That's a 35% improvement in throughput for a process whose running time is measured in hours - so not bad at all. It's also a ~25% improvement over SqlBulkCopy.
I am actually fairly confident that the true improvement was significantly more than this. During testing it became apparent that the final merge was no longer the critical path; instead, the Web Service that was doing all of the data processing was starting to buckle under the number of requests coming in. Neither the CPU nor the database I/O were really maxed out, and there was no significant locking activity. In some cases we were seeing a gap of a few idle seconds between successive merges. There was a slight gap, but much smaller (half a second or so) when using SqlBulkCopy. But I suppose that will become a tale for another day.
Conclusion: Table-Valued Parameters really do perform better than BULK INSERT operations for complex import+transform processes operating on mid-sized data sets.
I'd like to add one other point, just to assuage any apprehension on part of the folks who are pro-staging-tables. In a way, this entire service is one giant staging process. Every step of the process is heavily audited, so we don't need a staging table to determine why some particular merge failed (although in practice it almost never happens). All we have to do is set a debug flag in the service and it will break to the debugger or dump its data to a file instead of the database.
In other words, we already have more than enough insight into the process and don't need the safety of a staging table; the only reason we had the staging table in the first place was to avoid thrashing on all of the INSERT and UPDATE statements that we would have had to use otherwise. In the original process, the staging data only lived in the staging table for fractions of a second anyway, so it added no value in maintenance/maintainability terms.
Also note that we have not replaced every single BULK INSERT operation with TVPs. Several operations that deal with larger amounts of data and/or don't need to do anything special with the data other than throw it at the DB still use SqlBulkCopy. I am not suggesting that TVPs are a performance panacea, only that they succeeded over SqlBulkCopy in this specific instance involving several transforms between the initial staging and the final merge.
So there you have it. Point goes to TToni for finding the most relevant link, but I appreciate the other responses as well. Thanks again!

I don't really have experience with TVP yet, however there is an nice performance comparison chart vs. BULK INSERT in MSDN here.
They say that BULK INSERT has higher startup cost, but is faster thereafter. In a remote client scenario they draw the line at around 1000 rows (for "simple" server logic). Judging from their description I would say you should be fine with using TVP's. The performance hit - if any - is probably negligible and the architectural benefits seem very good.
Edit: On a side note you can avoid the server-local file and still use bulk copy by using the SqlBulkCopy object. Just populate a DataTable, and feed it into the "WriteToServer"-Method of an SqlBulkCopy instance. Easy to use, and very fast.

The chart mentioned with regards to the link provided in #TToni's answer needs to be taken in context. I am not sure how much actual research went into those recommendations (also note that the chart seems to only be available in the 2008 and 2008 R2 versions of that documentation).
On the other hand there is this whitepaper from the SQL Server Customer Advisory Team: Maximizing Throughput with TVP
I have been using TVPs since 2009 and have found, at least in my experience, that for anything other than simple insert into a destination table with no additional logic needs (which is rarely ever the case), then TVPs are typically the better option.
I tend to avoid staging tables as data validation should be done at the app layer. By using TVPs, that is easily accommodated and the TVP Table Variable in the stored procedure is, by its very nature, a localized staging table (hence no conflict with other processes running at the same time like you get when using a real table for staging).
Regarding the testing done in the Question, I think it could be shown to be even faster than what was originally found:
You should not be using a DataTable, unless your application has use for it outside of sending the values to the TVP. Using the IEnumerable<SqlDataRecord> interface is faster and uses less memory as you are not duplicating the collection in memory only to send it to the DB. I have this documented in the following places:
How can I insert 10 million records in the shortest time possible? (lots of extra info and links here as well)
Pass Dictionary<string,int> to Stored Procedure T-SQL
Streaming Data Into SQL Server 2008 From an Application (on SQLServerCentral.com ; free registration required)
TVPs are Table Variables and as such do not maintain statistics. Meaning, they report only having 1 row to the Query Optimizer. So, in your proc, either:
Use statement-level recompile on any queries using the TVP for anything other than a simple SELECT: OPTION (RECOMPILE)
Create a local temporary table (i.e. single #) and copy the contents of the TVP into the temp table

I think I'd still stick with a bulk insert approach. You may find that tempdb still gets hit using a TVP with a reasonable number of rows. This is my gut feeling, I can't say I've tested the performance of using TVP (I am interested in hearing others input too though)
You don't mention if you use .NET, but the approach that I've taken to optimise previous solutions was to do a bulk load of data using the SqlBulkCopy class - you don't need to write the data to a file first before loading, just give the SqlBulkCopy class (e.g.) a DataTable - that's the fastest way to insert data into the DB. 5-10K rows isn't much, I've used this for up to 750K rows. I suspect that in general, with a few hundred rows it wouldn't make a vast difference using a TVP. But scaling up would be limited IMHO.
Perhaps the new MERGE functionality in SQL 2008 would benefit you?
Also, if your existing staging table is a single table that is used for each instance of this process and you're worried about contention etc, have you considered creating a new "temporary" but physical staging table each time, then dropping it when it's finished with?
Note you can optimize the loading into this staging table, by populating it without any indexes. Then once populated, add any required indexes on at that point (FILLFACTOR=100 for optimal read performance, as at this point it will not be updated).

Staging tables are good! Really I wouldn't want to do it any other way. Why? Because data imports can change unexpectedly (And often in ways you can't foresee, like the time the columns were still called first name and last name but had the first name data in the last name column, for instance, to pick an example not at random.) Easy to research the problem with a staging table so you can see exactly what data was in the columns the import handled. Harder to find I think when you use an in memory table. I know a lot of people who do imports for a living as I do and all of them recommend using staging tables. I suspect there is a reason for this.
Further fixing a small schema change to a working process is easier and less time consuming than redesigning the process. If it is working and no one is willing to pay for hours to change it, then only fix what needs to be fixed due to the schema change. By changing the whole process, you introduce far more potential new bugs than by making a small change to an existing, tested working process.
And just how are you going to do away with all the data cleanup tasks? You may be doing them differently, but they still need to be done. Again, changing the process the way you describe is very risky.
Personally it sounds to me like you are just offended by using older techniques rather than getting the chance to play with new toys. You seem to have no real basis for wanting to change other than bulk insert is so 2000.

Client-server synchronization pattern / algorithm?

I have a feeling that there must be client-server synchronization patterns out there. But i totally failed to google up one.
Situation is quite simple - server is the central node, that multiple clients connect to and manipulate same data. Data can be split in atoms, in case of conflict, whatever is on server, has priority (to avoid getting user into conflict solving). Partial synchronization is preferred due to potentially large amounts of data.
Are there any patterns / good practices for such situation, or if you don't know of any - what would be your approach?
Below is how i now think to solve it:
Parallel to data, a modification journal will be held, having all transactions timestamped.
When client connects, it receives all changes since last check, in consolidated form (server goes through lists and removes additions that are followed by deletions, merges updates for each atom, etc.).
Et voila, we are up to date.
Alternative would be keeping modification date for each record, and instead of performing data deletes, just mark them as deleted.
Any thoughts?

You should look at how distributed change management works. Look at SVN, CVS and other repositories that manage deltas work.
You have several use cases.
Synchronize changes. Your change-log (or delta history) approach looks good for this. Clients send their deltas to the server; server consolidates and distributes the deltas to the clients. This is the typical case. Databases call this "transaction replication".
Client has lost synchronization. Either through a backup/restore or because of a bug. In this case, the client needs to get the current state from the server without going through the deltas. This is a copy from master to detail, deltas and performance be damned. It's a one-time thing; the client is broken; don't try to optimize this, just implement a reliable copy.
Client is suspicious. In this case, you need to compare client against server to determine if the client is up-to-date and needs any deltas.
You should follow the database (and SVN) design pattern of sequentially numbering every change. That way a client can make a trivial request ("What revision should I have?") before attempting to synchronize. And even then, the query ("All deltas since 2149") is delightfully simple for the client and server to process.

As part of the team, I did quite a lot of projects which involved data syncing, so I should be competent to answer this question.
Data syncing is quite a broad concept and there are way too much to discuss. It covers a range of different approaches with their upsides and downsides. Here is one of the possible classifications based on two perspectives: Synchronous / Asynchronous, Client/Server / Peer-to-Peer. Syncing implementation is severely dependent on these factors, data model complexity, amount of data transferred and stored, and other requirements. So in each particular case the choice should be in favor of the simplest implementation meeting the app requirements.
Based on a review of existing off-the-shelf solutions, we can delineate several major classes of syncing, different in granularity of objects subject to synchronization:
Syncing of a whole document or database is used in cloud-based applications, such as Dropbox, Google Drive or Yandex.Disk. When the user edits and saves a file, the new file version is uploaded to the cloud completely, overwriting the earlier copy. In case of a conflict, both file versions are saved so that the user can choose which version is more relevant.
Syncing of key-value pairs can be used in apps with a simple data structure, where the variables are considered to be atomic, i.e. not divided into logical components. This option is similar to syncing of whole documents, as both the value and the document can be overwritten completely. However, from a user perspective a document is a complex object composed of many parts, but a key-value pair is but a short string or a number. Therefore, in this case we can use a more simple strategy of conflict resolution, considering the value more relevant, if it has been the last to change.
Syncing of data structured as a tree or a graph is used in more sophisticated applications where the amount of data is large enough to send the database in its entirety at every update. In this case, conflicts have to be resolved at the level of individual objects, fields or relationships. We are primarily focused on this option.
So, we grabbed our knowledge into this article which I think might be very useful to everyone interested in the topic => Data Syncing in Core Data Based iOS apps (http://blog.denivip.ru/index.php/2014/04/data-syncing-in-core-data-based-ios-apps/?lang=en)

What you really need is Operational Transform (OT). This can even cater for the conflicts in many cases.
This is still an active area of research, but there are implementations of various OT algorithms around. I've been involved in such research for a number of years now, so let me know if this route interests you and I'll be happy to put you on to relevant resources.

The question is not crystal clear, but I'd look into optimistic locking if I were you.
It can be implemented with a sequence number that the server returns for each record. When a client tries to save the record back, it will include the sequence number it received from the server. If the sequence number matches what's in the database at the time when the update is received, the update is allowed and the sequence number is incremented. If the sequence numbers don't match, the update is disallowed.

I built a system like this for an app about 8 years ago, and I can share a couple ways it has evolved as the app usage has grown.
I started by logging every change (insert, update or delete) from any device into a "history" table. So if, for example, someone changes their phone number in the "contact" table, the system will edit the contact.phone field, and also add a history record with action=update, table=contact, field=phone, record=[contact ID], value=[new phone number]. Then whenever a device syncs, it downloads the history items since the last sync and applies them to its local database. This sounds like the "transaction replication" pattern described above.
One issue is keeping IDs unique when items could be created on different devices. I didn't know about UUIDs when I started this, so I used auto-incrementing IDs and wrote some convoluted code that runs on the central server to check new IDs uploaded from devices, change them to a unique ID if there's a conflict, and tell the source device to change the ID in its local database. Just changing the IDs of new records wasn't that bad, but if I create, for example, a new item in the contact table, then create a new related item in the event table, now I have foreign keys that I also need to check and update.
Eventually I learned that UUIDs could avoid this, but by then my database was getting pretty large and I was afraid a full UUID implementation would create a performance issue. So instead of using full UUIDs, I started using randomly generated, 8 character alphanumeric keys as IDs, and I left my existing code in place to handle conflicts. Somewhere between my current 8-character keys and the 36 characters of a UUID there must be a sweet spot that would eliminate conflicts without unnecessary bloat, but since I already have the conflict resolution code, it hasn't been a priority to experiment with that.
The next problem was that the history table was about 10 times larger than the entire rest of the database. This makes storage expensive, and any maintenance on the history table can be painful. Keeping that entire table allows users to roll back any previous change, but that started to feel like overkill. So I added a routine to the sync process where if the history item that a device last downloaded no longer exists in the history table, the server doesn't give it the recent history items, but instead gives it a file containing all the data for that account. Then I added a cronjob to delete history items older than 90 days. This means users can still roll back changes less than 90 days old, and if they sync at least once every 90 days, the updates will be incremental as before. But if they wait longer than 90 days, the app will replace the entire database.
That change reduced the size of the history table by almost 90%, so now maintaining the history table only makes the database twice as large instead of ten times as large. Another benefit of this system is that syncing could still work without the history table if needed -- like if I needed to do some maintenance that took it offline temporarily. Or I could offer different rollback time periods for accounts at different price points. And if there are more than 90 days of changes to download, the complete file is usually more efficient than the incremental format.
If I were starting over today, I'd skip the ID conflict checking and just aim for a key length that's sufficient to eliminate conflicts, with some kind of error checking just in case. (It looks like YouTube uses 11-character random IDs.) The history table and the combination of incremental downloads for recent updates or a full download when needed has been working well.

For delta (change) sync, you can use pubsub pattern to publish changes back to all subscribed clients, services like pusher can do this.
For database mirror, some web frameworks use a local mini database to sync server side database to local in browser database, partial synchronization is supported. Check meteror.

This page clearly describes mosts scenarios of data synchronization with patterns and example code: Data Synchronization: Patterns, Tools, & Techniques
It is the most comprehensive source I found, considering whole of delta syncs, strategies on how to handle deletions and server-to-client and client-to-server sync. It is a very good starting point, worth a look.

Is there a DBGrid component that can handle large datasets fast?

Large datasets, millions of records, need special programming to maintain speed in DBGrids.
I want to know if there are any ready-made components for Delphi (DBGrids) that do this automatically?
EDIT For Example: Some databases have features such as fetch 1st X records (eg 100 records). When I reach the bottom with scrolling, I want to auto fetch the next 100. Conversely when I reach the beginning, I want to fetch the previous 100. I know I can program this, but it sure is possible to propagate that feature to a DBGrid control where the DBGrid does the buffering. It will save quite a bit of programming - you simply have to set the "buffer size" so to speak.

You might want to take a look at the wonderful (free, open source, dual licensed as MPL 1.1 and GPL thus usable in closed source apps) Virtual TreeView and its user-supplied descendants (scroll down the page to find those.)
Edit to reflect the question's edit: Virtual TreeView not only allows you to handle millions of nodes without keeping them in memory, but that is in fact the preferred way of using it. You supply the data through event callbacks when it's needed, and you can tell the tree to cache that data (or not.)
Oh, and of course it also has a grid / report mode where it can function as a table (just set the GridExtensions property to True.)

I would have a look at Developer Express QuantumGrid Suite. (#birger: you just were a tick faster ;-) ) So I'm not just duplicating the answer, some elaboration:
The DevExpress Grid uses a data controller that has several modes to controll the data bound to the grid. One of these is exactly what you are looking for:
Grid Mode
When using Grid Mode, only a fixed
number of dataset records is loaded
into memory. Because only a limited
set of records are retrieved from the
dataset, automatic sorting, filtering
and summary calculations are disabled
in Grid Mode (must be controlled
manually instead). By default, this
mode is disabled and the
ExpressDataController loads all
records in a dataset.
It does have some drawbacks, which seem pretty obvious: you cannot make a summary, sort, or filter if you do not have all records at hand.

NextGrid is light, fast and nice looking grid for Delphi
http://www.bergsoft.net/component/next-grid/features.htm
HANDLING LARGE AMOUNT OF CELLS WITHOUT LOOSING SPEED
NextGrid can handle very large amount
of cells without losing speed. Speed
of adding, modifying and deleting data
doesn't depend of the amount of cells.
In NextGrid demo you can see how fast
NextGrid work with 100,000 rows and 10
columns = 1,000,000 cells

I think the DevExpress Quantumgrid supports this very good.

sorry, I just saw your comment to Neftalí
if you would to bring 100 record per time, and then fetch the next 100, this work related to database access components, look at devart components, they are offer direct access components to most used database, and they have the feature you are asking about and more:
http://www.devart.com/products-vcl.html

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio