cassandra 1 big column vs multi small columns in read performance? - performance

I'm getting around 1000 distinct events per second, (4 nodes cluster). After each event I will need to increase some counters. My question is, is it better to have a normal column family which has only one column and all the counters are treated like string with comma "," separated (example: "1,3,5,6,0,2") or it is better to create a Counter Column family with multiple columns? I read some document it says that counter column family can do read and write with consistency level 1 which is fast for reading. I don't really care much about write performance.

I think this depends on how you are receiving events and latency requirements.
If you are receiving them from multiple sources concurrently and need to write data as soon as possible it would seem that counters would be the better approach. With one big column, you would need to serialize all writes to any column as well as read the current value. This could also unnecessarily complicate your application code. If performance is a problem, you could try to enable the row cache for your counter column family. I have never tried to cache a counter column family, but I don't see any docs saying it is not supported. You can try it and check the JMX stats to see if it's working.
If you are receiving events single threaded and can do something like read data for 1000 events and then write once to cassandra while keeping the current counter values in memory, then a single column might be fine. But you need to realize that if you happen to just need to read a few counter values at a time, you'll be fetching a lot of unnecessary data on every read. Unless you do some tests that show that one column performs significantly better I would favor counters.


extremely high SSD write rate with multiple concurrent writers

I'm using QuestDB as backend for storing collected data using the same script for different data sources.
My problem ist the extremly high disk (ssd) usage. During 4 days it has written 335MB per second.
What am I doing wrong?
Inserting data using the ILP interface
I don't know how much data you are ingesting, so not sure if 335 MB per second is much or not. But since you are surprised by it I am going to assume your throughput is lower than that. It might be the case your data is out of order, specially if ingesting from multiple data sources.
QuestDB keeps the data per table always in incremental order by designated timestamp. If data arrives out of order, the whole partition needs to be rewritten. This might lead to write amplification where you see your data is being rewritten very often.
Until literally a few days ago, to fine tune this you would need to change the default config, but since version 6.6.1, this is dynamically adjusted.
Maybe you want to give a try to version 6.6.1, or alternatively if data from different sources is arriving out of order (relative to each other), you might want to create separate tables for different sources, so data is always in order for each table.
I have been experimenting a lot and it seems that you're absolutely right. I was ingesting 14 different clients into a single table. After having splitted this to 14 tables, one for each client, the problem disappeared.
Another advantage is the fact that I need a symbol less as I do not have to distinguish the rows.
By the way - thank you and your team for this marvellous tool you gave us! It makes my work so much easier!!

VSAM Search VS COBOL search/loop

I have a file that could contain about 3 million records. Certain records of this file will need to be updated multiple times throughout the run of the program. If I need to pull specific records from this file, which of the following is more efficient:
Indexed VSAM search
Indexed flat file with a COBOL search all
Buffering all of the data into working storage and writing a loop to handle the search
Obviously, if you can buffer all of the data into memory (and if the host system can support a working-set of pages which is big enough to allow all of it to actually remain in RAM without paging, then this would probably be the fastest possible approach.
But, be very careful to consider "hidden disk-I/O" caused by the virtual-memory paging subsystem! If the requested "in-memory" data is, in fact, not "in memory," a page-fault will occur and your process will stop in its tracks until the page has been retrieved. (And if "page stealing" occurs, well, you're in trouble. Your "in-memory" strategy just turned into a possibly very-inefficient(!) disk-based one. If keys are distributed randomly, then your process has a gigantic working-set that it is accessing randomly. If all of that memory is not actually in memory, and will stay there, you're in trouble.
If you are making updates to a large file, consider sorting the updates-delta file before processing it, so that all occurrences of the same key will be adjacent. You can now write your COBOL program to take advantage of this (and, of course, to abend if an out-of-sequence record is ever detected!). If the key in "this" record is identical to the key of the "previous" one, then you do not need to re-read the record. (And, you do not actually need to write the old record, until the key does change.) As the indexed-file access method is presented with the succession of keys, each key is likely to be "close to" the one previously-requested, such that some of the necessary index-tree pages will already be in-memory. Obviously, you will need to benchmark this, but the amount of time spent sorting the file can be far less than the amount of time spent in index-lookups. (Which actually can be considerable.)
The answer of Mike has the important issue about "hidden I/O" in (depends on the machine, configuration, amount of data)...
If you very likely need to update many records the option Mike suggest is the most useful one.
If you very likely need to update not much records (I'd guess you're likely below 2%) another approach can be quite faster (needs a benchmark !):
read every key via indexed VSAM search
store the changed record in memory (big occurs table), if you will only change some values and the record is quite big then only store all possible changed values + key in the table without an actual REWRITE
before doing a VSAM search: look in your occurs table if you read the key
already, take the values either from there or get a new one
at program end: go through your occurs and REQRITE all records (if you have the complete record a REWRITE is enough, otherwise you'd need a READ first to get the complete record)
Performance is often: "know your data and possible program flow, then try the best 2-3 approach, benchmark and decide".

can you have a large number of sharded counters i.e. large number of entity kinds?

My question is on sharded counters and whether you could have too many. note the below is just a made up example.
Say you want to keep a hit count of different pages on your site. So to prevent datastore contention you decide to shard the hit counters for each page. Now the number of pages grows, hence the number of sharded counters grow.
Assuming you are following the typical sharded examples, each sharded counter has its own kind allowing a query to be built that retrieves all entries blowing to a kind i.e. all entities belonging to that particular sharded counter.
My questions are:
Will a large number of counters (not shards per counter)
) affect performance as there will be so many entity kinds?
Is this the best practise? I mean it looks ugly in the datastore viewer when you have loads of entity kinds as each kind is a sharded counter for a page on your site.
If the above is not good, what would be a better solution?
If you followed what you call the "typical shard counter" examples, you can see that there's only one counter type, but you can create different string keys to count different things.
That way you have only one ShardCounter type in your db, but many-many instances with different string keys.
We have a system similar to what you've described. Using only one type of counter we count more than a hundred event types, summing up to around million hits a day. So it's safe to assume that it's pretty scalable ;)
EDIT added counter code examples from Google's documentation:
In the last example you will a counter that has a SHARD_KEY_TEMPLATE variable at the top of the code. This last example allows having different counters with the same shard class.

Parse, replacing large (several thousands) number of records

I've got a class in parse with 1-4k records per user. This needs to be replaced from time to time (actually these are records representing multiple timetables).
The problem I'm facing that deleting and inserting these records is a ton of requests. Is there maybe a method to delete and insert a bunch of records, that counts as one request? Maybe it's possible from Cloud Code?
I tried compacting all this data in one record, but then I faced the size limit for records (128 KB). Using any sub format(like a db or file onside a record) would be really tedious, cause the app is targeting nearly all platforms supported by Parse.
For clarification, the problem isn't the limit on saveAll/destroyAll. My problem is facing the req/s limit (or rather, as docs state req/min).
Also, I just checked that requests from Cloud Code also seem to count towards that limit.
Well, a possible solution would be also to redesing my datasets and use Array columns or something, but I'd rather avoid it if possible.
I think you could try Parse.Object.saveAll which batch processes the save() function.
I would use a saveAll/DestroyAll (or DeleteAll?) and anything -All that parse provides in its SDK.
You'd still reach a 1000 objects limit, but to counter that you can loop using the .skip property of a request.
Set a limit of 1000 and skip of 0, do the query, then increase the skip value by the previous limit, and so on. And you'd have 2 or 3 requests of a size of 1000 each time. You stop the loop when your results count is smaller than your limit. If it's not, then you query again and set the skip to the limit x loopcount.
Now you say you're facing size issues, maybe you can reduce that query limit to, say, 400, and your loop would just run for longer until your number of results is smaller than your limit (and then you can stop querying/limiting/skipping/looping or anything in -ing).
Okay, so this isn't an answer to my question, but it's a solution to my problem, so I'm posting it.
My problem was storing and then replacing a large amount of small records which add up to significant size (up to 500KB JSON [~1.5MB XML] in my current plans).
So I've chosen a middle path - I implemented sort of vertical partitions.
What I have is a master User record which holds array of pointers to other class (called Entries). Entries have only 2 fields - ID of school record and data which is type Array.
I decided to split "partitions" every 1000 records, which is about ~60-70KB per record, but in my calculations may go up to ~100KB.
I also made field names in json 1 letter, cause every letter in 1000 records is like 1 or 2 KB, depending on encoding.
Actually that approach made PHP code like twice as fast and there is a lot less usage on network and remote database (1000 times less inserts/destroys basically).
So, that is my solution, if anybody has any other ideas, please post it as answer here, cause probably I'm not the only one with such problem and that certainly isn't the only solution.

Cassandra lookup query is quite slow after deleting large bundle of data

Currently, I have a cassandra column family with large rows of data, to say more than 100,000. Now, I'd like to remove all data in this column family and the problem came up:
After all data is removed, I execute a lookup query in this column family, the cassandra will take tens of seconds to return a empty query result. And the time cost will increase Linearly when the original data is larger
It is caused by the tombstone feature while deleting data from the cassandra database. The lookup speed won't recover to normal until the next GC is fired. See Cassandra Distributed Deletes.
Because such query operations are frequently used in my system, I cannot bear the huge latency up to a few seconds.
Would you please give me a solution to this problem?
This sounds like a very bad way to use a database. Populate it, empty it, repeat. One way you can solve your problem is by using different CF names each time, as in when you empty the data and start repopulating it, create a new column family and use that and just drop the other colum family however this is hacky.
I'd suggest using compaction (gets rid of all the tombstones it can detect) to solve your problem, it is CPU intensive but it's better than waiting for tens of seconds for queries to respond. You can make the task less intensive on your machine by providing the specific ks & cf you want to compact:
./nodetool compact <ks_name> <cf_name>
Ritchard's point is a good one, gc_grace_seconds is set to 10 days by default so you will probably have to tweak this to allow for compaction to get rid of tombstones.
If your column family is frequently modified (read then update then read the update again...), you should use the leveled compaction strategy
To make deleted columns removed quickier, change the property gc_grace_seconds of your column family
