Is there a unique queue in Ruby? - ruby

A unique queue would only allow the queueing of a value once. Subsequent queuing would not do anything.
If it is not clear what the uniqueness criteria is, there could be a key that would be added.
Is there such a data structure in Ruby?

Related

Splitting work across microservice instances with dynamic partitions

I have a table in DB for "messages to be sent". Message ordering for same destination is important.
For instance:
create table outbox (
id int serial,
destination varchar
payload varchar
)
Currently I have a thread which does select * from outbox order by id, and because ID's are ordered, I can group it by destination and send.
Now I want to make it a separate microservice but not sure about how to handle this if I have to scale it.
That's going to scale as far as the DB can scale (because you're putting the burden of synchronizing on the DB).
Note that your IDs are globally synchronized, when you really only need the synchronization per destination. You can get pretty far by sharding the outbox table by destination: create N outbox tables and consistently map a given destination to a given outbox table. You can grow the number of outbox tables as needed, as long as doing so doesn't result in a change in which outbox table an already-existing destination uses (this can be satisfied by having a table tracking which outbox table to use for a given destination: entries can be added to that table by hash modulo number of outbox tables). At the limit, as you scale this out, you might end up with 1 outbox table per destination.
The above is implementable in anything.
That said, the significant ordering for message sends requirement is strikingly similar to the actor model of computation (e.g. most, if not all, actor implementations provide a guarantee that if actor A sends messages X, Y, Z to actor B, actor B will receive those messages in the order they were sent). Many actor model frameworks (e.g. Akka on the JVM, Akka.Net, Orleans, Lagom, Cloudstate, Ray(?)) support a notion of sharding actors across a cluster and using event-sourced persistence, which will manage a lot of the aspects of what I outlined above for you. So it might be worth investigating that approach rather than trying to implement all that yourself.

State store partitioned iterator?

I have a Kafka-streams transformer which functions like a windower: it accumulates state into a state store in transform() and then forwards it in an output topic during punctuate(), with the state store topic partition key the same as the input topic.
During punctuate(), I would like each StreamThread to only iterate its own partition of the state store to minimize the amount of data to be read from the backing kafka topic. But the only iterator I can get is through
org.apache.kafka.streams.state.ReadOnlyKeyValueStore<K,V>.all()
which iterates through the whole state store.
Is there any way to "assign partitions" of a state store and make punctuate() iterate only on the assigned partitions?
I guess, ReadOnlyKeyValueStore<K,V>.all() does what you want. Note, that the overall state is sharded into multiple stores with one shard/store per partitions. all() does not iterate through "other shards". "all" means "everything local", ie, everything from the shard of a single partition.

DocumentDB unique concurrent insert?

I have a horizontally event-source driven application that runs using an Azure Service Bus Topic and a Service Bus Queue. Some events for building up my domain model's state are received through the topic by all my servers, while the ones on the queue (the ones received a lot more often and not mutating domain model state) are distributed among the servers in order to distribute the load.
Now, every time one of my servers receives an event through the queue or topic, it stores it in a DocumentDB which it uses as event store.
Now here's the problem. How can I be sure that the same document is not inserted twice? Let's say 3 servers receive the same event. They all try to store it. How can I make it fail for 2 of the servers in the case they decide to do it all at the same time? Is there any form of unique constraint I can set in DocumentDB or some kind of transaction scope to prevent the document from being inserted twice?
The id property for each document has a uniqueness constraint. You can use this constraint to ensure that duplicate documents are not written to a collection.

How to keep order of records in database

I'm developing an app where records appear in certain order. Users are allowed to reorder records as they wish, and I need to store that.
I have an order number for each record, but when they reorder records, that affects all records that go after that record - which could be quite expensive database operation.
Is there a clever way of storing record's order number, so that it doesn't affect many of the other records?
I have written a web application with at a high level similar requirements. I added two fields to a document which contained metadata about the user sortable list:
SortOrderVersion: integer
SortOrder: array of _id for documents
The SortOrder simply contained an ordered array of each document's _id. It was that list was manipulated by the client. The second field, SortOrderVersion was used to optimistically protect changes by multiple clients simultaneously. If the version being sent matched what was stored via findAndModify, then the update was allowed, and the number was incremented to prevent further changes by other clients. (And as a bonus, the changes were pushed to the other clients via a web socket connection).
When doing it this way, the server would do the sorting based on the list before returning it to the client as it was cached, and didn't change frequently. I could have pushed the busy work of sorting to the client, I just didn't think it was necessary.
I had considered storing the documents as a subdocument in a sorted array within a single document, but in my case their were too many opportunities where multiple users would be editing the details of the subdocuments which complicated updates and reordering significantly.
While I didn't need it for this web application, by storing the sort order independently, I could have extended the application to provide sorting easily on a per user basis.

Checking uniqueness on real time application

On a real time messaging application, I want to control if incoming message is unique. For this purpose, I am planning to insert a hash of incoming message as unique key in db and check if I get unique key exception. (ORA-00001 in oracle).
Is this an efficient way or is there a better way to consider for this case ?
For ones who want to know, program is written in java and as a db we use oracle.
If you're trying to get around the performance problem of uniqueness tests on very large strings, then this is a decent way of achieving it, yes.
You might need a way to deal with hash collisions, though, as the presence of a unique key would prevent different messages having the same hash from loading. One way would be to check for existing matching hashes and do a comparison test against the full text of the message. It would keep your index size down as you'd index on the hash not the message text, but Ii would not be completely foolproof as two identical messages could be loaded by different sessions if the timing was exactly right (or wrong, depending on your perspective).

Resources