several questions about multi-paxos? - algorithm

I have several questions about multi-paxos
will each instance has it's own proposal Number and accepted ballot and accepted value ? or all the instance share with the same
proposal number ,after one is finished ,then anther one start?
if all the instance share with the same proposal number ,Consider the below condition, server A sends a proposal ,and the acceptor returns the accepted instanceId which might be greater or less than the proposal'instanceid ,then what will proposal do? use that instanceId and it's value for accept phase? then increase it'own instanceId ,waiting for next round ,then re-proposal with it own value? if so , when is the previous accepted value removed,because if it's not removed ,the acceptor will return this intanceId and value again,then it seems it is a loop

Multi-Paxos has a vague description so two persons may build two different systems based on it and in a context of one system the answer is "no," and in the context of another it's "yes."
Approach #1 - "No"
Mindset: Paxos is a two-phase protocol for building write-once registers. Multi-Paxos is a technique how to create a log on top of them.
One of the possible ways to build a log is
Create an array of completely independent write-once registers and initialize the first one with an initial value.
On new record we should:
A) Guess an index (X) of a vacant register and try to write a dummy record here (if it's already used then pick a register with a higher index and retry).
B) Start writing dummy records to every register with smaller than X index until we find a register filled with a non-dummy record.
C) Calculate a new record based on it (e.g., a record may have an ordinal, and we can use it to calculate an ordinal of the new record; since some registers are filled with dummy records the ordinals aren't equal to index) and write it to the X+1 register. In case of a conflict, we should restart the procedure from step A).
To read the log we should start writing dummy values from the first record, and on each conflict, we should increment index and retry until the write is succeeded which would indicate that the log's end is reached.
Of course, there is a lot of overhead in this approach, so please treat it just like a top-level overview what Multi-Paxos is.
The log is a powerful concept, and we can use it as a recipe for building distributed state machines - just think of each record as an update command. Unfortunately, in some cases, there is also a lot of overhead. For example, if you want to build a key/value storage and you care only about the current value than you don't need history and probably need to implement garbage collection to remove past versions from the log to optimize storage costs.
Approach #2 - "Yes"
Mindset: rewritable register as a heavily optimized version of Multi-Paxos.
If you start with the described approach with an application to the creation of key/value storage and then iterate in other to get rid of overhead, e.g., by doing garbage collection on the fly then eventually you may come up with an idea how to update the write-once register to be rewritable.
In that case, each instance uses the same ballot numbers just because all the instances are collapsed into one rewritable instance.
I described this approach in the How Paxos Works post and implemented it in the Gryadka project with 500-lines of JavaScript. Also, the idea behind it was independently checked with TLA+ by Greg Rogers and Tobias Schottdorf.

Related

LCI data format in BW2

So I want to import my own LCI database to Brightway2, and my process has 3 valuable products.
I found this example with co-products: https://github.com/massimopizzol/B4B/blob/main/02.2_Simple_LCA_co_products.py
The example shows more or less how it works, but I would like to use allocation for my process and not substitution. Should I just change the type to "allocation" instead of putting there "substitution", or is bw2 not supporting allocation? Also, if we have 3 valuable products, the first one is part of the main activity as type="production", and 2 others have 'type="substitution"? And for the other 2, we create 2 separate activities and they are just kinda one exchange activity, in which their type is production, like the example?
Besides just to make sure, if one of the inputs has the type="technosphere", we need to create another activity where we show the process behind it. When it comes to raw products, they have the type="biosphere", and their amounts are negative, in comparison to emissions.
I set other valuable products type as "substitution" and for each of them I created a new activity, where their type was equal to "production". Overall it worked, but the obtained LCA score wasn't correct, so I don't know if it wasn't conceptual mistake.
Thank you in advance for all your help and time!
So, Brightway currently does not have a model where you can enter a multifunctional process and get the software to do allocation for you. You will need to do the allocation yourself :) Here is a notebook I wrote up that shows a simple allocation procedure.
P.S. In the future please only post to one of the beginners mailing list or SO, otherwise everyone doesn't get notified twice.
Changing "substitution" by "allocation" will not work. If you want to use allocation / partition, I would create the activities with the exchanges already allocated.
The meaning of the "substitution" exchange as well as the sign conventions for biosphere flows is explained in the documentation here.

Making sure you don't add same person data twice using EventSourcing

I am wondering how you make sure you are not adding the same person twice in your EventStore?
lets say that on you application you add person data but you want to make sure that the same person name and birthday is not added twice in different streams.
Do you ask you ReadModels or do you do it within your Evenstore?
I am wondering how you make sure you are not adding the same person twice in your EventStore?
The generalized form of the problem that you are trying to solve is set validation.
Step #1 is to push back really hard on the requirement to ensure that the data is always unique - if it doesn't have to be unique always, then you can use a detect and correct approach. See Memories, Guesses, and Apologies by Pat Helland. Roughly translated, you do the best you can with the information you have, and back up if it turns out you have to revert an error.
If a uniqueness violation would expose you to unacceptable risk (for instance, getting sued to bankruptcy because the duplication violated government mandated privacy requirements), then you have to work.
To validate set uniqueness you need to lock the entire set; this lock could be pessimistic or optimistic in implementation. That's relatively straight forward when the entire set is stored in one place (which is to say, under a single lock), but something of a nightmare when the set is distributed (aka multiple databases).
If your set is an aggregate (meaning that the members of the set are being treated as a single whole for purposes of update), then the mechanics of DDD are straightforward. Load the set into memory from the "repository", make changes to the set, persist the changes.
This design is fine with event sourcing where each aggregate has a single stream -- you guard against races by locking "the" stream.
Most people don't want this design, because the members of the set are big, and for most data you need only a tiny slice of that data, so loading/storing the entire set in working memory is wasteful.
So what they do instead is move the responsibility for maintaining the uniqueness property from the domain model to the storage. RDBMS solutions are really good at sets. You define the constraint that maintains the property, and the database ensures that no writes which violate the constraint are permitted.
If your event store is a relational database, you can do the same thing -- the event stream and the table maintaining your set invariant are updated together within the same transaction.
If your event store isn't a relational database? Well, again, you have to look at money -- if the risk is high enough, then you have to discard plumbing that doesn't let you solve the problem with plumbing that does.
In some cases, there is another approach: encoding the information that needs to be unique into the stream identifier. The stream comes to represent "All users named Bob", and then your domain model can make sure that the Bob stream contains at most one active user at a time.
Then you start needing to think about whether the name Bob is stable, and which trade-offs you are willing to make when an unstable name changes.
Names of people is a particularly miserable problem, because none of the things we believe about names are true. So you get all of the usual problems with uniqueness, dialed up to eleven.
If you are going to validate this kind of thing then it should be done in the aggregate itself IMO, and you'd have to use use read models for that like you say. But you end up infrastructure code/dependencies being sent into your aggregates/passed into your methods.
In this case I'd suggest creating a read model of Person.Id, Person.Name, Person.Birthday and then instead of creating a Person directly, create some service which uses the read model table to look up whether or not a row exists and either give you that aggregate back or create a new one and give that back. Then you won't need to validate at all, so long as all Person-creation is done via this service.

Choosing Redis datatypes for advanced data manipulation in a simple torrent tracker service

I need your advice on Redis datatypes for my project. The project is a torrent-tracker (ruby, simple sinatra-based) with pure in-memory data store for current information about peers. I feel like this is what Redis is made for. But I'm stuck at choosing proper data types for this. For now I tend to the following setup:
Use list for seeders. Actually I'd better need a ring buffer to get a sequential range of seeders (with given size and start position) and save new start position for the next time.
Use sorted set for leechers. Score for each leecher is downloaded/(downloaded+left) so I can also extract a range for any specific case.
All string values in set and list are string (bencoded) representation of peer data.
What I actually lack in the setup above is:
Necessity to store offset for seeders so data access needs synchronization.
Unknown method of finding a specific seeder in list. Here I may benefit from set but then I won't be able to extract a range of items at once.
(General problem) Need TTL for set/list members (if client shuts down without sending any data before this). Possible option is to make each peer an ordinary string key/value (string or hash), give it TTL, subscribe on destroy and delete it in corresponding list or set.
What could you suggest? Any practical advice?

Global mapping of one subscript dimension to another database

I have a vendor defined database (about 140GB total) on Caché 2007. It uses the old style MUMPS programming environment and accesses globals directly in a hierarchical style. There is one global that accounts for about 75% of the total database size. The first subscript in this table is an artificial integer account number. The next 2-3 subscripts are constant subrecord identifiers that break up blocks of fields and denote repeating sub record kinds.
One of these repeating subrecords (record type 30) is for notes on an account. Because of the way the system is used, this dimension accounts for a very large portion of the global's total space; I'd estimate it to be at least 50%. Because of the way Caché stores data physically in the database, a scan of this global ends up loading all or most of these notes as a side effect even though they aren't relevant to most operations. It has the effect of greatly increasing the cost of IO operations on the global, especially when you only want one tiny detail from a bunch of accounts.
Example subscript references for this global:
^ACCT(3461,10,1)="SOME^DATA"
^ACCT(3461,10,2)="MORE^DATA"
...
^ACCT(3461,30,1)="NOTE1 blah blah"
^ACCT(3461,30,2)="NOTE2 blah blah"
...
^ACCT(3461,30,100)="NOTE100 blah blah"
I can't change the design of the database. It's controlled by an outside vendor and there is a large amount of MUMPS style hardcoded references in the database. I'm thinking that a big reason that batch operations are so slow on the system are due to the high cost of these mostly irrelevant notes coming along for the IO ride whenever account data is accessed. Scanning this whole global (i.e. when there is no useful application maintained index) takes at least 8 hours.
One thought I had is to shift the note data from being stored along side other details in the global to a separate database file by using the global mapping facility described in the Guide to Using Caché Globals and Guide to System Administration. If I could map all the subscript 30s to a separate database file in the same Caché database, most data operations (the ones that don't even care about notes) wouldn't be bringing those in to memory along with the details they do care about.
In the global structure guide (1st link), this looks plausible as they show a particular 2nd subscript mapping separately than the 1st subscript. What they don't show in any of the examples is what the syntax is to make that happen. In the "Add a new global mapping" screen in the Caché Management Portal, I should be able to do something like
Global name: ACCT
Subscripts to be mapped: (BEGIN:END)(30)
But whatever variations I try in the syntax, I always get ERROR #657: Invalid subscript in reference 1 subscript #1.
StackExchange note: This question would possibly be better suited to dba.stackexchange.com but there are apparently zero Intersystems questions there and I don't think it would get any attention.
Unfortunately, while it's possible to map 2nd level subscripts of a particular node, it's not possible to map 2nd level subscripts of all nodes.
There is an experienced Performance team on WRC, did you try to contact them?

How would you implement a Workflow system?

I need to implement a Workflow system.
For example, to export some data, I need to:
Use an XSLT processor to transform an XML file
Use the resulting transformation to convert into an arbitrary data structure
Use the resulting (file or data) and generate an archive
Move the archive into a given folder.
I started to create two types of class, Workflow, which is responsible of adding new Step object and run it.
Each Steps implement a StepInterface.
My main concerns is all my steps are dependent to the previous one (except the first), and I'm wondering what would be the best way to handle such problems.
I though of looping over each steps and providing each steps the result of the previous (if any), but I'm not really happy with it.
Another idea would have been to allow a "previous" Step to be set into a Step, like :
$s = new Step();
$s->setPreviousStep(Step $step);
But I lose the utility of a Workflow class.
Any ideas, advices?
By the way, I'm also concerned about success or failure of the whole workflow, it means that if any steps fail I need to rollback or clean the previous data.
I've implemented a similar workflow engine a last year (closed source though - so no code that I can share). Here's a few ideas based on that experience:
StepInterface - can do what you're doing right now - abstract a single step.
Additionally, provide a rollback capability but I think a step should know when it fails and clean up before proceeding further. An abstract step can handle this for you (template method)
You might want to consider branching based on the StepResult - so you could do a StepMatcher that takes a stepResult object and a conditional - its sub-steps are executed only if the conditional returns true.
You could also do a StepException to handle exceptional flows if a step errors out. Ideally, this is something that you can define either at a workflow level (do this if any step fails) and/or at a step level.
I'd taken the approach that a step returns a well defined structure (StepResult) that's available to the next step. If there's bulky data (say a large file etc), then the URI/locator to the resource is passed in the StepResult.
Your workflow is going to need a context to work with - in the example you quote, this would be the name of the file, the location of the archive and so on - so think of a WorkflowContext
Additional thoughts
You might want to consider the following too - if this is something that you're planning to implement as a large scale service/server:
Steps could be in libraries that were dynamically loaded
Workflow definition in an XML/JSON file - again, dynamically reloaded when edited.
Remote invocation and call back - submit job to remote service with a callback API. when the remote service calls back, the workflow execution is picked up at the subsequent step in the flow.
Parallel execution where possible etc.
stateless design
Rolling back can be fit into this structure easily, as each Step will implement its own rollback() method, which the workflow can call (in reverse order preferably) if any of the steps fail.
As for the main question, it really depends on how sophisticated do you want to get. On a basic level, you can define a StepResult interface, which is returned by each step and passed on to the next one. The obvious problem with this approach is that each step should "know" which implementation of StepResult to expect. For small systems this may be acceptable, for larger systems you'd probably need some kind of configurable mapping framework that can be told how to convert the result of the previous step into the input of the next one. So Workflow will call Step, Step returns StepResult, Workflow then calls StepResultConverter (which is your configurable mapping thingy), StepResultConverter returns a StepInput, Workflow then calls the next Step with StepInput and so on.
I've had great success implementing workflow using a finite state machine. It can be as simple or complicated as you like, with multiple workflows linking to each other. Generally an FSM can be implemented as a simple table where the current state of a given object is tracked in a history table by keeping a journal of the transitions on the object and simply retrieving the last entry. So a transition would be of the form:
nextState = TransLookup(currState, Event, [Condition])
If you are implementing a front end you can use this transition information to construct a list of the events available to a given object in its current state.

Resources