What is definition of an Identical Request?

What is definition of an Identical Request? - microservices

I found a good answer about idompetancy here (https://softwareengineering.stackexchange.com/questions/320143/should-an-idempotent-service-always-return-the-same).
But what really is the definition of "identical request" ?
Can a two different API calls (different requestid/correlationid) but with the exact same body content be considered as identical request ?
My understanding is : it does not matter when and how (either via api call or event messaging) the two requests are made, as long as they has the same effect to the application state , then both are the same request. Is this correct?

As long as the application state stays the same after the requests. How you do idempotency depends on how you want to deal with these type of scenarios.
Scenario 1
increment(x)
append(xs, x)
I highly discourage such an APIs; but they must have a unique correlation id or some sort of admission control to work properly.
idempotency token: 521F14B6-4A72-467F-8CD0-6654C69F4629
increment(x);
append(xs, x)
where xs read at revision 1022;
Scenario 2
Put(x=1)
Delete(x)
Put(x=1)
How do you determine if the two Puts are the same? Again, either idempotency tokens or admission control or both.
A key issue with idempotency tokens is what to do with them. Do they have a TTL? Do you store them in a database?

Related

What is statelessness in REST API in layman term?

I have gone through various REST API documentation. However, I don't have a clear understanding of it. Can someone help me to understand it in layman's terms? How is it different from API being stateful? Why today all the APIs we develop are restful APIs?

Term stateless/stateful is a common term and has the same meaning for Rest API as for anything else.
Stateless means that when you make a call to some entity the response (output) will always depend on your input only. I.e. for any number of calls with the same input you will ALWAYS get the same response. Example: a request of "what is 2 + 2 equals to?" will always get you an answer 4 no matter how many times you ask and regardless if there were other queries in-between your query requests.
Stateful means that the output will depend on your input and some internal state. Example: "Please add 2 to current number that you hold. what is the new number?" (Assuming that the internal state (the current number) is 0). after the first query you will get the answer 2 but after the same query for the second time you will get answer 4 and so forth.

Each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. -- Fielding, 2000
Note that this is a constraint on semantics: all of the information that you need to understand the meaning of the request is in the request itself.
Counter example: the LIST command in FTP, where a null argument for pathname "implies the user's current working or default directory" is not stateless -- current working directory is data of session state that the server needs to remember from previous requests.

Role of off-chain workers

I'm trying to build a mental model of the role of off-chain workers in substrate. The bigger picture seems to be that they move logic inside the substrate node, that was otherwise done by oracles, triggering on predefined transactions. There are two use cases I was thinking of specifically:
1: Validating file formats: incoming transaction proposes a file accessible via url or ipfs hash, and it's format needs to be validated. An off-chain worker fetches the file, asserts format (size, encoding, content, whatever) and if correct submits another transaction saying it's valid.
2: Key generation: let's assume there is a separate service distributed with the substrate node, which manages keys for each instance. Node A runs a key sharing algorithm (like Shamir's secret sharing) via this external service between participants A, B and C, then makes a transaction creating a group (A,B,C) on-chain. This transaction triggers all nodes that are in this group to run off-chain workers, call into their local key store verifying having the key. They can all mark it on-chain afterwards.
As far as I understand it correctly, off-chain workers are triggered in every node after block execution. In the former use case, this would result in lots of transactions validating just one file, and nothing guarantees the correctness of these. What is a good way of reaching consensus on the validity of the file? Is it also possible without economic incentives like staking? It would be problematic with tokens having no value in the network, e.g in enterprise settings. Is this even the right use case for off-chain workers? The second example should not suffer from such issue, we just need all parties to verify having the key.
Where does the thought process above go wrong, and why?

As far as I understand it correctly, off-chain workers are triggered in every node after block execution.
Yes and no. There is a CLI flag for it. And at the time of this writing it says:
--offchain-worker <ENABLED>
Should execute offchain workers on every block.
By default it's only enabled for nodes that are authoring new blocks. [default: WhenValidating] [possible
values: Always, Never, WhenValidating]
In the former use case, this would result in lots of transactions validating just one file, and nothing guarantees the correctness of these.
I think it is the responsibility of the receiving function (aka. Call) to handle and incentivise this. For example, there could be a reward opportunity to validate an address. But, if it has already been submitted by another transaction, you will get slashed (or even if not, you do pay some transaction fee, for nothing). In such cases, you can assume that not all participants will submit a transaction. They will only do it when there is a chance of improvement, which should be depicted by your potential reward/slash scheme.
Is this even the right use case for off-chain workers?
I am no expert here, but I think at least the validation example is a good example. It is just a matter of finding a good incentive + anti-spam slashing.
I am less familiar with the second example, so no comments on that.

Is really safety to use PATCH based on array index?

For instance if we (as client app) retrieve a Patient with one array of contacts and now we send to the fhir server a PATCH request to modify some of the info for some of the contact... the only way we sawto indicate it is using the position. Example : Patient.contact[1].gender. Thats only one example.
I think that approach (using array position) is not safety because services are not stateful and besides, no always the server are returning the same order for the same array (its no makes sense to suppose we are reciving the contact list ordered) so the server could change the wrong contact (in this case or to be more dangerous/unsafe situation if we use clinical resources).
I'm wrong ? There is another more safety approach of using PATCH without penalize the performance?

For a JSON Patch, you could use a "test" operation if you had a value within the array that can be relied upon. The patch operation as a whole is required to fail if the test fails: http://jsonpatch.com/#test
For XML Patch, I believe you may be able to do something similar with selectors? https://www.rfc-editor.org/rfc/rfc5261#section-4.1 - again, it depends on what you're trying to update.
I also agree with others that you should only attempt to patch if the version matches. There are very few updates that should be made to clinical data in a version-blind manner.

Servers are supposed to retain order. Not all servers will, but servers who don't probably won't be able to support PATCH. If you wish, feel free to submit a change request and we can highlight that in the specification.

Thanks so much for your clarification. Sure, we will request a change, at least in the documentation for highliting this requirement (server have to mantain the order).
But What do you mean exactly with "order"?? For instance, Meanwhile the appclient1 retrived the Patient with 3 contacts (Andrew,Bob,Dukhan) and send a patch for [2] (Dukhan), but during this time any other system (appclient2) has added a new contact (Carl) .. now the list (on server side) will be Andrew (0), Bob (1), Carl(2) and Dukhan (3).... so when PATCH request for dukhan is received on the server from the initial appclient1 the position [2] just now is not Dukhan , is Carl. So we continue with the same unsafe situation.

Is there a RESTful way to determine whether a POST will succeed?

Is there a RESTful way to determine whether a POST (or any other non-idempotent verb) will succeed? This would seem to be useful in cases where you essentially need to do multiple idempotent requests against different services, any of which might fail. It would be nice if these requests could be done in a "transaction" (i.e. with support for rollback), but since this is impossible, an alternative is to check whether each of the requests will succeed before actually performing them.
For example suppose I'm building an ecommerce system that allows people to buy t-shirts with custom text printed on them, and this system requires integrating with two different services: a t-shirt printing service, and a payment service. Each of these has a RESTful API, and either might fail. (e.g. the printing company might refuse to print certain words on a t-shirt, say, and the bank might complain if the credit card has expired.) Is there any way to speculatively perform these two requests, so my system will only proceed with them if both requests appear valid?
If not, can this problem be solved in a different way? Creating a resource via a POST with status = pending, and changing this to status = complete if all requests succeed? (DELETE is more tricky...)

HTTP defines the 202 status code for exactly your scenario:
202 Accepted
The request has been accepted for processing, but the processing has not been completed. The request might or might not eventually be acted upon, as it might be disallowed when processing actually takes place. There is no facility for re-sending a status code from an asynchronous operation such as this.
The 202 response is intentionally non-committal. Its purpose is to allow a server to accept a request for some other process (perhaps a batch-oriented process that is only run once per day) without requiring that the user agent's connection to the server persist until the process is completed. The entity returned with this response SHOULD include an indication of the request's current status and either a pointer to a status monitor or some estimate of when the user can expect the request to be fulfilled.
Source: HTTP 1.1 Status Code Definition
This is similar to 201 Created, except that you are indicating that the request has not been completed and the entity has not yet been created. Your response would contain a URL to the resource representing the "order request", so clients can check the status of the order through this URL.
To answer your question more directly: There is no way to "test" whether a request will succeed before you make it, because you're asking for clairvoyance.
It's not possible to foresee the range of technical problems that could occur when you attempt to make a request in the future. The network may be unavailable, the server may not be able to access its database or external systems it depends on for functioning, there may be a power-cut and the server is offline, a stray neutrino could wander into your memory and bump a 0 to a 1 causing a catastrophic kernel fault.
In order to consume a remote service you need to account for possible failures of any request in isolation of any other processes.
For your specific problem, if the services have no transactional safety, you can't bake any in there and you have to deal with this in a more real-world way. A few options off the top of my head:
Get the T-Shirt company to give you a "test" mechanism, so you can see whether they'll process any given order without actually placing it. It could be that placing an order with them is a two-phase operation, where you construct the order in the first phase (at which time they validate its creation) and then you subsequently ask the order to be processed (after you have taken payment successfully).
Take the credit-card payment first and move your order into a "paid" state. Then attempt to fulfil the order with the T-Shirt service as an asynchronous process. If fulfilment fails and you can identify that the customer tried to get something printed the company is not prepared to produce, you will have to contact them to change their order or produce a refund.
Most organizations will adopt the second approach, due to its technical simplicity and reduced risk to the business. It also has the benefit of being able to cope with the T-Shirt service not being available; the asynchronous process simply waits until the service is available and completes the order at that time.

Exactly. That can be done as you suggest in your last sentence. The idea would be to decopule resource creation (that will always work unless network failures) that represents an "ongoing request" of the "order acceptation", that can be later decided. As POST returns a "Location" header, you can then retrieve in any moment the "status" of your request.
At some point it may become either accepted or rejected. This may be intantaneous or it may take some time, so you have to design your service with these restrictions (i.e. allowing the client to check if his/her order is accepted, or running some kind of hourly/daily service that collect accepted requests).

Double-click double-insert resolutions?

A team member has run into an issue with an old in-house system where a user double-clicking on a link on a web page can cause two requests to be sent from the browser resulting in two database inserts of the same record in a race condition; the last one to run fails with a primary key violation. Several solutions and hacks have been proposed and discussed:
Use Javascript on the web page to mitigate the second click by disabling the link on the first click. This is a quick and easy way to reduce the occurrences of the problem, but not entirely eliminate it.
Wrap the request execution on the sever side in a transaction. This has been deemed too expensive of an operation due to server load and lock levels on the table in question.
Catch the primary key exception thrown by the failed insert, identify it as such, and eat it. This has the disadvantages of (a) vendor lock-in, having to know the nuances of the database-specific exceptions, and (b) potentially not logging/dealing with legitimate database failures.
An extension of #3 by attempting to update the record if the insert fails and checking the result of the update to ensure it returns 1 record affected.
Are the other options that haven't been considered? Are there pros and cons of the options presented that were overlooked? Which is the lesser of all evils?

Put a unique identifier on the page in a hidden field. Only accept one response with a given unique identifier.

It sounds like you might be misusing a GET request to modify server state (although this is not necessarily the case). While it may not be appropriate for you situation, it should be stated that you should consider converting the link into a form POST.

You need to implement the Synchronizer Token pattern.
How it works is: a value (the token) is generated on the server for each request. This same token must then be included in your form submission. On receipt of the request the server token and client token are compared and if they are the same you may continue to add your record. The server side token is then regenerated, so subsequent requests containing the old token will fail.
There's a more thorough explanation about half-way down this page.
I'm not sure what technology you're using, but Struts provides framework level support for this pattern. See example here

It seems you already replied to your own question there; #1 seems to be the only viable option.
Otherwise, you should really do all three steps -- data integrity should be handled at the database level, but extra checks (such as the explicit transaction) in the code to avoid roundtrips to the database could be good for performance.

REF You need to implement the Synchronizer Token pattern.
This is for Javascript/HTML not JAVA

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio