Is the concept of forking (source code) a relevant term in Hyperledger Fabric? - fork

Hard or Soft forking of blockchain code (such as Bitcoin and Ethereum) has caused a significant change (and headache) for shareholders of those blockchains.
My question is does hard or soft forking and the process of voting for a fork is a relevant concept in Hyperledger projects? Does Hyperledger Fabric allows the nodes to create a fork and run the fork?

Obviously, if you change the source code you can do whatever you want - hard or soft fork, and since Fabric is a permissioned blockchain - you have to coordinate with the members of the channel before you do such a thing - else you'll have a state fork.

Related

Microservices architecture versioning on periodic releases

I'm trying to wrap my head around the best practices to manage versioning in microservices based architecture with periodic releases.
Currently our system is decomposed into multiple different repositories:
Frontend
Backend
Database
API gateway
Docker-compose-env
Each of these components must be developed, built, tested, containerized and deployed independently. But the release cycles are synchronized and periodic. Docker-compose-env project contains the environment definition to start all compatible service versions for development and integration testing purpose.
Current versioning strategy is as follows:
Each commit to master branch is tagged with a semantic version and pushed to docker registry (semantic tags are used to track dependencies during development cycle)
Each merge commit to persistent release branch is tagged with a release tag and pushed to docker registry (release tags are used to synchronize project versions together for quarterly release)
master is the trunk, and periodic release build is initiated by PR from master to release.
I'm skeptical if this the best way to manage versions with microservices based architecture on periodic releases. Any feedback or tips are appreciated.
Each of these components must be developed, built, tested, containerized and deployed independently. But the release cycles are synchronized and periodic.
There is a contradiction here. Microservices mostly solve an organizational problem - the main point is that teams should be able to work independently as much as possible.
Synchronization between teams is what make them slow. This can happen in different ways, e.g. waiting for another version to be deployed in a shared test environment, or using the same shared database schema, or making releases at the same time.
I'm skeptical if this the best way to manage versions with microservices based architecture on periodic releases.
Try to avoid "synchronized releases", instead make sure to not break any contracts between the services (e.g. no breaking API changes). Try to release more often, you want to work in small batches to reduce the risk with deployments and changes. Try to not pile of a bunch of changes, deploy continuously - Continuous Delivery.
Release cycles are syncronized
I think the fact that you need to do a synchronized release of all services at the same time could be an indicator that the coupling between your services is higher then it should be and probably the way you are managing it can be improved.
The question is how can you design your development teams working on different micro-services so that when they introduce changes and they do not break each others micro-service?
Versioning and managing changes
There are 2 aspects which are important for this to work and they are:
Versioning and how you implement and work with versioning.
Team Communication. Communication between teams when introducing breaking changes.
What do I mean by this?
First about versioning. Your micro-services are communicating with each other.
Regardless of the fact that the communication is sync or async using Rest(or SOAP or gRPC or other) or Messaging(Queues) they need to rely on some Contracts. Those Contracts will be some API Contracts(in terms of Java/C# classes/interfaces). They need to be stable as they can be used by other micro-services.
Suggestion: I would suggest to do versioning of the micro-service independent from the versioning of the Contracts.
Example:
Micro-service order-micro-service could be at latest version v1.0.0 and Contracts order-micro-service-contracts at version v1.0.0 as well.
Micro-service customer-micro-service could be at latest version v3.0.0 but Contracts customer-micro-service-contracts could be at version v2.2.1.
Micro-service product-micro-service could be at latest version v3.0.0 and Contracts product-micro-service-contracts could be at version v4.0.0.
As you can see from the example above the version from the Micro-service and its corresponding exposed Contracts can be the same but they can also differ. The reason is simply that you can do changes on the micro-service(some internal business logic change) without changing the Contracts. And you can also do changes on the Contracts without changing the micro-service logic. Usually changes happens on both of them in the same time. You update some api business logic for which you adjust the exposed Contract. But sometimes a MAJOR change in the micro-service logic is not necessary a breaking or MAJOR change on the Contracts. As you see this gives you great flexibility. The benefit of this is not only flexibility but also the fact that a micro-service-A will only be dependent on micro-service-B-contracts and not the micro-service-B itself. This is just a suggestion you can also use one version for micro-service and its exposed Contracts.
Now about team communication. By this I mean if you have an organization where you have multiple teams working on different areas of the system and each team is responsible for one or more micro-services from an particular Domain.
If you are using the Semantic versioning like MAJOR.MINOR.PATCH for example v1.3.5 then you can do it in the following way.
There are a couple of things which are important to consider:
Contracts PATCH/MINOR change
A change which is a PATCH or MINOR change version upgrade should not be a breaking change and should always be backwards compatible for the consumers who use those contracts. This means that you should ensure that upgrading from version 1.3.0 to 1.4.0 should not be a problem to the consumer regardless of the fact if he upgraded to 1.4.0 or stayed on 1.3.0 for a little longer. Ideally all consumer should update to latest version but even if they don't for some period they will not be broken by the change. For example a change for which you will do that kind of upgrade would be adding new Contract model or updating existing model with new not mandatory fields, or increasing accepted string length from a field from 20 to 50 or similar.
Contracts MAJOR or breaking change
Is usually a big change which can also be breaking change. If it is a breaking change then we need some team process in place. The teams who use those contracts need to be notified that the change will happen upfront and even when releasing the new version a bridging period of couple weeks(or sprints) should be ensured where both versions of contracts will work(old and new). This will give the affected teams/micro-services enough time to upgrade and adjust their services. After that the backwards compatibility compromise Contracts/code can be deprecated. Sometimes for some cases a solution for a breaking change Contract change is introducing a complete new version of that Model(class) and not do a hard change on the same Model. For example you could have a CustomerModel class and then introduce CustomerModelV2 and remove the old CustomerModel class after some period. This is a common situation where you have a Contract Model for an Event(Message from a queue) like: CustomerCreated. You can have CustomerCreated and CustomerCreatedV2. You can publish both messages for a particular time period until the consumers adopt and deprecate(stop publishing the event and removing the Contract model) the CustomerCreated event. This depends on your particular business logic or case.
Micro-service changes
Regardless of the fact that the change is just a bug fix, small change or a big change in the service if your versioning is separate from the from the Contracts it should not be affecting the other micro-services, at least not from the contract managing prospective. Doing versioning updates on micro-service only gives you the possibility to deploy it independently.
Independent and separate deployments of micro-services
If you apply the above advice's you will come closer to the situation where you can deploy micro-services independently and without synchronized periods where all services have do be deployed at once.
One of the biggest advantages of using micro-services is being able to deploy micro-services independent from other parts of the system so if you have a chance to do that you should go for it.
Each of these components must be developed, built, tested,
containerized and deployed independently. But the release cycles are
synchronized and periodic.
Since you already develop, build and tested independently you could also do the release independently.
I know that all those suggested changes are not only technical but also organizational changes like team communication, team setup and so on. But usually when working with big system using micro-services it a compromise between those 2 worlds and trying to find the best process and solution for your Organization and Business.

Using blockchain for continuous integration

I'm working on a thesis with the goal of a blockchain based approach to continuous integration. Reading this paper I noticed the strongly diverging understandings and implementations of continuous integration particularly looking at the points 'Definition of build failure and success' and 'Fault handling'. My idea is to use blockchain and smart contracts as a voting mechanism between a pull request by a developer and the build results of the CI servers. The image below is what I thought of.
When a developer sends a pull request a transaction is proposed to the blockchain network which sends the build candidate to the CI servers. The CI servers post their outcome in the form of votes back to the blockchain network and depending on the consensuns mechanism (majority or absolute vote) the build is accepted and the branches are merged.
However I'm new to the world of blockchain and it's unclear to me how to apply consensus mechanisms like PoW or PoS and smart contracts. Maybe some of you could help me answer a few questions or give advice on how to design it.
How do I approach PoW or PoS here? What exactly is being mined here? Does a transaction require a cost? Do I even need those things or can I simply use blockchain as a data structure?
The obvious question with blockchain is why I would even use it in this case. From my own DevOps experience I see through all projects a lack of standards and rules and problems with role-based access control. This includes improved transparency of the system mechanics and traceability throughout the project.
This being a bachelor thesis I'm not trying to revolutionize computer science, my goal is learn more about CI and if blockchain can do any good. The topic and design of the system above is approved by my supervisor.
Apart from cryptocurrencies, the main cases of using blockchain platforms are:
creation of a trusted operating environment among participants who do not trust each other;
building a system of interaction without a point of failure - technical or organizational, when the elimination or malicious intent of any node does not lead to the inoperability of the system as a whole;
construction of a cheap distributed database.
In your case, using blockchain in the sense of a decentralized platform is imprecise and most likely ineffective. One of the main problems in building a blockchain network is determining who will be the owner of the node and why (what are his incentives for this).
In your case, it is difficult to think of rational reasons why each developer (or at least a group) would want to own the nodes instead of using some centralized service. Moreover, you have a guaranteed "centralization point" - a CI server. It might be worth considering an infrastructure with multiple independent CI servers. But even here I still do not see the benefits of using a decentralized solution.
Technically, the use of a blockchain platform for voting in your case does not present any apparent difficulty when implementing almost any voting algorithm - you can use a private Ethereum, Quorum or Hyper Ledger Fabric.
Regarding your question (1), PoW or PoS are technical mechanisms for determining the blocks of the main chain of public blockchain platforms. Your smart-contracts will work on top of this protocol and, most likely, implement consensus algorithms from the BFT group (PBFT, IBFT, and so on).

Hyperledger fabric - Can I call external system during validation or Endorsement Phase

We have a use case where a transaction validation logic is quite complex and requires data from different sources, in order to validate a transaction.
Query Can we call and external rest service to validate certain data from hyperledger fabric, using its pluggable validation feature ?
Making an external api call from hyperledger fabric smart contract is technically possible, it is a risky idea for several reasons:
1) chaincode must be deterministic, and the problem with 'enriching' a transaction using an external API is that it must return the same result running anywhere in a business network, which may very well be running globally, so you need to trust that the answers will all be the same within a time windows that is quite a bit wider than a few ms
2) running just one endorser in development and production gets you around that problem, but weakens consensus a bit, and makes it essentially impossible to prove determinism for any given transaction
3) designing to such a weakened system is not a good idea, since inevitably someone will realize that the endorsement policy should be stronger and you go right back to the issues in point 1
One way around this issue is to use a distributed external API with versioned data (and you might need to write an oracle to provide this facility on top of an API that is not versioning its data) such that all endorsers store the external data's current version in the asset repository in world state as well. This makes certain that the data read is identical and accounts for delays in propagation in the oracle network. The presence of the API data version in the final asset data in world state (more accurately in the read/write set for the transaction) ensures that different versions of data in different regions in the oracle (e.g. propagation delays) will fail any multi-endorsement policy. Of course, a client designed in such an environment is free to resubmit a transaction for endorsement to get consensus.

What is happening when you deploy a BNA file to Hyperledger Composer?

After I've put together my business network definition, what is actually happening on peers after I deploy that package? I'm especially interested in how a hyperledger peer can be interpreting javascript, since that doesn't appear to be a supported language for chaincode.
The Composer chain code is written in Go. It uses the Duktape Javascript interpreter to execute the user (and system) JS code within a Go process.
The Composer chain code maps the public JS API to the underlying Fabric Go API calls.
From a Fabric perspective this is just a "normal" piece of Go chain code, albeit quite a complex one!
When you "deploy" a business network using the Composer CLI, you are actually doing 2 things:
deploying the Composer chain code (Go) and starting it
deploying the bytes of the business network archive and storing it in world-state, so that it is available to the interpreter when you submit transactions
In the future we would like to replace the use of Duktape by native Node.js execution. Thanks to Fabric's modular architecture (and use of Docker containers and gRPC) this should be possible.

Marathon vs Aurora and their purposes

Both Marathon and Aurora are built on Mesos and supposedly are engineered for running long running services. My questions are:
What are their differences? I have struggled in finding any good explanations regarding their key differences
Do these frameworks run anything that runs on Linux? For Marathon they state that it can run anything that "is executable in a shell" but this is sort of vague :)
Thanks!
Disclaimer: I am the VP of Apache Aurora, and have been the tech lead of the Aurora team at Twitter for ~5 years. My likely-biased opinions are my own and do not necessarily represent those of Twitter or the ASF.
Do these frameworks run anything that runs on Linux? For Marathon they
state that it can run anything that "is executable in a shell" but
this is sort of vague :)
Essentially, yes. Ultimately these systems are sophisticated machinery to execute shell code somewhere in a cluster :-)
What are their differences? I have struggled in finding any good
explanations regarding their key differences
Aurora and Marathon do indeed offer similar feature sets, both being classified as "service schedulers". In other words, you hand us instructions for how to run your application servers, and we do our best to keep them up.
I'll offer some differences in broad strokes. When it comes to shortcomings mentioned in each, I think it's safe to say that the communities are aware and intend to fix them.
Ease of use
Aurora is not easy to install. It will likely feel like you are trailblazing while setting it up. It exposes a thrift API, which means you'll need a thrift client to interact with it programmatically (a REST-like API is coming, but is vaporware at the moment), or use our command line client. Aurora has a DSL for configuration which can be daunting, but allows you to easily share templates and common patterns as you use the system more.
Marathon, on the other hand, helps you to run 'Hello World' as quickly as possible. It has great docs to do this in many environments and there's little overhead to get going. It has a REST API, making it easier to adapt to custom tools. It uses JSON for configuration, which is easy to start with but more prone to cargo culting.
Targeted use cases
Aurora has always been designed to handle a large engineering organization. The clusters at Twitter have tens of thousands of machines and hundreds of engineers using them. It is critical to Twitter's business. As a result, we take our requirements of scale, stability, and security very seriously. We make sure to only condone features that we believe are trustworthy at scale in production (for example, we have our Docker support labeled as beta because of known issues with Docker itself and the Mesos-Docker integration). We also have features like preemption that make our clusters suitable for mixing business-critical services with prototypes and experiments.
I can't make any claim for or against Marathon's scalability. On the feature front, Marathon has build out features quickly, but this can feel bleeding edge in practice (Docker support is a good example). This is not always due to Marathon itself, but also layers down the stack. Marathon does not provide preemption.
Ownership
To some, ownership and governance of a project is important. It feel that in practice it does not define the openness of a project, but for some people/companies the legal fine print can be a deal-breaker.
Marathon is owned by a company (Mesosphere)
To some, this is beneficial, to others is is not. It means that you can pay for support and features. It also means that there is something to be sold, and the project direction is ultimately decided by Mesosphere's interests.
Aurora is owned by the Apache Software Foundation
This means it is subject to the governance model of the ASF, driven by the community. Aurora does not have paying customers, and there is not currently a software shop that you can pay for development.
tl;dr If you are just getting your feet wet with running services on Mesos, I would suggest Marathon as your first port of call. It will be easier for you to get running and poke around the ecosystem. If you are forming the 'private cloud strategy' for a company, I suggest seriously considering Aurora, as it is proven and specifically designed for that.
So I've been evaluating both and this is my summary.
Aurora
[+] also handles recurring jobs
[+] finer grained, extensive file-based configuration
[+] has namespaces so multiple environments can co-exist
[-] read-only UI, no official API
[~] file based configuration and cli based execution brings overhead (which can be justified with more extensive feature set)
Marathon
[+] very easy to setup and use
[+] UI that provides control and extensive API (even with features missing from UI at the moment)
[+] event bus to listen in on api calls
[-] handles only long-running jobs
[-] does not have separate deployment-run-cleanup steps, these if necessary need to be combined in a script of one-liner
Even though Aurora has better capabilities, I prefer Marathon due to Auroras complexity/overhead and lack of UI (for control) & API
I have more experience with Marathon.
Ideological:
Marathon is a relatively tested product that is used in production at AirBnB. Aurora is an early Apache project (so YMMV).
Both are open source and active. Feel free to contribute pull requests or file issues!
Technical:
Marathon doesn't schedule batch tasks or cron jobs
Marathon has a friendly UI and better health indicators (in 0.8.x)
In regards to your second question, you can run any command or docker container, and Mesos will do the resource isolation for you. If you have 50% CentOS nodes and 50% Ubuntu nodes and you run a task that executes apt-get, the task will have a 50% chance of failure. Mesos and Marathon have no awareness of the actual machines.
Disclaimer: I don't have hands-on experience with Aurora, only with Marathon.
ad Q1: In a nutshell Apache Aurora is capable of doing what Marathon + Chronos can provide, that is, schedule both long-running services and recurring (batch) jobs; see also Aurora user guide.
ad Q2: Yes, anything. Currently based on cgroups and Docker but hey, you can roll your own.

Resources