To separate the doc in an different repository or not? - python-sphinx

I wonder how people are handling the repository of the documentation of their modules (I am mostly doing python/sphinx/git).
Facing two choices:
include the doc sources in the module repository
has the advantage of having a all in one unique place for the project, but the inconvenience of bringing non-vital data with the code (for instance images for the doc)
make a separate repository for the documentation
has the advantage of having a separate life cycle between the doc and the module code. When correcting a doc typo for instance one would like to avoid a new repository version for the code. The drawback is to maintain two repositories with maybe equivalent branching.
Are there clear guidelines somewhere? Am I missing some important points?

Related

Multiple microservices in one repository

I have question about microservices and repositories. We are a small team (5 people) and we creating new project in microservices. Expected microservice applications in our project is between 10-15.
We are thinking about one repository for all microservices in structure like that:
-/
--/app1
--/app2
--/app3
-./script.sh
-./script.bat
What do you think about this design? Can you recommend something better? We think if we will have repository per app it will be overkill for that small project in one team. As our applications you can imagine spring boot or spa applications in angular. Thank you in advice.
In general you can have all your micro-services in one repository but I think while the code grows for each of them it can be difficult to manage that.
Here are some things that you might want to consider before deciding to put all your micro-services in one repository:
Developer discipline:
Be careful with coupling of code. Since the code for all your micro-services is in one repository you don't have a real physical boundary between them, so developers can just use some code from other micro-services like adding a reference or similar. Having all micro-services in one repository will require some discipline and rules for developers not to cross boundaries and misuse them.
Come into temptation to create and misuse shared code.
This is not a bad thing if you do it in a proper and structured way. Again this leaves a lot of space for doing it the wrong way. If people just start using the same shared jar or similar that could lead to a lot of problems. In order to have something shared it should be isolated and packaged and ideally should have some versioning with support for backwards compatibility. This way each micro-service when this library is updated would still have a working code with the previous version. Still it is doable in the same repository but as with the 1. point above it requires planing and managing.
Git considerations:
Managing a lot of pull requests and branches in one repository can be challenging and can lead to the situation: "I am blocked by someone else". Also as possibly more people are going to work on the project and will commit to your source branch you will have to do rebase and/or merge source branch to your development or feature branch much more often(even if you do not need the changes from other services). Email notifications configured for the repository can be very annoying as you will receive Emails for things which are not in your micro-service code. In this case you need to create some Filters/Rules in your Email clients to avoid the Emails that you are not interested in.
Number of micro-services grow even further then your initial 10-15. The number can grow? If not, all fine. But if it does, at some point you could maybe consider to split each micro-service in a dedicated repository. Doing this at the point where you are in later stage of project can be challenging and could require some work and in worst case you will find out that there are some couplings that people made over time which you will have to resolve at this stage.
CI pipelines considerations:
If you use something like Jenkins to build, test and/or deploy your code
you could encounter some small configuration difficulties like the integration between Jenkins and GitHub. You would need to configure a pipeline which would only build/test a specific part of the code (or one micro-service) if someone creates a merge/pull request against that micro-service. I never tried to do such a thing but I guess you will have to figure out how to do it (script and automate this). It is doable I guess but will required some work to achieve it.
Conclusion
Still all or most of these points can be resolved with some extra management and configuration but it is still worth knowing what additional effort you could encounter. I guess there are some other points to be taken into considerations as well but my general advice would be to use separate repositories for each micro-service if you can (Private Repository pricing and similar reasons). This is a decision which is made project by project.

Keeping Go microservices in different repositories on Github

I'm working on a micro-services project. For this, I'd like to have one Go package per service, all included in a parent package for the project. It looks like this:
.
└── github.com
└── username
      └── project
   ├── service1
   └── service2
I think this structure allows to comply with the Go conventions on package names and paths. A consequence of this is that all my microservices end on the same repository on Github, as the repository will be at depth 3 in the URL. I think this may become an issue in the future if the codebase becomes large. It may also add complexity for the CI/CD pipeline, for example a change to one service would trigger a build for all other services and the code to clone would be unnecessarily large.
Is there a way to avoid this conflict between the Go conventions and the way Github works ? Or this a problem that have to be solved during the CI/CD work ?
What you're talking about is popularly called "monorepo" these days. While I personally like having all my projects in the own independent repositories (including microservices and everything else), there are a number of proponents of having all code for a company in a single repository. Interestingly, both Google and Facebook use monorepos though it must be said they have built a lot of fancy tooling to make that work for them.
One important thing to note is that your repository is a separate thing from your architecture. There is not necessarily any correlation between them. You can have microservices all in a single repo and you can have a monolith divided up into several repos; the repository is only a tool to store and document your code base, nothing more.
While researching the topic, here are some of the advantages and disadvantages taken from a number of articles across the web:
Monorepo Advantages
Ease of sharing modules between projects (even in microservices, there are often cross-cutting concerns)
One single place to see and know what code exists - especially useful in large companies with lots of code
Simplifies automated and manual code review processes
Simplifies documentation rather than pulling from multiple, disconnected repos
Monorepo Disadvantages
Massive codebase can be challenging/slow to check in/out to local
Without very clear, strict guidelines it can be easy to cause tight coupling between products
Requires (slightly) more complex CI/CD tooling to partial-release
Depending on repository platform, very large codebases can affect performance
And here's a good discussion on the pros and cons of monorepos, and here's one specifically related to switching TO a monorepo with microservices architecture. Here's one more with lots of links both pro- and against-monorepo.
Like so many other things in programming and especially in SOA, the right solution for you depends on a number of factors that only you can determine. The main takeaway is that big and small companies have been successful with both options and many in between, so choose carefully but don't worry too much about it.
Versioning sub-packages that a Go project depends on can be track by git tagging. So using Go-modules one is encouraged to move sub-packages into their own git repos.
If the majority of your solution will be written in go, I'd suggest leveraging go modules.
This blog post explains how to manage Go-modules' go.mod with regard to package dependencies and their associated version git tags:
https://blog.golang.org/migrating-to-go-modules
There is a good way to work with separate repositories and Go microservices: Go plugins. In short:
Build a base image that implements shared functionality.
Have that base image look for a Go plugin somewhere in the container
In derived images, compile the functionality as a Go plugin and place it where the base image can find it.
For Go/gRPC, I have put a base image that does this here.

Organization of protobuf files in a microservice architecture

In my company, we have a system organized with microservices with a dedicated git repository per service. We would like to introduce gRPC and we were wondering how to share protobuf files and build libs for our various languages. Based on some examples we collected, we decided at the end to go for a single repository with all our protobuf inside, it seems the most common way of doing it and it seems easier to maintain and use.
I would like to know if you have some examples on your side ?
Do you have some counter examples of companies doing the exact opposite, meaning hosting protobuf in a distributed way ?
We have a distinct repo for protofiles (called schema) and multiple repos for every microservice. Also we never store generated code. Server and client files are generated from scratch by protoc during every build on CI.
Actually this approach works and fits our needs well. But there are two potential pitfalls:
Inconsistency between schema and microservice repositories. Commits to two different git repos are not atomic, so, at the time of schema updates, there is always a little time period when schema is updated, while microservice's repo is not yet.
In case if you use Go, there is a potential problem of moving to Go modules introduced in Go 1.11. We didn't make a comprehensive research on it yet.
Each of our microservices has it's own API (protobuf or several protobuf files). For each API we have separate repository. Also we have CI job which build protoclasses into jar (and not only for Java but for another language too) and publish it into our central repository. Than you just add dependencies to API you need.
For example, we have microservice A, we also have repository a-api (contains only protofiles) which build by job into jar (and to another languages) com.api.a-service.<version>

Are rugged Repository instances threadsafe?

My question boils down to the title: are rugged Repository instances (intended to be) threadsafe?
While I'm at it, I may be able to settle a question I've been having longer: is access to a git repository using rugged (intended to be) threadsafe when using different Repository instances?
Context
I'm using Rugged to access a git repository that stores documents for multiple users that can access the repo via a shared web frontend. So far I created a new Repository instance for each access, as that performs well enough and seems to be safe (I can't find guarantees in the docs or determine obvious safety from the way libgit2 is used, but no tests found problems and I'm assuming libgit2 itself is safe).
However, I ran into an issue, which limits the amount of repository instances that you can open near-simultaneously, which causes problems for some scripts that reuse some of the code that create Repository instances for each git repository access. An easy solution would be to share Repository instances between all users. However, that will cause problems if repository instances are not thread safe. Do I need to guard all of these shared instances with a Mutex or can I do without, because rugged/libgit2 already solves this problem for me?
Yes, libgit2 (and thus rugged as well) should be threadsafe, as long as you don't use the same repository instance (or any other object created from libgit2) across different threads.
But as indicated by the second part of your question, you actually want to use the same repository instance across different threads. Here, the answe is it depends. Most, but not all, of the functions provided by libgit2 should be threadsafe, but I can't give you a definite list. See https://github.com/libgit2/libgit2/issues/2491 for a bit of more information.

Organize Project Solution w/ Interfaces for Repository

VS 2010 / C#
Trying to organize a solution and looking for options for naming the project that will host the interfaces for the repository.
I have:
MyProject.Domain
MyProject.WebUI
MyProject.Repositories
MyProject.Interfaces??
So far "Interfaces" is the best name i've come up with, but I don't like it. Any ideas/suggestions?
It is not too uncommon to see repository interfaces placed in the same assembly as the domain objects, themselves. This is what Jeffrey Palermo discusses in his series on The Onion Architecture. Personally, I do the same.
As for the reasoning behind it, I believe it is completely logical to define what the repository does in relation to the domain objects. Consideration as to the behavior of the repository is as heavily weighted as the domain itself, in my opinion. Assume that you have one team or developer who works on the domain model and defines the repository interfaces after working with the domain expert. It is their/his/her role to make sure that the knowledge is transferred about how the domain is related to the repositories, but not necessarily the repositories, themselves.
In doing so, handing off this assembly to anyone else on the team, the UoK (Unit of Knowledge, my own term) is constrained to the assembly. People writing the implementation of the repositories will then code against the transferred knowledge in the assembly. Since this UoK is unchanging based upon how the repository is implemented, from a data access standpoint, it logically goes into another assembly.

Resources