Is the latest version of docker from Amazon on ec2 broken? - amazon-ec2

As of last night all our new docker deployments started failing because the latest version of docker (docker-1.3.2-1.0.amzn1.x86_64) in the amazon repo fails to start up.
Steps to reproduce are:
## Launch instance with default amazon AMI
yum install docker-1.3.2-1.0.amzn1.x86_64
service docker restart
### Get the following error in /var/log/docker
2014/11/26 05:14:16 docker daemon: 1.3.2 c78088f/1.3.2; execdriver: native; graphdriver:
[8f6d7cfb] +job serveapi(unix:///var/run/docker.sock)
[info] Listening for HTTP on unix (/var/run/docker.sock)
docker: relocation error: docker: symbol dm_task_get_info_with_deferred_remove,
version Base not defined in file libdevmapper.so.1.02 with link time reference
If I downgrade back to docker-1.3.1-1.0.amzn1.x86_64 everything seems to be fine.
Is the AWS package actually broken, or is it just our setup?
Is there a work around other than downgrading?

Yes, it is broken for me too.
Downgrading has been the solution yet.

The same error was by me on a centos VM provisioned at my workplace - a yum update resolved it.
I suspect a build was broken but went out, and has been fixed subsequently.

Related

DataHub installation on Minikube failing: "no matches for kind "PodDisruptionBudget" in version "policy/v1beta1"" on elasticsearch setup

Im following the deployement guide of DataHub with Kubernetes present on the documentation: https://datahubproject.io/docs/deploy/kubernetes
Settin up the local clusten with Minikube I've started following the prerequisites session of the guide.
At first I tried to change some of the default values to try it locally (I've already installed it sucessfully on Google Kubernetes Engine, so I was trying different set ups)
But on the first step of the installation I've received the error:
Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: resource mapping not found for name: "elasticsearch-master-pdb" namespace: "" from "": no matches for kind "PodDisruptionBudget" in version "policy/v1beta1"
ensure CRDs are installed first
The steps I've followed after installing Minikube where the exact steps presented on the page:
helm repo add datahub https://helm.datahubproject.io/
helm install prerequisites datahub/datahub-prerequisites
With the error happening on step 2
At first I've changed to the default configuration to see if it wasnt a mistake on the new values, but the error remained.
Ive expected that after followint the exact default steps the installation would be successfull locally, just like it was on the GKE
I got help browsing the DataHub slack community and figured out a way to fix this error.
It was simply a matter of a version error with Kubernetes, I was able to fix it by forcing minikube to start with the 1.19.0 version of Kubernetes:
minikube start --kubernetes-version=v1.19.0

Error syncing pod on starting Beam - Dataflow pipeline from docker

We are constantly getting an error while starting our Beam Golang SDK pipeline (driver program) from a docker image which works when started from local / VM instance. We are using Dataflow runner for our pipeline and Kubernetes to deploy.
LOCAL SETUP:
We have GOOGLE_APPLICATION_CREDENTIALS variable set with service account for our GCP cluster. When running the job from local, job gets submitted to dataflow and completes successfully.
DOCKER SETUP:
Build image used is FROM golang:1.14-alpine. When we pack the same program with Dockerfile and try to run, it fails with error
User program exited: fork/exec /bin/worker: no such file or directory
On checking Stackdriver logs for more details, we see this:
Error syncing pod 00014c7112b5049966a4242e323b7850 ("dataflow-go-job-1-1611314272307727-
01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"),
skipping: failed to "StartContainer" for "sdk" with CrashLoopBackOff:
"back-off 2m40s restarting failed container=sdk pod=dataflow-go-job-1-
1611314272307727-01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"
Found reference to this error in Dataflow common errors doc, but it is too generic to figure out whats failing. After multiple retries, we were able to eliminate any permission / access related issues from pods. Not sure what else could be the problem here.
After multiple attempts, we decided to start the job manually from a new Debian 10 based VM instance and it worked. This brought to our notice that we are using alpine based golang image in Docker which may not have all the required dependencies installed to start the job.
On golang docker hub, we found a golang:1.14-buster where buster is codename for Debian 10. Using that for docker build helped us solve the issue. Self answering here to help anyone else facing the same issues.

Kyma restart issue in local

I have installed Kyma version 1.13.0 on Windows, it's working fine if I don't restart my machine or minikube. But when I restart minikube by following steps provided in the below link. Kyma is not working.
https://kyma-project.io/docs/latest/root/kyma#installation-install-kyma-locally-stop-and-restart-kyma-without-reinstalling
I need to reinstall kyma again to make it work.
Any help would be appreciated
This sounds similar to what I get on my Windows machine.
This is the error that I get after restarting minikube.
stderr:
error execution phase addon/coredns: unable to patch the CoreDNS deployment: Timeout: request did not complete within requested timeout 30s
To see the stack trace of this error execute with --v=5 or higher
If you get same error, this has been reported as a bug.
https://github.com/kyma-project/cli/issues/455
My solution to this issue is to get the kyma working by issuing provision command twice, so give it a try.

Windows Docker API issue for using GitLab-runner

While setting up a Windows CI pipeline from GitLab, I was going through the numerous issues related to the Windows gitlab-runner docker executor that is using an old API (1.18) which Docker no longer accepts.
The issue results in the following error messages when the Gitlab/CI tries to connect to the runner:
Running with gitlab-runner 11.2.0 (35e8515d) on Windows VS2017 x64 0825d1d7
Using Docker executor with image buildtools2017 ...
ERROR: Preparation failed: Error response from daemon: **client version 1.18 is too old.** Minimum supported API version is 1.24, please upgrade your client to a newer version (executor_docker.go:1148:0s)
The 'buildtools2017' docker image that is referred to is the Microsoft "official".
The image seems to be working and valid for the current (experimental) Docker version I'm using (18.06.1-ce-win74) and for the stable version as well.
The issue was described throughout the GitLab wiki. Andrew Leech (?) went so far as to fork and modify the runner so that it would connect properly, and kindly provided his scripts and comments in a blogpost. This seems to give some results:
C:\gitlab-runner>gitlab-runner.exe -v
Version: 10.8.0~beta.551.g67a6ccc7
Git revision: 67a6ccc7
Git branch: windows-container-executor
GO version: go1.9.4
Built: 2018-07-30T08:57:44+00:00
OS/Arch: windows/amd64
The GitLab wiki states that they're waiting until a more stable solution can be released. Currently it's been over one year of broken windows docker runners..
Andrew's blogpost and a link to his gitlab-runner.exe describes actually a different workaround using the PowerShell runner that then starts a Docker instance. All the token info is exposed, I'm not sure how to set it up, and it also seems to rely on an external image with older build tools.
It seems the docker runner now connects, but if I undestand correctly, the Gitlab-runner docker runner does not seem to agree on the 'build directory' that is used. The first Gitlab/CI scriptline in my repo is just an echo command, so the error is not about the ci script content, but I'm not sure what it IS about. If anyone with docker fu knows what is going on this would really help me.
Using Docker executor with image buildtools2017 ...
ERROR: Preparation failed: build directory needs to be absolute and non-root path
Cheers,

Cassandra Detected unreadable sstables(data not caches)

ERROR [main] 2017-08-04 13:24:21,949 CassandraDaemon.java:638 - Detected unreadable sstables /opt/cassandra/data/some_key_space/ep_lc_events-adc44160dbe611e6953689bcd3ed73aa/mc-547-big-Summary.db, and many others...
That has happened after I upgraded Cassandra to 3 version and after a while downgraded it to 2nd version.
When I run this command: sudo service cassandra status
I have got such message:
could not access pidfile for Cassandra
In /var/log/cassandra/system.log I have logs which I wrote at the beginning.
PS: let me pay your attention that everything is happening on EC2 Amazon instance.
Well, I have just upgraded back to 3rd version, used cassandra-unloader to export all data, then downgraded back to 2nd version and used cassandra-loader to import all data. But if you were lucky and had backups and snapshots it would not be an obstacle for you.
PS. Afterwards, I had to run this command nodetool resetlocalschema to reset local schema and resynchronize.
PPS. This you can find how to do that.
https://github.com/brianmhess/cassandra-loader
I also got the same error, but it was due to switching between cassandra 4.0.0 and version 3.11 and back again while using docker.
Update the version to the right one for the ssltable, or delete the data volume:
docker-compose logs cassandra
docker volumes ls
docker ps
docker-compose down
docker volume rm testapp_cassandra
docker volume ls
docker-compose up

Resources