etcd error when trying to start service rejected send message - etcd

I am using ubuntu 14.04 and Im configuring etcd for use with calico, but the service does not work.
This is my etcd.conf file:
# vim:set ft=upstart ts=2 et:
description "etcd"
author "etcd maintainers"
start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]
respawn
setuid etcd
env ETCD_DATA_DIR=/var/lib/etcd
export ETCD_DATA_DIR
exec /usr/bin/etcd --name="uno" \
--advertise-client-urls="http://172.16.8.241:2379,http://172.16.8.241:4001" \
--listen-client-urls="http://0.0.0.0:2379,http://0.0.0.0:4001" \
--listen-peer-urls "http://0.0.0.0:2380" \
--initial-advertise-peer-urls "http://172.16.8.241:2380" \
--initial-cluster-token $(uuidgen) \
--initial-cluster "node1=http://172.16.8.241:2380" \
--initial-cluster-state "new"
When I try to start:
ikerlan#uno:~$ service etcd start
start: Rejected send message, 1 matched rules; type="method_call", sender=":1.128" (uid=1000 pid=7374 comm="start etcd ") interface="com.ubuntu.Upstart0_6.Job" member="Start" error name="(unset)" requested_reply="0" destination="com.ubuntu.Upstart" (uid=0 pid=1 comm="/sbin/init")
What could be the problem?

Try to run with sudo:
sudo service etcd start
Then if you got error like:
start: Job failed to start
Rerun after add user etcd:
sudo adduser etcd
Update:
If etcd instance can't start, check the following two things:
1: your etcd start command is right, in your case, the etcd command can't run as you will get err msg like :
etcd: couldn't find local name "uno" in the initial cluster configuration
so change your content in /etc/init/etcd.conf to :
--initial-cluster "uno=http://172.16.8.241:2380" \
where your original config is :
--initial-cluster "node1=http://172.16.8.241:2380" \
2: user etcd should have the permission to write to /var/lib/etcd

Etcd flags "name" and "initial-cluster" must be match together.
....
--name="keukenhof" \
--initial-cluster="keukenhof=http://localhost:2380"
....

Related

Add timestamp to name of created VM's instance in bash script

I deploy some VM's instances in my cloud infrastructure with bash script:
#!/bin/bash
instance_name="vm"
# create instance
yc compute instance create \
--name $instance_name \
--hostname reddit-app \
--memory=2 \
...
I need to add timestamp to instance's name in format vm-DD-MM_YYYY-H-M-S.
For debug I tried to set value instance_name=$(date +%d-%m-%Y_%H-%M-%S) but got the error:
ERROR: rpc error: code = InvalidArgument desc = Request validation error: Name: invalid resource name
Any help would be appreciated.
The Yandex Cloud documentation says:
"The name may contain lowercase Latin letters, numbers, and hyphens. The first character must be a letter. The last character can't be a hyphen. The maximum length of the name is 63 characters".
I changed my script following the recommendations and it works now:
#!/bin/bash
instance_name="vm-$(date +%d-%m-%Y-%H-%M-%S)"
# create instance
yc compute instance create \
--name $instance_name \
--hostname reddit-app \
--memory=2 \
...

GitLab job Job succeeded but not finished (create / delete Azure AKS)

I am using a runner to create a AKS on the fly and also delete a previous one.
unfortunately these jobs take a while and i have now more than often experienced that the job Stops suddenly (in the range of 5+ min after a az aks delete or az aks create call.
The Situation is happening in GitLab and after several retries it usually works one time.
On some googleing in found that before and after scripts might have an impact ... but even with removing them there was not difference.
Is there any Runner rules or something particular that might need to be changed ?
It would be more understandable when it would stop with a TImeout Error, but it handles it as job succeeded, even it did no even finish running thorugh all lines. Below is the stagging segment causing the issue:
create-kubernetes-az:
stage: create-kubernetes-az
image: microsoft/azure-cli:latest
# when: manual
script:
# REQUIRE CREATED SERVICE PRINCIPAL
- az login --service-principal -u ${AZ_PRINC_USER} -p ${AZ_PRINC_PASSWORD} --tenant ${AZ_PRINC_TENANT}
# Create Resource Group
- az group create --name ${AZ_RESOURCE_GROUP} --location ${AZ_RESOURCE_LOCATION}
# ERROR HAPPENS HERE # Delete Kubernetes Cluster // SOMETIMES STOPS AFTER THIS
- az aks delete --resource-group ${AZ_RESOURCE_GROUP} --name ${AZ_AKS_TEST_CLUSTER} --yes
#// OR HERE # Create Kubernetes Cluster // SOMETIMES STOPS AFTER THIS
- az aks create --name ${AZ_AKS_TEST_CLUSTER} --resource-group ${AZ_RESOURCE_GROUP} --node-count ${AZ_AKS_TEST_NODECOUNT} --service-principal ${AZ_PRINC_USER} --client-secret ${AZ_PRINC_PASSWORD} --generate-ssh-keys
# Get cubectl
- az aks install-cli
# Get Login Credentials
- az aks get-credentials --name ${AZ_AKS_TEST_CLUSTER} --resource-group ${AZ_RESOURCE_GROUP}
# Install Helm and Tiller on Azure Cloud Shell
- curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh
- chmod 700 get_helm.sh
- ./get_helm.sh
- helm init
- kubectl create serviceaccount --namespace kube-system tiller
- kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
- kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
# Create a namespace for your ingress resources
- kubectl create namespace ingress-basic
# Wait 1 minutes
- sleep 60
# Use Helm to deploy an NGINX ingress controller
- helm install stable/nginx-ingress --namespace ingress-basic --set controller.replicaCount=2 --set controller.nodeSelector."beta\.kubernetes\.io/os"=linux --set defaultBackend.nodeSelector."beta\.kubernetes\.io/os"=linux
# Test by get public IP
- kubectl get service
- kubectl get service -l app=nginx-ingress --namespace ingress-basic
#- while [ "$(kubectl get service -l app=nginx-ingress --namespace ingress-basic | grep pending)" == "pending" ]; do echo "Updating"; sleep 1 ; done && echo "Finished"
- while [ "$(kubectl get service -l app=nginx-ingress --namespace ingress-basic -o jsonpath='{.items[*].status.loadBalancer.ingress[*].ip}')" == "" ]; do echo "Updating"; sleep 10 ; done && echo "Finished"
# Add Ingress Ext IP / Alternative
- KUBip=$(kubectl get service -l app=nginx-ingress --namespace ingress-basic -o jsonpath='{.items[*].status.loadBalancer.ingress[*].ip}')
- echo $KUBip
# Add DNS Name - TODO - GITLAB ENV VARIABELEN KLAPPEN NICHT
- DNSNAME="bl-test"
# Get the resource-id of the public ip
- PUBLICIPID=$(az network public-ip list --query "[?ipAddress!=null]|[?contains(ipAddress, '$KUBip')].[id]" --output tsv)
- echo $PUBLICIPID
- az network public-ip update --ids $PUBLICIPID --dns-name $DNSNAME
#Install CertManager Console
# Install the CustomResourceDefinition resources separately
- kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
# Create the namespace for cert-manager
- kubectl create namespace cert-manager
# Label the cert-manager namespace to disable resource validation
- kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
# Add the Jetstack Helm repository
- helm repo add jetstack https://charts.jetstack.io
# Update your local Helm chart repository cache
- helm repo update
# Install the cert-manager Helm chart
- helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager
# Run Command issuer.yaml
- sed 's/_AZ_AKS_ISSUER_NAME_/'"${AZ_AKS_ISSUER_NAME}"'/g; s/_BL_DEV_E_MAIL_/'"${BL_DEV_E_MAIL}"'/g' infrastructure/kubernetes/cluster-issuer.yaml > cluster-issuer.yaml;
- kubectl apply -f cluster-issuer.yaml
# Run Command ingress.yaml
- sed 's/_BL_AZ_HOST_/'"beautylivery-test.${AZ_RESOURCE_LOCATION}.${AZ_AKS_HOST}"'/g; s/_AZ_AKS_ISSUER_NAME_/'"${AZ_AKS_ISSUER_NAME}"'/g' infrastructure/kubernetes/ingress.yaml > ingress.yaml;
- kubectl apply -f ingress.yaml
And the result
Running with gitlab-runner 12.3.0 (a8a019e0)
on runner-gitlab-runner-676b494b6b-b5q6h gzi97H3Q
Using Kubernetes namespace: gitlab-managed-apps
Using Kubernetes executor with image microsoft/azure-cli:latest ...
Waiting for pod gitlab-managed-apps/runner-gzi97h3q-project-14628452-concurrent-0l8wsx to be running, status is Pending
Waiting for pod gitlab-managed-apps/runner-gzi97h3q-project-14628452-concurrent-0l8wsx to be running, status is Pending
Running on runner-gzi97h3q-project-14628452-concurrent-0l8wsx via runner-gitlab-runner-676b494b6b-b5q6h...
Fetching changes with git depth set to 50...
Initialized empty Git repository in /builds/****/*******/.git/
Created fresh repository.
From https://gitlab.com/****/********
* [new branch] Setup-Kubernetes -> origin/Setup-Kubernetes
Checking out d2ca489b as Setup-Kubernetes...
Skipping Git submodules setup
$ function create_secret() { # collapsed multi-line command
$ echo "current time $(TZ=Europe/Berlin date +"%F %T")"
current time 2019-10-06 09:00:50
$ az login --service-principal -u ${AZ_PRINC_USER} -p ${AZ_PRINC_PASSWORD} --tenant ${AZ_PRINC_TENANT}
[
{
"cloudName": "AzureCloud",
"id": "******",
"isDefault": true,
"name": "Nutzungsbasierte Bezahlung",
"state": "Enabled",
"tenantId": "*******",
"user": {
"name": "http://*****",
"type": "servicePrincipal"
}
}
]
$ az group create --name ${AZ_RESOURCE_GROUP} --location ${AZ_RESOURCE_LOCATION}
{
"id": "/subscriptions/*********/resourceGroups/*****",
"location": "francecentral",
"managedBy": null,
"name": "******",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null,
"type": "Microsoft.Resources/resourceGroups"
}
$ az aks delete --resource-group ${AZ_RESOURCE_GROUP} --name ${AZ_AKS_TEST_CLUSTER} --yes
Running after script...
$ echo "current time $(TZ=Europe/Berlin date +"%F %T")"
current time 2019-10-06 09:05:55
Job succeeded
Is there ways to have the running completely ?
And succesfull in the best case ?
UPDATE: What is the idea: i try to automate the process of setting up a complete kubernetes Cluster with SSL and DNS management. Having everything fas setup and ready for different use cases and different environments in future. I also want to learn how to do things better :)
NEW_UPDATE:
Added a solution
I added a small work around as I was expecting it requires an execution every once in a while.....
It seems the az aks wait command did the trick for me as of now. and the previous command requires --no-wait in order to continue.
# Delete Kubernetes Cluster
- az aks delete --resource-group ${AZ_RESOURCE_GROUP} --name ${AZ_AKS_TEST_CLUSTER} --no-wait --yes
- az aks wait --deleted -g ${AZ_RESOURCE_GROUP} -n ${AZ_AKS_TEST_CLUSTER} --updated --interval 60 --timeout 1800
# Create Kubernetes Cluster
- az aks create --name ${AZ_AKS_TEST_CLUSTER} --resource-group ${AZ_RESOURCE_GROUP} --node-count ${AZ_AKS_TEST_NODECOUNT} --service-principal ${AZ_PRINC_USER} --client-secret ${AZ_PRINC_PASSWORD} --generate-ssh-keys --no-wait
- az aks wait --created -g ${AZ_RESOURCE_GROUP} -n ${AZ_AKS_TEST_CLUSTER} --updated --interval 60 --timeout 1800

fail2ban "command not found" when executing banaction

One of the actions for fail2ban is configured to run a ruby script; however, fail2ban fails when trying to execute the ruby script with a "Command not found" error. I don't understand this error because I'm providing the full path to the ruby script and it has execution permissions:
Here's my fail2ban action:
[root:a17924e746f0:~]# cat /etc/fail2ban/action.d/404.conf
# Fail2Ban action configuration file for Subzero/Core
[Definition]
actionstart =
actionstop =
actioncheck =
actionban = /root/ban_modify.rb ban <ip>
actionunban = /root/ban_modify.rb unban <ip>
Here are the contents to the /root/ban_modify.rb script:
#!/usr/bin/env ruby
command = ARGV[0]
ip_address = ARGV[1]
blacklist = File.open("/root/blacklist.txt").read.split("\n")
if command == "unban"
if blacklist.include? "#{ip_address} deny"
blacklist.delete "#{ip_address} deny"
end
elsif command == "ban"
blacklist << "#{ip_address} deny"
end
File.open("/root/blacklist.txt", "w") {|f| f.write(blacklist.join("\n"))}
Very simple. This blacklist.txt file is used by Apache to permanently ban individuals from the web server when a fail2ban condition is met.
However, when I issue the following command: sudo /usr/bin/fail2ban-client set 404 unbanip <my ip>
I get the following error:
2019-08-19 20:56:43,508 fail2ban.utils [16176]: Level 39 7ff7395873f0 -- exec: ban_modify.rb ban <myip>
2019-08-19 20:56:43,509 fail2ban.utils [16176]: ERROR 7ff7395873f0 -- stderr: '/bin/sh: 1: ban_modify.rb: not found'
2019-08-19 20:56:43,509 fail2ban.utils [16176]: ERROR 7ff7395873f0 -- returned 127
2019-08-19 20:56:43,509 fail2ban.utils [16176]: INFO HINT on 127: "Command not found". Make sure that all commands in 'ban_modify.rb ban <myip>' are in the PATH of fail2ban-server process (grep -a PATH= /proc/`pidof -x fail2ban-server`/environ). You may want to start "fail2ban-server -f" separately, initiate it with "fail2ban-client reload" in another shell session and observe if additional informative error messages appear in the terminals.
2019-08-19 20:56:43,509 fail2ban.actions [16176]: ERROR Failed to execute ban jail '404' action '404' info 'ActionInfo({'ip': '<myip>', 'family': 'inet4', 'ip-rev': '<myip>.', 'ip-host': '<myip>', 'fid': '<myip>', 'failures': 1, 'time': 1566266203.3465006, 'matches': '', 'restored': 0, 'F-*': {'matches': [], 'failures': 1}, 'ipmatches': '', 'ipjailmatches': '', 'ipfailures': 1, 'ipjailfailures': 1})': Error banning <myip>
I'm not sure why this error is happening if the actionban is pointing to the full path of a ruby script.
I even tried changing the contents of /root/ban_modify.rb to just simply puts "Hello World". Tried changing the banaction to iptables-allports and that still failed. It seems like banaction just simply doesn't work.
You can enable fail2ban debug mode & check fail2ban log for more details.
# change fail2ban log level
sudo nano /etc/fail2ban/fail2ban.conf
loglevel = DEBUG
# restart fail2ban
sudo systemctl restart fail2ban
# check logs
tail -f /var/log/fail2ban.log
You can restart the fail2ban and check it again:
sudo systemctl restart fail2ban

Send an HTTPS request to TLS1.0-only server in Alpine linux

I'm writing a simple web crawler inside Docker Alpine image. However I cannot send HTTPS requests to servers that support only TLS1.0 . How can I configure Alpine linux to allow obsolete TLS versions?
I tried adding MinProtocol to /etc/ssl/openssl.cnf with no luck.
Example Dockerfile:
FROM node:12.0-alpine
RUN printf "[system_default_sect]\nMinProtocol = TLSv1.0\nCipherString = DEFAULT#SECLEVEL=1" >> /etc/ssl/openssl.cnf
CMD ["/usr/bin/wget", "https://www.restauracesalanda.cz/"]
When I build and run this container, I get
Connecting to www.restauracesalanda.cz (93.185.102.124:443)
ssl_client: www.restauracesalanda.cz: handshake failed: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
wget: error getting response: Connection reset by peer
I can reproduce your issue using the builtin-busybox-wget. However, using the "regular" wget works:
root#a:~# docker run --rm -it node:12.0-alpine /bin/ash
/ # wget -q https://www.restauracesalanda.cz/; echo $?
ssl_client: www.restauracesalanda.cz: handshake failed: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
wget: error getting response: Connection reset by peer
1
/ # apk add wget
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/1) Installing wget (1.20.3-r0)
Executing busybox-1.29.3-r10.trigger
OK: 7 MiB in 17 packages
/ # wget -q https://www.restauracesalanda.cz/; echo $?
0
/ #
I'm not sure, but maybe you should post an issue at https://bugs.alpinelinux.org
Putting this magic 1 liner into my dockerfile solved my issues and i was able to use TLS 1.0:
RUN sed -i 's/MinProtocol = TLSv1.2/MinProtocol = TLSv1/' /etc/ssl/openssl.cnf \ && sed -i 's/CipherString = DEFAULT#SECLEVEL=2/CipherString = DEFAULT#SECLEVEL=1/' /etc/ssl/openssl.cnf
Credit goes to this dude: http://blog.travisgosselin.com/tls-1-0-1-1-docker-container-support/

knife vsphere requests root password - is unattended execution possible?

Is there any way to ruyn the knife vsphere for unattended execution? I have a deploy shell script which I am using to help me:
cat deploy-production-20-vm.sh
#!/bin/bash
##############################################
# These are machine dependent variables (need to change)
##############################################
HOST_NAME=$1
IP_ADDRESS="$2/24"
CHEF_BOOTSTRAP_IP_ADDRESS="$2"
RUNLIST=\"$3\"
CHEF_HOST= $HOSTNAME.my.lan
##############################################
# These are psuedo-environment independent variables (could change)
##############################################
DATASTORE="dcesxds04"
##############################################
# These are environment dependent variables (should not change per env)
##############################################
TEMPLATE="\"CentOS\""
NETWORK="\"VM Network\""
CLUSTER="ProdCluster01" #knife-vsphere calls this a resource pool
GATEWAY="10.7.20.1"
DNS="\"10.7.20.11,10.8.20.11,10.6.20.11\""
##############################################
# the magic
##############################################
VM_CLONE_CMD="knife vsphere vm clone $HOST_NAME \
--template $TEMPLATE \
--cips $IP_ADDRESS \
--vsdc MarkleyDC\
--datastore $DATASTORE \
--cvlan $NETWORK\
--resource-pool $CLUSTER \
--cgw $GATEWAY \
--cdnsips $DNS \
--start true \
--bootstrap true \
--fqdn $CHEF_BOOTSTRAP_IP_ADDRESS \
--chost $HOST_NAME\
--cdomain my.lan \
--run-list=$RUNLIST"
echo $VM_CLONE_CMD
eval $VM_CLONE_CMD
Which echos (as a single line):
knife vsphere vm clone dcbsmtest --template "CentOS" --cips 10.7.20.84/24
--vsdc MarkleyDC --datastore dcesxds04 --cvlan "VM Network"
--resource-pool ProdCluster01 --cgw 10.7.20.1
--cdnsips "10.7.20.11,10.8.20.11,10.6.20.11" --start true
--bootstrap true --fqdn 10.7.20.84 --chost dcbsmtest --cdomain my.lan
--run-list="role[my-env-prod-server]"
When it runs it outputs:
Cloning template CentOS Template to new VM dcbsmtest
Finished creating virtual machine dcbsmtest
Powered on virtual machine dcbsmtest
Waiting for sshd...done
Doing old-style registration with the validation key at /home/me/chef-repo/.chef/our-validator.pem...
Delete your validation key in order to use your user credentials instead
Connecting to 10.7.20.84
root#10.7.20.84's password:
If I step away form my desk and it prompts for PWD - then sometimes it times out and the connection is lost and chef doesn't bootstrap. Also I would like to be able to automate all of this to be elastic based on system needs - which won't work with attended execution.
The idea I am going to run with, unless provided a better solution is to have a default password in the template and pass it on the command line to knife, and have chef change the password once the build is complete, minimizing the exposure of a hard coded password in the bash script controlling knife...
Update: I wanted to add that this is working like a charm. Ideally we could have changed the centOs template we were deploying - but it wasn't possible here - so this is a fine alternative (as we changed the root password after deploy anyhow).

Resources