PCS resource ipaddr2 failed to start with exitreason='[findif] failed' - pacemaker

I need to set up a VIP with pcs in a 2 CentOS 7 node cluster. The resoruce gets defined like that:
pcs resource create MyVip ocf:heartbeat:IPaddr2 ip=10.215.208.164/24 cidr_netmask=24 nic=ens32 op monitor interval=3s
This same config is working well in all other deployments. I just can't understand what the error means:
Failed Actions:
* MyVip_start_0 on node02 'not configured' (6): call=6, status=complete, exitreason='[findif] failed',
last-rc-change='Fri Dec 28 20:47:26 2018', queued=0ms, exec=58ms
This is the interface thats seems not found:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:50:56:92:e2:f9 brd ff:ff:ff:ff:ff:ff
inet 10.215.208.173/24 brd 10.215.208.255 scope global noprefixroute ens32
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe92:e2f9/64 scope link
valid_lft forever preferred_lft forever

If you are getting
vip_start_0 on serv1.XXX.com 'unknown error' (1): call=6, status=complete, exitreason='[findif] failed',
last-rc-change='Sat Sep 19 16:16:19 2020', queued=1ms, exec=159ms
Check if you have NIC setup for the resource:
pcs config
And check in response whether NIC is defined:
Cluster Name: VIP
Corosync Nodes:
serv1.centos7g.com serv2.XXX.com
Pacemaker Nodes:
serv1.centos7g.com serv2.XXX.com
Resources:
Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=192.168.119.200 nic=YOUR_NIC_HERE
You can update nic for an existing resource. Worked for me (CentOS 7.2)
pcs resource update RESOURCE_NAME nic=NIC_NAME
pcs resource cleanup
# check if IP address was created on your NIC interface
ip a s
pcs status

pcs resource create MyVip ocf:heartbeat:IPaddr2 ip=10.215.208.164/24 cidr_netmask=24 nic=ens32 op monitor interval=3s
ip not to have cidr mask.
Correct defnitionw will be ::
ocf:heartbeat:IPaddr2 ip=10.215.208.164 cidr_netmask=24 nic=ens32 op monitor interval=3s

Got this error message with command
pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=1.2.3.4 cidr_netmask=32 op monitor interval=30s
I guess script findif tries to find an interface with appropriate network address for given ip. I have no any similar, so specifying an ip from my interfaces subnets solves the problem:
pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.243.123 cidr_netmask=32 op monitor interval=30s
Specifying interface manually also solves the problem:
pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=1.2.3.4 cidr_netmask=32 nic=lo op monitor interval=30s

Related

Kolla Ansilble: Openstack Instances Unable to access Internet or eachother

I am a newbie to openstack (deployed using kolla-ansible) and have created two instances both are ubuntu 20.04 VMs. I am able to ping and ssh them from the host machine (192.168.211.133) and vice versa. However instances are unable to access internet. The virtual router is also unable to access internet:
Configuration of one of the machine is below;
root#kypo-virtual-machine:/etc/apt/sources.list.d# ip netns ls
qrouter-caca1d42-86b4-42a2-b591-ec7a90437029 (id: 1)
qdhcp-0ec41857-9420-4322-9fef-e332c034e98e (id: 0)
root#kypo-virtual-machine:/etc/apt/sources.list.d# ip netns e qrouter-caca1d42-86b4-42a2-b591-ec7a90437029 route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.211.1 0.0.0.0 UG 0 0 0 qg-f31a26b7-25
192.168.64.0 0.0.0.0 255.255.192.0 U 0 0 0 qr-e5c8842c-c2
192.168.211.0 0.0.0.0 255.255.255.0 U 0 0 0 qg-f31a26b7-25
Netplan of instance shows:
# This file is generated from information provided by the datasource. Changes
# to it will not persist across an instance reboot. To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
version: 2
ethernets:
ens3:
dhcp4: true
match:
macaddress: fa:16:3e:a7:9d:70
mtu: 1450
set-name: ens3
And IP sheme is:
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc
fq_codel state UP group default qlen 1000
link/ether fa:16:3e:a7:9d:70 brd ff:ff:ff:ff:ff:ff
inet 192.168.65.39/18 brd 192.168.127.255 scope global dynamic ens3
valid_lft 85719sec preferred_lft 85719sec
inet6 fe80::f816:3eff:fea7:9d70/64 scope link
valid_lft forever preferred_lft forever
From Horizon
IP Addresses
kypo-base-net
192.168.65.39, 192.168.211.250
Security Groups
kypo-base-proxy-sg
ALLOW IPv6 to ::/0
ALLOW IPv4 icmp from 0.0.0.0/0
ALLOW IPv4 22/tcp from 0.0.0.0/0
ALLOW IPv4 udp from b9904736-6d8a
ALLOW IPv4 tcp from b9904736-6d8a
ALLOW IPv4 tcp from 73ca626b-7cfb
ALLOW IPv4 udp from 73ca626b-7cfb
ALLOW IPv4 to 0.0.0.0/0
I was able to resolve the issue by pinpointing that the gateway used by the virtual router (192.168.211.1) was different form the one used by my host VM (192.168.211.2).
kypo#kypo-virtual-machine:/etc/kolla$ ip route show
default via 192.168.211.2 dev ens33 proto dhcp
src 192.168.211.133 metric 100
I modify the gateway;
openstack subnet set --gateway 192.168.211.2 public-subnet
And now my instances are able to access internet.
The main reason for this configuration issue was while creating the subnet I used auto for --gateway option and obviously it didn't pick the correct gateway.

How to activate can bus support in Yocto/BeagleBoneBlack?

I am trying can bus support for Yocto with beagleboneblack.
I did kernel config by bitbake -c menuconfig virtual/kernel and add following driver to kernel.
Raw CAN Protocol
Broacast Manager CAN Protocol
CAN Gateway/Router
Platform CAN drivers with Netlink support
Can bit-timing calculation
TI High End CAN Controller
And add IMAGE_INSTALL_append = " can-utils iproute2" to local.conf.
When my yocto boot up, serial console seems to show
[ 1.239593] can: controller area network core (rev 20170425 abi 9)
[ 1.246828] NET: Registered protocol family 29
[ 1.251438] can: raw protocol (rev 20170425)
[ 1.255758] can: broadcast manager protocol (rev 20170425 t)
[ 1.261517] can: netlink gateway (rev 20190810) max_hops=1
So, i think that kernel have can driver and socketcan.
But there is no can device.
root#beaglebone:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 78:a5:04:b4:18:cf brd ff:ff:ff:ff:ff:ff
inet 192.168.100.19/24 brd 192.168.100.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 240b:251:520:5b00:7aa5:4ff:feb4:18cf/64 scope global dynamic mngtmpaddr
valid_lft 2591946sec preferred_lft 604746sec
inet6 fe80::7aa5:4ff:feb4:18cf/64 scope link
valid_lft forever preferred_lft forever
3: sit0#NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
Could you tell me how can i find can device in ip a?
BR, Soramame
AM335X has Bosch C_CAN/D_CAN controller but not TI High End CAN Controller.
So i changed kernel config from bitbake -c menuconfig virtual/kernel.
And I modified device tree and rebuild kernel.
Then, I could find can0 and can1.

Oracle Cloud and Docker

I am trying to run my docker on Oracle Cloud instance.
In the past (dedicated server with public IP), I used to run this command to bind my container: docker run -d -p 80:80 image
But now, it doesn't work anymore.
I checked my network interfaces, and I am getting confused, because I cannot see my public IP. How to fix this issue?
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:17:00:8e:77 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.4/24 brd 10.0.0.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::17ff:fe00:8e77/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:cc:94:7a:d9 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:ccff:fe94:7ad9/64 scope link
valid_lft forever preferred_lft forever
I can't give you a complete answer without knowing what exactly is your network setup.
However, all information can be found here.
To summarize, for instances in Oracle Cloud Infrastructure to be accessible from the outside there is a set of prerequisites:
Create a VCN and a public subnet
Create an Internet Gateway in your VCN
Add that internet Gateway to the subnet route table
Create an instance (only the private IP will be visible inside of your instance
in your case, it is 10.0.0.4.
Assign a public IP to your instance (in reality OCI links the public IP to the
private one and not to the instance itself).
If you already have a public subnet you should have seen an "assign public IP" checkbox while creating the instance.
Please feel free to add more details about your setup.

Pacemaker Cluster with two network interfaces on each node

I am trying to create a cluster between 2 nodes with 2 network interfaces each. The idea is that the cluster changes of node when in the node that is active some of its 2 interfaces fall (or the 2 logically). The problem is that the cluster only changes of node if the interface eth1 of the active node falls. If the interface eth0 of the active node falls, the cluster never changes nodes.
This is the network configuration of the nodes:
node1:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.3 netmask 255.255.255.248 broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.26.34.2 netmask 255.255.255.248 broadcast 172.26.34.7
node2:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.4 netmask 255.255.255.248 broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.26.34.3 netmask 255.255.255.248 broadcast 172.26.34.7
These are the commands I use to create the cluster between the nodes and assign the resources:
pcs cluster auth node1 node2 -u hacluster -p 1234 --debug --force
pcs cluster setup --name HAFirewall node1 node2 --force
pcs cluster start --all
pcs resource create VirtualIP_eth0 ocf:heartbeat:IPaddr2 ip=192.168.0.1 cidr_netmask=29 nic=eth0 op monitor interval=30s --group InterfacesHA
pcs resource create VirtualIP_eth1 ocf:heartbeat:IPaddr2 ip=172.26.34.1 cidr_netmask=29 nic=eth1 op monitor interval=30s --group InterfacesHA
pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore
pcs resource enable InterfacesHA
This is the configuration of the corosync.conf file:
totem {
version: 2
secauth: off
cluster_name: HAFirewall
transport: udpu
}
nodelist {
node {
ring0_addr: node1
nodeid: 1
}
node {
ring0_addr: node2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
}
This is the output of the pcs status command:
Cluster name: HAFirewall
Stack: corosync
Current DC: node1 (version 1.1.16-94ff4df) - partition WITHOUT quorum
Last updated: Tue Oct 27 19:01:35 2020
Last change: Tue Oct 27 18:22:27 2020 by hacluster via crmd on node2
2 nodes configured
2 resources configured
Online: [ node1 ]
OFFLINE: [ node2 ]
Full list of resources:
Resource Group: InterfacesHA
VirtualIP_eth0 (ocf::heartbeat:IPaddr2): Started node1
VirtualIP_eth1 (ocf::heartbeat:IPaddr2): Started node1
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
This is the output of the crm configure show command:
node 1: node1
node 2: node2
primitive VirtualIP_eth0 IPaddr2 \
params ip=192.168.0.1 cidr_netmask=29 \
op start interval=0s timeout=20s \
op stop interval=0s timeout=20s \
op monitor interval=30s
primitive VirtualIP_eth1 IPaddr2 \
params ip=172.26.34.1 cidr_netmask=29 \
op start interval=0s timeout=20s \
op stop interval=0s timeout=20s \
op monitor interval=30s
group InterfacesHA VirtualIP_eth0 VirtualIP_eth1
location cli-prefer-InterfacesHA InterfacesHA role=Started inf: node1
property cib-bootstrap-options: \
stonith-enabled=false \
no-quorum-policy=ignore \
have-watchdog=false \
dc-version=1.1.16-94ff4df \
cluster-infrastructure=corosync \
cluster-name=HAFirewall
And these are the interfaces of node1 when it is active and has the virtual IPs up:
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ac:1f:6b:90:a5:58 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.3/29 brd 192.168.0.7 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.0.1/29 brd 192.168.0.7 scope global secondary eth0
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:a558/64 scope link
valid_lft forever preferred_lft forever
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ac:1f:6b:90:a5:59 brd ff:ff:ff:ff:ff:ff
inet 172.26.34.2/29 brd 172.26.34.7 scope global eth1
valid_lft forever preferred_lft forever
inet 172.26.34.1/29 brd 172.26.34.7 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:a559/64 scope link
valid_lft forever preferred_lft forever
Any idea why the cluster works perfectly when the eth1 interface is down and does not work when the etho interface is down?
Greetings and thanks.
i believe you need to specify both interfaces in corosync.conf:
interface {
ringnumber: 0
bindnetaddr: 192.168.0.4
...
interface {
ringnumber: 1
bindnetaddr: 172.26.34.3
...

Of VirtualBox, Chef-dk and Ruby

I've been trying to set up a home "lab" so I can further increase my fluency in Chef. While doing this, I've found an area of frustration I'm looking both to understand (likely a VBox cause) and remedy.
The goal is to use my Arch desktop (which hosts VBox) as the workstation for Chef-dk (installed)
Create two Ubuntu VMs (setup and configured):
- chefsvr (hosts the Chef server)
- chefnode (the node to apply Chef recipes on and manage)
Having spent a while trying to get this all up and running, I noticed that this fails. The error is:
/opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-4.1.0/lib/net/ssh/transport/session.rb:90:in `rescue in initialize': Net::SSH::ConnectionTimeout (Net::SSH::ConnectionTimeout)
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-4.1.0/lib/net/ssh/transport/session.rb:57:in `initialize'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-4.1.0/lib/net/ssh.rb:233:in `new'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-4.1.0/lib/net/ssh.rb:233:in `start'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-multi-1.2.1/lib/net/ssh/multi/server.rb:186:in `new_session'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-multi-1.2.1/lib/net/ssh/multi/session.rb:488:in `next_session'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-multi-1.2.1/lib/net/ssh/multi/server.rb:138:in `session'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/net-ssh-multi-1.2.1/lib/net/ssh/multi/session_actions.rb:36:in `block (2 levels) in sessions'
from /opt/chefdk/embedded/lib/ruby/gems/2.4.0/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
In trying to debug the above, I've added a 2nd nick to the VM's so that the first NIC is now a Host Only Adapter and the 2nd a Bridge Adapter as I want to use my internal DNS server. Traffic seems to pass freely and SSH works all around outside of Chef.
No joy after adding the 2nd adapters.
My next effort was to spin up a 3rd VM and try that as a management node. Instead of Arch I used Ubuntu because I wanted to be as close to the Chef "How-to" as possible. After spinning up and configuring this workstation, everything works as expected.
Any thoughts on this are greatly appreciated. I'd love to use all the tools on my Arch and not be working totally in VMs.
My guess is that there's some networking adjustment I need to make with VirtualBox, but so far I've been unable to identify any.
Many thanks.
Current Versions (although many others tried historically):
VBox 5.2.2r119230
Chef Development Kit Version: 2.4.19
chef-client version: 13.6.4
berks version: 6.3.1
kitchen version: 1.19.2
inspec version: 1.46.2
Additional Info:
Specifics:
Physical IP of Host: 192.168.1.98/24
Guest Bridge Adapter Network: 192.168.1.0/24
Guest Host-Only Adapter Network: 192.168.56.0/24
The Chef nodes have an address in each.
Example:
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:6a:49:6a brd ff:ff:ff:ff:ff:ff
inet 192.168.56.102/24 brd 192.168.56.255 scope global dynamic enp0s3
valid_lft 1035sec preferred_lft 1035secknife bootstrap chefnode --ssh-user mtompkins --sudo --identity-file ~/.ssh/id_rsa --node-name chefnode --run-list 'recipe[learn_chef_httpd]'
inet6 fe80::a00:27ff:fe6a:496a/64 scope link
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:25:13:43 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.174/24 brd 192.168.1.255 scope global dynamic enp0s8
valid_lft 73642sec preferred_lft 73642sec
inet6 fe80::a00:27ff:fe25:1343/64 scope link
valid_lft forever preferred_lft forever
Add'l #2
Tried bypassing DNS by static entries in the host file so trffic would route on the Host-Only subnet. Traffic flows correctly bypassing the bridge but no improvement on trying to bootstrap a node.

Resources