Multicast propagation with daisy chained multi NIC card PCs - zeromq

Here is the topology under discussion:
NIC0 NIC0 NIC0
| | |
+-----+-----+ +------+-----+ +------+------+
---NIC1 NIC2---------NIC1 NIC2-----NIC1 NIC2---- . .
+----PC1----+ +----PC2-----+ +-----PC3-----+
I have stack of PC Boxes each having multiple (3) Nic Cards, one to interface with outside world, and others to be used to daisy chain for communication among themselves.
Q(1) Can someone suggest if I can somehow control "multicast traffic" on daisy chained systems without letting that mixed with traffic coming from NIC0(diagram below) Cards?
I am running Linux kernel on each boxes. I can give multicast address in the interface name, but my guess is that it would not guarantee the mixing of traffic if the NIC0 traffic also has same multicast IP, so, is ACL a answer?
Q(2) My application needs subscribe-notify setup, and that's why I need multicast. There are options such as using 0MQ that also use TCP based multicast(PGM). Will that protect me here somehow?

I don't know why you would daisy chain these computers. Before going any further, you should connect these machines with a switch.
Really, though, I don't understand what your questions is...

Related

Cut-through behaviour for the INET WirelessHost

The INET WirelessHost inherits from the StandardHost which has store-and-forward as its default forwarding behaviour. Is there a way to change that behaviour to cut-through? I did not find any fitting parameters in both the StandardHost and WirelessHost modules.
TL;DR: No, that is not possible and makes no sense.
cut-through is a layer 2 (link layer) device feature i.e. switches can support it. It is impossible to do this on layer 3 (network layer) as IP packets can be fragmented, defragmented and the IP header itself can change during routing. So at most, the question should be: can a wireless access point support cut-through? BUT:
cut-through implies that the interface can receive and send simultaneously. That's an almost impossible feat for a wireless transciever over a radio medium. (unless the transmission is highly directional (like. StarLink laser links, but in that case they could be considered as wired channels)

Is tcpdump 100% reliable on outgoing connection?

I'm working on a server.
Its doing health check to another server, like a simple tcp open connection
Basically my tcpdump says that the packet (the health check tcp sYn packet) is going out of my interface.
But, the Firewall doesnt see anything.
I have doubt if the packet is going outside the server at all, or the problem is on the switch.
Is there a way to be sure about this?
Captured traffic == source of truth
It's possible for tcpdump to have false negatives (i.e. packets are sent but tcpdump doesn't record them). This can be due to hardware (CPU, RAM, disk) being maxed or if tcpdump's buffer size (-B) is too small. Likewise, it's possible your firewall isn't picking it up where it should.
It's highly unlikely for tcpdump to report a false positive. Tcpdump copies bytes from your network interface [0] and summarizes them in a text line (depending on your output options). If firewall rules from e.g. iptables would block traffic, tcpdump won't see the traffic. If tcpdump reports a packet, you can be sure it transited that interface.
[0]: If you're curious how tcpdump works at a lower level, use strace.
Flow-based troubleshooting
Flow-based troubleshooting can be required to figure out where packets get dropped in a network. For your network of server:A <-> B:switch:C <-> D:Firewall, we know that A sends it and D does not receive it. Thus you should check ports B and C to determine where the packet loss occurs. It's also possible that D reports a false negative. You can test both of these things by plugging this server directly into a different firewall that can take packet captures/monitor traffic.

How to do a TRUE rescan of PCIe bus

I have an FPGA (Like most of the people asking this question) that gets configured after my Linux kernel does the initial PCIe bus scan and enumeration. As you can guess, the FPGA implements a PCIe endpoint.
I would Like to have the PCIe core re-enumerate the ENTIRE PCIe bus so that my FPGA will then show up and I can load my driver module. I would also like the ability to SWAP the FPGA load out for a different configuration. By this I mean I would like to be able to:
Boot Linux
Configure FPGA
Enumerate PCIe endpoint and load module
Remove PCIe endpoint
Re-configure FPGA
Re-enumerate PCIe endpoint
All without rebooting Linux
Here are solutions that have been proposed elsewhere but do not solve the problem.
echo 1 > /sys/bus/pci/rescan This seems to work (only sometimes) and it does not work if I want to hotswap the FPGA load after it was first enumerated.
Can the Hotplug/power managment facilities of PCIe be used to make this work? If so is there any good resources for how to use the Hotplug system with PCIe? (LDD does not quite cover it thoroughly enough)
Re-enumerating the PCIe bus/tree via echo 1 > /sys/bus/pci/rescan is the correct solution. We are using it the same way as you described it.
We are using echo 1 > $pcidevice/remove to disconnect the driver from the device and to detach the device from the tree. The driver (xillybus) is not unloaded, just disconnected.
A better solution is to rescan only the node where your FPGA is attached to. This reduces the over all impact for the system.
This technique is used in the RC3E FPGA cloud system.
This is really dependent on exactly what is changed on the FPGA. The problem is in how PCIe enumeration and address assignment is done, particularly how the PCIe switches are configured. The allocation MUST be done in one shot as a depth-first search. After this is complete, it is not possible to go insert additional bus numbers or address space without changing all of the subsequent allocations, which would require reloading all of the corresponding device drivers. Basically, once the bus is enumerated and addresses are assigned, you can't change the overall allocations without re-enumerating the entire bus, which requires a reboot. Preallocating resources on a specific PCIe port can alleviate this problem, and is required for PCIe hot plugging.
If the PCIe BAR configuration has not changed, then usually doing a remove/hot reset/rescan is sufficient and no reboots are required.
If the BAR configuration has changed, then it's a different story. If the new BARs are smaller, then there should be no problem. But if the new BARs are larger or there are more BARs, if there isn't enough address space allocated to the switch port that the device is attached to, then those BARs cannot be allocated address space and the device will fail to enumerate. In this case, a reboot is required to so that resources can be reassigned. Don't forget that there are also 32 bit BARs and 64 bit BARs and these BARs are assigned form two different pools of address space, so changing BAR types can also require a reboot to re-enumerate.
If you're going from no device to a device (i.e. blank FPGA to configured FPGA), then bus numbers may need to be reassigned, which requires a reboot.
From The Doctor
Here is how to reset the Vegas before same as a reset in windows. This is based on the Vendor ID.
lspci -n | grep 1002: | egrep -v ".1"| awk '{print "find /sys | grep ""$1"/rescan" -| tac -;"}' | sh - | sed s/^/echo\ 1\ >\ "&/g | sed s/$/"/g
The output of that put in your /etc/rc.local to reset your Vegas after bootup similar to the devcon restart script.
echo 1 > "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/rescan"
echo 1 > "/sys/devices/pci0000:00/0000:00:1c.5/0000:03:00.0/rescan"
echo 1 > "/sys/devices/pci0000:00/0000:00:1d.0/0000:06:00.0/rescan"
echo 1 > "/sys/devices/pci0000:00/0000:00:1d.1/0000:07:00.0/rescan"

Iptables: Matching packets leaving a bridged interface

Apologies if you've already seen this over on serverfault, but it's been on there for several days now, and I've had absolutely no traction...
I'm building a firewall configuration tool based on iptables, and trying to get a "bump in the wire" scenario working.
Given a setup with eth0 and eth1 in a bridge br0 and a third interface eth2:
| | |
eth0 eth1 eth2
| == br0== | |
| |
| |
--- linux node ---
In this scenario, lets say I want TCP port 80 traffic to be dropped if it is going to the network attached to eth0, but allow it to eth1.
I am therefore trying to reliably match packets that go out over the specific interface eth0.
If I add the following iptables rule in the filter table:
-A FORWARD -o br0 --physdev-out eth0 -j LOG
Given a packet that originates from eth1 (the other half of the bridge), then the rule matches just fine, logging:
... IN=br0 OUT=br0 PHYSIN=eth2 PHYSOUT=eth1 ...
However if the packet origniates from eth2, then the rule no longer matches.
I appears that the routing algorithm can't determine which of the bridged interfaces to choose, so the packet is sent out over both interfaces in the bridge.
If I add another more promiscuous log rule, then I get the following log output for that packet:
... IN=eth2 OUT=br0 ...
My guess is that in the first case, the routing algorithm can just choose the other interface on the bridge since that packet shouldn't go out the way it came. In the second case, it hasn't chosen a specific interface and you then get no physdev information at all!
However, if the bridge has learned the destination MAC address (as shown by brctl showmacs br0) then it can determine the correct interface, and you get physdev informatino again.
(There is also a third case: where the bridge comprises three interfaces that this seems to apply to , then it still can't establish a single interface to send the packet on just be excluding the source interface.)
So, the question is, how can I reliably match packets the go out over eth0 regardless?
Given the example I gave at the start, it is not enough to just match packets that will be routed out over multiple interfaces, one of which is eth0 (though that would be useful in other scenarios). I want to be able to treat the traffic for eth0 and eth1 differently, allowing the traffic to eth1, but not eth0.
Reasons for observed behviour
The reason that iptables doesn't get the physical bridge information when the packet arrives from a non-bridged interface is that the packet has never been near the bridging mechanism, even though at this point we know we are sending it out on the bridge.
In the case where the packet did arrive over a bridge port, but it is an N>2 bridge, the problem is that the iptables PHYSDEV extention only provides for their being one value for "out", so it just doesn't bother telling us if there are two.
Solution
Use ebtables instead of iptables. The ebtables OUTPUT chain will know which physical bridged interface it is sending packets out on.
In the scenario above, where you want to filter packets that are leaving via a specific bridged interface (eth0), regardless of how it arrived into the system, add an ebtables rule along the following lines:
-A OUTPUT -o eth0 -j <target>
In a more complex scenario, where you want to filter packets arriving from a specific interface, and leaving via a bridged interface, it gets harder. Say we want to drop all traffic from eth2 (non-bridged) going to eth0 (bridged as part of br0) we need to add this rule to iptables:
-A FORWARD -i eth2 -o br0 -j MARK --set-mark 1234
This will mark any packet that comes from eth2 and is bound for the bridge. Then we add this rule to ebtables:
-A OUTPUT -o eth0 --mark 1234 -j DROP
Which will DROP any packet marked by iptables (as being from eth2) that is egressing via the specific bridge port eth0.
Acknowledgements
Thanks goes out to Pascal Hambourg over at the netfilter iptables mailing list for his help in coming up with this solution.

Obtaining MACs on a Layer 3 port via SNMP?

I'm working on a script to map servers that are connected into our switches and routers. I have it working to map layer two ports, using the algorithm listed at http://www.cisco.com/en/US/tech/tk648/tk362/technologies_tech_note09186a00801c9199.shtml to pull out the MAC addresses.
Layer 3 ports are another matter. These are ports that don't show up in the 'sh vlan' command on a router/layer 3 switch. Ideally, I'd like to use the MAC addresses present in these ports, underlying the layer 3 connection, as that's a bit more 'permanent' than the IP address - these do show up in the MAC-address table on the device. However, the fact that these ports don't have an associated VLAN, and that the MAC retrieval via SNMP is VLAN-indexed, makes it quite difficult.
I've been banging my head against this for about a week or so, but nothing I try/find seems to allow me to get the non-VLAN MAC addresses. Is it possible to map the layer three ports this way, or will I need to use layer 3 (IP address) mapping?
If you are connected via layer 2 to the device, you could just use a ping on the layer 3 address to generate an arp lookup and then look in the arp cache for the mac... This would work for any layer 3 port, even logical ports like the layer 3 version of Portchannels.
This is probably the easiest way.
If you want to be 100% in the realm of SNMP:
To get the interface table for that device, walk the below oid. It will return
the list of all interfaces on that device. This should work on any device (even a server) runnning a SNMP agent:
.1.3.6.1.2.1.2.2.1.2
This will give you a list of interface numbers (last digit in OID), and the interface descriptions. It works for SVI and physical interfaces, not sure about logical types other than SVI.
Then for each interface, to get it's mac (where x is the value in the interface table):
.1.3.6.1.2.1.2.2.1.6.x
This gives you the mac. (Leading 0's can be truncated on some devices.)
However, you will need atleast 1 layer 3 address on each device to do the snmpwalk and get.
If you just want all the macs, then walk this oid:
.1.3.6.1.2.1.2.2.1.6
I use this approach to do something similar on a large network.

Resources