Disconnecting and reconnecting nvme - amazon-ec2

Is there capacity within amazon/centos/linux to switch the ordering round of nitro disks?
I have an ami which consistently has devices in the incorrect order, by this I mean nvme1n1 and nvme2n1 should be switched round. If I run nvme id-ctrl -v /dev/nvme1n1 | grep sn I get a different serial number back following a reboot. I know they're "wrong" as the serial numbers are not reflective of their capacity... Hope that makes sense (I appreciate it's a bit confusing). This only ever occurs on servers with two or more disks; upon a reboot the disks are "correct"
My question is, is there a method of forcing the nvme device to disconnect and reconnect (in the hope that the mapping works as expected in the correct order).
Thanks guys

Amazon Linux version 2017.09.01 and later contains scripts and a udev rule that automatically maps NVMe devices to /dev/xvd?. It is very briefly mentioned in the documentation, but there is not much information there.
You can obtain a copy by launching the Amazon Linux AMI, but there are also other places on the web where they have been posted. For example, I found this gist.

Very simple in the end:
echo 1 > /sys/bus/pci/devices/$(readlink -f /sys/class/nvme/nvme1 | awk -F "/" '{print $5}')/remove
echo 1 > /sys/bus/pci/devices/$(readlink -f /sys/class/nvme/nvme2 | awk -F "/" '{print $5}')/remove
echo 1 > /sys/bus/pci/rescan

Related

Unable to download data using Aspera

I am trying to download data from the European Nucleotide Archive (ENA) using Aspera CLI however my downloads are getting stalled. I have downloaded several files earlier using the same tool but this is happening since last one month. I usually use the following command:
ascp -QT -P33001 -k 1 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp#fasp.sra.ebi.ac.uk:/vol1/fastq/ERR192/009/ERR1924229/ERR1924229.fastq.gz .
From a post on Beta Science, I learnt that this might be due to not limiting the download speed and hence tried usng the -l argument but was of no help.
ascp -QT -l 300m -P33001 -k 1 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp#fasp.sra.ebi.ac.uk:/vol1/fastq/ERR192/009/ERR1924229/ERR1924229.fastq.gz .
Your command works.
you might be overdriving your local network ?
how much bandwidth do you have ?
here "-l 300m" sets a target rate to 300 Mbps, if you have less than 30, this can cause such problems.
try to reduce the target rate to what you actually have.
(using wired ? Wifi ?)

How to make cpuset.cpu_exclusive function of cpuset work correctly

I'm trying to use the kernel's cpuset to isolate my process. To obtain this, I follow the instructions(2.1 Basic Usage) from kernel doc cpusets, however, it didn't work in my environment.
I have tried in both my centos7 server and my ubuntu16.04 work pc, but neither did work.
centos kernel version:
[root#node ~]# uname -r
3.10.0-327.el7.x86_64
ubuntu kernel version:
4.15.0-46-generic
What I have tried is as follows.
root#Latitude:/sys/fs/cgroup/cpuset# pwd
/sys/fs/cgroup/cpuset
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.cpus
0-3
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.mems
0
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.cpu_exclusive
1
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.mem_exclusive
1
root#Latitude:/sys/fs/cgroup/cpuset# find . -name cpuset.cpu_excl
usive | xargs cat
0
0
0
0
0
1
root#Latitude:/sys/fs/cgroup/cpuset# mkdir my_cpuset
root#Latitude:/sys/fs/cgroup/cpuset# echo 1 > my_cpuset/cpuset.cpus
root#Latitude:/sys/fs/cgroup/cpuset# echo 0 > my_cpuset/cpuset.mems
root#Latitude:/sys/fs/cgroup/cpuset# echo 1 > my_cpuset/cpuset.cpu_exclusive
bash: echo: write error: Invalid argument
root#Latitude:/sys/fs/cgroup/cpuset#
It just printed the error bash: echo: write error: Invalid argument.
Google it, however, I can't get the correct answers.
As I pasted above, before my operation, I confirmed that the cpuset root path have enabled the cpu_exclusive function and all the cpus are not been excluded by other sub-cpuset.
By using ps -o pid,psr,comm -p $PID, I can confirm that the cpus can be assigned to some process if I don't care cpu_exclusive. But I have also proved that if cpu_exclusive is not set, the same cpus can also be assigned to another processes.
I don't know if it is because some pre-setting are missed.
What I expected is "using cpuset to obtain exclusive use of cpus". Can anyboy give any clues?
Thanks very much.
i believe it is a mis-understanding of cpu_exclusive flag, as i did. Here is the doc https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt, quoting:
If a cpuset is cpu or mem exclusive, no other cpuset, other than
a direct ancestor or descendant, may share any of the same CPUs or
Memory Nodes.
so one possible reason you have bash: echo: write error: Invalid argument, is that you have some other cgroup cpuset enabled, and it conflicts with your operations of echo 1 > my_cpuset/cpuset.cpu_exclusive
please run find . -name cpuset.cpus | xargs cat to list all your cgroup's target cpus.
assume you have 12 cpus, if you want to set cpu_exclusive of my_cpuset, you need to carefully modify all the other cgroups to use cpus, eg. 0-7, then set cpus of my_cpuset to be 8-11. After all these cpus configurations , you can set cpu_exclusive to be 1.
But still, other process can still use cpu 8-11. Only the tasks that belongs to the other cgroups will not use cpu 8-11
for me, i had some docker container running, which prevents me from setting my cpuset cpu_exclusive
with kernel doc, i do not think it is possible to use cpus exclusively by cgroup itself. One approach (i know this approach is running on production) is that we isolate cpus, and manage the cpu affinity/cpuset by ourselves

Getting log entry "disk online" from system log

When a disk inserted to my cluster, i wanna know that.
So i need to listen /var/adm/messages and when i catch !NEW! "online" line i must write it to a different log file.
When disk goes online I get this kind of log entries:
Dec 8 10:10:46 SMNODE01 genunix: [ID 408114 kern.info] /scsi_vhci/disk#g5000c50095f92a8f (sd69) online
Tail works without -F option. But i need -F option :/
tail messages | grep 408114 | grep '/scsi_vhci/disk#'| egrep -wi --color 'online'
I have 3 uniform words for grep.
1- The id "408114" is unique for online status.
2- /scsi_vhci/disk#
3- online
P.S: Sorry for my english :)
For grep AND use .*:
$ grep 408114.*/scsi_vhci/disk#.*online test
Dec 8 10:10:46 SMNODE01 genunix: [ID 408114 kern.info] /scsi_vhci/disk#g5000c50095f92a8f (sd69) online
Next time don't edit the question completely but ask another question.

Nmap - RTTVAR has grown to over 2.3 seconds, decreasing to 2.0

I have a script that I'm using to build a config for icinga2. The network is large, multiple /13's large. When I run the script I keep getting the RTTVAR has grown to over 2.3 seconds, decreasing to 2.0 error. I've tried raising my gc_thresh and breaking up the subnets. I've dived through the little info from google and can't seem to find a fix. If anyone has any ideas, I'd really appreciate it. I'm on Ubuntu 16.04
My script:
# Find devices and create IP list
i=72
while [ $i -lt 255 ]
do
echo "$(date) - Scanning xx.$i.0.0/16" >> files/scan.log
nmap -sn --host-timeout 5 xx.$i.0.0/16 -oG - | awk '/Up$/{print $2}' >> files/ip-list
let i=i+1
done
My /etc/sysctl.conf
# Force gc to clean-up quickly
net.ipv4.neigh.default.gc_interval = 3600
# Set ARP cache entry timeout
net.ipv4.neigh.default.gc_stale_time = 3600
# Setup DNS threshold for arp
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv4.neigh.default.gc_thresh2 = 4096
net.ipv4.neigh.default.gc_thresh1 = 2048
Edit: added host-timeout 5 removed -n
I can suggest you tu use ping scan. If you want an "overall sight" of your network you can use
nmap -sP -n
It decreases the time a little bit comparing to nmap -sn , you can check it with small examples.
As I said in a comment. Use --host-timeout and --max-retries and that will improve your performance.

What could prevent frequently switching default source ip of a machine with several interfaces

The goal was to frequently change default outgoing source ip on a machine with multiple interfaces and live ips.
I used ip route replace default as per its documentation and let a script run in loop for some interval. It changes source ip fine for a while but then all internet access to the machine is lost. It has to be remotely rebooted from a web interface to have any thing working
Is there any thing that could possibly prevent this from working stably. I have tried this on more than one servers?
Following is a minimum example
# extract all currently active source ips except loopback
IPs="$(ifconfig | grep 'inet addr:'| grep -v '127.0.0.1' | cut -d: -f2 |
awk '{ print $1}')"
read -a ip_arr <<<$IPs
# extract all currently active mac / ethernet addresses
Int="$(ifconfig | grep 'eth'| grep -v 'lo' | awk '{print $1}')"
read -a eth_arr <<<$Int
ip_len=${#ip_arr[#]}
eth_len=${#eth_arr[#]}
i=0;
e=0;
while(true); do
#ip route replace 0.0.0.0 dev eth0:1 src 192.168.1.18
route_cmd="ip route replace 0.0.0.0 dev ${eth_arr[e]} src ${ip_arr[i]}"
echo $route_cmd
eval $route_cmd
sleep 300
(i++)
(e++)
if [ $i -eq $ip_len ]; then
i=0;
e=0;
echo "all ips exhausted - starting from first again"
# break;
fi
done
I wanted to comment, but as I'm not having enough points, it won't let me.
Consider:
Does varying the delay time before its run again change the number of iterations before it fails?
Exporting the ifconfig & routes every time you change it, to see if something is meaningfully different over time. Maybe some basic tests to it (ping, nslookup, etc) Basically, find what is exactly going wrong with it. Also exporting the commands you send to a logfile (text file per change?) to see changes in them to see if some is different after x iterations.
What connectivity is lost? Incoming? Outgoing? Specific applications?
You say you use/do this on other servers without problems?
Are the IP's: Static (/etc/network/interfaces), bootp/DHCP, semi-static (bootp/DHCP server serving, based on MAC address), and if served by bootp/DHCP, what is the lease duration?
On the last remark:
bootp/dhcp will give IP's for x duration. say its 60 minutes. After half that time it will "check" with the bootp/dhcp server if it can keep the IP, and extend the lease to 60 minutes again, this can mean a small reconfig on the ifconfig (maybe even at the same time of your script?).
hth

Resources