bash Assign Result Query to Variable - bash

The situation, where I have unknown number of volume groups and their names with unknown number of disks assigned to them.
Example :
pvs -o pv_name,vg_name
PV VG
/dev/vdd appvg01
/dev/vdb appvg01
/dev/vdf3 vg00
/dev/vdh testvg
vgs --noheadings | awk '{print $1}'| while read line ; do echo $line;vgs --noheadings -o pv_name $line; done
appvg01
/dev/vdd
/dev/vdb
testvg
/dev/vdh
vg00
/dev/vdf3
At the final stage I'd like to mirror each volume with new disk that I'll add manually :
for i in `/sbin/lvs| /bin/awk '{if ($2 ~ /appvg01/) print $1}'`; do
/sbin/lvconvert -b -m0 appvg01/$i /dev/vde
done
but, I don't know what volume name should I correlate with, as it might be any other name.
what is the best approach for this structure.
Thanks

The correct data structure to store this kind of information in bash is associative arrays:
declare -A pvs
{
read # skip the header
while read -r pv vg; do
pvs[$pv]=$vg
done
} < <(pvs -o pv_name,vg_name)
Thereafter, you can iterate and do lookups:
for pv in "${!pvs[#]}"; do
vg="${pvs[$pv]}"
echo "vg $vg is backed by pv $pv"
done

Related

Add device name and size to shell selection script

I am writing a shell script to install Arch Linux, currently, I have a very basic disk selection but it does not give enough information, how can I add extra information such as the disk name and size but keep the $DISK variable the same (ie: /dev/nvme0n1)
select_disk () {
count=0
for device in `lsblk -d | tail -n+2 | cut -d" " -f1`; do
count=$((count+1))
dev[$count]=$device
printf '%s: %s\n' "$count" "$device"
done
read -rp "Select disk (numbers 1-$count): " selection
DISK="/dev/${dev[$selection]}"
echo "$DISK" > /tmp/disk
echo "Installing on $DISK."
}
You could use the columns format lsblk offers using the -o flag. Also, you can avoid the tail (which I believe is to remove the header) by using the lsblk flag -n which does exactly that. Something like this:
select_disk () {
count=0
# by default the 'for' loop splits by spaces, change that
# to split by breaklines by redefining IFS
IFS=$'\n'
for device_info in `lsblk -d -n -o NAME,TYPE,SIZE`; do
count=$((count+1))
device_name=$(echo $device_info | cut -d" " -f1)
dev[$count]=$device_name
printf '%s: %s\n' "$count" "$device_info"
done
read -rp "Select disk (numbers 1-$count): " selection
DISK="/dev/${dev[$selection]}"
echo "$DISK" > /tmp/disk
echo "Installing on $DISK."
}
Here is only showed the NAME, TYPE and SIZE, but you can add all the columns you want (separated by a comma). See lsblk -h (the help) for all possible options.
I'm not sure about what you are calling 'disk name' but...
Let's consider, in your 'for;do..done' loop, that $device is your mapped device name (i.e: sda).
You may get the following information:
1)The device label (if exist):
lsblk "/dev/${device}" -dn -o LABEL
2)The device vendor:
lsblk "/dev/${device}" -dn -o VENDOR
3)The device size (in human readable format)
lsblk "/dev/${device}" -dn -o SIZE
and multiple other informations like:
4)device type:
lsblk "/dev/${device}" -dn -o TYPE
5)Hardware block sizes
lsblk "/dev/${device}" -dn -o PHY-SEC,LOG-SEC
6)Linux owners and group assigned to this object
lsblk "/dev/${device}" -dn -o OWNER,GROUP
Or anything in the below list:
NAME KNAME MAJ:MIN FSTYPE MOUNTPOINT LABEL UUID PARTTYPE PARTLABEL PARTUUID PARTFLAGS RA RO RM HOTPLUG MODEL SERIAL SIZE STATE OWNER GROUP MODE ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE TYPE DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO WSAME WWN RAND PKNAME HCTL TRAN SUBSYSTEMS REV VENDOR ZONED
hope it helps.

How to loop through a list and pass as variable in bash / awk

Updated question:
I have a config.file in which I define a few variables that are ultimately called in a different script.
$cat config.file
#1 Accession number ref
ref=L41223.2
#2 Accession number SRA
SRA=SRA7361534
#3 Path to SRA
path_SRA='/Volumes/5TB/sra/'
#4 Path to ref
path_ref='/Volumes/5TB/results/species1/'
The #3 (path to SRA) is constant and never changes. For the other variables ($ref, $sra and $path_ref), I would like to read them one-by-one from different fields of an input.file:
$cat input.file
species1 L41223.2 SRA7361534
species2 D45023.5 SRA9473231
species3 L42823.6 SRA0918881
...
All these variables are called several times in a script.sh:
#!/bin/bash
# Path to the configuration file
. /Users/Main/config.file
# Use NCBI's e-utilities to download reference files
esearch -db nucleotide -query $ref | efetch -format fasta > $path_ref$ref.fasta
# Using NCBI's sratoolkit to download SRA file
prefetch $SRA
cd $path_SRA
mv *.sra $path_ref
# Decompress the SRA file
cd $path_ref; if fastq-dump --split-3 $SRA.sra ; then
echo "SRA file successfully decompressed. Deleting the SRA file now..."
rm $SRA.sra
else
echo "Could not decompress SRA file"
fi
# Use bwa to align DNA reads to the reference sequence
cd $path_ref;
bwa index -p INDEX $ref.fasta
bwa aln -t $core INDEX *_1.fastq > 1.sai
bwa aln -t $core INDEX *_2.fastq > 2.sai
bwa sampe INDEX 1.sai 2.sai *_1.fastq *_2.fastq | samtools view -hq 5 > $SRA.Q5.sam
# Use samtools for conversion
samtools view -bT $ref.fasta $SRA.Q5.sam > $SRA.Q5.bam
samtools sort $SRA.Q5.bam -o $SRA.sorted
# use bedtools for coverage
bedtools genomecov -d -ibam $SRA.sorted.bam > $SRA.gencov.txt
# use awk for extraction
awk '$2 ~ /81|161|97|145/ {print $0}' $SRA.Q5.sam > $SRA.OTW.sam
samtools view -bT $ref.fasta $SRA.OTW.sam > $SRA.OTW.bam
samtools sort $SRA.OTW.bam -o $SRA.OTW.sorted.bam
# Extract FLAG, POS, CIGAR and TLEN for outward-oriented reads
awk '$2 ~ /81|161|97|145/ {print $2, $4, $6, $9}' $SRA.Q5.sam > $SRA.OTW.txt
# Get per-base coverage for outward-oriented reads
bedtools genomecov -d -ibam $SRA.OTW.sorted.bam > $SRA.OTW.gencoverage.txt
# Simplify the output by averaging read coverage over 50 bp window; prints the average count value and last genomic position
awk '{sum+=$3; count++} FNR % 50 == 0 {print $2, (sum/count); count=sum = ""}' $SRA.OTW.gencoverage.txt > $SRA.OTW.50sum.txt
#### End of the script
What I would like to do is "read" from the input.file into the config.file. The first field (species1...) would be used as input for $path_ref, field 2 (L41223.2...) would be used as input for $ref and third field (SRA7361534...) would be used as input for $SRA variable. Once the first round (basically the first line) has been done, the script.sh would run again and read fields 1,2 and 3 from the line 2 and so on. Basically, a loop, but somewhat more complicated than the answer below because different variables are called at different places in the script.
This works fine for one variable, however I couldn't implement it with three different variables called throughout the script:
while read -r c1 c2 c3; do
bwa index -p INDEX ${c2}.fasta
# place rest of your script here
done < input.file
Many thanks in advance.
In script.sh, after the line . /Users/Main/config.file, add these lines:
number_of_inputs=$(wc -l < input.file)
for (( i=1 ; i <= number_of_inputs ; i++ )); do
# extract columns $1, $2, $3 here, from line $i - please change appropriately
ref=$( awk "NR==$i{print \$1}" input.file)
SRA=$( awk "NR==$i{print \$2}" input.file)
path_ref=$(awk "NR==$i{print \$3}" input.file)
then add a done at the end of the file, so the whole thing loops over the values in each line of input.file, setting the values accordingly

How to use parameter in conf file for disk usage?

I have a disk script like this:
#!/bin/bash
filesys=(
/
)
[ -f "$(pwd)/filesys.conf" ] && filesys=($(<$(pwd)/filesys.conf))
date=$(date +"%d\/%m\/%Y")
df -P "${filesys[#]}" |
sed -ne 's/^.* \([0-9]\+\)% \(.*\)$/'$date', \2, \1%/p' > disk.log
I have filesys.conf for work which filesystem:
/
/run
And this is output (disk.log):
23/05/2016, /, 78%
23/05/2016, /run, 0%
Question:
I need filesys.conf because server filesystems always change, conf file easy for me. But I need to add usage parameter in filesys.conf too like this:
/,90
/run,99
If / usage greater than 90, /run usage greater than 99, write to log file.
How can I do this?
Let's create a bash associative array from the file filesys.conf on the form assoc[file system]=threshold. This way, we can loop over it and compare against the output of df -P.
Given the file:
$ cat filesys.conf
/,90
/dev,25
We store it in the array with (source: Bash reading from a file to an associative array):
declare -A assoc
while IFS=, read -r -a array;
do
assoc[${array[0]}]=${array[1]}
done < filesys.conf
So now the values are:
$ for k in "${!assoc[#]}"; do echo "${k} --> ${assoc[$k]}"; done
/dev --> 25
/ --> 90
Now it is just a matter of processing the output of df with, for example, awk:
mydate=$(date "+%d/%m/%Y")
for k in "${!assoc[#]}"
do
df -P "${k}" | awk 'NR==2 && $(NF-1)+0 > limit {print date, fs, $(NF-1)}' OFS=, fs="$k" date=mydate limit="${assoc[$k]}"
done >> disk.log
This pipes df to awk. There, we check the second line (the first one is the header, which apparently cannot be removed from the output (man df does not mention it)) and, in there, the value of the penultimate column. If its value is bigger than the given threshold, we print the desired output.
Note the trick of $(NF-1)+0 to cast the Use% column to int: the format 13% gets converted into 13.

Bash error: Integer expression expected

In the sections below, you'll see the shell script I am trying to run on a UNIX machine, along with a transcript.
When I run this program, it gives the expected output but it also gives an error shown in the transcript. What could be the problem and how can I fix it?
First, the script:
#!/usr/bin/bash
while read A B C D E F
do
E=`echo $E | cut -f 1 -d "%"`
if test $# -eq 2
then
I=`echo $2`
else
I=90
fi
if test $E -ge $I
then
echo $F
fi
done
And the transcript of running it:
$ df -k | ./filter.sh -c 50
./filter.sh: line 12: test: capacity: integer expression expected
/etc/svc/volatile
/var/run
/home/ug
/home/pg
/home/staff/t
/packages/turnin
$ _
Before the line that says:
if test $E -ge $I
temporarily place the line:
echo "[$E]"
and you'll find something very much non-numeric, and that's because the output of df -k looks like this:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb1 954316620 212723892 693109608 24% /
udev 10240 0 10240 0% /dev
: :
The offending line there is the first, which will have its fifth field Use% turned into Use, which is definitely not an integer.
A quick fix may be to change your usage to something like:
df -k | sed -n '2,$p' | ./filter -c 50
or:
df -k | tail -n+2 | ./filter -c 50
Either of those extra filters (sed or tail) will print only from line 2 onwards.
If you're open to not needing a special script at all, you could probably just get away with something like:
df -k | awk -vlimit=40 '$5+0>=limit&&NR>1{print $5" "$6}'
The way it works is to only operate on lines where both:
the fifth field, converted to a number, is at least equal to the limit passed in with -v; and
the record number (line) is two or greater.
Then it simply outputs the relevant information for those matching lines.
This particular example outputs the file system and usage (as a percentage like 42%) but, if you just want the file system as per your script, just change the print to output $6 on its own: {print $6}.
Alternatively, if you do the percentage but without the %, you can use the same method I used in the conditional: {print $5+0" "$6}.

bash awk file compare

I have a config
[LogicalUnit1] UnitInquiry "NFSN00Y5IP51ZL" LUN0 /mnt/extent0 64MB
[LogicalUnit2] UnitInquiry "NFSN00N49CQL28" LUN0 /mnt/extent1 64MB
[LogicalUnit3] UnitInquiry "NFSNBRGQOCXK" LUN0 /mnt/extent4 10MB
[LogicalUnit4] UnitInquiry "NFSNE7IXADFJ" LUN0 /mnt/extent5 25MB
which is read via a bash script, using awk i parse the file and get variables
awk '/UnitInquiry/ {print $1, $3, $5, $6}' $ctld_config | while read a b c d ; do
if [ -f $a ]
then
ctladm create -b block -o file=$c -S $b -d $a
ctladm devlist -v > $lun_config
else
truncate -s $d $c ; ctladm create -b block -o file=$c -S $b -d $a
fi
this will initialize the luns properly on bootup, however if i add a lun then it will recreate them all again, how can i compare whats running, to whats configured and only reinitialize the ones not already live, there is a command to list the devices
ctladm devlist -v
LUN Backend Size (Blocks) BS Serial Number Device ID
0 block 131072 512 "NFSN00Y5IP51ZL [LogicalUnit1]
lun_type=0
num_threads=14
file=/mnt/extent0
1 block 131072 512 "NFSN00N49CQL28 [LogicalUnit2]
lun_type=0
num_threads=14
file=/mnt/extent1
2 block 20480 512 "NFSNBRGQOCXK" [LogicalUnit3]
lun_type=0
num_threads=14
file=/mnt/extent4
3 block 51200 512 "NFSNE7IXADFJ" [LogicalUnit4]
lun_type=0
num_threads=14
file=/mnt/extent5
Why not add the following after the then:
ctladm devlist -v | grep -q "$a" && continue
This will
run the command that show the currently active devices
check if the LogicalUnit name you want to register is already listed, and if yes...
skip the rest of the loop.
If $a (logical unit name) is not unique enough, you can also grep for another, more unique identifier, e.g. the serial number.

Resources