Offset of a specific listed partition - bash

Building off a question found here:
How to get the offset of a partition with a bash script? in regards to using awk,bash and parted for a GPT partition
Being pretty new to scripting languages I am not sure if and how to build off the existing request.
I am looking to grab a specific partition listed by the parted command. Specifically I need the start sector of the ntfs partition for setting a offset in mount within my bash script.
root#workstation:/mnt/ewf2# parted ewf1 unit B p
Model: (file)
Disk /mnt/ewf2/ewf1: 256060514304B
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1048576B 525336575B 524288000B fat32 EFI system partition boot
2 525336576B 659554303B 134217728B Microsoft reserved partition msftres
3 659554304B 256060162047B 255400607744B ntfs Basic data partition msftdata

Using grep with PCRE:
parted ewf1 unit B p | grep -Po "^\s+[^ ]+\s+\K[^ ]+(?=\s.*ntfs)"
Output:
659554304B

awkis your friend for this task:
$ parted ewf1 unit B p |awk '$5=="ntfs"{print $2}'
When the 5th column equals ntfs, print the second one.

This will print the second field of the last line:
parted ewf1 unit B p | awk 'END { print $2 }' # prints 659554304B
or you can search for a line that matches ntfs
parted ewf1 unit B p | awk '/ntfs/ { print $2 }' # prints 659554304B

Related

Processing of the data from a big number of input files

My AWK script processes each log file from the folder "${results}, from which it looks for a pattern (a number occurred on the first line of ranking table) and then print it in one line together with the filename of the log:
awk '$1=="1"{sub(/.*\//,"",FILENAME); sub(/\.log/,"",FILENAME); printf("%s: %s\n", FILENAME, $2)}' "${results}"/*_rep"${i}".log
Here is the format of each log file, from which the number
-9.14
should be taken
AutoDock Vina v1.2.3
#################################################################
# If you used AutoDock Vina in your work, please cite: #
# #
# J. Eberhardt, D. Santos-Martins, A. F. Tillack, and S. Forli #
# AutoDock Vina 1.2.0: New Docking Methods, Expanded Force #
# Field, and Python Bindings, J. Chem. Inf. Model. (2021) #
# DOI 10.1021/acs.jcim.1c00203 #
# #
# O. Trott, A. J. Olson, #
# AutoDock Vina: improving the speed and accuracy of docking #
# with a new scoring function, efficient optimization and #
# multithreading, J. Comp. Chem. (2010) #
# DOI 10.1002/jcc.21334 #
# #
# Please see https://github.com/ccsb-scripps/AutoDock-Vina for #
# more information. #
#################################################################
Scoring function : vina
Rigid receptor: /home/gleb/Desktop/dolce_vita/temp/nsp5holoHIE.pdbqt
Ligand: /home/gleb/Desktop/dolce_vita/temp/active2322.pdbqt
Grid center: X 11.106 Y 0.659 Z 18.363
Grid size : X 18 Y 18 Z 18
Grid space : 0.375
Exhaustiveness: 48
CPU: 48
Verbosity: 1
Computing Vina grid ... done.
Performing docking (random seed: -1717804037) ...
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -9.14 0 0
2 -9.109 2.002 2.79
3 -9.006 1.772 2.315
4 -8.925 2 2.744
5 -8.882 3.592 8.189
6 -8.803 1.564 2.092
7 -8.507 4.014 7.308
8 -8.36 2.489 8.193
9 -8.356 2.529 8.104
10 -8.33 1.408 3.841
It works OK for a moderate number of input log files (tested for up to 50k logs), but does not work for the case of big number of the input logs (e.g. with 130k logs), producing the following error:
./dolche_finito.sh: line 124: /usr/bin/awk: Argument list too long
How could I adapt the AWK script to be able processing any number of input logs?
If you get a /usr/bin/awk: Argument list too long then you'll have to control the number of "files" that you supply to awk; the standard way to do that efficiently is:
results=. # ???
i=00001 # ???
output= # ???
find "$results" -type f -name "*_rep$i.log" -exec awk '
FNR == 1 {
filename = FILENAME
sub(/.*\//,"",filename)
sub(/\.[^.]*$/,"",filename)
}
$1 == 1 { printf "%s: %s\n", filename, $2 }
' {} + |
LC_ALL=C sort -t':' -k2,2g > "$results"/ranking_"$output"_rep"$i".csv
edit: appended the rest of the chain as asked in comment
note: you might need to specify other predicates to the find command if you don't want it to search the sub-folders of $results recursively
Note that your error message:
./dolche_finito.sh: line 124: /usr/bin/awk: Argument list too long
is from your shell interpreting line 124 in your shell script, not from awk - you just happen to be calling awk at that line but it could be any other tool and you'd get the same error. Google ARG_MAX for more information on it.
Assuming printf is a builtin on your system:
printf '%s\0' "${results}"/*_rep"${i}".log |
xargs -0 awk '...'
or if you need awk to process all input files in one call for some reason and your file names don't contain newlines:
printf '%s' "${results}"/*_rep"${i}".log |
awk '
NR==FNR {
ARGV[ARGC++] = $0
next
}
...
'
If you're using GNU awk or some other awk that can process NUL characters as the RS and your input file names might contain newlines then you could do:
printf '%s\0' "${results}"/*_rep"${i}".log |
awk '
NR==FNR {
ARGV[ARGC++] = $0
next
}
...
' RS='\0' - RS='\n'
When using GNU AWK you might alter ARGC and ARGV to command GNU AWK to read additional files, consider following simple example, let filelist.txt content be
file1.txt
file2.txt
file3.txt
and content of these files to be respectively uno, dos, tres then
awk 'FNR==NR{ARGV[NR+1]=$0;ARGC+=1;next}{print FILENAME,$0}' filelist.txt
gives output
file1.txt uno
file2.txt dos
file3.txt tres
Explanation: when reading first file i.e. where number of row in file (FNR) is equal number of row globally (NR) I add to ARGV line as value under key being number of row plus one, as ARGV[1] is already filelist.txt and I increase ARGC by 1, I instruct GNU AWK to then go to next line so no other action is undertaken. For other files I print filename followed by whole line.
(tested in GNU Awk 5.0.1)

Parse output of command and store in variables

I need to parse the output of the mmls command and store multiple values in variables using a BASH script.
Specifically, I need to store: sector size (512 in the example below), and start values (0,0,63,224910,240975 in the example below). Since the second set of values represent partitions, the number of values captured could vary.
mmls /mnt/E01Mnt/RAW/ewf1
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors
Slot Start End Length Description
000: Meta 0000000000 0000000000 0000000001 Primary Table (#0)
001: ------- 0000000000 0000000062 0000000063 Unallocated
002: 000:000 0000000063 0000224909 0000224847 NTFS / exFAT (0x07)
003: 000:001 0000224910 0000240974 0000016065 DOS FAT12 (0x01)
004: ------- 0000240975 0000250878 0000009904 Unallocated
Here's a start:
$ awk '/^Units/{print $4+0} /^[0-9]/{print $3+0}' file
512
0
0
63
224910
240975
Try to solve the rest yourself and then let us know if you have questions.
Explanation: file is a file containing your sample input. You can replace awk '{script}' file with command | awk '{script}' if you're input is coming from the output of some command rather then being stored in a file.
^ is the universal regexp metacharacter for start of string while /.../ in awk means "find this regexp". So the above is looking for lines that start with the text shown (i.e. Units or digits) and then printing the 4th or 3rd space-separated field after adding zero to it to remove any trailing non-digits or leading zeros. man awk.
You need a bit of awk to start with.
values=( $(mmls /mnt/E01Mnt/RAW/ewf1 | awk '
/^Units are in/{match($4,/^[[:digit:]]+/,ss); print ss[0]}
NR>6{print $4}'
) )
Now you have a values array which contains both the sector size(first element) and the start values(subsequent elements) . We could do some array manipulation to separate individual elements.
secsize=${values[0]} # size of sector
declare -a sv # sv for start values
for((i=1;i<${#values[#]};i++))
do
sv+=( ${values[i]} )
done
echo "${sv[#]}" # print start values
unset values # You don't need values anymore.
Note: Requires GNU awk.

bash script to read table line by line

Sample Input: (tab separated values in table format)
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
vs1 vol1 aggr1 online RW 2GB 1.9GB 5%
vs1 vol1_dr aggr0_dp online DP 200GB 160.0GB 20%
vs1 vol2 aggr0 online RW 150GB 110.3GB 26%
vs1 vol2_dr aggr0_dp online DP 150GB 110.3GB 26%
vs1 vol3 aggr1 online RW 150GB 120.0GB 20%
I've a task to find the volumes under an aggregate which has breached threshold so that they can be moved to a different aggregate.
Need your help to read the above table line by line, capture volume associated with a specific aggregate name (which will passed as an argument) and add the size of the volume to variable (say total). The next lines should be read till the variable, total is less than or equal to the size that should be moved (again which will passed as an argument)
Expected output if <aggr1> and <152GB> are passed as arguments
vol1 aggr1 2GB
vol3 aggr1 150GB
You want to read the file line by line, so you can use awk. You give arguments with the syntax -v aggr=<aggr>. You will enter that on command line:
awk -f script.awk -v aggr=aggr1 -v total=152 tabfile
here is an awk script:
BEGIN {
if ( (aggr == "") || (total == 0.) ) {
print "no <aggr> or no <total> arg\n"
print "usage: awk -f script.awk -v aggr=<aggr> -v total=<total> <file_data>"
exit 1;}
sum = 0;
}
$0 ~ aggr {
scurrent = $6; sub("GB","", scurrent);
sum += scurrent;
if (sum <= total) print $2 "\t" $3 "\t" $6;
else exit 0;
}
The BEGIN block is interpreted once, at the beginning! Here you initialize sum variable and you check the presence of mandatory arguments. If they are missing, their value is null.
The script will read the file line by line, and will process only lines containing aggr argument.
Each column is referred thanks to $ and its NUM; your volume size is in the column $6.

Wondering how to merge these two bash commands in a single efficient one

I've a log file that contains some lines I need to grab:
Jul 2 06:42:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (3)
Jul 2 06:51:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (2)
Account id(xx) gives me the name of an object that I am able to gather through mysql query.
Following command (which is for sure not optimized at all, but working) gives me the number of matching lines per account id:
grep "Slow transactions" logfile| awk '{print $18}' | awk -F '[^0-9]+' '{OFS=" ";for(i=1; i<=NF; i++) if ($i != "") print($i)}' | sort | uniq -c
14 20
The output (14 20) means the account id 20 was observed 14 times (14 lines in the logfile).
Then I also have number of dropped slow transactions: (2) part.
This gives the real number of dropped transactions that was logged. In other word, a log entry could mean 1 or more dropped transaction.
I do have a small command to count the number of dropped transactions:
grep "Slow transactions" logfile | awk '{print $24}' | sed 's/(//g' | sed 's/)//g' | awk '{s+=$1} END {print s}'
73
That means 73 transactions were dropped.
These two works but when coming to the point of merging the two I am stuck. I really don't see how to combine them; I am pretty sure awk can do it (and probably a better way that I did) but I would appreciate if any expert from the community could give me some guidance.
update
Since above one was too easy for some of our awk experts in SO I introduce an optional feature :)
As previously mentioned I can convert account ID into a name issuing a mysql query. So, the idea is now to include the ID => name conversion into the awk command.
The mySQL query looks like this (XX being the account ID):
mysql -Bs -u root -p$(perl -MF5::GenUtils -e "print get_mysql_password.qq{\n}") -e "SELECT name FROM myTABLE where account_id= 'XX'"
I founded the post below which deals with commands outputs into awk but facing syntax errors...
How can I pass variables from awk to a shell command?
This uses parentheses as your field separator, so it's easier to grab the account number and the number of slow connections.
awk -F '[()]' '
/Slow transactions/ {
acct[$2]++
dropped[$2] += $4
}
END {
PROCINFO["sorted_in"] = "#ind_num_asc" # https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html
for (acctnum in acct)
print acctnum, acct[acctnum], dropped[acctnum]
}
' logfile
Given your sample input, this outputs
20 2 5
Required GNU awk for the "sorted_in" method of sorting array traversal by index.

bash Assign Result Query to Variable

The situation, where I have unknown number of volume groups and their names with unknown number of disks assigned to them.
Example :
pvs -o pv_name,vg_name
PV VG
/dev/vdd appvg01
/dev/vdb appvg01
/dev/vdf3 vg00
/dev/vdh testvg
vgs --noheadings | awk '{print $1}'| while read line ; do echo $line;vgs --noheadings -o pv_name $line; done
appvg01
/dev/vdd
/dev/vdb
testvg
/dev/vdh
vg00
/dev/vdf3
At the final stage I'd like to mirror each volume with new disk that I'll add manually :
for i in `/sbin/lvs| /bin/awk '{if ($2 ~ /appvg01/) print $1}'`; do
/sbin/lvconvert -b -m0 appvg01/$i /dev/vde
done
but, I don't know what volume name should I correlate with, as it might be any other name.
what is the best approach for this structure.
Thanks
The correct data structure to store this kind of information in bash is associative arrays:
declare -A pvs
{
read # skip the header
while read -r pv vg; do
pvs[$pv]=$vg
done
} < <(pvs -o pv_name,vg_name)
Thereafter, you can iterate and do lookups:
for pv in "${!pvs[#]}"; do
vg="${pvs[$pv]}"
echo "vg $vg is backed by pv $pv"
done

Resources