Grab value from text & assign it to a variable - shell

I am working on script that reads battery informations from a system file.
Simply I need to grab the total battery capacity (3000) from the line MAX_IBAT(mA): 3000; and put it into a variable .
This is the content of the file I am reading from :
charging_source: NONE;
charging_enabled: 0;
overload: 0;
Percentage(%): 50;
Percentage_raw(%): 50;
gs_cable_impedance: 0
gs_R_cable_impedance: 0
gs_aicl_result: 0
batt_cycle_first_use: 2017/01/01/12:00:06
batt_cycle_level_raw: 26157;
batt_cycle_overheat(s): 0;
htc_extension: 0x0;
usb_overheat_state: 0;
USB_PWR_TEMP(degree): 304;
ISEN_VALUE_ADC: 228;
ISEN_VALUE: 0;
SOC(%): 27;
VBAT(mV): 3707;
IBAT(mA): 383;
IUSB(mA): 0;
MAX_IBAT(mA): 3000;
MAX_IUSB(mA): 0;
AICL_RESULT: 0
VBUS(uV): 0;
BATT_TEMP: 320;
HEALTH: 1;
BATT_PRESENT(bool): 1;
CHARGE_TYPE: 1;
CHARGE_DONE: 0;
USB_PRESENT: 0;
USB_ONLINE: 0;
CHARGER_TEMP: -1;
CHARGER_TEMP_MAX: 803;
CC_uAh: 889648;
USB_CMD_IL_REG: 0x00;
USBIN_CURRENT_LIMIT_CFG: 0x14;
USBIN_AICL_OPTIONS_CFG: 0xc4;
FAST_CHARGE_CURRENT_CFG: 0x78;
FG_BCL_LMH_STS1: 0x00;
What I have tried:
awk '/^ +MAX_IBAT(mA): && $NF!=0{print $NF} Input_file

This uses ": " and ";" as field separators:
max_ibat=$(awk -F ': |;' '$1=="MAX_IBAT(mA)" {print $2}' file)
echo "$max_ibat"
Output:
3000

grep solution:
-o print the exact output not the whole line
-P perl mode
(?<=MAX_IBAT\(mA\):\s) lookbehind assertion to print only numbers that are preceded by the string 'MAX_IBAT(mA): '
command:
max_ibat=$(grep -oP '(?<=MAX_IBAT\(mA\):\s)\d+' Input_file)
echo "$max_ibat"
Output:
3000
sed solution:
-n silent mode -> do not print by default
/^MAX_IBAT(mA):/ process only lines that start by MAX_IBAT(mA):
/s/[^0-9]//gp replace all characters that are not numbers by nothing (delete them) and then print with p.
command:
max_ibat=$(sed -n '/^MAX_IBAT(mA):/s/[^0-9]//gp' Input_file)
echo "$max_ibat"
Output:
3000

Related

bash read from CSV multiple columns whith hash key

I tried to read verticaly a csv file as follow to insert in graphite/carbon DB.
"No.","time","00:00:00","00:00:01","00:00:02","00:00:03","00:00:04","00:00:05","00:00:06","00:00:07","00:00:08","00:00:09","00:00:0A"
"1","2021/09/12 02:16",235,610,345,997,446,130,129,94,555,274,4
"2","2021/09/12 02:17",364,210,371,341,294,87,179,106,425,262,3
"3","2021/09/12 02:18",297,343,860,216,275,81,73,113,566,274,3
"4","2021/09/12 02:19",305,243,448,262,387,64,63,119,633,249,3
"5","2021/09/12 02:20",276,151,164,263,315,86,92,175,591,291,1
"6","2021/09/12 02:21",264,343,287,542,312,83,72,122,630,273,4
"7","2021/09/12 02:22",373,157,266,446,246,90,173,90,442,273,2
"8","2021/09/12 02:23",265,112,241,307,329,64,71,82,515,260,3
"9","2021/09/12 02:24",285,247,240,372,176,92,67,83,609,620,1
"10","2021/09/12 02:25",289,964,277,476,356,84,74,104,560,294,1
"11","2021/09/12 02:26",279,747,227,573,569,82,77,99,589,229,5
"12","2021/09/12 02:27",338,370,315,439,653,85,165,346,367,281,2
"13","2021/09/12 02:28",269,135,372,262,307,73,86,93,512,283,4
"14","2021/09/12 02:29",281,207,688,322,233,75,69,85,663,276,2
...
I wish to generate commands for each column header 00:00:XX taking into account the hour in column $ 2 and of the value during this time
echo "perf.$type.$serial.$object.00:00:00.TOTAL_IOPS" "235" "epoch time (2021/09/12 02:16)" | nc "localhost" "2004"
echo "perf.$type.$serial.$object.00:00:00.TOTAL_IOPS" "364" "epoch time (2021/09/12 02:17)" | nc "localhost" "2004"
...
echo "perf.$type.$serial.$object.00:00:01.TOTAL_IOPS" "610" "epoch time (2021/09/12 02:16)" | nc "localhost" "2004"
echo "perf.$type.$serial.$object.00:00:01.TOTAL_IOPS" "210" "epoch time (2021/09/12 02:17)" | nc "localhost" "2004"
.. etc..
I dont know by which way to start, i tried with awk without success
Trial1: awk -F "," 'BEGIN{FS=","}NR==1{for(i=1;i<=NF;i++) header[i]=$i}{for(i=1;i<=NF;i++) { print header[i] } }' file.csv
Trial2: awk '{time=$2; for(i=3;i<=NF;i++){time=time" "$i}; print time}' file.csv
Many thanks for any help.
In plain bash:
#!/bin/bash
{
IFS=',' read -ra header
header=("${header[#]//\"}")
nf=${#header[#]}
row_nr=0
while IFS=',' read -ra flds; do
datetime[row_nr++]=$(date -d "${flds[1]//\"}" '+%s')
for ((i = 2; i < nf; ++i)); do
col[i]+=" ${flds[i]}"
done
done
} < file
for ((i = 2; i < nf; ++i)); do
v=(${col[i]})
for ((j = 0; j < row_nr; ++j)); do
printf 'echo "perf.$type.$serial.$object.%s.TOTAL_IOPS" "%s" "epoch time (%s)" | nc "localhost" "2004"\n' \
"${header[i]}" "${v[j]}" "${datetime[j]}"
done
done
Would you please try the following:
awk -F, '
NR==1 { # process the header line
for (i = 3; i <= NF; i++) {
gsub(/"/, "", $i) # remove double quotes
tt[i-2] = $i # assign time array
}
next
}
{ # process the body
gsub(/"/, "", $0)
dt[NR - 1] = $2 # assign datetime array
for (i = 3; i <= NF; i++) {
key[NR-1, i-2] = $i # assign key values
}
}
END {
for (i = 1; i <= NF - 2; i++) {
for (j = 1; j <= NR - 1; j++) {
printf "echo \"perf.$type.$serial.$object.%s.TOTAL_IOPS\" \"%d\" \"epoch time (%s)\" | nc \"localhost\" \"2004\"\n", tt[i], key[j, i], dt[j]
}
}
}
' file.csv

How to get n random "paragraphs" (groups of ordered lines) from a file

I have a file (originally compressed) with a known structure - every 4 lines, the first line starts with the character "#" and defines an ordered group of 4 lines. I want to select randomly n groups (half) of lines in the most efficient way (preferably in bash/another Unix tool).
My suggestion in python is:
path = "origin.txt.gz"
unzipped_path = "origin_unzipped.txt"
new_path = "/home/labs/amit/diklag/subset.txt"
subprocess.getoutput("""gunzip -c %s > %s """ % (path, unzipped_path))
with open(unzipped_path) as f:
lines = f.readlines()
subset_size = round((len(lines)/4) * 0.5)
l = random.sample(list(range(0, len(lines), 4)),subset_size)
selected_lines = [line for i in l for line in list(range(i,i+4))]
new_lines = [lines[i] for i in selected_lines]
with open(new_path,'w+') as f2:
f2.writelines(new_lines)
Can you help me find another (and faster) way to do it?
Right now it takes ~10 seconds to run this code
The following script might be helpful. This is however, untested as we do not have an example file:
attempt 1 (awk and shuf) :
#!/usr/bin/env bash
count=30
path="origin.txt.gz"
new_path="subset.txt"
nrec=$(gunzip -c $path | awk '/^#/{c++}{END print c})'
awk '(NR==FNR){a[$1]=1;next}
!/^#/{next}
((++c) in a) { for(i=1;i<=4;i++) { print; getline } }' \
<(shuf -i 1-$nrec -n $count) <(gunzip -c $path) > $new_path
attempt 2 (sed and shuf) :
#!/usr/bin/env bash
count=30
path="origin.txt.gz"
new_path="subset.txt"
gunzip -c $path | sed ':a;N;$!ba;s/\n/__END_LINE__/g;s/__END_LINE__#/\n#/g' \
| shuf -n $count | sed 's/__END_LINE__/\n/g' > $new_path
In this example, the sed line will replace all newlines with the string __END_LINE__, except if it is followed by #. The shuf command will then pick $count random samples out of that list. Afterwards we replace the string __END_LINE__ again by \n.
attempt 3 (awk) :
Create a file called subset.awk containing :
# Uniform(m) :: returns a random integer such that
# 1 <= Uniform(m) <= m
function Uniform(m) { return 1+int(m * rand()) }
# KnuthShuffle(m) :: creates a random permutation of the range [1,m]
function KnuthShuffle(m, i,j,k) {
for (i = 1; i <= m ; i++) { permutation[i] = i }
for (i = 1; i <= m-1; i++) {
j = Uniform(i-1)
k = permutation[i]
permutation[i] = permutation[j]
permutation[j] = k
}
}
BEGIN{RS="\n#"; srand() }
{a[NR]=$0}
END{ KnuthShuffle(NR);
sub("#","",a[1])
for(r = 1; r <= count; r++) {
print "#"a[permutation[r]]
}
}
And then you can run :
$ gunzip -c <file.gz> | awk -c count=30 -f subset.awk > <output.txt>

How to merge files line by line in bash

My files look like
file0 file1 file2
a 1 ##
a 1 ##
b 2 ##
b 2 ##
and I want to merge these files lines by lines, so it should look like
merged file
a
a
1
1
##
##
b
b
2
2
##
##
I mean, choose some lines for each file and merge them into one file.
I tried below bash script.
touch ini.dat
n=2
linenum=$(wc -l < file0)
iter=$((linenum/n))
for i in $(seq 0 1 $iter)
do
for j in $(seq 0 1 2)
do
awk 'NR > '$(($i*$n))' && NR <= '$((($i+1)*$n))'' file"$j" > tmp
cat ini.dat tmp > tmpp
cp tmpp ini.dat
rm tmpp
done
done
It works fine, but takes too much time. Is there any efficient way?
Limiting Factors
Your script had two flaws which made it slow:
A lot of files were created and copied. Especially the ... > tmp; cat ini.dat tmp > tmpp; cp tmpp ini.dat could have been written as ... >> ini.dat.
To read the i-th line of a file, the script has to scan that file from the beginning until the i-th line is reached. If done repeatedly for i = 1, 2, 3, ..., n it will take O(n2). Reading the whole file once (O(n)) into an array and accesing the lines by indices (O(1)) only takes O(n).
Pure Bash Solution
The following bash script does the job a bit faster. linesPerBlock corresponds to the parameter n from your script. The script will print as much blocks as possible. That is:
Once the shortest input file was printed, the script terminates. Following lines from longer files will not be printed.
If the shortest input file's number of lines is not divisible by n, the last lines (fewer than n) will be omitted.
#! /bin/bash
files=(file{0..2})
linesPerBlock=2
starts=(0)
maxLines=9223372036854775807 # bash's max. number
for i in "${!files[#]}"; do
lineCount="$(wc -l < "${files[i]}")"
(( lineCount < maxLines )) && (( maxLines = lineCount ))
(( starts[i+1] = starts[i] + maxLines ))
mapfile -t -O "${starts[i]}" -n "$maxLines" lines < "${files[i]}"
done
for (( b = 0; b < maxLines / linesPerBlock; ++b )); do
for f in "${!files[#]}"; do
start="${starts[f]}"
for (( i = 0; i < linesPerBlock; ++i )); do
echo "${lines[start + b*linesPerBlock + i]}"
done
done
done > outputFile
This awk should do the job and will be much quicker that your shell script:
awk 'fn != FILENAME {
fn = FILENAME
n = 1
}
NF {
a[FILENAME,n++] = $0
}
END {
for(i=0; i<(n-1)/2; i++) {
for(j=1; j<ARGC; j++)
printf "%s\n%s\n", a[ARGV[j],i*2+1], a[ARGV[j],i*2+2];
print ""
}
}' file{0..2}
a
a
1
1
##
##
b
b
2
2
##
##
In a single line:
awk 'fn != FILENAME{fn=FILENAME; n=1} NF{a[FILENAME,n++]=$0} END{for(i=0; i<(n-1)/2; i++) { for(j=1; j<ARGC; j++) printf "%s\n%s\n", a[ARGV[j],i*2+1], a[ARGV[j],i*2+2]; print "" } }' file{0..2}
here is another awk, not caching all contents
paste file{0..2} | awk -v n=2 '
function pr() {for(j=1;j<=NF;j++)
for(i=0;i<n;i++) print a[i,j]}
{for(j=1;j<=NF;j++) a[c+0,j]=$j; c++}
!(NR%n) {pr(); delete a; c=0}
END {pr()}'
if the number of lines is not divisible by n, it will fill up with empty lines.

Adding a loop in awk

I had a problem that was resolved in a previous post:
But because I had too many files it was not practical to do an awk on every file and then use a second script to get the output I wanted.
Here are some examples of my files:
3
10
23
.
.
.
720
810
980
And the script was used to see where the numbers from the first file fell in this other file:
2 0.004
4 0.003
6 0.034
.
.
.
996 0.01
998 0.02
1000 0.23
After that range was located, the mean values of the second column in the second file was estimated.
Here are the scripts:
awk -v start=$(head -n 1 file1) -v end=$(tail -n 1 file1) -f script file2
and
BEGIN {
sum = 0;
count = 0;
range_start = -1;
range_end = -1;
}
{
irow = int($1)
ival = $2 + 0.0
if (irow >= start && end >= irow) {
if (range_start == -1) {
range_start = NR;
}
sum = sum + ival;
count++;
}
else if (irow > end) {
if (range_end == -1) {
range_end = NR - 1;
}
}
}
END {
print "start =", range_start, "end =", range_end, "mean =", sum / count
}
How could I make a loop so that the mean for every file was estimated. My desired output would be something like this:
Name_of_file
start = number , end = number , mean = number
Thanks in advance.
.. wrap it in a loop?
for f in <files>; do
echo "$f";
awk -v start=$(head -n 1 "$f") -v end=$(tail -n 1 "$f") -f script file2;
done
Personally I would suggest combining them on one line (so that your results are block-data as opposed to file names on different lines from their results -- in that case replace echo "$f" with echo -n "$f " (to not add the newline).
EDIT: Since I suppose you're new to the syntax, <files> can either be a list of files (file1 file2 file 3), a list of files as generated by a glob (file*, files/data_*.txt, whatever), or a list of files generated by a command ( $(find files/ -name 'data' -type f), etc).

Extracting multiple parts of a string using bash

I have a caret delimited (key=value) input and would like to extract multiple tokens of interest from it.
For example: Given the following input
$ echo -e "1=A00^35=D^150=1^33=1\n1=B000^35=D^150=2^33=2"
1=A00^35=D^22=101^150=1^33=1
1=B000^35=D^22=101^150=2^33=2
I would like the following output
35=D^150=1^
35=D^150=2^
I have tried the following
$ echo -e "1=A00^35=D^150=1^33=1\n1=B000^35=D^150=2^33=2"|egrep -o "35=[^/^]*\^|150=[^/^]*\^"
35=D^
150=1^
35=D^
150=2^
My problem is that egrep returns each match on a separate line. Is it possible to get one line of output for one line of input? Please note that due to the constraints of the larger script, I cannot simply do a blind replace of all the \n characters in the output.
Thank you for any suggestions.This script is for bash 3.2.25. Any egrep alternatives are welcome. Please note that the tokens of interest (35 and 150) may change and I am already generating the egrep pattern in the script. Hence a one liner (if possible) would be great
You have two options. Option 1 is to change the "white space character" and use set --:
OFS=$IFS
IFS="^ "
set -- 1=A00^35=D^150=1^33=1 # No quotes here!!
IFS="$OFS"
Now you have your values in $1, $2, etc.
Or you can use an array:
tmp=$(echo "1=A00^35=D^150=1^33=1" | sed -e 's:\([0-9]\+\)=: [\1]=:g' -e 's:\^ : :g')
eval value=($tmp)
echo "35=${value[35]}^150=${value[150]}"
To get rid of the newline, you can just echo it again:
$ echo $(echo "1=A00^35=D^150=1^33=1"|egrep -o "35=[^/^]*\^|150=[^/^]*\^")
35=D^ 150=1^
If that's not satisfactory (I think it may give you one line for the whole input file), you can use awk:
pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=35,150 -F^ ' {
sep = "";
split (LIST, srch, ",");
for (i = 1; i <= NF; i++) {
for (idx in srch) {
split ($i, arr, "=");
if (arr[1] == srch[idx]) {
printf sep "" arr[1] "=" arr[2];
sep = "^";
}
}
}
if (sep != "") {
print sep;
}
}'
35=D^150=1^
35=d^
pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=1,33 -F^ ' {
sep = "";
split (LIST, srch, ",");
for (i = 1; i <= NF; i++) {
for (idx in srch) {
split ($i, arr, "=");
if (arr[1] == srch[idx]) {
printf sep "" arr[1] "=" arr[2];
sep = "^";
}
}
}
if (sep != "") {
print sep;
}
}'
1=A00^33=1^
1=a00^33=11^
This one allows you to use a single awk script and all you need to do is to provide a comma-separated list of keys to print out.
And here's the one-liner version :-)
echo '1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLST=1,33 -F^ '{s="";split(LST,k,",");for(i=1;i<=NF;i++){for(j in k){split($i,arr,"=");if(arr[1]==k[j]){printf s""arr[1]"="arr[2];s="^";}}}if(s!=""){print s;}}'
given a file 'in' containing your strings :
$ for i in $(cut -d^ -f2,3 < in);do echo $i^;done
35=D^150=1^
35=D^150=2^

Resources