Mixing data from two csv files with date field [duplicate] - bash

This question already has answers here:
match values in first column of two files and join the matching lines in a new file
(2 answers)
Closed 6 years ago.
I have two csv files.
File1.csv
F1 F2
14:01 22
14:05 23
14:07 34
14:58 98
15:01 22
15:10 24
File2.csv
F1 F2
14:01 22
14:06 21
14:07 34
14:59 08
15:01 22
15:19 20
And is it possible to have something like below ?
F1 F2 F3
14:01 22 22
14:05 23
14:06 21
14:07 34 34
14:58 98
14:59 08
15:01 22 22
15:10 24
15:19 20
Thank you.

Here is a pure bash solution, not the most efficient as pointed by #Inian but still pure
#!/bin/bash
f1=()
f2=()
while read -r f1l; do
f1[${#f1[#]}]="$f1l"
done < File1.csv
while read -r f2l; do
f2[${#f2[#]}]="$f2l"
done < File2.csv
output=$'F1\tF2\n'
for (( i=1; i<${#f1[#]}; ++i ))
do
f1c1=${f1[i]%% *}
f1c2=${f1[i]##* }
f2c1=${f2[i]%% *}
f2c2=${f2[i]##* }
if [[ $f1c1 = $f2c1 ]]; then
output+="$f1c1"$'\t'$(($f1c2+$f2c2))$'\n'
else
output+="$f1c1"$'\t'"$f1c2"$'\n'
output+="$f2c1"$'\t'"$f2c2"$'\n'
fi
done
echo "${output:0:-1}" > File3.csv

Related

Linux Bash Print largest number in column from monthly rotated log file

I have monthly rotated log files which looks like the output below. The files are names transc-2301.log (transc-YMM). There is a file for each month of the year. I need a simple bash command to find the file of the current month, and display the largest number (max) of column 3. In the example below, the output should be 87
01/02/23 10:45 19 26
01/02/23 11:45 19 45
01/02/23 12:45 19 36
01/02/23 13:45 22 64
01/02/23 14:45 19 72
01/02/23 15:45 19 54
01/02/23 16:45 19 80
01/02/23 17:45 17 36
01/03/23 10:45 18 24
01/03/23 11:45 19 26
01/03/23 12:45 19 48
01/03/23 13:45 20 87
01/03/23 14:45 20 29
01/03/23 15:45 18 26
Since your filenames are sortable you can easily pick the file of the current month as being the last one in a sortable sequence. Than a quick awk returns the result.
for file in transc_*.log; do :; done
awk '($3>m){m=$3}END{print m}' "$file"
alternatively you can let awk do the heavy lifting on the filename
awk 'BEGIN{ARGV[1]=ARGV[ARGC-1];ARGC=2}($3>m){m=$3}END{print m}' transc_*.log
or if you don't like the glob-expansion trick:
awk '($3>m){m=$3}END{print m}' "transc_$(date "+%y%m").log"
I would harness GNU AWK for this task following way, let transc-2301.log content be
01/02/23 10:45 19 26
01/02/23 11:45 19 45
01/02/23 12:45 19 36
01/02/23 13:45 22 64
01/02/23 14:45 19 72
01/02/23 15:45 19 54
01/02/23 16:45 19 80
01/02/23 17:45 17 36
01/03/23 10:45 18 24
01/03/23 11:45 19 26
01/03/23 12:45 19 48
01/03/23 13:45 20 87
01/03/23 14:45 20 29
01/03/23 15:45 18 26
then
awk 'BEGIN{m=-1;FS="[[:space:]]{2,}";logname=strftime("transc-%y%m.log")}FILENAME==logname{m=$3>m?$3:m}END{print m}' transc*.log
gives output (as of 18 Jan 2023)
87
Warning: I assume your file use as separator two-or-more whitespace characters, if this does not hold adjust FS accordingly. Warning: set m to value which is lower than lowest value which might appear in column of interest. Explanation: I use strftime function to detect what file should be processed and ram all transc*.log files but action is only taken for selected file, action is: set m to $3 if it is higher than current m otherwise keep m value. After processing files, in END, I print value of m.
(tested ub GNU Awk 5.0.1)
mawk '_<(__ = +$NF) { _=__ } END { print +_ }'
gawk 'END { print +_ } (_=_<(__=+$NF) ?__:_)<_'
87

Issue with including if statement in bash script

I am pretty new to bash scripting. I have my bash script below and I want to include an if statement when month (ij==09) equals 09 then "i" should be from 01 to 30. I tried several ways but did not work.
How can I include an if statement in the code below to achieve my task.? Any help is appreciated.
Thanks.
#!/bin/bash
for ii in 2007
do
for i in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 #Day of the Month
do
for ij in 09 10 # Month
do
for j in 0000 0100 0200 0300 0400 0500 0600 0700 0800 0900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300
do
cdo cat DAS_0125_H.A${ii}${ij}${i}.${j}.002_var.nc outfile_${ii}${ij}${i}.nc
done
done
done
done
The smallest change is adding a continue for day 31 in month 9.
You must test "09" as a string (or as 10#09).
(I also changed cdo ... into echo cdo ...)
for ii in 2007
do
for i in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 #Day of the Month
do
for ij in 09 10 # Month
do
if [[ "${ij}" == "09" ]] && [[ "${i}" == "31" ]]; then continue; fi
for j in 0000 0100 0200 0300 0400 0500 0600 0700 0800 0900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300
do
echo "cdo cat DAS_0125_H.A${ii}${ij}${i}.${j}.002_var.nc outfile_${ii}${ij}${i}.nc"
done
done
done
done
It would be easier to read when you use loop through the variales with a seq. You do not want to use `for ((i=1;i<=31;i++)) in view of the leading zeroes.
Also use verbose variable names.
for year in 2007
do
for day in {01..31} # Day of the Month
do
for month in {09,10} # Month
do
if [[ "${month}" == "09" ]] && [[ "${day}" == "31" ]]; then continue; fi
for hour in {00..23}
do
echo cdo cat DAS_0125_H.A${year}${month}${day}.${hour}00.002_var.nc outfile_${year}${month}${day}.nc
done
done
done
done
When the files already exist, you can consider
ls DAS_0125_H.A2007{09,10}{01..31}.{00..23}00.002_var.nc |
sed -r 's/.*([0-9]{8})/cdo cat & outfile_\1.nc/'
When this will show the commands you want, you can execute them by
source <(ls DAS_0125_H.A2007{09,10}{01..31}.{00..23}00.002_var.nc |
sed -r 's/.*([0-9]{8})/cdo cat & outfile_\1.nc/')

bash for loop through all sub directories [duplicate]

This question already has answers here:
Looping through directories in Bash
(3 answers)
How to loop over directories in Linux?
(11 answers)
Looping over directories in Bash
(3 answers)
Closed 4 years ago.
I have one directory with 48 sub-directories such like:
output$ ll
total 0
drwxr-sr-x+ 1 xxx 576 Apr 27 16:39 ./
drwxrws---+ 1 xxx 254 May 4 15:12 ../
drwxrws---+ 1 xxx 28 Apr 19 16:31 404904/
drwxrws---+ 1 xxx 28 Apr 19 16:31 404905/
drwxrws---+ 1 xxx 28 Apr 19 16:31 405003/
drwxrws---+ 1 xxx 28 Apr 19 16:31 405050/
drwxrws---+ 1 xxx 28 Apr 19 16:31 405077/
...
I wanted to write a bash for loop to work on some common analysis in them such like:
for d in {404904,404905,405503,...};
do
echo $d
done
My question is how to loop these sub-directories instead of manually type in.
for d in */; do
echo "${d%/}"
done

How do i compare the value located in one file with the other file

File Mem.txt
[root#mavenir Sudhakar]# cat MEM.txt | awk '{print $4,$5}'
Output:
CARD_0-1 12
CARD_0-10 13
CARD_0-11 13
CARD_0-12 28
CARD_0-13 2
CARD_0-14 2
CARD_0-2 30
CARD_0-3 13
CARD_0-4 29
CARD_0-9 24
CARD_1-1 13
CARD_1-10 28
CARD_1-11 13
CARD_1-12 28
CARD_1-13 29
CARD_1-14 13
CARD_1-2 30
CARD_1-3 13
CARD_1-4 28
CARD_1-5 10
CARD_1-6 28
CARD_1-9 13
[root#mavenir Sudhakar]# cat cardnum.txt
0-1
0-3
0-11
1-1
1-3
1-5
1-9
1-11
1-13
these are the two file where i need to select the value of the 2nd value from the MEM.txt file is the card num exist in cardnum.txt file.
the output should be like this
0-1 12
0-3 13
0-11 13
1-1 13
1-3 13
1-5 10
1-9 13
1-11 13
1-13 29
Here, now this should work.
#!/bin/bash
while lead line; do
val=$(echo "$line" | awk -F _ '{print $2}');
echo "$val";
done < mem.txt

Range with leading zero in bash

How to add leading zero to bash range?
For example, I need cycle 01,02,03,..,29,30
How can I implement this using bash?
In recent versions of bash you can do:
echo {01..30}
Output:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Or if it should be comma separated:
echo {01..30} | tr ' ' ','
Which can also be accomplished with parameter expansion:
a=$(echo {01..30})
echo ${a// /,}
Output:
01,02,03,04,05,06,07,08,09,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
another seq trick will work:
seq -w 30
if you check the man page, you will see the -w option is exactly for your requirement:
-w, --equal-width
equalize width by padding with leading zeroes
You can use seq's format option:
seq -f "%02g" 30
A "pure bash" way would be something like this:
echo {0..2}{0..9}
This will give you the following:
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Removing the first 00 and adding the last 30 is not too hard!
This works:
printf " %02d" $(seq 1 30)

Resources