List of last generated file on each day from 7 days list - shell

I've a list of files in the following format:
Group_2012_01_06_041505.csv
Region_2012_01_06_041508.csv
Region_2012_01_06_070007.csv
XXXX_YYYY_MM_DD_HHMMSS.csv
What is the best way to compile a list of last generated file for each day per group from last 7 days list?
Version that worked on HP-UX
for d in 6 5 4 3 2 1 0
do
DATES[d]=$(perl -e "use POSIX;print strftime '%Y_%m_%d%',localtime time-86400*$d;")
done
for group in `ls *.csv | cut -d_ -f1 | sort -u`
do
CSV_FILES=$working_dir/*.csv
if [ ! -f $CSV_FILES ]; then
break # if no file exists do not attempt processing
fi
for d in "${DATES[#]}"
do
file_nm=$(ls ${group}_$d* 2>>/dev/null | sort -r | head -1)
if [ "$file_nm" != "" ]
then
# Process file
fi
done
done

You can explicitly iterate over the group/time combinations:
for d in {1..6}
do
DATES[d]=`gdate +"%Y_%m_%d" -d "$d day ago"`
done
for group in `ls *csv | cut -d_ -f1 | sort -u`
do
for d in "${DATES[#]}"
do
echo "$group $d: " `ls ${group}_$d* 2>>/dev/null | sort -r | head -1`
done
done
Which outputs the following for your example data set:
Group 2012_01_06: Group_2012_01_06_041505.csv
Group 2012_01_05:
Group 2012_01_04:
Group 2012_01_03:
Group 2012_01_02:
Group 2012_01_01:
Region 2012_01_06: Region_2012_01_06_070007.csv
Region 2012_01_05:
Region 2012_01_04:
Region 2012_01_03:
Region 2012_01_02:
Region 2012_01_01:
XXXX 2012_01_06:
XXXX 2012_01_05:
XXXX 2012_01_04:
XXXX 2012_01_03:
XXXX 2012_01_02:
XXXX 2012_01_01:
Note Region_2012_01_06_041508.csv is not shown for Region 2012_01_06 as it is older than Region_2012_01_06_070007.csv

Related

How can I count and display only the words that are repeated more than once using unix commands?

I am trying to count and display only the words that are repeated more than once in a file. The basic idea is:
You are given a file with names and characters like commas, colons, slashes, etc..
Use the cut command to display only the first names in the file (other commands are also allowed).
Count and then display only the names repeated more than once.
I got to the point of counting and displaying all the names. However, I haven't found a way to display and to count only those names repeated more than once.
Here is a section of the file:
user1:x:80:200:Mia,Spurs:/home/user1:/bin/bash
user2:x:80:200:Martha,Dalton:/home/user2:/bin/bash
user3:x:80:200:Lucy,Carlson:/home/user3:/bin/bash
user4:x:80:200:Carl,Bingo:/home/user4:/bin/bash
Here is what I have been able to do:
Daniel#Daniel-MacBook-Pro Files % cut -d ":" -f 5-5 file1 | cut -d "," -f 1-1 | sort -n | uniq -c
1 Mia
3 Martha
1 Lucy
1 Carl
1 Jessi
1 Joke
1 Jim
2 Race
1 Sem
1 Shirly
1 Susan
1 Tim
You can filter out the rows with count 1 with grep.
cut -d ":" -f 5 file1 | cut -d "," -f 1 | sort | uniq -c | grep -v '^ *1 '

How can I process date strings in bash?

Does anyone have any idea how I could process input like this with bash? I would like to convert absolute time to relative time. My approach works but is VERY messy. Can anyone do better? Is there a cleaner way to do this?
Input:
| 2020-08-01 15:35:47.446 | message 1 |
| 2020-08-01 15:35:48.446 | hi these |
| 2020-08-01 15:31:47.446 | do stuff now! |
Output: Shows the time difference in milliseconds
0 message 1
1000 hi these
60000 do stuff now!
Working (very dirty) approach:
while read line;
do echo $(echo "$(echo "$line" | cut -d' ' -f3 | cut -d':' -f2 | head -1) * 60000 + $(echo "$line" | cut -d' ' -f3 | cut -d':' -f3 | head -1) * 1000 - $baseval" | bc) $(echo "$line" | cut -d'|' -f3) ;
done < file.log
Looks like the question ask to move a series of abs timestamp to relative timestamp, using 'baseval' as the zero point in time.
It is possible to use date command (using the '+%S' to get seconds past epoch) to simplify calcualtion. If the file has many lines, this solution might not be ideal, as it calls the 'date' process for each line.
Worth noting some of the complexities is with parsing the input format - combination of fixed + delimited column. Code uses bash 'IFS' to split the line into components.
#! /bin/bash
function relative_time_ms {
# Convert inputinto two tokens - relative seconds + nanoseconds
local dd=($(date '+%s %N' -d "$1"))
echo $((dd[0]*1000 + dd[1]/1000000 - baseval))
}
while IFS='|' read x ts msg ; do
rel_time=$(relative_time_ms "$ts")
echo "$rel_time | $msg"
done < file.log
Output:
0 | message 1
1000 | hi these
-240000 | do stuff now!

How do I print a different variable dependent on multiple quantity variables (and math)?

Source File Input (.csv):
TOTAL FULL PARTIALORDER FULLORDER DEVICENAME
10 2 123456 456789 OHWIL8499IPM101
8 0 345678 789605 OHCIN8499IPM102
TOTAL= Is the total number of devices for a full and partial order.
FULL = A different order but of same location.
PARTIAL/FULL ORDER = Order numbers. I need to print one after each device name.
Requirement/Goal:
I need to subtract FULL from TOTAL. If there is a difference I need to change the order number variable on the last (# OF FULL) device names printed.
If there are 10 total and 2 full I need 8 devices printed OHWIL8499IPM101 through OHWIL8499IPM108 followed by the partial order number. Then OHWIL8499IPM109-110 followed by the full order number. In the input file example above the second line doesn't have a difference so I only need to print the partial order number.
Desired Output File (.csv):
DEVICENAME ORDERNUMBER SHIPDATE TYPE SYSTEMSKU
OHWIL8499IPM101 123456 ASAP PROJECT1 12345678
OHWIL8499IPM102 123456 ASAP PROJECT1 12345678
OHWIL8499IPM103 123456 ASAP PROJECT1 12345678
OHWIL8499IPM104 123456 ASAP PROJECT1 12345678
OHWIL8499IPM105 123456 ASAP PROJECT1 12345678
OHWIL8499IPM106 123456 ASAP PROJECT1 12345678
OHWIL8499IPM107 123456 ASAP PROJECT1 12345678
OHWIL8499IPM108 123456 ASAP PROJECT1 12345678
OHWIL8499IPM109 456789 ASAP PROJECT1 12345678
OHWIL8499IPM110 456789 ASAP PROJECT1 12345678
Current Script:
#!/bin/bash
currentUser=$(/bin/ls -la /dev/console | /usr/bin/cut -d ' ' -f 4)
LOGFILE="/Users/$currentUser/Desktop/kit.csv"
SRCFILE="/Users/$currentUser/Desktop/input.csv"
orderFull=$(cat "$SRCFILE" | sed 's/,/ /g' | awk '{print $4}')
orderPartial=$(cat "$SRCFILE" | sed 's/,/ /g' | awk '{print $3}')
device_name=$(cat "$SRCFILE" | sed 's/,/ /g' | awk '{print $5}')
quantityNum=$(cat "$SRCFILE" | sed 's/,/ /g' | awk '{print $1}')
quantityFull=$(cat "$SRCFILE" | sed 's/,/ /g' | awk '{print $2}')
shipDate=ASAP
projectType=Project1
systemSku=1235678
number=$(echo "$device_name" | head -c 9| tail -c 4)
max_sequence_name=$()
max_sequence_num=$(echo $max_sequence_name | rev | cut -c 1-3 | rev)
if [ -z "$max_sequence_name" ];
then
max_sequence_name=device_name
max_sequence_num=100
fi
array_new_sequence_name=()
for i in $(seq 1 $quantityNum);
do
cnum=$((max_sequence_num + i))
array_new_sequence_name+=($(echo $device_name$cnum))
done
for sqn in ${array_new_sequence_name[#]};
do
echo "$sqn,$orderNumber,$shipDate,$projectType,$systemSku,$number" >> $LOGFILE
done
So far I have the following code I've been attempting to start with but am unsure if I'm on the right track or not.
if [ "$quantityFull" = 0 ]; then
orderNumber=$(echo "$orderPartial")
else
math=$(expr $orderPartial - $orderFull)
#ENTER CODE HERE TO DETERMINE WHICH NAMES TO CHANGE ORDER NUMBER ON
orderNumber=$(echo "$orderFull")
fi
Once this can be determined where would I place it in the above script? Or is there an easier way of implementing this somehow? If you need additional information please let me know.

Read content of file and put particular portion of content in separate files using bash

I would like to get specific file contains from single file and put into separate files via bash. I have tried getting test1 file contain using below code and able to get it but i'm failed when getting everything in respected files.
Tried code:
reportFile=/report.txt
test1File=/test1.txt
test2File=/test2.txt
test3File=/test3.txt
totalLineNo=`cat ${reportFile} | wc -l`
test1LineNo=`grep -n "Test1 file content :" ${reportFile} | grep -Eo '^[^:]+'`
test2LineNo=`grep -n "Test2 file content :" ${reportFile} | grep -Eo '^[^:]+'`
test3LineNo=`grep -n "Test3 file content :" ${reportFile} | grep -Eo '^[^:]+'`
exactTest1LineNo=`echo $(( ${test1LineNo} - 1 ))`
exactTest2LineNo=`echo $(( ${test2LineNo} -1 ))`
exactTest3LineNo=`echo $(( ${test3LineNo} -1 ))`
test1Content=`cat ${reportFile} | head -n ${exactTest1LineNo}`
test3Content=`cat ${reportFile} | tail -n ${exactTest3LineNo}`
echo -e "${test1Content}\r" >> ${test1File}
echo -e "${test3Content}\r" >> ${test3File}
report.txt:
-------------------------------------
My Report:
Test1 file content:
1
2
3
4
5
6
Test2 file content:
7
8
9
10
Test3 file content:
11
12
13
14
15
Note: Find my report above.
-------------------------------------
test1.txt (expected):
1
2
3
4
5
6
test2.txt (expected):
7
8
9
10
test3.txt (expected):
11
12
13
14
15
With single awk command:
awk '/^Test[0-9] file content:/{ f=1; fn=tolower($1)".txt"; next }
f && NF{ print > fn }!NF{ f=0 }' report.txt
Viewing results:
$ head test[0-9].txt
==> test1.txt <==
1
2
3
4
5
6
==> test2.txt <==
7
8
9
10
==> test3.txt <==
11
12
13
14
15
If I understand you correctly: you have a long file report.txt and you want to extract short files from it. The name of each file is followed by the string " file content:" in the file report.txt.
This is my solution:
#!/bin/bash
reportFile=report.txt
Files=`grep 'file content' $reportFile | sed 's/ .*$//'`
for F in $Files ; do
f=${F,}.txt # first letter lowercase and append .txt
awk "/$F file content/,/^\$/ {print}" $reportFile |
tail -n +2 | # remove first line with "Test* file content:"
head -n -1 > $f # remove last blank line
done

Bash: store sql query in an array

I am running an sql query in bash to get file names and their schedule time. The files will then be run at the schedule time associated with it. The query out put is below. I need to capture the date and time and the file names and run the files at their specified time. How do I store both columns in separate arrays.
file_name | schedule_time
--------------------------+---------------------
file 1 | 2016-02-25 07:26:00
file 2 | 2016-02-26 07:37:00
file 1 | 2016-02-27 07:39:00
file 3 | 2016-02-27 12:00:00
file 1 | 2016-02-28 07:25:00
file 2 | 2016-02-29 02:15:00
file 2 | 2016-02-29 08:38:00
file 1 | 2016-02-29 12:00:00
I don't know exactly why you need it, but here is a plain bash solution:
#!/bin/bash
sqldata="file_name | schedule_time
--------------------------+---------------------
file 1 | 2016-02-25 07:26:00
file 2 | 2016-02-26 07:37:00
file 1 | 2016-02-27 07:39:00
file 3 | 2016-02-27 12:00:00
file 1 | 2016-02-28 07:25:00
file 2 | 2016-02-29 02:15:00
file 2 | 2016-02-29 08:38:00
file 1 | 2016-02-29 12:00:00"
sqldata=$(echo "$sqldata" | tail -n +3) # skip first 3 lines
oldifs="$IFS"
IFS=$'\r\n'
lines=( $sqldata )
IFS="$oldifs"
files=()
dates=()
idx=0
for i in "${lines[#]}"
do
files[idx]=$(echo $i | sed -E 's/ +\|.*//')
data[idx]=$(echo $i | sed -E 's/ .*\|//')
idx=$(($idx + 1))
done
echo files:
echo ${files[#]}
echo data:
echo ${data[#]}

Resources