So I'm having difficulty figuring this out.
What I am trying to do, is display the most recently entered command
Let's use this as an example:
MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
This command has just been executed. It is contained inside of a shell script.
After it is executed, the output is checked in an if..then..else.. statement.
If the condition is met, I want it to run the command above, except I want it incremented by one, every time it is ran.
For instance:
MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
if test ! $MD5=$HASH #$HASH is a user defined MD5Hash, it is checking if $MD5 does NOT equal the user's $HASH
then #one liner to display the history, to display the most recent
"MD5=$(cat $DICT | head -1 | tail -1 | md5sum)" #pipe it to remove the column count, then increment the "head -1" to "head -2"
else echo "The hash is the same."
fi #I also need this if..then..else statement to run, until the "else" condition is met.
Can anyone help, please and thank you. I'm having a brain fart.
I was thinking using sed, or awk to increment. grep to display the most recent of the commands,
So say:
$ history 3
Would output:
1 MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
2 test ! $MD5=$HASH
3 history 3
-
$ history 3 | grep MD5
Would output:
1 MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
Now I want it to remove the 1, and add a 1 to head's value, and rerun that command. And send that command back through the if..then..else test.
UPDATED
If I understood your problem well, this can be a solution:
# Setup test environment
DICT=infile
cat >"$DICT" <<XXX
Kraftwerk
King Crimson
Solaris
After Cyring
XXX
HASH=$(md5sum <<<"After Cyring")
# Process input file and look for match
while read line; do
md5=$(md5sum<<<"$line")
((++count))
[ "$HASH" == "$md5" ] && echo "The hash is the same. ($count)" && break
done <$DICT
Output:
The hash is the same. (4)
I improved the script a little bit. It spares one more clone(2) and pipe(2) call using md5sum<<<word notation instead of echo word|md5sum.
At first it sets up the test env creating infile and a HASH. Then it reads each line of the input file, creates the MD5 checksum and checks if is matches with HASH. If so it writes some message to stdout and breaks the loop.
IMHO the original problem was a little bit over-thought.
Related
I have a reference file with device names in them. For example WABEL8499IPM101. I'm using this script to set the base name (without the last 3 digits) to look at the reference file and see what is already used. If 101 is used it will create a file for me with 102, 103 if I request 2 total. I'm looking to use an input file to run it multiple times. I'm also trying to figure out how to start at 101 if there isn't a name found when searching the reference file
I would like to loop this using an input file instead of manually entering bash test.sh WABEL8499IPM 2 each time. I would like to be able to build an input file of all the names that need compared and then output. It would also be nice that if there isn't a match that it starts creating names at WABEL8499IPM101 instead of just WABEL8499IPM1.
Input file example:
ColumnA (BASE NAME) ColumnB (QUANTITY)
WABEL8499IPM 2
Script:
SRCFILE="~/Desktop/deviceinfo.csv"
LOGDIR="~/Desktop/"
LOGFILE="$LOGDIR/DeviceNames.csv"
# base name, such as "WABEL8499IPM"
device_name=$1
# quantity, such as "2"
quantityNum=$2
# the largest in sequence, such as "WABEL8499IPM108"
max_sequence_name=$(cat $SRCFILE | grep -o -e "$device_name[0-9]*" | sort --reverse | head -n 1)
# extract the last 3digit number (such as "108") from max_sequence_name
max_sequence_num=$(echo $max_sequence_name | rev | cut -c 1-3 | rev)
# create new sequence_name
# such as ["WABEL8499IPM109", "WABEL8499IPM110"]
array_new_sequence_name=()
for i in $(seq 1 $quantityNum);
do
cnum=$((max_sequence_num + i))
array_new_sequence_name+=($(echo $device_name$cnum))
done
#CODE FOR CREATING OUTPUT FILE HERE
#for fn in ${array_new_sequence_name[#]}; do touch $fn; done;
# write log
for sqn in ${array_new_sequence_name[#]};
do
echo $sqn >> $LOGFILE
done
Usage:
bash test.sh WABEL8499IPM 2
Result in the log file:
WABEL8499IPM109
WABEL8499IPM110
Just wrap a loop around your code instead of assuming the args come in on the command line.
SRCFILE="~/Desktop/deviceinfo.csv"
LOGDIR="~/Desktop/"
LOGFILE="$LOGDIR/DeviceNames.csv"
while read device_name quantityNum
do max_sequence_name=$( grep -o -e "$device_name[0-9]*" $SRCFILE |
sort --reverse | head -n 1)
max_sequence_num=${max_sequence_name: -3}
array_new_sequence_name=()
for i in $(seq 1 $quantityNum)
do cnum=$((max_sequence_num + i))
array_new_sequence_name+=("$device_name$cnum")
done
for sqn in ${array_new_sequence_name[#]};
do echo $sqn >> $LOGFILE
done
done < input.file
I'd maybe pass the input file as the parameter now.
This is a follow up from my other post:
Printing all palindromes from text file
I want to be able to print to amount of palindromes that I have found from my text file similar to a frequency table. It'll show the amount of the word followed by the word, similar to this format:
100 did
32 sas
17 madam
My code right now is:
#!usr/bin/env bash
function search
{
grep -oiE '[a-z]{3,}' "$1" | sort -n | tr '[:upper:]' '[:lower:]' | while read -r word; do
[[ $word == $(rev <<< "$word") ]] && echo "$word" | uniq -c
done
}
search "$1"
In comparison to the last post I did: Printing all palindromes from text file . I have added "sort -n" and "uniq -c" which from my knowledge is to sort the palindromes found in alphabetical order, then "uniq -c" is to print the number of occurrences of the words found.
Just to test script I have a testing file named: "testingfile.txt" . This contains:
testing words testing words testing words
palindromes
Sas
Sas
Sas
sas
bob
Sas
Sas
Sas Sas madam
midim poop goog tot sas did i want to go to the movies did
otuikkiuto
pop
poop
This file is just so I can test before trying this script on a much larger file in which it'll take much longer.
When typing in the console: (also to note "palindrome" is the name of my script)
source palindrome testingfile.txt
The output appears like this:
1 bob
1 did
1 did
1 goog
1 madam
1 midim
1 otuikkiuto
1 poop
1 poop
1 pop
1 sas
1 sas
1 sas
1 sas
1 sas
1 sas
1 sas
1 sas
1 sas
1 tot
Is there something I am missing to get the result that I want:
9 sas
2 did
2 poop
1 bob
1 goog
1 madam
1 midim
1 otuikkiuto
1 pop
1 tot
Solutions to this would be greatly appreciated! If there are solutions with other commands that are needed an explanation of the reasoning behind the other commands are also greatly appreciated.
Thank you
You missed two important details:
You need to pass all input at once to uniq -c to count them, not one by one to one uniq each
uniq expects its input to be sorted. The sort you had in the grep pipeline is ineffective, because after the transformation to lowercase, the values would need to be sorted again
You can apply sort | uniq -c to the output of an entire loop,
by piping the loop itself:
grep -oiE '[a-z]{3,}' "$1" | tr '[:upper:]' '[:lower:]' | while read -r word; do
[[ $word == $(rev <<< "$word") ]] && echo "$word"
done | sort | uniq -c
Finally, to get an output sorted in descending order by count,
you need to further pipe the output to sort -nr.
I have a text file (bigfile.txt) with thousands of rows. I want to make a smaller text file with 1 % of the rows which are randomly chosen. I tried the following
output=$(wc -l bigfile.txt)
ds1=$(0.01*output)
sort -r bigfile.txt|shuf|head -n ds1
It give the following error:
head: invalid number of lines: ‘ds1’
I don't know what is wrong.
Even after you fix your issues with your bash script, it cannot do floating point arithmetic. You need external tools like Awk which I would use as
randomCount=$(awk 'END{print int((NR==0)?0:(NR/100))}' bigfile.txt)
(( randomCount )) && sort -r file | shuf | head -n "$randomCount"
E.g. Writing a file with with 221 lines using the below loop and trying to get random lines,
tmpfile=$(mktemp /tmp/abc-script.XXXXXX)
for i in {1..221}; do echo $i; done >> "$tmpfile"
randomCount=$(awk 'END{print int((NR==0)?0:(NR/100))}' "$tmpfile")
If I print the count, it would return me a integer number 2 and using that on the next command,
sort -r "$tmpfile" | shuf | head -n "$randomCount"
86
126
Roll a die (with rand()) for each line of the file and get a number between 0 and 1. Print the line if the die shows less than 0.01:
awk 'rand()<0.01' bigFile
Quick test - generate 100,000,000 lines and count how many get through:
seq 1 100000000 | awk 'rand()<0.01' | wc -l
999308
Pretty close to 1%.
If you want the order random as well as the selection, you can pass this through shuf afterwards:
seq 1 100000000 | awk 'rand()<0.01' | shuf
On the subject of efficiency which came up in the comments, this solution takes 24s on my iMac with 100,000,000 lines:
time { seq 1 100000000 | awk 'rand()<0.01' > /dev/null; }
real 0m23.738s
user 0m31.787s
sys 0m0.490s
The only other solution that works here, heavily based on OP's original code, takes 13 minutes 19s.
An application is continually writing to a log. Each line forms a new entry, the log is in a csv format. Example:
123123123,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
444444222,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
563434535,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234234334,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234234534,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
546456456,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
567567567,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234232342,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
I need to poll the log and extract the data in chunks appending the data to another log file called newLog.csv
I need to ensure that;
I don't copy data already moved over to the new file,
If there is not 200 lines of data then it captures the nearest number of lines available, without getting duplicates.
Can I change this tail statement to meet the above?
tail -n 200 $REMOTE_HOME/data/log.csv >> $SCRIPT_DIR/$project/newLog.csv
Provided the first data in the string is some sort of a time code (unixtime ?), you could do:
1.Check the time of last written line in new log.
LAST_LINE=tail -n 1 /PATH/new_log | awk -F',' '{print $1}'
2.Check the first line you want to write
FIRST_LINE=tail -n 200 /PATH/old_log | head -n 1
3.If the last line in new log is older than first line of 200 write 200 lines
if [ $LAST_LINE -lt $FIRST_LINE ]
do tail -n 200 /PATH/old_log >> /PATH/new_log;done;
Now you have to put it in a loop, to make stuff work if e.g. 3 lines overlap. Basically you do the same as before, just have to list the last 200 lines to get the first new one.
LAST_LINE=tail -n 1 /PATH/new_log | awk -F',' '{print $1}'
COUNT=200;
while [ $COUNT -gt 0 ]; do
FIRST_LINE=tail -n $COUNT /PATH/old_log | head -n 1
if [ $LAST_LINE -lt $FIRST_LINE ]
do tail -n $COUNT /PATH/old_log >> /PATH/new_log;break;done;
done
I try to take the first number from each file.dat of the form:
5.01 1 56.413481000 -0.00063400 0.00095770
5.01 2 61.193808800 0.00102170 0.00078280
5.01 3 65.974136600 -0.00108170 0.00102620
5.01 4 70.754464300 0.00082490 0.00103630
and then use this number (5.01) as the title of a .png file.
I use a bash script and I know the command line=$(head -n 1 $f) as found in a question here, but this take to me the first line of the file $f.
In this case also the space in the line is saved and the .png file title became:
plot 5.01 1 56.413481000 -0.00063400 0.00095770.png
There is some way to take only 5.01 and have a trim title for the plot?
Thanks to all.
I'd probably just do it with perl:
VAL=$( echo "$line" | perl -pe 's/^[^\d]+//g;s/[^\d\.].*$//' )
Something like that anyway.
Should remove:
anything that isn't a digit from the start of line.
Anything not-digit or not . to the end of line.
Or with grep:
grep -o "[0-9]*\.[0-9]*" file.dat | head -1
Edit:
Testing without the head -1 for a oneline input:
echo " 5.01 2 61.193808800 0.00102170 0.00078280" | grep -o "[0-9]*\.[0-9]*"
5.01
61.193808800
0.00102170
0.00078280
Using head -1 will return the first match on the first line.
When you know the match will be on the first line, so can we ignore files with an incorrect first line (and don't grep through complete files):
Make a two-headed monster:
head -1 | grep -o "[0-9]*\.[0-9]*" file.dat | head -1
To extract the first field, assuming they are tab separated:
val=$(head -n 1 $f | cut -f 1)
or, if they are space separated instead:
val=$(head -n 1 $f | cut -f 1 -d ' ')
OR you can avoid calling any extra processes and keep all data manipulation in the bash shell with
while read realNum restOfLine ;
break
done < $f
echo $realNum
This grabs the first "word" and puts the remaining into "restOfLine".
The break ensures that you only read the first line of the file.
IHTH