Get the first real number from a series of files

Get the first real number from a series of files - bash

I try to take the first number from each file.dat of the form:
5.01 1 56.413481000 -0.00063400 0.00095770
5.01 2 61.193808800 0.00102170 0.00078280
5.01 3 65.974136600 -0.00108170 0.00102620
5.01 4 70.754464300 0.00082490 0.00103630
and then use this number (5.01) as the title of a .png file.
I use a bash script and I know the command line=$(head -n 1 $f) as found in a question here, but this take to me the first line of the file $f.
In this case also the space in the line is saved and the .png file title became:
plot 5.01 1 56.413481000 -0.00063400 0.00095770.png
There is some way to take only 5.01 and have a trim title for the plot?
Thanks to all.

I'd probably just do it with perl:
VAL=$( echo "$line" | perl -pe 's/^[^\d]+//g;s/[^\d\.].*$//' )
Something like that anyway.
Should remove:
anything that isn't a digit from the start of line.
Anything not-digit or not . to the end of line.

Or with grep:
grep -o "[0-9]*\.[0-9]*" file.dat | head -1
Edit:
Testing without the head -1 for a oneline input:
echo " 5.01 2 61.193808800 0.00102170 0.00078280" | grep -o "[0-9]*\.[0-9]*"
5.01
61.193808800
0.00102170
0.00078280
Using head -1 will return the first match on the first line.
When you know the match will be on the first line, so can we ignore files with an incorrect first line (and don't grep through complete files):
Make a two-headed monster:
head -1 | grep -o "[0-9]*\.[0-9]*" file.dat | head -1

To extract the first field, assuming they are tab separated:
val=$(head -n 1 $f | cut -f 1)
or, if they are space separated instead:
val=$(head -n 1 $f | cut -f 1 -d ' ')

OR you can avoid calling any extra processes and keep all data manipulation in the bash shell with
while read realNum restOfLine ;
break
done < $f
echo $realNum
This grabs the first "word" and puts the remaining into "restOfLine".
The break ensures that you only read the first line of the file.
IHTH

Related

Loop Script from Input File

I have a reference file with device names in them. For example WABEL8499IPM101. I'm using this script to set the base name (without the last 3 digits) to look at the reference file and see what is already used. If 101 is used it will create a file for me with 102, 103 if I request 2 total. I'm looking to use an input file to run it multiple times. I'm also trying to figure out how to start at 101 if there isn't a name found when searching the reference file
I would like to loop this using an input file instead of manually entering bash test.sh WABEL8499IPM 2 each time. I would like to be able to build an input file of all the names that need compared and then output. It would also be nice that if there isn't a match that it starts creating names at WABEL8499IPM101 instead of just WABEL8499IPM1.
Input file example:
ColumnA (BASE NAME) ColumnB (QUANTITY)
WABEL8499IPM 2
Script:
SRCFILE="~/Desktop/deviceinfo.csv"
LOGDIR="~/Desktop/"
LOGFILE="$LOGDIR/DeviceNames.csv"
# base name, such as "WABEL8499IPM"
device_name=$1
# quantity, such as "2"
quantityNum=$2
# the largest in sequence, such as "WABEL8499IPM108"
max_sequence_name=$(cat $SRCFILE | grep -o -e "$device_name[0-9]*" | sort --reverse | head -n 1)
# extract the last 3digit number (such as "108") from max_sequence_name
max_sequence_num=$(echo $max_sequence_name | rev | cut -c 1-3 | rev)
# create new sequence_name
# such as ["WABEL8499IPM109", "WABEL8499IPM110"]
array_new_sequence_name=()
for i in $(seq 1 $quantityNum);
do
cnum=$((max_sequence_num + i))
array_new_sequence_name+=($(echo $device_name$cnum))
done
#CODE FOR CREATING OUTPUT FILE HERE
#for fn in ${array_new_sequence_name[#]}; do touch $fn; done;
# write log
for sqn in ${array_new_sequence_name[#]};
do
echo $sqn >> $LOGFILE
done
Usage:
bash test.sh WABEL8499IPM 2
Result in the log file:
WABEL8499IPM109
WABEL8499IPM110

Just wrap a loop around your code instead of assuming the args come in on the command line.
SRCFILE="~/Desktop/deviceinfo.csv"
LOGDIR="~/Desktop/"
LOGFILE="$LOGDIR/DeviceNames.csv"
while read device_name quantityNum
do max_sequence_name=$( grep -o -e "$device_name[0-9]*" $SRCFILE |
sort --reverse | head -n 1)
max_sequence_num=${max_sequence_name: -3}
array_new_sequence_name=()
for i in $(seq 1 $quantityNum)
do cnum=$((max_sequence_num + i))
array_new_sequence_name+=("$device_name$cnum")
done
for sqn in ${array_new_sequence_name[#]};
do echo $sqn >> $LOGFILE
done
done < input.file
I'd maybe pass the input file as the parameter now.

How to properly use the grep command to grab and store integers?

I am currently building a bash script for class, and I am trying to use the grep command to grab the values from a simple calculator program and store them in the variables I assign, but I keep receiving a syntax error message when I try to run the script. Any advice on how to fix it? my script looks like this:
#!/bin/bash
addanwser=$(grep -o "num1 + num2" Lab9 -a 5 2)
echo "addanwser"
subanwser=$(grep -o "num1 - num2" Lab9 -s 10 15)
echo "subanwser"
multianwser=$(grep -o "num1 * num2" Lab9 -m 3 10)
echo "multianwser"
divanwser=$(grep -o "num1 / num2" Lab9 -d 100 4)
echo "divanwser"
modanwser=$(grep -o "num1 % num2" Lab9 -r 300 7)
echo "modawser"`

You want to grep the output of a command.
grep searches from either a file or standard input. So you can say either of these equivalent:
grep X file # 1. from a file
... things ... | grep X # 2. from stdin
grep X <<< "content" # 3. using here-strings
For this case, you want to use the last one, so that you execute the program and its output feeds grep directly:
grep <something> <<< "$(Lab9 -s 10 15)"
Which is the same as saying:
Lab9 -s 10 15 | grep <something>
So that grep will act on the output of your program. Since I don't know how Lab9 works, let's use a simple example with seq, that returns numbers from 5 to 15:
$ grep 5 <<< "$(seq 5 15)"
5
15

grep is usually used for finding matching lines of a text file. To actually grab a part of the matched line other tools such as awk are used.
Assuming the output looks like "num1 + num2 = 54" (i.e. fields are separated by space), this should do your job:
addanwser=$(Lab9 -a 5 2 | awk '{print $NF}')
echo "$addanwser"
Make sure you don't miss the '$' sign before addanwser when echo'ing it.
$NF selects the last field. You may select nth field using $n.

Incrementing a single number in a history line

So I'm having difficulty figuring this out.
What I am trying to do, is display the most recently entered command
Let's use this as an example:
MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
This command has just been executed. It is contained inside of a shell script.
After it is executed, the output is checked in an if..then..else.. statement.
If the condition is met, I want it to run the command above, except I want it incremented by one, every time it is ran.
For instance:
MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
if test ! $MD5=$HASH #$HASH is a user defined MD5Hash, it is checking if $MD5 does NOT equal the user's $HASH
then #one liner to display the history, to display the most recent
"MD5=$(cat $DICT | head -1 | tail -1 | md5sum)" #pipe it to remove the column count, then increment the "head -1" to "head -2"
else echo "The hash is the same."
fi #I also need this if..then..else statement to run, until the "else" condition is met.
Can anyone help, please and thank you. I'm having a brain fart.
I was thinking using sed, or awk to increment. grep to display the most recent of the commands,
So say:
$ history 3
Would output:
1 MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
2 test ! $MD5=$HASH
3 history 3
-
$ history 3 | grep MD5
Would output:
1 MD5=$(cat $DICT | head -1 | tail -1 | md5sum)
Now I want it to remove the 1, and add a 1 to head's value, and rerun that command. And send that command back through the if..then..else test.

UPDATED
If I understood your problem well, this can be a solution:
# Setup test environment
DICT=infile
cat >"$DICT" <<XXX
Kraftwerk
King Crimson
Solaris
After Cyring
XXX
HASH=$(md5sum <<<"After Cyring")
# Process input file and look for match
while read line; do
md5=$(md5sum<<<"$line")
((++count))
[ "$HASH" == "$md5" ] && echo "The hash is the same. ($count)" && break
done <$DICT
Output:
The hash is the same. (4)
I improved the script a little bit. It spares one more clone(2) and pipe(2) call using md5sum<<<word notation instead of echo word|md5sum.
At first it sets up the test env creating infile and a HASH. Then it reads each line of the input file, creates the MD5 checksum and checks if is matches with HASH. If so it writes some message to stdout and breaks the loop.
IMHO the original problem was a little bit over-thought.

How do I pick random unique lines from a text file in shell?

I have a text file with an unknown number of lines. I need to grab some of those lines at random, but I don't want there to be any risk of repeats.
I tried this:
jot -r 3 1 `wc -l<input.txt` | while read n; do
awk -v n=$n 'NR==n' input.txt
done
But this is ugly, and doesn't protect against repeats.
I also tried this:
awk -vmax=3 'rand() > 0.5 {print;count++} count>max {exit}' input.txt
But that obviously isn't the right approach either, as I'm not guaranteed even to get max lines.
I'm stuck. How do I do this?

This might work for you:
shuf -n3 file
shuf is one of GNU coreutils.

If you have Python accessible (change the 10 to what you'd like):
python -c 'import random, sys; print("".join(random.sample(sys.stdin.readlines(), 10)).rstrip("\n"))' < input.txt
(This will work in Python 2.x and 3.x.)
Also, (again change the 10 to the appropriate value):
sort -R input.txt | head -10

If jot is on your system, then I guess you're running FreeBSD or OSX rather than Linux, so you probably don't have tools like rl or sort -R available.
No worries. I had to do this a while ago. Try this instead:
$ printf 'one\ntwo\nthree\nfour\nfive\n' > input.txt
$ cat rndlines
#!/bin/sh
# default to 3 lines of output
lines="${1:-3}"
# default to "input.txt" as input file
input="${2:-input.txt}"
# First, put a random number at the beginning of each line.
while read line; do
printf '%8d%s\n' $(jot -r 1 1 99999999) "$line"
done < "$input" |
sort -n | # Next, sort by the random number.
sed 's/^.\{8\}//' | # Last, remove the number from the start of each line.
head -n "$lines" # Show our output
$ ./rndlines input.txt
two
one
five
$ ./rndlines input.txt
four
two
three
$
Here's a 1-line example that also inserts the random number a little more cleanly using awk:
$ printf 'one\ntwo\nthree\nfour\nfive\n' | awk 'BEGIN{srand()} {printf("%8d%s\n", rand()*10000000, $0)}' | sort -n | head -n 3 | cut -c9-
Note that different versions of sed (in FreeBSD and OSX) may require the -E option instead of -r to handle ERE instead or BRE dialect in the regular expression if you want to use that explictely, though everything I've tested works with escapted bounds in BRE. (Ancient versions of sed (HP/UX, etc) might not support this notation, but you'd only be using those if you already knew how to do this.)

This should do the trick, at least with bash and assuming your environment has the other commands available:
cat chk.c | while read x; do
echo $RANDOM:$x
done | sort -t: -k1 -n | tail -10 | sed 's/^[0-9]*://'
It basically outputs your file, placing a random number at the start of each line.
Then it sorts on that number, grabs the last 10 lines, and removes that number from them.
Hence, it gives you ten random lines from the file, with no repeats.
For example, here's a transcript of it running three times with that chk.c file:
====
pax$ testprog chk.c
} else {
}
newNode->next = NULL;
colm++;
====
pax$ testprog chk.c
}
arg++;
printf (" [%s] n", currNode->value);
free (tempNode->value);
====
pax$ testprog chk.c
char tagBuff[101];
}
return ERR_OTHER;
#define ERR_MEM 1
===
pax$ _

sort -Ru filename | head -5
will ensure no duplicates. Not all implementations of sort have the -R option.

To get N random lines from FILE with Perl:
perl -MList::Util=shuffle -e 'print shuffle <>' FILE | head -N

Here's an answer using ruby if you don't want to install anything else:
cat filename | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
for example, given a file (dups.txt) that looks like:
1 2
1 3
2
1 2
3
4
1 3
5
6
6
7
You might get the following output (or some permutation):
cat dups.txt| ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
4
6
5
1 2
2
3
7
1 3
Further example from the comments:
printf 'test\ntest1\ntest2\n' | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
test1
test
test2
Of course if you have a file with repeated lines of test you'll get just one line:
printf 'test\ntest\ntest\n' | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
test

What's an easy way to read random line from a file?

What's an easy way to read random line from a file in a shell script?

You can use shuf:
shuf -n 1 $FILE
There is also a utility called rl. In Debian it's in the randomize-lines package that does exactly what you want, though not available in all distros. On its home page it actually recommends the use of shuf instead (which didn't exist when it was created, I believe). shuf is part of the GNU coreutils, rl is not.
rl -c 1 $FILE

Another alternative:
head -$((${RANDOM} % `wc -l < file` + 1)) file | tail -1

sort --random-sort $FILE | head -n 1
(I like the shuf approach above even better though - I didn't even know that existed and I would have never found that tool on my own)

This is simple.
cat file.txt | shuf -n 1
Granted this is just a tad slower than the "shuf -n 1 file.txt" on its own.

perlfaq5: How do I select a random line from a file? Here's a reservoir-sampling algorithm from the Camel Book:
perl -e 'srand; rand($.) < 1 && ($line = $_) while <>; print $line;' file
This has a significant advantage in space over reading the whole file in. You can find a proof of this method in The Art of Computer Programming, Volume 2, Section 3.4.2, by Donald E. Knuth.

using a bash script:
#!/bin/bash
# replace with file to read
FILE=tmp.txt
# count number of lines
NUM=$(wc - l < ${FILE})
# generate random number in range 0-NUM
let X=${RANDOM} % ${NUM} + 1
# extract X-th line
sed -n ${X}p ${FILE}

Single bash line:
sed -n $((1+$RANDOM%`wc -l test.txt | cut -f 1 -d ' '`))p test.txt
Slight problem: duplicate filename.

Here's a simple Python script that will do the job:
import random, sys
lines = open(sys.argv[1]).readlines()
print(lines[random.randrange(len(lines))])
Usage:
python randline.py file_to_get_random_line_from

Another way using 'awk'
awk NR==$((${RANDOM} % `wc -l < file.name` + 1)) file.name

A solution that also works on MacOSX, and should also works on Linux(?):
N=5
awk 'NR==FNR {lineN[$1]; next}(FNR in lineN)' <(jot -r $N 1 $(wc -l < $file)) $file
Where:
N is the number of random lines you want
NR==FNR {lineN[$1]; next}(FNR in lineN) file1 file2
--> save line numbers written in file1 and then print corresponding line in file2
jot -r $N 1 $(wc -l < $file) --> draw N numbers randomly (-r) in range (1, number_of_line_in_file) with jot. The process substitution <() will make it look like a file for the interpreter, so file1 in previous example.

#!/bin/bash
IFS=$'\n' wordsArray=($(<$1))
numWords=${#wordsArray[#]}
sizeOfNumWords=${#numWords}
while [ True ]
do
for ((i=0; i<$sizeOfNumWords; i++))
do
let ranNumArray[$i]=$(( ( $RANDOM % 10 ) + 1 ))-1
ranNumStr="$ranNumStr${ranNumArray[$i]}"
done
if [ $ranNumStr -le $numWords ]
then
break
fi
ranNumStr=""
done
noLeadZeroStr=$((10#$ranNumStr))
echo ${wordsArray[$noLeadZeroStr]}

Here is what I discovery since my Mac OS doesn't use all the easy answers. I used the jot command to generate a number since the $RANDOM variable solutions seems not to be very random in my test. When testing my solution I had a wide variance in the solutions provided in the output.
RANDOM1=`jot -r 1 1 235886`
#range of jot ( 1 235886 ) found from earlier wc -w /usr/share/dict/web2
echo $RANDOM1
head -n $RANDOM1 /usr/share/dict/web2 | tail -n 1
The echo of the variable is to get a visual of the generated random number.

Using only vanilla sed and awk, and without using $RANDOM, a simple, space-efficient and reasonably fast "one-liner" for selecting a single line pseudo-randomly from a file named FILENAME is as follows:
sed -n $(awk 'END {srand(); r=rand()*NR; if (r<NR) {sub(/\..*/,"",r); r++;}; print r}' FILENAME)p FILENAME
(This works even if FILENAME is empty, in which case no line is emitted.)
One possible advantage of this approach is that it only calls rand() once.
As pointed out by #AdamKatz in the comments, another possibility would be to call rand() for each line:
awk 'rand() * NR < 1 { line = $0 } END { print line }' FILENAME
(A simple proof of correctness can be given based on induction.)
Caveat about rand()
"In most awk implementations, including gawk, rand() starts generating numbers from the same starting number, or seed, each time you run awk."
-- https://www.gnu.org/software/gawk/manual/html_node/Numeric-Functions.html

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Get the first real number from a series of files - bash

I'd probably just do it with perl: VAL=$( echo "$line" | perl -pe 's/^[^\d]+//g;s/[^\d\.].*$//' ) Something like that anyway. Should remove: anything that isn't a digit from the start of line. Anything not-digit or not . to the end of line.

To extract the first field, assuming they are tab separated: val=$(head -n 1 $f | cut -f 1) or, if they are space separated instead: val=$(head -n 1 $f | cut -f 1 -d ' ')

OR you can avoid calling any extra processes and keep all data manipulation in the bash shell with while read realNum restOfLine ; break done < $f echo $realNum This grabs the first "word" and puts the remaining into "restOfLine". The break ensures that you only read the first line of the file. IHTH

Related

Loop Script from Input File

How to properly use the grep command to grab and store integers?

Incrementing a single number in a history line

How do I pick random unique lines from a text file in shell?

What's an easy way to read random line from a file?

Categories

Resources