Unix file making with one randomly chosen file in a directory

Unix file making with one randomly chosen file in a directory - bash

I’m having a hard time figuring out why my script will make files, but not add the text into the files it creates in the “eggs” directory. Can anyone help me figure out what I have wrong in my code? Or offer suggestions?
I’ve tried single > and double >> for the text appending to the file but it doesn’t. It just leaves the files blank.
Edit:
file=0
RandomEgg=$(( RANDOM % 10 ))
cd eggs
while [ $file -lt 10 ]
do
touch "egg$file"
file=$(( file +1 ))
done
for files in $(ls eggs)
do
if [ $file -eq $RandomEgg ]
then
echo 'Found it!' > egg$file
else
echo 'Not Here!' > egg$file
fi
done

In bash, the script could be reduced into
cd eggs || exit
RandomEgg=$(( RANDOM % 10 ))
for ((i = 0; i < 10; ++i)); do
if ((i == RandomEgg)); then
echo 'Found it!'
else
echo 'Not Here!'
fi > egg$i
done
or,
cd eggs || exit
RandomEgg=$(( RANDOM % 10 ))
msg=('Not Here!' 'Found it!')
for ((i = 0; i < 10; ++i)); do echo "${msg[i == RandomEgg]}" > egg$i; done

Change the second loop to:
for files in egg*
do
if [ $files = "egg$RandomEgg" ]
then
echo 'Found it!' > $files
else
echo 'Not Here!' > $files
fi
done
You don't need to use ls to list the files, just use a wildcard. You were also listing the wrong directory -- the files were created in the current directory, not the eggs subdirectory.
You need to use $files as the filename, not egg$file, since that's the variable from this for loop.
You must use = to compare strings, not -eq. -eq is for numbers.

Related

Unix shell script problem with space in the file name

I an new to shell scripting. We have a requirement to write a shell
script for below requirement. We will get daily files in the below
format.
1. /app/dstage/BAL/Activ Bal_pen_20200129.xls
2. /app/dstage/BAL/Activ Bal_pen_20200130.xls
3. /app/dstage/BAL/Activ Bal_pen_20200131.xls We need to write a shell script as soon as the files arrvies in the > direcotry I need move
them to another staging directory. I am writing below script. However
i am getting message file not found even the file "Activ
Bal_pen_20200129.xls" present in the directory. I am seeing the
problem is with the space in the file name. How to resolve the space
issue in the file name. my Script is below:
#!/bin/ksh
# Validation the number of arguments
if (( $# == 4 )); then
Pattern=$1
numFilesExpected=$2
stgDir=$3
maxWaitTime=$4
else
print -u2 "Wrong number of arguments (${#}): Usage FileValidation.ksh <Pattern> <numFilesExpected> <stgDir> <maxWaitTime>"
return 8
fi
# While max waiting duration is not obtained find files and set the array containing the file names; closing STDERR in case no files are found
waitTime=0
while (( $waitTime < $maxWaitTime )) ; do
set -A fileList $(ls $Pattern 2<&-)
if (( ${#fileList[#]} != 0 )); then
break
fi
sleep 5
((waitTime++))
done
# Resetting the fileList in case it did not go in the loop
set -A fileList $(ls $Pattern 2<&-)
# Verifying if the pattern returned a file
if (( ${#fileList[#]} == 0 )); then
print -u2 "No file found with pattern: $Pattern"
return 1
#
elif [[ $(basename "$Pattern") = "*" ]]; then
print -u2 "Found a source file pattern with no prefix (only a path with wildcard *): $Pattern"
return 1
# Validation of the number of files expected
elif (( $numFilesExpected != 0 && ${#fileList[#]} != $numFilesExpected )); then
print -n "${#fileList[#]}"
return 2
fi

How do I create large CSVs in seconds?

I am trying to create 1000s of large CSVs rapidly. This function generates the CSVs:
function csvGenerator () {
for ((i=1; i<=$NUMCSVS; i++)); do
CSVNAME=$DIRNAME"-"$CSVPREFIX$i$CSVEXT
HEADERARRAY=()
if [[ ! -e $CSVNAME ]]; then #Only create csv file if it not exist
touch $CSVNAME
echo "file: "$CSVNAME "created at $(date)" >> ../status.txt
fi
for ((j=1; j<=$NUMCOLS; j++)); do
if (( j < $NUMCOLS )) ; then
HEADERNAME=$DIRNAME"-csv-"$i"-header-"$j", "
elif (( j == $NUMCOLS )) ; then
HEADERNAME=$DIRNAME"-csv-"$i"-header-"$j
fi
HEADERARRAY+=$HEADERNAME
done
echo $HEADERARRAY > $CSVNAME
for ((k=1; k<=$NUMROWS; k++)); do
ROWARRAY=()
for ((l=1; l<=$NUMCOLS; l++)); do
if (( l < $NUMCOLS )) ; then
ROWVALUE=$DIRNAME"-csv-"$i"-r"$k"c"$l", "
elif (( l == $NUMCOLS )) ; then
ROWVALUE=$DIRNAME"-csv-"$i"-r"$k"c"$l
fi
ROWARRAY+=$ROWVALUE
done
echo $ROWARRAY >> $CSVNAME
done
done
}
The script takes ~3 mins to generate a CSV with 100k rows and 70 cols. What do I need to do to generate these CSVs at the rate of 1 CSV/~10 seconds?

Let me start by saying that bash and "performant" don't usually go together in the same sentence. As other commentators suggested, awk may be a good choice that's adjacent in some senses.
I haven't yet had a chance to run your code, but it opens and closes the output file once per row — in this example, 100,000 times. Each time it must seek to the end of the file so that it can append the latest row.
Try pulling the actual generation (everything after for ((j=1; j<=$NUMCOLS; j++)); do) into a new function, like generateCsvContents. In that new function, don't reference $CSVNAME, and remove the redirections on the echo statements. Then, in the original function, call the new function and redirect its output to the filename. Roughly:
function csvGenerator () {
for ((i=1; i<=NUMCSVS; i++)); do
CSVNAME=$DIRNAME"-"$CSVPREFIX$i$CSVEXT
if [[ ! -e $CSVNAME ]]; then #Only create csv file if it not exist
echo "file: $CSVNAME created at $(date)" >> ../status.txt
fi
# This will create $CSVNAME if it doesn't yet exist
generateCsvContents > "$CSVNAME"
done
}
function generateCsvContents() {
HEADERARRAY=()
for ((j=1; j<=NUMCOLS; j++)); do
if (( j < NUMCOLS )) ; then
HEADERNAME=$DIRNAME"-csv-"$i"-header-"$j", "
elif (( j == NUMCOLS )) ; then
HEADERNAME=$DIRNAME"-csv-"$i"-header-"$j
fi
HEADERARRAY+=$HEADERNAME
done
echo $HEADERARRAY
for ((k=1; k<=NUMROWS; k++)); do
ROWARRAY=()
for ((l=1; l<=NUMCOLS; l++)); do
if (( l < NUMCOLS )) ; then
ROWVALUE=$DIRNAME"-csv-"$i"-r"$k"c"$l", "
elif (( l == NUMCOLS )) ; then
ROWVALUE=$DIRNAME"-csv-"$i"-r"$k"c"$l
fi
ROWARRAY+=$ROWVALUE
done
echo "$ROWARRAY"
done
}

"Not this way" is I think the answer.
There are a few problems here.
You're not using your arrays as arrays. When you treat them like strings, you affect only the first element in the array, which is misleading.
The way you're using >> causes the output file to be opened and closed once for every line. That's potentially wasteful.
You're not quoting your variables. In fact, you're quoting the stuff that doesn't need quoting, and not quoting the stuff that does.
Upper case variable names are not recommended, due to the risk of collision with system variables. ref
Bash isn't good at this. Really.
A cleaned up version of your function might look like this:
csvGenerator2() {
for (( i=1; i<=NUMCSVS; i++ )); do
CSVNAME="$DIRNAME-$CSVPREFIX$i$CSVEXT"
# Only create csv file if it not exist
[[ -e "$CSVNAME" ]] && continue
touch "$CSVNAME"
date "+[%F %T] created: $CSVNAME" | tee -a status.txt >&2
HEADER=""
for (( j=1; j<=NUMCOLS; j++ )); do
printf -v HEADER '%s, %s-csv-%s-header-%s' "$HEADER" "$DIRNAME" "$i" "$j"
done
echo "${HEADER#, }" > "$CSVNAME"
for (( k=1; k<=NUMROWS; k++ )); do
ROW=""
for (( l=1; l<=NUMCOLS; l++ )); do
printf -v ROW '%s, %s-csv-%s-r%sc%s' "$ROW" "$DIRNAME" "$i" "$k" "$l"
done
echo "${ROW#, }"
done >> "$CSVNAME"
done
}
(Note that I haven't switched the variables to lower case because I'm lazy, but it's still a good idea.)
And if you were to make something functionally equivalent in awk:
csvGenerator3() {
awk -v NUMCSVS="$NUMCSVS" -v NUMCOLS="$NUMCOLS" -v NUMROWS="$NUMROWS" -v DIRNAME="$DIRNAME" -v CSVPREFIX="$CSVPREFIX" -v CSVEXT="$CSVEXT" '
BEGIN {
for ( i=1; i<=NUMCSVS; i++) {
out=sprintf("%s-%s%s%s", DIRNAME, CSVPREFIX, i, CSVEXT)
if (!system("test -e " CSVNAME)) continue
system("date '\''+[%F %T] created: " out "'\'' | tee -a status.txt >&2")
comma=""
for ( j=1; j<=NUMCOLS; j++ ) {
printf "%s%s-csv-%s-header-%s", comma, DIRNAME, i, j > out
comma=", "
}
printf "\n" >> out
for ( k=1; k<=NUMROWS; k++ ) {
comma=""
for ( l=1; l<=NUMCOLS; l++ ) {
printf "%s%s-csv-%s-r%sc%s", comma, DIRNAME, i, k, l >> out
comma=", "
}
printf "\n" >> out
}
}
}
'
}
Note that awk does not suffer from the same open/closer overhead mentioned earlier with bash; when a file is used for output or as a pipe, it gets opened once and is left open until it is closed.
Comparing the two really highlights the choice you need to make:
$ time bash -c '. file; NUMCSVS=1 NUMCOLS=10 NUMROWS=100000 DIRNAME=2 CSVPREFIX=x CSVEXT=.csv csvGenerator2'
[2019-03-29 23:57:26] created: 2-x1.csv
real 0m30.260s
user 0m28.012s
sys 0m1.395s
$ time bash -c '. file; NUMCSVS=1 NUMCOLS=10 NUMROWS=100000 DIRNAME=3 CSVPREFIX=x CSVEXT=.csv csvGenerator3'
[2019-03-29 23:58:23] created: 3-x1.csv
real 0m4.994s
user 0m3.297s
sys 0m1.639s
Note that even my optimized bash version is only a little faster than your original code.

Refactoring your two inner for-loops to loops like this will save time:
for ((j=1; j<$NUMCOLS; ++j)); do
HEADERARRAY+=$DIRNAME"-csv-"$i"-header-"$j", "
done
HEADERARRAY+=$DIRNAME"-csv-"$i"-header-"$NUMCOLS

How to list files with words exceeding n characters in all subdirectories

I have to write a shell script that creates a file containing the name of each text files from a folder (given as parameter) and it's subfolders that contain words longer than n characters (read n from keyboard).
I wrote the following code so far :
#!/bin/bash
Verifies if the first given parameter is a folder:
if [ ! -d $1 ]
then echo $1 is not a directory\!
exit 1
fi
Reading n
echo -n "Give the number n: "
read n
echo "You entered: $n"
Destination where to write the name of the files:
destinatie="destinatie"
the actual part that i think it makes me problems:
nr=0;
#while read line;
#do
for fisier in `find $1 -type f`
do
counter=0
for word in $(<$fisier);
do
file=`basename "$fisier"`
length=`expr length $word`
echo "$length"
if [ $length -gt $n ];
then counter=$(($counter+1))
fi
done
if [ $counter -gt $nr ];
then echo "$file" >> $destinatie
fi
done
break
done
exit
The script works but it does a few more steps that i don't need.It seems like it reads some files more than 1 time. If anyone can help me please?

Does this help?
egrep -lr "\w{$n,}" $1/* >$destinatie
Some explanation:
\w means: a character that words consist of
{$n,} means: number of consecutive characters is at least $n
Option -l lists files and does not print the grepped text and -r performs a recursive scan on your directory in $1
Edit:
a bit more complete version around the egrep command:
#!/bin/bash
die() { echo "$#" 1>&2 ; exit 1; }
[ -z "$1" ] && die "which directory to scan?"
dir="$1"
[ -d "$dir" ] || die "$dir isn't a directory"
echo -n "Give the number n: "
read n
echo "You entered: $n"
[ $n -le 0 ] && die "the number should be > 0"
destinatie="destinatie"
egrep -lr "\w{$n,}" "$dir"/* | while read f; do basename "$f"; done >$destinatie

This code has syntax errors, probably leftovers from your commented-out while loop: It would be best to remove the last 3 lines: done causes the error, break and exit are unnecessary as there is nothing to break out from and the program always terminates at its end.
The program appears to output files multiple times because you just append to $destinatie. You could simply delete that file when you start:
rm "$destinatie"
You echo the numbers to stdout (echo "$length") and the file names to $destinatie (echo "$file" >> $destinatie). I do not know if that is intentional.

I found the problem.The problem was the directory in which i was searching.Because i worked on the files from the direcotry and modified them , it seems that there remained some files which were not displayed in file explorer but the script would find them.i created another directory and i gived it as parameter and it works. Thank you for your answers
.

Using selfwritten .sh file in another .sh file

I am writing a small .sh program in bash.
The problem is extremely simple, i.e, to dind the primefactors of a number.
What I've done is written a .sh file to check if a number is prime or not.
Here is the code for that :
if [ $# -ne 1 ]; then
exit
fi
number=$1
half=$(($number / 2))
for (( i=2;i<$half;i++ ))
do
rem=$(($number % $i))
if [ $rem -eq 0 ]; then
echo "0"
exit
fi
done
echo "1"
And the second .sh file to generate prime factors :
clear
echo "Enter number : "
read number
half=$(($number / 2))
for(( i=1;i<=$half;i++ ))
do
rem=$(($number % $i))
if [ $rem -eq 0 ]; then
ok=`prime.sh $rem`
if [ "$ok" == "1" ]; then
echi $i
fi
fi
done
This line ,
ok=`prime.sh $rem`
gives the following error :
primefactor.sh: line 10: prime.sh: command not found
So, is it not possible to divide a program into smaller modules and use it in the other modules like other programming languages ?
Some help on how to achieve this will be helpful.

primefactor.sh: line 10: prime.sh: command not found
...means that prime.sh is not in your PATH, or is not executable. There are a few ways you can remedy this:
First, ensure that the +x bit is set:
chmod +x prime.sh
...then, add it to your PATH:
PATH=.:$PWD
...or invoke it directly:
ok=$(./prime.sh)
By the way, names ending in .sh are appropriate for POSIX sh libraries, not bash scripts (which typically aren't valid POSIX sh scripts anyhow). You don't run ls.elf; you should run prime, not prime.sh, for the same reasons.
That said, if your goal is just to split your code amongst multiple files, a library might be the right thing. Using subshells (which fork an existing shell instance) is much more efficient than spawning subprocesses (which involve both a fork and an exec).
For instance, you could write prime.bash:
check() {
local number half i rem
number=$1
half=$((number / 2))
for (( i=2; i<half; i++ )); do
rem=$((number % i))
if (( rem == 0 )); then
echo "0"
return
fi
done
echo "1"
}
...and then, in your primefactor script, read in that library and use the function it defined:
source prime.bash # read in the library
clear
echo "Enter number : "
read number
half=$((number / 2))
for(( i=1;i<=half;i++ ))
do
rem=$((number % i))
if (( rem == 0 )); then
ok=$(check "$rem")
if [[ $ok = 1 ]]; then
echo "$i"
fi
fi
done

Call your script like this:
ok=`./prime.sh $rem`

To Continuously loop using for in shell scripting

for m in $count
do
`cat $op ${arr[$m]} > $op1`
`rm -f $op`
`touch $op`
`cat $op1 ${arr[$m+1]} > $op`
if [ $m ge $count ]; then
`rm -f $op1`
`touch $op1`
fi
m=$((m+1))
done
I wanted to continuously loop from the start count 2 till the end count 10 . The $count=10 here. But the above piece of code executes the for loop only once.

Rainy sunday - having much free time - long answer ;)
Many issues with your script, some recommended solutions. Because you used the construction m=$((m+1)) - will be using bash as "shell". (Consider adding the bash tag)
For the cycle - several possibilities
count=10
m=2 #start with 2
while (( $m <= $count )) #while m is less or equal to 10
do #do
echo $m #this action
let m++ #increment m (add one to m)
done #end of while
or, if the count is a constant (not a variable), you can write
for m in {2..10} #REMEMBER, will not works with a variables, like {2..$count}
do
echo "$m"
done
another variant - using the seq (man seq) command for counting
for m in $(seq 2 ${count:=10}) # ${count:=10} - defaults the $count to 10 if it is undefined
do
echo $m
done
or C-like for loop
let count=10
for ((m=2; m<=count; m++))
do
echo $m
done
All 4 loops produces:
2
3
4
5
6
7
8
9
10
so, having a right cycle now. Now add your specific actions.
The:
rm -f $op
touch $op
can be replaced by one command
echo -n > $op #echo nothing and write the "nothing" into the file
it is faster, because the echo is an bash builtin (doesn't start two external commands)
So your actions could looks like
cat $op ${arr[$m]} > $op1
echo -n > $op
cat $op1 ${arr[$m+1]} > $op
in this case, the echo is useless, because the second cat will write its output
to the $op anyway (and before write shortens the file to zero size), so this result is
identical with the above
cat $op ${arr[$m]} > $op1
cat $op1 ${arr[$m+1]} > $op
Those two cat commands can be shorted to one, using bash's >> append to file redirection
cat ${arr[$m]} ${arr[m+1]} >> $op
The whole script could look like the next
#making a testing environment
for f in $(seq 12) #create 12 files opdata-N
do
arr[$f]="opdata-$f" #store the filenames in the array "arr"
echo "data-$f" > ${arr[$f]} #each file contains one line "data-N"
done
#echo ${arr[#]}
#setting the $op and $op1 filenames
#consider choosing more descriptive variable names
op="file_op"
#op1="file_op1" #not needed
#add some initial (old) value to $op
echo "initial value" > $op
#end of creating the testing environment
#the script
count=10
for m in $(seq 2 $count)
do
cat ${arr[$m]} ${arr[m+1]} >> $op
done
at the end, file $op will contain:
initial value
data-2
data-3
data-3
data-4
data-4
data-5
data-5
data-6
data-6
data-7
data-7
data-8
data-8
data-9
data-9
data-10
data-10
data-11
BTW, are you sure about the result? Because if only want add file-2 .. file-10 to the end of $op (without duplicating entries), you can simple write:
cat file-{2..10} >> $op #the '>>' adds to the end of file...
or by using your array:
startpos=2
count=10
cat ${arr[#]:$startpos:$count} >> $op
Ufff.. ;)
Ps: usually it is good practice to enclose variables in double quotes like "$filename" - in the above examples for better readability I omitted them.

Any loop needs a "condition to keep looping". When you use a
for m in count
type of loop, the condition is "if there are more elements in the collection count, pick the next one and keep going". This doesn't seem to be what you want. You are looking for the bash equivalent of
for(m = 0; m < 10; m++)
I think. The best way to do this is - with exactly that kind of loop (but note - an extra pair of parentheses, and a semicolon):
#!/bin/bash
# Display message 5 times
for ((i = 0 ; i < 5 ; i++)); do
echo "Welcome $i times."
done
see nix craft for original
I think you can extend this to your situation… if I understood your question correctly you need something like this:
for ((m = 2; m <= 10; m++))
do
cat $op ${arr[$m]} > $op1
rm -f $op
touch $op
cat $op1 ${arr[$m+1]} > $op
if [ $m ge $count ]; then
rm -f $op1
touch $op1
fi
done

Use a while loop instead.
The for loop is when you have multiple objects to iterate against. You have only one, i.e. $count.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Unix file making with one randomly chosen file in a directory - bash

Related

Unix shell script problem with space in the file name

How do I create large CSVs in seconds?

How to list files with words exceeding n characters in all subdirectories

Using selfwritten .sh file in another .sh file

To Continuously loop using for in shell scripting

Categories

Resources