there is a directory which contains folders named with numbers, i've to find the folder with largest number in that directory.
This is the script i've written to find that folder:
files='ls path/'
var=0
for file in $files
do
echo $file
tmp=$((file-"0"))
if [ $tmp -gt $var ]
then
var=$tmp
fi
done
echo $var
But it's not working. It gives below error after invoking the script using command sudo ./restore2.sh.
ls
path/
./restore2.sh: line 6: path/: syntax error: operand expected (error token is "/")
0
Try this:
#!/bin/bash
files=`ls path/`
var=0
for file in $files
do
echo $file
tmp=$((file-"0"))
if [ $tmp -gt $var ]
then
var=$tmp
fi
done
echo $var
there's a backtick here: ls path/ instead of single or double-quotes.
I've only corrected this statement and it worked. and notice to add #!/bin/bash at the top of the script. This will tell your system to run the script in a bash shell.
You're using single quotes instead of backticks files='ls path/'. It's trying to use it as a literal string instead of evaluating it.
Also, for that specific task, you can just do:
ls test | awk '{if($1 > largest){largest = $1}} END{print largest}'
To have it a bit simpler.
Use find instead:
find . -maxdepth 1 -type d -regextype "posix-extended" -regex "^.*[[:digit:]]+.*$" | sort -n | tail -1
Set the maxdepth to 1 to check for directories within this directory only and no deeper. Set the regular expression type to posix-extended and search for all directories that have one or more digits. Print the result and order through sort before taking the largest one with tail -1.
Does path/ have any files in it? It looks like it's empty.
You should be getting a completely different complaint...
You don't want the path info in the filename. Rather than strip it with ${file##*/}, just go there and use non-path'd names.
An adaptation using your own logic as its base -
cd /whatever/path/ # go where the files are
var=-1 # initialize comparator
for file in [0-9]* # each entry that starts with a digit
do [[ "$file" =~ [^0-9] ]] && continue # skip any file with nondigit contents
[[ -f "$file" ]] || continue # only process plain files
(( file > var )) && var=$file # remember largest seen
done
echo $var # report largest
If you are sure there will be no negative numbered filenames, this should do it.
If there can be valid negatives, then your initialization needs to be appropriately lower, and the exclusion of nondigits should include the minus sign, as well as the list of files to select.
Note that this doesn't parse ls and doesn't require piping through a sort or spawning any other processes -- it's all handled in the bash interpreter and should be pretty efficient.
If you are sure of your data, and know there aren't any negatives or files named just 0 or non-plain-file entries in the directory that match the [0-9]* pattern, you can simplify it to just
cd /whatever/path/ # go where the files are
for file in [0-9]*; do (( file > var )) && var=$file; done
echo $var # report largest
As an aside, if you wanted to preserve the "make a list first" logic, you should still NOT use ls. Use an array.
cd /wherever/your/files/are/
files=( [0-9]* )
for file in "${files[#]}"
do : ...
Related
I have a parent directory with over 800+ directories, each of these has a unique name. Some of these directories house a sub-directory called y in which a file called z, (if it exists) can be found.
I need to script a loop that will check each of the 800+ for z, and if it's there, I need to append the name of the directory (the directory before y) into a text file. I'm not sure how to do this.
This is what I have
#!/bin/bash
for d in *; do
if [ -d "y"]; then
for f in *; do
if [ -f "x"]
echo $d >> IDlist.txt
fi
fi
done
Let's assume that any foo/y/z is a file (that is, you do not have directories with such names). If you had a really large number of such files, storing all paths in a bash variable could lead to memory issues, and would advocate for another solution, but about 800 paths is not large. So, something like this should be OK:
declare -a names=(*/y/z)
printf '%s\n' "${names[#]%%/*}" > IDlist.txt
Explanation: the paths of all z files are first stored in array names, thanks to a glob pattern: */y/z. Then, a pattern substitution is applied to each array element to suppress the /y/z part: "${names[#]%%/*}". The result is printed, one name per line: printf '%s\n'.
If you also had directories named z, or if you had millions of files, find could be used, instead, with a bit of awk to retain only the leading directory name:
find . -mindepth 3 -maxdepth 3 -path './*/y/z' -type f |
awk -F/ '{print $2}' > IDlist.txt
If you prefer sed for the post-processing:
find . -mindepth 3 -maxdepth 3 -path './*/y/z' -type f |
sed 's|^\./\(.*\)/y/z|\1|' > IDlist.txt
These two are probably also more efficient (faster).
Note: your initial attempt could also work, even if using bash loops is far less efficient, but it needs several changes:
#!/bin/bash
for d in *; do
if [ -d "$d/y" ]; then
for f in "$d"/y/*; do
if [ "$f" = "$d/y/z" ]; then
printf '%s\n' "$d" >> IDlist.txt
fi
done
fi
done
As noted by #LéaGris, printf is better than echo because if d is the -e string, for instance, echo "$d" interprets it as an option of the echo command and does not print it.
But a simpler and more efficient version (even if not as efficient as the first proposal or the find-based ones) would be:
#!/bin/bash
for d in *; do
if [ -f "$d/y/z" ]; then
printf '%s\n' "$d"
fi
done > IDlist.txt
As you can see there is another improvement (also suggested by #LéaGris), which consists in redirecting the output of the entire loop to the IDlist.txt file. This will open and close the file only once, instead of once per iteration.
This should solve it:
for f in */y/z; do
[ -f "$f" ] && echo ${f%%/*}
done
Note:
If there is a possibility of weird top level directory name like "-e", use printf instead of echo, as in the comment below.
This should do it:
shopt -s nullglob
outfile=IDlist.txt
>$outfile
for found in */y/x
do
[[ -f $found ]] && echo "${found%%/*}" >>$outfile # Drop the /y/x part
done
The nullglob ensures that the loop is skipped if there is no match, and the quotes in the echo ensure that the directory name is output correctly even if it contains two successive spaces.
You can first try to do some filtering using find
Below will list all z files recursively within current directory
Then let's say the one of the output was
./dir001/y/z
Then you can extract required part using multiple ways grep, sed, awk, etc
e.g. with grep
find . -type f | grep z | grep -E -o "y.*$"
will give
y/z
The first example doesn't check that z is a file, but I think it's worth showing compgen:
#!/bin/bash
compgen -G '*/y/z' | sed 's|/.*||' > IDlist.txt
Doing glob expansion, file check and path splitting with perl only:
perl -E 'foreach $p (glob "*/y/z") {say substr($p, 0, index($p, "/")) if -f $p}' > IDlist.txt
I'm writing a fairly basic shell script that loops through files within a directory and renames the file and adds a dot(.) to the start of the file however it does not work
any insight on whats going wrong?
for file in /tmp/test/*; do
mv $file \\.$file;
done
There are two problems.
You're putting the dot before the whole pathname, not just the filename part.
You're prefixing the filename with \. instead of just .. There's no need for \\ in the mv command.
Corrected code:
for file in /tmp/test/*; do
mv "$file" "${file%/*}/.${file##*/}";
done
${file%/*} returns the value of $file with everything starting from the last / removed, which is the directory part of the pathname. ${file##*/}" returns the value of $file with everything up to the last / removed, which is the filename part. Then it puts them back together with /. between them, which adds the . prefix that you want to the filename part. See Bash parameter expansion documentation for details of these operators.
Also, remember to quote variables so you don't get errors when the variable contains whitespace.
This is a simple script that takes a directory argument:
hide_files.sh:
if [ $# -ne 1 ] || [ ! -d $1 ]; then
echo 'invalid dir arg.'
exit 1
fi
for f in $(ls $1); do
mv -v "$1/$f" "$1/.$f"
done
output:
$ bash hide_files.sh mydir
mydir/a -> mydir/.a
mydir/c -> mydir/.c
I have list of files at a location ${POWERCENTER_FILE_DIR} .
The files consist of row header and values.
MART_Apple.csv
MART_SAMSUNG.csv
MART_SONY.csv
MART_BlackBerry.csv
Requirements:
select only those files which has atleast 1 row.
Add time stamp to the files which has at least 1 row.
For example:
If all the files except MART_BlackBerry.csv has atleast one row then my output files names should be
MART_Apple_20170811112807.csv
MART_SAMSUNG_20170811112807.csv
MART_SONY_20170811112807.csv
Code tried so far
#!/bin/ksh
infilename=${POWERCENTER_FILE_DIR}MART*.csv
echo File name is ${infilename}
if [ wc -l "$infilename"="0" ];
then
RV=-1
echo "input file name cannot be blank or *"
exit $RV
fi
current_timestamp=`date +%Y%m%d%H%M%S`
filename=`echo $infilename | cut -d"." -f1 `
sftpfilename=`echo $filename`_${current_timestamp}.csv
cp -p ${POWERCENTER_FILE_DIR}$infilename ${POWERCENTER_FILE_DIR}$sftpfilename
RV=$?
if [[ $RV -ne 0 ]];then
echo Adding timestamp to ${POWERCENTER_FILE_DIR}$infilename failed ... Quitting
echo Return Code $RV
exit $RV
fi
Encountering errors like:
line 3: [: -l: binary operator expected
cp: target `MART_Apple_20170811121023.csv' is not a directory
failed ... Quitting
Return Code 1
to be frank, i am not able to apprehend the errors nor i am sure i am doing it right. Beginner in unix scripting.Can any experts guide me where to the correct way.
Here's an example using just find, sh, mv, basename, and date:
find ${POWERCENTER_FILE_DIR}MART*.csv ! -empty -execdir sh -c "mv {} \$(basename -s .csv {})_\$(date +%Y%m%d%H%M%S).csv" \;
I recommend reading Unix Power Tools for more ideas.
When it comes to shell scripting there is rarely a single/one/correct way to accomplish the desired task.
Often times you may need to trade off between readability vs maintainability vs performance vs adhering-to-some-local-coding-standard vs shell-environment-availability (and I'm sure there are a few more trade offs). So, fwiw ...
From your requirement that you're only interested in files with at least 1 row, I read this to also mean that you're only interested in files with size > 0.
One simple ksh script:
#!/bin/ksh
# define list of files
filelist=${POWERCENTER_FILE_DIR}/MART*.csv
# grab current datetime stamp
dtstamp=`date +%Y%m%d%H%M%S`
# for each file in our list ...
for file in ${filelist}
do
# each file should have ${POWERCENTER_FILE_DIR} as a prefix;
# uncomment 'echo' line for debugging purposes to display
# the contents of the ${file} variable:
#echo "file=${file}"
# '-s <file>' => file exists and size is greater than 0
# '! -s <file>' => file doesn't exist or size is equal to 0, eg, file is empty in our case
#
# if the file is empty, skip/continue to next file in loop
if [ ! -s ${file} ]
then
continue
fi
# otherwise strip off the '.csv'
filebase=${file%.csv}
# copy our current file to a new file containing the datetime stamp;
# keep in mind that each ${file} already contains the contents of the
# ${POWERCENTER_FILE_DIR} variable as a prefix; uncomment 'echo' line
# for debug purposes to see what the cp command looks like:
#echo "cp command=cp ${file} ${filebase}.${dtstamp}.csv"
cp ${file} ${filebase}.${dtstamp}.csv
done
A few good resources for learning ksh:
O'Reilly: Learning the Korn Shell
O'Reilly: Learning the Korn Shell, 2nd Edition (includes the newer ksh93)
at your UNIX/Linux command line: man ksh
A simplified script would be something like
#!/bin/bash
# Note I'm using bash above, can't guarantee (but I hope) it would work in ksh too.
for file in ${POWERCENTER_FILE_DIR}/*.csv # Check Ref [1]
do
if [ "$( wc -l "$file" | grep -Eo '^[[:digit:]]+' )" -ne 0 ] # checking at least one row? Check Ref [2]
then
mv "$file" "${file%.csv}$(date +'%Y%m%d%H%M%S').csv" # Check Ref [3]
fi
done
References
File Globbing [1]
Command Substitution [2]
Parameter Substitution [3]
I have number of multiple files in a folder and their filenames contains alphanumeric values. For e.g. 045_gfds.sql, 46kkkk.sql, 47asdf.sql etc. I want to compare numbers in these filenames with another number stored in variable lets say $× =45 and find out files which has greater than number contain in filename. I am using Cygwin and currently only able to retrieve numbers using egrep command. for e.g.
filename="C:\scripts"
dir $filename | egrep -o [0-9]+
Output is : 045 46 47
I want output as filename after comparing greater than $=45 with all the filenames as:
46kkkk.sql
47asdf.sql
Need help with regular expressions for comparing greater than values in filename.
#!/bin/bash
dir="$1"
print_if_greater="45"
for fname in "$dir"/[0-9]*; do
num="${fname##*/}" # isolate filename from path
num="${num%%[^0-9]*}" # extract leading digits from filename
if (( num > print_if_greater )); then
printf '%s\n' "$fname"
fi
done
The above script will go through all file in the given directory that starts with at least one digit.
The filename is stripped from the path, and the initial digits in the filename are extracted using the variable expansion syntax of bash.
If the number that is extracted is greater than $print_if_greater, then the full pathname is displayed on standard output.
This script is invoked with the directory that you'd like to examine:
$ ./thescript.sh 'C:\scripts'
or
$ bash ./thescript.sh 'C:\scripts'
I haven't got access to Cygwin, so I haven't been able to test it with Window-styled paths. If the above doesn't work, try with C:/scripts as the path.
You can try this :
DIR="C:\scripts"
MAX=45
for FILE in "$DIR"/*
do
if
[[ "$FILE" =~ ^([0-9]+) ]]
then
NUMBER="${BASH_REMATCH[1]}"
if
[ "$NUMBER" -gt "$MAX" ]
then
echo "$FILE"
fi
fi
done
Please note I have not tested this code. It is bash-specific, and assumes the numbers are always at the beginning of the filename.
I have an assignment to write a bash program which if I type in the following:
-bash-4.1$ ./sample.sh path regex keyword
that will result something like that:
path/sample.txt:12
path/sample.txt:34
path/dir/sample1.txt:56
path/dir/sample2.txt:78
The numbers are the line number of the search results. I have absolutely no idea how can I achieve this in bash, without using find or grep -r. I am allowed to use grep, sed, awk, …
Break the problem into parts.
First, you need to obtain the file names to search in. How can you list the files in a directory and its subdirectories? (Hint: there's a glob pattern for that.)
You need to iterate over the files. What form of loop should this be?
For each file, you need to read each line from the file in turn. There's a builtin for that.
For each line, you need to test whether the line matches the specified regexp. There's a construct for that.
You need to maintain a counter of the number of lines read in a file to be able to print the line number.
Search for globstar in the bash manual.
See https://unix.stackexchange.com/questions/18886/why-is-while-ifs-read-used-so-often-instead-of-ifs-while-read/18936#18936 regarding while read loops.
shopt -s globstar # to enable **/
GLOBIGNORE=.:.. # to match dot files
dir=$1; regex=$2
for file in "$dir"/**/*; do
[[ -f $file ]] || continue
n=1
while IFS= read -r line; do
if [[ $line =~ $regex ]]; then
echo "$file:$n"
fi
((++n))
done <"$file"
done
It's possible that your teacher didn't intend you to use the globstar feature, which is a relatively recent addition to bash (appeared in version 4.0). If so, you'll need to write a recursive function to recurse into subdirectories.
traverse_directory () {
for x in "$1"/*; do
if [ -d "$x" ]; then
traverse_directory "$x"
elif [ -f "$x" ]; then
grep "$regexp" "$x"
fi
done
}
Putting this into practice:
#!/bin/sh
regexp="$2"
traverse_directory "$1"
Follow-up exercise: the glob pattern * omits files whose name begins with a . (dot files). You can easily match dot files as well by adding looping over .* as well, i.e. for x in .* *; do …. However, this throws the function into an infinite loop as it recurses forever into . (and also ..). How can you change the function to work with dot files as well?
while read
do
[[ $REPLY =~ foo ]] && echo $REPLY
done < file.txt