I'm clicking photos on a camera (Fuji x100s) that doesn't store the filenumber tag in the exif. Though it adds this information in the file name, eg, DSCF0488.JPG, DSCF0489.JPG, DSCF0490.JPG.
How do I extract this number and set it as the file number?
To extract the numbers from the files in your example, using native bash regex, you could do something like this:
for i in *.JPG; do
[[ $i =~ ([[:digit:]]+) ]] && echo ${BASH_REMATCH[1]}
done
Running that loop from the directory containing the files in your question would give the output:
0488
0489
0490
So if you have a tool called exif_script that can add this information to your files, you could do something like:
for photo in *.JPG; do
if [[ $photo =~ ([[:digit:]]+) ]]; then
file_number="${BASH_REMATCH[1]}"
exif_script # set number to $file_number
fi
done
If you don't have a new enough bash to support regular expression matching, you could use sed:
for i in *.JPG; do file_number=$(echo "$i" | sed 's/[^0-9]\{1,\}\([0-9]\{1,\}\).JPG/\1/'); done
The value of $file_number will be the same as in the first piece of code but this approach should work on the vast majority of platforms.
Related
I have some problems adapting the answers from previous questions, so I hope it is ok to write for a specific solution.
I have a file with RNA-reads in the fasta format, however the end of the readname has been messed up, so I need to correct it.
It is a simple task of padding zeroes into the middle of a string, however I cannot get it to work as I also need to identify the length and the position of the problem.
My read file header looks like this:
#V350037327L1C001R0010000023/1_U1
and I need to search for the "/1_U" and then left pad zeroes to the rest of the line up to a total length of 6.
It will look like this:
#V350037327L1C001R0010000023/1_U000001
The final length should be six following "/1_U".
eg: input:
#V350037327L1C001R0010000055/1_U300 = /1_U000300
#V350037327L1C001R0010000122/1_U45000 = /1_U045000
I have tried with awk, however I cannot get it to check the initial length and hence not pad the correct number of zeroes.
Thank you in advance and thank you for your neverending support in this forum
Try this:
#! /bin/bash
files=('#V350037327L1C001R0010000023/1_U1'
'#V350037327L1C001R0010000055/1_U300'
'#V350037327L1C001R0010000122/1_U45000')
for file in "${files[#]}"; do
if [[ $file =~ ^(.*U)([0-9]+)$ ]]; then
printf '%s%06d\n' "${BASH_REMATCH[#]:1}"
fi
done
Update: This reads the files from stdin.
#! /bin/bash
while read -r file; do
if [[ $file =~ ^(.*U)([0-9]+)$ ]]; then
printf '%s%06d\n' "${BASH_REMATCH[#]:1}"
fi
done
Update 2: You should really learn the basics of shell programming before you start programming the shell. Typical basics are conditional constructs.
#! /bin/bash
while read -f file; do
if [[ $file =~ ^(.*U)([0-9]+)$ ]]; then
printf '%s%06d\n' "${BASH_REMATCH[#]:1}"
else
printf '%s\n' "$file"
fi
done
I need help. I should write a script,whih will move all non-ASCII files from one directory to another. I got this code,but i dont know why it is not working.
#!/bin/bash
for file in "/home/osboxes/Parkhom"/*
do
if [ -eq "$( echo "$(file $file)" | grep -nP '[\x80-\xFF]' )" ];
then
if test -e "$1"; then
mv $file $1
fi
fi
done
exit 0
It's not clear which one you are after, but:
• To test if the variable $file contains a non-ASCII character, you can do:
if [[ $file == *[^[:ascii:]]* ]]; then
• To test if the file $file contains a non-ASCII character, you can do:
if grep -qP '[^[:ascii:]]' "$file"; then
So for example your code would look like:
for file in "/some/path"/*; do
if grep -qP '[^[:ascii:]]' "$file"; then
test -d "$1" && mv "$file" "$1"
fi
done
The first problem is that your first if statement has an invalid test clause. The -eq operator of [ needs to take one argument before and one after; your before argument is gone or empty.
The second problem is that I think the echo is redundant.
The third problem is that the file command always has ASCII output but you're checking for binary output, which you'll never see.
Using file pretty smart for this application, although there are two ways you can go on this; file says a variety of things and what you're interested in are data and ASCII, but not all files that don't identify as data are ASCII and not all files that don't identify as ASCII are data. You might be better off going with the original idea of using grep, unless you need to support Unicode files. Your grep is a bit strange to me so I don't know what your environment is but I might try this:
#!/bin/bash
for file in "/home/osboxes/Parkhom"/*
do
if grep -qP '[\0x80-\0xFF]' $file; then
[ -e "$1" ] && mv $file $1
fi
done
The -q option means be quiet, only return a return code, don't show the matches. (It might be -s in your grep.) The return code is tested directly by the if statement (no need to use [ or test). The && in the next line is just a quick way of saying if the left-hand side is true, then execute the right-hand side. You could also form this as an if statement if you find that clearer. [ is a synonym for test. Personally if $1 is a directory and doesn't change, I'd check it once at the beginning of the script instead of on each file, it would be faster.
If you mean you want to know if something is not a plain text file then you can use the file command which returns information about the type of a file.
[[ ! $( file -b "$file" ) =~ (^| )text($| ) ]]
The -b simply tells it not to bother returning the filename.
The returned value will be something like:
ASCII text
HTML document text
POSIX shell script text executable
PNG image data, 21 x 34, 8-bit/color RGBA, non-interlaced
gzip compressed data, from Unix, last modified: Mon Oct 31 14:29:59 2016
The regular expression will check whether the returned file information includes the word "text" that is included for all plain text file types.
You can instead filter for specific file types like "ASCII text" if that is all you need.
I have number of multiple files in a folder and their filenames contains alphanumeric values. For e.g. 045_gfds.sql, 46kkkk.sql, 47asdf.sql etc. I want to compare numbers in these filenames with another number stored in variable lets say $× =45 and find out files which has greater than number contain in filename. I am using Cygwin and currently only able to retrieve numbers using egrep command. for e.g.
filename="C:\scripts"
dir $filename | egrep -o [0-9]+
Output is : 045 46 47
I want output as filename after comparing greater than $=45 with all the filenames as:
46kkkk.sql
47asdf.sql
Need help with regular expressions for comparing greater than values in filename.
#!/bin/bash
dir="$1"
print_if_greater="45"
for fname in "$dir"/[0-9]*; do
num="${fname##*/}" # isolate filename from path
num="${num%%[^0-9]*}" # extract leading digits from filename
if (( num > print_if_greater )); then
printf '%s\n' "$fname"
fi
done
The above script will go through all file in the given directory that starts with at least one digit.
The filename is stripped from the path, and the initial digits in the filename are extracted using the variable expansion syntax of bash.
If the number that is extracted is greater than $print_if_greater, then the full pathname is displayed on standard output.
This script is invoked with the directory that you'd like to examine:
$ ./thescript.sh 'C:\scripts'
or
$ bash ./thescript.sh 'C:\scripts'
I haven't got access to Cygwin, so I haven't been able to test it with Window-styled paths. If the above doesn't work, try with C:/scripts as the path.
You can try this :
DIR="C:\scripts"
MAX=45
for FILE in "$DIR"/*
do
if
[[ "$FILE" =~ ^([0-9]+) ]]
then
NUMBER="${BASH_REMATCH[1]}"
if
[ "$NUMBER" -gt "$MAX" ]
then
echo "$FILE"
fi
fi
done
Please note I have not tested this code. It is bash-specific, and assumes the numbers are always at the beginning of the filename.
I have a directory with files frame* (schema) and input_frame* (schema), where frame and input_frame are prefixes for two different types of files. If one takes just the characters after the prefixes and compares the two file lists, then the set of files frame* is always a subset of the set input_frame*.
I'd like to remove the files in input_frame* that don't have an equivalent member in frame*. Is there an easy way to do this in bash?
You can use:
for f in input_frame*; do [[ ! -f "${f#input_}" ]] && echo rm "$f"; done
Once you're satisfied with the output remove echo.
Something like this (test it first by using echo instead of rm):
for i in input_frame*;
if [ ! -e ${i/input_/} ]; then
rm $i
fi
done
Since frame is itself a suffix of input_frame, you can accomplish this with a simple use of the # parameter expansion operator.
for f in input_frame*; do
[[ -f "${f#input_}" ]] || rm "$f"
done
For example, if f is input_frame97, then ${f#input_} expands to frame97. Just check if the modified file name exists, and if not, remove the original.
I have an assignment to write a bash program which if I type in the following:
-bash-4.1$ ./sample.sh path regex keyword
that will result something like that:
path/sample.txt:12
path/sample.txt:34
path/dir/sample1.txt:56
path/dir/sample2.txt:78
The numbers are the line number of the search results. I have absolutely no idea how can I achieve this in bash, without using find or grep -r. I am allowed to use grep, sed, awk, …
Break the problem into parts.
First, you need to obtain the file names to search in. How can you list the files in a directory and its subdirectories? (Hint: there's a glob pattern for that.)
You need to iterate over the files. What form of loop should this be?
For each file, you need to read each line from the file in turn. There's a builtin for that.
For each line, you need to test whether the line matches the specified regexp. There's a construct for that.
You need to maintain a counter of the number of lines read in a file to be able to print the line number.
Search for globstar in the bash manual.
See https://unix.stackexchange.com/questions/18886/why-is-while-ifs-read-used-so-often-instead-of-ifs-while-read/18936#18936 regarding while read loops.
shopt -s globstar # to enable **/
GLOBIGNORE=.:.. # to match dot files
dir=$1; regex=$2
for file in "$dir"/**/*; do
[[ -f $file ]] || continue
n=1
while IFS= read -r line; do
if [[ $line =~ $regex ]]; then
echo "$file:$n"
fi
((++n))
done <"$file"
done
It's possible that your teacher didn't intend you to use the globstar feature, which is a relatively recent addition to bash (appeared in version 4.0). If so, you'll need to write a recursive function to recurse into subdirectories.
traverse_directory () {
for x in "$1"/*; do
if [ -d "$x" ]; then
traverse_directory "$x"
elif [ -f "$x" ]; then
grep "$regexp" "$x"
fi
done
}
Putting this into practice:
#!/bin/sh
regexp="$2"
traverse_directory "$1"
Follow-up exercise: the glob pattern * omits files whose name begins with a . (dot files). You can easily match dot files as well by adding looping over .* as well, i.e. for x in .* *; do …. However, this throws the function into an infinite loop as it recurses forever into . (and also ..). How can you change the function to work with dot files as well?
while read
do
[[ $REPLY =~ foo ]] && echo $REPLY
done < file.txt