How to get files do not match a pattern - bash

Here is my script
data_dir="/home/data"
shopt extglob
files=!($data_dir/*08142014*)
echo ${files[#]}
for file in $files[#]
do
#blabla
done
/home/data contains multiple files with different date info within file name, thus I should be able to get a list of files that does not contain "08142014".
But kept get syntax error. It seems files is just "!(/home/data/08202014)", while I want a list of file names.
Did I miss anything? Thanks

You can use:
data_dir="/home/data"
shopt -s extglob
files=($data_dir/!(*08142014*))
for file in "${files[#]}"
do
echo "$file"
done
To set extglob you need to use shopt -s extglob
To set array your syntax isn't right
Check how array is correctly iterated

You can use ->
files=`ls $data_dir | grep -v 08142014`

Related

Assign files in a list when using the bash find command

Im trying to see if I can assign the output of the find command to a variable. In this case it would be a list and iterate one file at a time to evaluate the file.
Ive tried this:
#!/bin/bash
PATH=/Users/mike/test
LIST='find $PATH -name *.txt'
for newfiles in #LIST; do
#checksize
echo $newfiles
done
My output is:
#LIST
Im trying to do the same this as the glob command in perl in bash.
var = glob "PATH/*.txt";
Use $(command) to execute command and substitute its output in place of that construct.
list=$(find "$PATH" -name '*.txt')
And to access a variable, put $ before the name, not # (your perl experience is showing).
for newfiles in $list; do
echo "$newfiles"
done
However, it's dangerous to parse the output of find like this, because you'll get incorrect results if any of the filenames contain whitespace -- it will be treated as multiple names. It's better to pipe the output:
find "$PATH" -name '*.txt' | while read -r newfiles; do
echo "$newfiles"
done
Also, notice that you should quote any variables that you don't want to be split into multiple words if they contain whitespace.
And avoid using all-uppercase variable names. This is conventionally reserved for environment variables.
LIST=$(find $PATH -name *.txt)
for newfiles in $LIST; do
Beware that you will have issues if any of the files have whitespace in the names.
Assuming you are using bash 4 or later, don't use find at all here.
shopt -s globstar nullglob
list=( "$path"/**/*.txt )
for newfile in "${list[#]}"; do
echo "$newfile"
done

How do I rename multiple files before the extension in linux?

I want to take a group of files with names like 123456_1_2.mpg and turn it into 123456.mpg how can I do this using terminal commands?
To loop over all the available files you can use a for loop over the file names of the form ??????_?_?.mpg.
To rename the files you can retain the shortest match of a pattern from the beginning of the string using ${MYVAR%%pattern} without using any external command.
This said, your code should look like:
#!/bin/bash
shopt -s nullglob # do nothing if no matches found
for file in ??????_?_?.mpg; do
[[ -f $file ]] || continue # skip if not a regular file
new_file="${file%%_*}.mpg" # compose the new file name
echo mv "$file" "$new_file" # remove echo after testing
done
rename 's/_.*/.mpg/' *mpg
this will remove everything between the first underscore and the mpg file extension for all files ending in mpg
We can use grep to strip out everything but the first sequence of numbers. The --interactive flag will ask you if you're sure for each move, so you can make sure it's not doing anything you don't expect.
for file in *.mpg; do
mv --interactive "$file" "$(grep -o '^[0-9]\+' <<< "$file")".mpg
done
The regex ^[0-9]\+ translates to "any sequence of characters that starts with a number and is followed by zero or more numbers".

Glob string stored in a variable is not expanded when used in a command

I'm trying to work on a set of files with various extensions but I'm not that experienced with the inner workings of bash... this is what I' trying to accomplished (stripped down):
DOCUMENT_SOURCE_FILE_PATTERN="*.{yaml,md}";
pandoc -s -f markdown -o combined.html $DOCUMENT_SOURCE_FILE_PATTERN;
results in
pandoc: *.{yaml,md}: openFile: does not exist (No such file or directory)
Whereas when I do it directly
pandoc -s -f markdown -o combined.html *.{yaml,md};
it works perfectly.
The value of $DOCUMENT_SOURCE_FILE_PATTERN is really generated by command line arguments and not hard coded, otherwise the direct approach in the example above would be good enough already.
as requested, here's a fully self contained example
put the below code into a test.sh script within an empty directory
#!/bin/bash
# setup
touch 0001.md
touch 0002.md
touch metadata.yaml
# actual functionality under test
DOCUMENT_SOURCE_FILE_PATTERN="yaml,md";
shopt -s nullglob;
DOCUMENT_SOURCE_FILES=( *.{$DOCUMENT_SOURCE_FILE_PATTERN} );
echo "required logic below:";
echo "${DOCUMENT_SOURCE_FILES[#]}";
echo;
echo "working solution with hardcoding:";
DOCUMENT_SOURCE_FILES=( *.{yaml,md} );
echo "${DOCUMENT_SOURCE_FILES[#]}";
# tear down
rm *.{yaml,md};
Don't try to add a glob string in a variable. Use an array and do quoted array expansion. The nullglob is to ensure the literal glob string is not passed to the array but only the expanded list if available
shopt -s nullglob
document_source_file_pattern=( *.{yaml,md} )
and pass the array as
pandoc -s -f markdown -o combined.html "${document_source_file_pattern[#]}"
as one more level of safe-way you could do below, which runs your pandoc command only the array is non-zero.
(( "${#document_source_file_pattern[#]}" )) &&
pandoc -s -f markdown -o combined.html "${document_source_file_pattern[#]}"
On this line, you are trying to double-expand (first $DOCUMENT_SOURCE_FILE_PATTERN, then the resulting pattern):
DOCUMENT_SOURCE_FILES=( *.{$DOCUMENT_SOURCE_FILE_PATTERN} );
You can't do that directly.
If you trust that $DOCUMENT_SOURCE_FILE_PATTERN isn't going to contain malicious input, then you can achieve what you want using eval:
eval DOCUMENT_SOURCE_FILES=( *.{$DOCUMENT_SOURCE_FILE_PATTERN} )
But instead, you could (should) probably try to do this in a different way, like add the files you want to an array, instead of trying to dynamically create a brace expansion in your code:
# prevent literal globs being added to the array when no files match
shopt -s nullglob
source_files=()
if <whatever condition to add markdown files>; then
source_files+=( *.md )
fi
if <whatever condition to add yaml files>; then
source_files+=( *.yaml )
fi

How can I iterate over the contents of a directory in unix without using a wildcard?

I totally understand what the problem is here.
I have a set of files, prepended as 'cat.jpg' and 'dog.jpg.' I just want to move the 'cat.jpg' files into a directory called 'cat.' Same with the 'dog.jpg' files.
for f in *.jpg; do
name=`echo "$f"|sed 's/ -.*//'`
firstThreeLetters=`echo "$name"|cut -c 1-3`
dir="path/$firstThreeLetters"
mv "$f" "$dir"
done
I get this message:
mv: cannot stat '*.jpg': No such file or directory
That's fine. But I can't find any way to iterate over these images without using that wildcard.
I don't want to use the wildcard. The only files are prepended with the 'dog' or 'cat'. I don't need to match. All the files are .jpgs.
Can't I just iterate over the contents of the directory without using a wildcard? I know this is a bit of an XY Problem but still I would like to learn this.
*.jpg would yield the literal *.jpg when there are no matching files.
Looks like you need nullglob. With Bash, you can do this:
#!/bin/bash
shopt -s nullglob # makes glob expand to nothing in case there are no matching files
for f in cat*.jpg dog*.jpg; do # pick only cat & dog files
first3=${f:0:3} # grab first 3 characters of filename
[[ -d "$first3" ]] || continue # skip if there is no such dir
mv "$f" "$first3/$f" # move
done

Listing files with spaces in name

Problem
In some directory, I have some files with spaces (or maybe some special character) in their filenames.
Trying to list one file per line, I used ls -1 but files with spaces in name are not processed as I expected.
Example
I have these three files:
$ ls -1
My file 1.zip
My file 2.zip
My file 3.zip
and I want to list and do something with them, so I use a loop like this:
for i in `ls -1 My*.zip`; do
# statements
echo $i;
# Do something with each file;
done
But like split names with spaces:
My
file
1.zip
My
file
2.zip
My
file
3.zip
Question
How can I solve this?, Is there some alternative in shell?
Don't use output from ls, use:
for f in *.zip; do
echo "processing $f"
done
By not using ls, and by quoting properly.
for i in My*.zip
do
echo "$i"
done
shopt -s dotglob
for i in *;
do :; # work on "$i"
done
shopt -u dotglob
By default, files that begin with a dot are special in that * will not match them.

Resources