Assign file names to a variable in shell - bash

I'm trying to write a script that does something a bit more sophisticated than what I'm going to show you, but I know that the problem is in this part.
I want each name of a list of files in a directory to be assigned to a variable (the same variable, one at a time) through a for loop, then do something inside of the loop with this, see what mean:
for thing in $(ls $1);
do
file $thing;
done
Edit: let's say this scrypt is called Scrypt and I have a folder named Folder, and it has 3 files inside named A,B,C. I want it to show me on the terminal when I write this:
./scrypt Folder
the following:
A: file
B: file
C: file
With the code I've shown above, I get this:
A: ERROR: cannot open `A' (No such file or directory)
B: ERROR: cannot open `B' (No such file or directory)
C: ERROR: cannot open `C' (No such file or directory)
that is the problem

One way is to use wildcard expansion instead of ls, e.g.,
for filename in "$1"/*; do
command "$filename"
done
This assumes that $1 is the path to a directory with files in it.
If you want to only operate on plain files, add a check right after do along the lines of:
[ ! -f "$filename" ] && continue

http://mywiki.wooledge.org/ParsingLs
Use globbing instead:
for filename in "$1"/* ; do
<cmd> "$filename"
done
Note the quotes around $filename

It's a bit unclear what you are trying to accomplish, but you can essentially do the same thing with functionality that already exists with find. For example, the following prints the contents of each file found in a folder:
find FolderName -type f -maxdepth 1 -exec cat {} \;

well,
i think that what you meant is that the loop will show the filenames in the desired dir.
so, i would do it like that:
for filename in "$1"/*; do
echo "file: $filename"
done
that way the result should be (in case in the dir are 3 files and the names are A B C:
`file: A
`file: B
`file: C

Related

Bash Script to copy from external drive to Box folder

Trying to write a bash script to copy a large number of files from an external drive into separate directories based on a subject id.
I've included the script I've written below.
I get the following error:
cat: /Volumes/Seagate: No such file or directory
cat: Backup: No such file or directory
cat: Plus: No such file or directory
cat: Drive/Subject_List.txt: No such file or directory
When I try to copy a single file at a time using the terminal, it copies using the exact command I've put in this script. I'm not sure why it's not recognizing the directory when I try and use it in the script below. Any help is greatly appreciated!
#!/bin/bash
#A bash script to copy the structural and functional files into the HCP_Entropy folder
#subject list
SUBJECT_LIST="/Volumes/Seagate/Backup/Plus/Drive/Subject_List.txt
for j in $(cat ${SUBJECT_LIST}); do
echo ${j}
cp /Volumes/Seagate\ Backup\ Plus\ Drive/HCP_DATA/Structural/{j}/unprocessed/3T/T1w_MPR1/${j}_3T_T1w_MPR1.nii.gz /Users/myname/Box/HCP_Entropy/BEN/${j}/anat
done
the line
$SUBJECT_LIST=/Volumes/Seagate\ Backup\ Plus\ Drive/Subject_List.txt
is bogus.
to assign values to a variable, you must not add the $ specifier.
a token starting with $ will be expanded, so $SUBJECT_LIST=... will first be expanded to =... (since you haven't assigned anything to the SUBJECT_LIST variable yet it is empty).
the proper way would be:
SUBJECT_LIST="/Volumes/Seagate Backup Plus Drive/Subject_List.txt"
(this uses quotes instead of escaping each space, which i find much more readable)
you also need to quote variables in case they contain spaces, else they might be interpreted by the command (cp) as multiple arguments.
for j in $(cat "${SUBJECT_LIST}"); do
# ...
done
and of course, you should check whether the source file actually exists, just like the destination directory.
indir="/Volumes/Seagate Backup Plus Drive"
SUBJECT_LIST="${indir}/Subject_List.txt"
cat "${SUBJECT_LIST}" | while read j; do
infile="${indir}/HCP_DATA/Structural/${j}/unprocessed/3T/T1w_MPR1/${j}_3T_T1w_MPR1.nii.gz"
outdir="/Users/myname/Box/HCP_Entropy/BEN/${j}/anat"
mkdir -p "${outdir}"
if [ -e "${infile}" ]; then
cp -v "${infile}" "${outdir}"
else
echo "missing file ${infile}" 1>&2
fi
done

Wildcard on mv folder destination

I'm writing a small piece of code that checks for .mov files in a specific folder over 4gb and writes it to a log.txt file by name (without an extension). I'm then reading the names into a while loop line by line which signals some archiving and copying commands.
Consider a file named abcdefg.mov (new) and a corresponding folder somewhere else named abcdefg_20180525 (<-*underscore timestamp) that also contains a file named abcedfg.mov (old).
When reading in the filename from the log.txt, I strip the extension to store the variable "abcdefg" ($in1) and i'm using that variable to locate a folder elsewhere that contains that matching string at the beginning.
My problem is with how the mv command seems to support a wild card in the "source" string, but not in the "destination" string.
For example i can write;
mv -f /Volumes/Myshare/SourceVideo/$in1*/$in1.mov /Volumes/Myshare/Archive
However a wildcard on the destination doesn't work in the same way. For example;
mv -f /Volumes/Myshare/Processed/$in1.mov Volumes/Myshare/SourceVideo/$in1*/$in1.mov
Is there an easy fix here that doesn't involve using another method?
Cheers for any help.
mv accepts a single destination path. Suppose that $in1 is abcdfg, and that $in1* expands to abcdefg_20180525 and abcdefg_20180526. Then the command
mv -f /dir1/$in1 /dir2/$in1*/$in1.mov
will be equivalent to:
mv -f /dir1/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
mv -f /dir1/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
mv -f /dir2/abcdefg_20180525/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
Moreover, because the destination file is the same in all three cases, the first two files will be overwritten by the third.
You should create a precise list and do a precise copy instead of using wild cards.
This is what I would probably do, generate a list of results in a file with FULL path information, then read those results in another function. I could have used arrays but I wanted to keep it simple. At the bottom of this script is a function call to scan for files of EXT mp4 (case insensitive) then writes the results to a file in tmp. then the script reads the results from that file in another function and performs some operation (mv etc.). Note, if functions are confusing , you can just remove the function name { } and name calls and it becomes a normal script again. functions are really handy, learn to love them!
#!/usr/bin/env bash
readonly SIZE_CHECK_LIMIT_MB="10M"
readonly FOLDER="/tmp"
readonly DESTINATION_FOLDER="/tmp/archive"
readonly SAVE_LIST_FILE="/tmp/$(basename $0)-save-list.txt"
readonly EXT="mp4"
readonly CASE="-iname" #change to -name for exact ext type upper/lower
function find_files_too_large() {
> ${SAVE_LIST_FILE}
find "${FOLDER}" -maxdepth 1 -type f "${CASE}" "*.${EXT}" -size +${SIZE_CHECK_LIMIT_MB} -print0 | while IFS= read -r -d $'\0' line ; do
echo "FOUND => $line"
echo "$line" >> ${SAVE_LIST_FILE}
done
}
function archive_large_files() {
local read_file="${SAVE_LIST_FILE}"
local write_folder="$DESTINATION_FOLDER"
if [ ! -s "${read_file}" ] || [ ! -f "${read_file}" ] ;then
echo "No work to be done ... "
return
fi
while IFS= read -r line ;do
echo "mv $line $write_folder" ;sleep 1
done < "${read_file}"
}
# MAIN (this is where the script starts) We just call two functions.
find_files_too_large
archive_large_files
it might be easier, i think, to change the filenames to the folder name initially. So abcdefg.mov would be abcdefg_timestamp.mov. I can always strip the timestamp from the filename easy enough after its copied to the right location. I was hoping i had a small syntax issue but i think there is no easy way of doing what i thought i could...
I think you have a basic misunderstanding of how wildcards work here. The mv command doesn't support wildcards at all; the shell expands all wildcards into lists of matching files before they get passed to the mv command as wildcards. Furthermore, the mv command doesn't know if the list of arguments it got came from wildcards or not, and the shell doesn't know anything about what the command is going to do with them. For instance, if you run the command grep *, the grep command just gets a list of names of files in the current directory as arguments, and will treat the first of them as a regex pattern ('cause that's what the first argument to grep is) to search the rest of the files for. If you ran mv * (note: don't do this!), it will interpret all but the last filename as sources, and the last one as a destination.
I think there's another source of confusion as well: when the shell expands a string containing a wildcard, it tries to match the entire thing to existing files and/or directories. So when you use Volumes/Myshare/SourceVideo/$in1*/$in1.mov, it looks for an already-existing file in a matching directory; AIUI the file isn't there yet, there's no match. What it does in that case is pass the raw (unexpanded) wildcard-containing string to mv as an argument, which looks for that exact name, doesn't find it, and gives you an error.
(BTW, should there be a "/" at the front of that pattern? I assume so below.)
If I understand the situation correctly, you might be able to use this:
mv -f /Volumes/Myshare/Processed/$in1.mov /Volumes/Myshare/SourceVideo/$in1*/
Since the filename isn't supplied in the second string, it doesn't look for existing files by that name, just directories with the right prefix; mv will automatically retain the filename from the source.
However, I'll echo #Sergio's warning about chaos from multiple matches. In this case, it won't overwrite files (well, it might, but for other reasons), but if it gets multiple matching target directories it'll move all but the last one into the last one (along with the file you meant to move). You say you're 100% certain this won't be a problem, but in my experience that means that there's at least a 50% chance that something you'd never have thought of will go ahead and make it happen anyway. For instance, is it possible that $in1 could wind up empty, or contain a space, or...?
Speaking of spaces, I'd also recommend double-quoting all variable references. You want the variables inside double-quotes, but the wildcards outside them (or they won't be expanded), like this:
mv -f "/Volumes/Myshare/Processed/$in1.mov" "/Volumes/Myshare/SourceVideo/$in1"*/

Terminal - run 'file' (file type) for the whole directory

I'm a beginner in the terminal and bash language, so please be gentle and answer thoroughly. :)
I'm using Cygwin terminal.
I'm using the file command, which returns the file type, like:
$ file myfile1
myfile1: HTML document, ASCII text
Now, I have a directory called test, and I want to check the type of all files in it.
My endeavors:
I checked in the man page for file (man file), and I could see in the examples that you could type the names of all files after the command and it gives the types of all, like:
$ file myfile{1,2,3}
myfile1: HTML document, ASCII text
myfile2: gzip compressed data
myfile3: HTML document, ASCII text
But my files' names are random, so there's no specific pattern to follow.
I tried using the for loop, which I think is going to be the answer, but this didn't work:
$ for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
$ for f in ./; do file $f; done
./: directory
Any ideas?
Every Unix or Linux shell supports some kind of globs. In your case, all you need is to use * glob. This magic symbol represents all folders and files in the given path.
eg., file directory/*
Shell will substitute the glob with all matching files and directories in the given path. The resulting command that will actually get executed might be something like:
file directory/foo directory/bar directory/baz
You can use a combination of the find and xargs command.
For example:
find /your/directory/ | xargs file
HTH
file directory/*
Is probably the shortest simplest solution to fix your issue, but this is more of an answer as to why your loops weren't working.
for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
For this loop it is saying "for f in the directory or file 'ls' ; do..." If you wanted it to execute the ls command then you would need to do something like this
for f in `ls`; do file "$f"; done
But that wouldn't work correctly if any of the filenames contain whitespace. It is safer and more efficient to use the shell's builtin "globbing" like this
for f in *; do file "$f"; done
For this one there's an easy fix.
for f in ./; do file $f; done
./: directory
Currently, you're asking it to run the file command for the directory "./".
By changing it to " ./* " meaning, everything within the current directory (which is the same thing as just *).
for f in ./*; do file "$f"; done
Remember, double quote variables to prevent globbing and word splitting.
https://github.com/koalaman/shellcheck/wiki/SC2086

Recursively search a directory for each file in the directory on IBMi IFS

I'm trying to write two (edit: shell) scripts and am having some difficulty. I'll explain the purpose and then provide the script and current output.
1: get a list of every file name in a directory recursively. Then search the contents of all files in that directory for each file name. Should return the path, filename, and line number of each occurrence of the particular file name.
2: get a list of every file name in a directory recursively. Then search the contents of all files in the directory for each file name. Should return the path and filename of each file which is NOT found in any of the files in the directories.
I ultimately want to use script 2 to find and delete (actually move them to another directory for archiving) unused files in a website. Then I would want to use script 1 to see each occurrence and filter through any duplicate filenames.
I know I can make script 2 move each file as it is running rather than as a second step, but I want to confirm the script functions correctly before I do any of that. I would modify it after I confirm it is functioning correctly.
I'm currently testing this on an IMBi system in strqsh.
My test folder structure is:
scriptTest
---subDir1
------file4.txt
------file5.txt
------file6.txt
---subDir2
------file1.txt
------file7.txt
------file8.txt
------file9.txt
---file1.txt
---file2.txt
---file3.txt
I have text in some of those files which contains existing file names.
This is my current script 1:
#!/bin/bash
files=`find /www/Test/htdocs/DLTest/scriptTest/ ! -type d -exec basename {} \;`
for i in $files
do
grep -rin $i "/www/Test/htdocs/DLTest/scriptTest" >> testReport.txt;
done
Right now it functions correctly with exception to providing the path to the file which had a match. Doesn't grep return the file path by default?
I'm a little further away with script 2:
#!/bin/bash
files=`find /www/Test/htdocs/DLTest/scriptTest/ ! -type d`
for i in $files
do
#split $i on '/' and store into an array
IFS='/' read -a array <<< "$i"
#get last element of the array
echo "${array[-1]}"
#perform a grep similar to script 2 and store it into a variable
filename="grep -rin $i "/www/Test/htdocs/DLTest/scriptTest" >> testReport.txt;"
#Check if the variable has anything in it
if [ $filename = "" ]
#if not then output $i for the full path of the current needle.
then echo $i;
fi
done
I don't know how to split the string $i into an array. I keep getting an error on line 6
001-0059 Syntax error on line 6: token redirection not expected.
I'm planning on trying this on an actual linux distro to see if I get different results.
I appreciate any insight in advanced.
Introduction
This isn't really a full solution, as I'm not 100% sure I understand what you're trying to do. However, the following contain pieces of a solution that you may be able to stitch together to do what you want.
Create Test Harness
cd /tmp
mkdir -p scriptTest/subDir{1,2}
mkdir -p scriptTest/subDir1/file{4,5,6}.txt
mkdir -p scriptTest/subDir2/file{1,8,8}.txt
touch scriptTest/file{1,2,3}.txt
Finding and Deleting Duplicates
In the most general sense, you could use find's -exec flag or a Bash loop to run grep or other comparison on your files. However, if all you're trying to do is remove duplicates, then you might simply be better of using the fdupes or duff utilities to identify (and optionally remove) files with duplicate contents.
For example, given that all the .txt files in the test corpus are zero-length duplicates, consider the following duff and fdupes examples
duff
Duff has more options, but won't delete files for you directly. You'll likely need to use a command like duff -e0 * | xargs -0 rm to delete duplicates. To find duplicates using the default comparisons:
$ duff -r scriptTest/
8 files in cluster 1 (0 bytes, digest da39a3ee5e6b4b0d3255bfef95601890afd80709)
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
fdupes
This utility offers the ability to delete duplicates directly in various ways. One such way is to invoke fdupes . --delete --noprompt once you're confident that you're ready to proceed. However, to find the list of duplicates:
$ fdupes -R scriptTest/
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
Get a List of All Files, Including Non-Duplicates
$ find scriptTest -name \*.txt
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
You could then act on each file with the find's -exec {} + feature, or simply use a grep that supports the --recursive --files-with-matches flags to find files with matching content.
Passing Find Results to a Bash Loop as an Array
Alternatively, if you know for sure that you won't have spaces in the file names, you can also use a Bash array to store the files into a variable you can iterate over in a Bash for-loop. For example:
files=$(find scriptTest -name \*.txt)
for file in "${files[#]}"; do
: # do something with each "$file"
done
Looping like this is often slower, but may provide you with the additional flexibility you need if you're doing something complicated. YMMV.

bash script to append word to filenames

I'm trying append to word "dicom" to the front of many filenames in a set of folders. The folders all begin with "s" (referred to by "s*" in the script below), and each contain many files (specified by "*" below)--I'd like all of these files to be changed using this bash script. I tried to run this:
for file in /Volumes/USB_AIB/DICOMFunCurrentBatch/MOVr1unzip/s*/*
do
mv $file dicom${file%%}
done
but got thousands of lines of the following error (once for each file within each folder--this is just an example of one of them):
mv: rename /Volumes/USB_AIB/DICOMFunCurrentBatch/MOVr1unzip/s307_1/29217684 to dicom/Volumes/USB_AIB/DICOMFunCurrentBatch/MOVr1unzip/s307_1/29217684: No such file or directory
Any ideas on how to fix it?
I don't you have a valid path as dicom/Volumes/USB_AIB/DICOMFunCurrentBatch/MOVr1unzip/s307_1/, why do you add dicom at the beginning?
maybe you want to append dicom to the end of the $file?
mv "$file" "${file}_dicom"
or something like that.
the following variable expansion ${file%%} is strange because it does nothing:
${parameter%%word} : remove the longest matching suffix pattern.
to move the file into a directory the path should exists, to create the path:
mkdir -p "$(dirname "${newfilename}")"
Maybe what you are trying to do:
for file in /Volumes/USB_AIB/DICOMFunCurrentBatch/MOVr1unzip/s*/*
do
mv "$file" "${file%/*}/dicom${file##*/}"
done

Resources