How to output filenames in shell? - shell

Ive written the following piece of code to print the filenames of a directory in output. The directory is dir1 and the filenames are L1, L2, L3, ..,L512
#!/bin/bash
TOP=`pwd`
for file in "$TOP/dir1"/*; do
echo "$file"
done
exit
But instead of printing just the filenames (L1,L2,..), it outputs the whole path corresponding to each file. How could I change it to only receive the filenames at output?

Use the basename command:
for file in "$TOP/dir1"/*; do
basename "$file"
done

Related

How to delay `redirection operator` of BASH `>`

First I create 3 files:
$ touch alpha bravo carlos
Then I want to save the list to a file:
$ ls > info.txt
However, I always got my info.txt inside:
$ cat info.txt
alpha
bravo
carlos
info.txt
It looks like the redirection operator creates my info.txt first.
In this case, my question is. How can I save my list of files before creating the info.txt first?
The main question is about the redirection operator. Why does it act first, and how to delay it so I complete my task first? Using the example above to answer it.
When you redirect a command's output to a file, the shell opens a file handle to the destination file, then runs the command in a child process whose standard output is connected to this file handle. There is no way to change this order, but you can redirect to a file in a different directory if you don't want the ls output to include the new file.
ls >/tmp/info.txt
mv /tmp/info.txt ./
In a production script, you should make sure that the file name is unique and unpredictable.
t=$(mktemp -t lstemp.XXXXXXXXXX) || exit
trap 'rm -f "$t"' INT HUP
ls >"$t"
mv "$t" ./info.txt
Alternatively, capture the output into a variable, and then write that variable to a file.
files=$(ls)
echo "$files" >info.txt
As an aside, probably don't use ls in scripts. If you want a list of files in the current directory
printf '%s\n' *
does that.
One simple approach is to save your command output to a variable, like this:
ls_output="$(ls)"
and then write the value of that variable to the file, using any of these commands:
printf '%s\n' "$ls_output" > info.txt
cat <<< "$ls_output" > info.txt
echo "$ls_output" > info.txt
Some caveats with this approach:
Bash variables can't contain null bytes. If the output of the command includes a null byte, that byte and everything after it will be discarded.
In the specific case of ls, though, this shouldn't be an issue, because the output of ls should never contain a null byte.
$(...) removes trailing newlines. The above compensates for this by adding a newline while creating info.txt, but if the the command output ends with multiple newlines, then the above will effectively collapse them into a single newline.
In the specific case of ls, this could happen if a filename ends with a newline — very unusual, and unlikely to be intentional, but nonetheless possible.
Since the above adds a newline while creating info.txt, it will put a newline there even if the command output doesn't end with a newline.
In the specific case of ls, this shouldn't be an issue, because the output of ls should always end with a newline.
If you want to avoid the above issues, another approach is to save your command output to a temporary file in a different directory, and then move it to the right place; for example:
tmpfile="$(mktemp)"
ls > "$tmpfile"
mv -- "$tmpfile" info.txt
. . . which obviously has different caveats (e.g., it requires access to write to a different directory), but should work on most systems.
One way to do what you want is to exclude the info.txt file from the ls output.
If you can rename the list file to .info.txt then it's as simple as:
ls >.info.txt
ls doesn't list files whose names start with . by default.
If you can't rename the list file but you've got GNU ls then you can use:
ls --ignore=info.txt >info.txt
Failing that, you can use:
ls | grep -v '^info\.txt$' >info.txt
All of the above options have the advantage that you can safely run them after the list file has been created.
Another general approach is to capture the output of ls with one command and save it to the list file with a second command. As others have pointed out, temporary files and shell variables are two specific ways to capture the output. Another way, if you've got the moreutils package installed, is to use the sponge utility:
ls | sponge info.txt
Finally, note that you may not be able to reliably extract the list of files from info.txt if it contains plain ls output. See ParsingLs - Greg's Wiki for more information.

rename all files of a specific type in a directory

I am trying to use bash to rename all .txt files in a directory that match a specific pattern. My two attempts below have removed the files from the directory and threw an error. Thank you :)
input
16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import.txt
16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import.txt
desired output
16-0000_File-A_multianno.txt
16-0002_File-B_multianno.txt
Bash attempt 1 this removes the files from the directory
for f in /home/cmccabe/Desktop/test/vcf/overall/annovar/*_classify.txt ; do
# Grab file prefix.
p=${f%%_*_}
bname=`basename $f`
pref=${bname%%.txt}
mv "$f" ${p}_multianno.txt
done
Bash attempt 2 Substitution replacement not terminated at (eval 1) line 1.
for f in /home/cmccabe/Desktop/test/vcf/overall/annovar/*_classify.txt ; do
# Grab file prefix.
p=${f%%_*_}
bname=`basename $f`
pref=${bname%%.txt}
rename -n 's/^$f/' *${p}_multianno.txt
done
You don't need a loop. rename alone can do this:
rename -n 's/(.*?_[^_]+).*/${1}_multianno.txt/g' /home/cmccabe/Desktop/test/vcf/overall/annovar/*_classify.txt
The meaning of the regular expression is roughly,
capture everything from the start until the 2nd _,
match the rest,
and replace with the captured prefix and append _multianno.txt
With the -n flag, this command will print what it would do without actually doing it.
When the output looks good, remove the -n and rerun.

Terminal - run 'file' (file type) for the whole directory

I'm a beginner in the terminal and bash language, so please be gentle and answer thoroughly. :)
I'm using Cygwin terminal.
I'm using the file command, which returns the file type, like:
$ file myfile1
myfile1: HTML document, ASCII text
Now, I have a directory called test, and I want to check the type of all files in it.
My endeavors:
I checked in the man page for file (man file), and I could see in the examples that you could type the names of all files after the command and it gives the types of all, like:
$ file myfile{1,2,3}
myfile1: HTML document, ASCII text
myfile2: gzip compressed data
myfile3: HTML document, ASCII text
But my files' names are random, so there's no specific pattern to follow.
I tried using the for loop, which I think is going to be the answer, but this didn't work:
$ for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
$ for f in ./; do file $f; done
./: directory
Any ideas?
Every Unix or Linux shell supports some kind of globs. In your case, all you need is to use * glob. This magic symbol represents all folders and files in the given path.
eg., file directory/*
Shell will substitute the glob with all matching files and directories in the given path. The resulting command that will actually get executed might be something like:
file directory/foo directory/bar directory/baz
You can use a combination of the find and xargs command.
For example:
find /your/directory/ | xargs file
HTH
file directory/*
Is probably the shortest simplest solution to fix your issue, but this is more of an answer as to why your loops weren't working.
for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
For this loop it is saying "for f in the directory or file 'ls' ; do..." If you wanted it to execute the ls command then you would need to do something like this
for f in `ls`; do file "$f"; done
But that wouldn't work correctly if any of the filenames contain whitespace. It is safer and more efficient to use the shell's builtin "globbing" like this
for f in *; do file "$f"; done
For this one there's an easy fix.
for f in ./; do file $f; done
./: directory
Currently, you're asking it to run the file command for the directory "./".
By changing it to " ./* " meaning, everything within the current directory (which is the same thing as just *).
for f in ./*; do file "$f"; done
Remember, double quote variables to prevent globbing and word splitting.
https://github.com/koalaman/shellcheck/wiki/SC2086

Listing files with spaces in name

Problem
In some directory, I have some files with spaces (or maybe some special character) in their filenames.
Trying to list one file per line, I used ls -1 but files with spaces in name are not processed as I expected.
Example
I have these three files:
$ ls -1
My file 1.zip
My file 2.zip
My file 3.zip
and I want to list and do something with them, so I use a loop like this:
for i in `ls -1 My*.zip`; do
# statements
echo $i;
# Do something with each file;
done
But like split names with spaces:
My
file
1.zip
My
file
2.zip
My
file
3.zip
Question
How can I solve this?, Is there some alternative in shell?
Don't use output from ls, use:
for f in *.zip; do
echo "processing $f"
done
By not using ls, and by quoting properly.
for i in My*.zip
do
echo "$i"
done
shopt -s dotglob
for i in *;
do :; # work on "$i"
done
shopt -u dotglob
By default, files that begin with a dot are special in that * will not match them.

Bash for loop not working over large dataset in OSX

I have a directory with a large number of sub-directories some of which have several zip files in them. I'm trying to write a bash script that will go through the directories and look for the name "Archive-foo" enter the sub-directory and if it contains zip files unzip them and then trash the zip files.
The script I wrote works on my test directories (5 sub directories) but when I tried to use it on the main archive directory (1200+ sub-directories) it fails to do anything.
Is there a max number of items a for loop can cycle through?
here's my code
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
NUMBER=0
for i in $( ls )
do
#echo "$i"" is in the Top Level"
NUMBER=$[NUMBER+1]
if ($(test -d "$i"))
then
#echo "$i"" is a Directory"
if [[ "$i" == *Archive* ]]
then
#echo "$i"" has Archive in the name"
cd "$i"
unzip -n "*".zip
mv *.zip ~/.Trash
#else
#echo "$i"" does not have Archive in the name"
fi
#else
#echo "$i"" is NOT a Directory skipping"
fi
done
echo "$NUMBER of items"
IFS=$SAVEIFS
There's a limit on the size of command lines, and for i in $( ls ) may be exceeding it.
Try this syntax instead:
ls | while read i;
do
...
done
The only problem with this is that the pipeline runs the while loop in a subshell, so assignments to NUMBER won't persist into the original shell process. You can have the loop prints a line whenever it processes a line, and pipe the whole loop to wc -l to count the number of lines.
Barmer answer hit the issue on the nose. Using for file in $(...) as loop headers is not a very good idea:
It is slower: The shell executes what is in $(..) first, then runs the for loop. It can't start the for until $(...) finishes.
It can overrun the command line buffer: The shell executes $(..) and then puts it on the command line. The command line buffer may be about 32 Kilobytes, maybe more now, but if you have 10,000 files and each file is averaging 20 characters, you end up with over a 200Kb command line buffer,
For loops are terrible at handling bad file names: If file names have white spaces in them, each word is treated like a file.
A much better construct is:
find . ... -print0 | while read -d $\0 file
do
...
done
This can execute the while read loop while the find is executing, making it faster.
This can't overrun the command line buffer.
Most importantly, this construct handles almost any type of file name. The find will return each file separated by a NUL character - a character that cannot be in a file name. The -d $\0 tells the read command that the NUL character is the delimiter between file names. This handles spaces, tabs, and even new lines in file names.
The find is also very flexible. You can limit the list to only files, files in a particular age range, etc. The most common ones needed to replae for loops are:
$ find . -depth 1
acts just like ls -a:
$ find . \! -name ".*" -prune -a -depth 1
Acts just like ls, and will skip over files names that begin with ..

Resources