bash : recursive listing of all files problem - bash

Run a recursive listing of all the
files in /var/log and redirect
standard output to a file called
lsout.txt in your home directory.
Complete this question WITHOUT leaving
your home directory.
An: ls -R /var/log/ >
/home/bqiu/lsout.txt
I reckon the above bash command is not correct. Because I found what it stores was :
$ ls -R /var/log
/var/log:
empty.txt setup.log setup.log.full tmp
/var/log/tmp:
fake.txt subfolder
/var/log/tmp/subfolder:
Does that mean problem resolved?
I reckon NOT.
Because it contains more "stuff" than "only files"
Or at least, if the purpose was to locate all "files" underneath the "/var/log" directory
recursively, then I hope to get the anwser like this:
/var/log/empty.txt
/var/log/setup.log
/var/log/setup.log.full
/var/log/tmp/fake.txt
So then someone can parse the content of the output for later use. Such like
$ perl -wnle 'print "$. :" , $_;' logfiles
1 :/var/log/empty.txt
2 :/var/log/setup.log
3 :/var/log/setup.log.full
4 :/var/log/tmp/fake.txt
This is what I've got so far:
$ ls -1R
.:
cal.sh
cokemachine.sh
dir
sort
test.sh
./dir:
afile.txt
file
subdir
./dir/subdir:
$ ls -R | sed s/^.*://g
cal.sh
cokemachine.sh
dir
sort
test.sh
afile.txt
file
subdir
But this still leaves all directory/sub-directory names (dir and subdir), plus a couple of empty newlines
How could I get the correct result without using Perl or awk? Preferably using only basic bash commands(this is just because Perl and awk is out of assessment scope)
Edited : I focused on my own "$HOME" folder just to restrict the file listed. I am having little content in my homedir
Edited 2nd: Sorry about my inapproprated question in the initial form. I fixed the wording and hopefully everyone can see the problem now.

Try -
find /var/log > ~/lsout.txt

If you were given no restrictions in terms of which commands can or cannot be used, ls -R /var/log >~/lsout.txt or find /var/log -print >"$HOME/lsout.txt" or any similar combination will work just fine.
However, if the point of the assignment is to write a 100% sh-based implementation, without using ls -R, find, etc. then you should be producing something along the lines of:
#!/bin/sh
# Helper method which recursively lists the contents of a given directory
# Usage: recurse_ls target_directory
recurse_ls()
{
TARGET_DIR="$1"
# list contents of $TARGET_DIR
...
# - recursive call to list contents of sub-directories
recurse_ls ...
...
}
# MAIN
# Usage: script.sh target_directory
# - check that parameters to script.sh are correct
...
# - list the contents of target_dir and its subdirectories
recurse_ls "$1"
Useful links:
variable expansion and parameter substitution
file type test operations
globbing (wildcard expansion)
quoting to account for blanks in variable values (including filenames)

I'd guess that the answer they want is:
ls -R /var/log/ > /home/bqiu/lsout.txt
ie. the original answer you said was wrong.
Except you may want to write it as:
ls -R /var/log/ > ~/lsout.txt.
That way it outputs to the home directory of whoever is logged in, rather than just user "bqiu".
When it says: Run a recursive listing of all the files
To me ls stands for listing and the -R option stands for recursive.
So to me the wording of the question suggests using ls -R to produce the listing,
But a it depends upon what format they want the listing.

Related

How to delay `redirection operator` of BASH `>`

First I create 3 files:
$ touch alpha bravo carlos
Then I want to save the list to a file:
$ ls > info.txt
However, I always got my info.txt inside:
$ cat info.txt
alpha
bravo
carlos
info.txt
It looks like the redirection operator creates my info.txt first.
In this case, my question is. How can I save my list of files before creating the info.txt first?
The main question is about the redirection operator. Why does it act first, and how to delay it so I complete my task first? Using the example above to answer it.
When you redirect a command's output to a file, the shell opens a file handle to the destination file, then runs the command in a child process whose standard output is connected to this file handle. There is no way to change this order, but you can redirect to a file in a different directory if you don't want the ls output to include the new file.
ls >/tmp/info.txt
mv /tmp/info.txt ./
In a production script, you should make sure that the file name is unique and unpredictable.
t=$(mktemp -t lstemp.XXXXXXXXXX) || exit
trap 'rm -f "$t"' INT HUP
ls >"$t"
mv "$t" ./info.txt
Alternatively, capture the output into a variable, and then write that variable to a file.
files=$(ls)
echo "$files" >info.txt
As an aside, probably don't use ls in scripts. If you want a list of files in the current directory
printf '%s\n' *
does that.
One simple approach is to save your command output to a variable, like this:
ls_output="$(ls)"
and then write the value of that variable to the file, using any of these commands:
printf '%s\n' "$ls_output" > info.txt
cat <<< "$ls_output" > info.txt
echo "$ls_output" > info.txt
Some caveats with this approach:
Bash variables can't contain null bytes. If the output of the command includes a null byte, that byte and everything after it will be discarded.
In the specific case of ls, though, this shouldn't be an issue, because the output of ls should never contain a null byte.
$(...) removes trailing newlines. The above compensates for this by adding a newline while creating info.txt, but if the the command output ends with multiple newlines, then the above will effectively collapse them into a single newline.
In the specific case of ls, this could happen if a filename ends with a newline — very unusual, and unlikely to be intentional, but nonetheless possible.
Since the above adds a newline while creating info.txt, it will put a newline there even if the command output doesn't end with a newline.
In the specific case of ls, this shouldn't be an issue, because the output of ls should always end with a newline.
If you want to avoid the above issues, another approach is to save your command output to a temporary file in a different directory, and then move it to the right place; for example:
tmpfile="$(mktemp)"
ls > "$tmpfile"
mv -- "$tmpfile" info.txt
. . . which obviously has different caveats (e.g., it requires access to write to a different directory), but should work on most systems.
One way to do what you want is to exclude the info.txt file from the ls output.
If you can rename the list file to .info.txt then it's as simple as:
ls >.info.txt
ls doesn't list files whose names start with . by default.
If you can't rename the list file but you've got GNU ls then you can use:
ls --ignore=info.txt >info.txt
Failing that, you can use:
ls | grep -v '^info\.txt$' >info.txt
All of the above options have the advantage that you can safely run them after the list file has been created.
Another general approach is to capture the output of ls with one command and save it to the list file with a second command. As others have pointed out, temporary files and shell variables are two specific ways to capture the output. Another way, if you've got the moreutils package installed, is to use the sponge utility:
ls | sponge info.txt
Finally, note that you may not be able to reliably extract the list of files from info.txt if it contains plain ls output. See ParsingLs - Greg's Wiki for more information.

Terminal - run 'file' (file type) for the whole directory

I'm a beginner in the terminal and bash language, so please be gentle and answer thoroughly. :)
I'm using Cygwin terminal.
I'm using the file command, which returns the file type, like:
$ file myfile1
myfile1: HTML document, ASCII text
Now, I have a directory called test, and I want to check the type of all files in it.
My endeavors:
I checked in the man page for file (man file), and I could see in the examples that you could type the names of all files after the command and it gives the types of all, like:
$ file myfile{1,2,3}
myfile1: HTML document, ASCII text
myfile2: gzip compressed data
myfile3: HTML document, ASCII text
But my files' names are random, so there's no specific pattern to follow.
I tried using the for loop, which I think is going to be the answer, but this didn't work:
$ for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
$ for f in ./; do file $f; done
./: directory
Any ideas?
Every Unix or Linux shell supports some kind of globs. In your case, all you need is to use * glob. This magic symbol represents all folders and files in the given path.
eg., file directory/*
Shell will substitute the glob with all matching files and directories in the given path. The resulting command that will actually get executed might be something like:
file directory/foo directory/bar directory/baz
You can use a combination of the find and xargs command.
For example:
find /your/directory/ | xargs file
HTH
file directory/*
Is probably the shortest simplest solution to fix your issue, but this is more of an answer as to why your loops weren't working.
for f in ls; do file $f; done
ls: cannot open `ls' (No such file or directory)
For this loop it is saying "for f in the directory or file 'ls' ; do..." If you wanted it to execute the ls command then you would need to do something like this
for f in `ls`; do file "$f"; done
But that wouldn't work correctly if any of the filenames contain whitespace. It is safer and more efficient to use the shell's builtin "globbing" like this
for f in *; do file "$f"; done
For this one there's an easy fix.
for f in ./; do file $f; done
./: directory
Currently, you're asking it to run the file command for the directory "./".
By changing it to " ./* " meaning, everything within the current directory (which is the same thing as just *).
for f in ./*; do file "$f"; done
Remember, double quote variables to prevent globbing and word splitting.
https://github.com/koalaman/shellcheck/wiki/SC2086

Recursively search a directory for each file in the directory on IBMi IFS

I'm trying to write two (edit: shell) scripts and am having some difficulty. I'll explain the purpose and then provide the script and current output.
1: get a list of every file name in a directory recursively. Then search the contents of all files in that directory for each file name. Should return the path, filename, and line number of each occurrence of the particular file name.
2: get a list of every file name in a directory recursively. Then search the contents of all files in the directory for each file name. Should return the path and filename of each file which is NOT found in any of the files in the directories.
I ultimately want to use script 2 to find and delete (actually move them to another directory for archiving) unused files in a website. Then I would want to use script 1 to see each occurrence and filter through any duplicate filenames.
I know I can make script 2 move each file as it is running rather than as a second step, but I want to confirm the script functions correctly before I do any of that. I would modify it after I confirm it is functioning correctly.
I'm currently testing this on an IMBi system in strqsh.
My test folder structure is:
scriptTest
---subDir1
------file4.txt
------file5.txt
------file6.txt
---subDir2
------file1.txt
------file7.txt
------file8.txt
------file9.txt
---file1.txt
---file2.txt
---file3.txt
I have text in some of those files which contains existing file names.
This is my current script 1:
#!/bin/bash
files=`find /www/Test/htdocs/DLTest/scriptTest/ ! -type d -exec basename {} \;`
for i in $files
do
grep -rin $i "/www/Test/htdocs/DLTest/scriptTest" >> testReport.txt;
done
Right now it functions correctly with exception to providing the path to the file which had a match. Doesn't grep return the file path by default?
I'm a little further away with script 2:
#!/bin/bash
files=`find /www/Test/htdocs/DLTest/scriptTest/ ! -type d`
for i in $files
do
#split $i on '/' and store into an array
IFS='/' read -a array <<< "$i"
#get last element of the array
echo "${array[-1]}"
#perform a grep similar to script 2 and store it into a variable
filename="grep -rin $i "/www/Test/htdocs/DLTest/scriptTest" >> testReport.txt;"
#Check if the variable has anything in it
if [ $filename = "" ]
#if not then output $i for the full path of the current needle.
then echo $i;
fi
done
I don't know how to split the string $i into an array. I keep getting an error on line 6
001-0059 Syntax error on line 6: token redirection not expected.
I'm planning on trying this on an actual linux distro to see if I get different results.
I appreciate any insight in advanced.
Introduction
This isn't really a full solution, as I'm not 100% sure I understand what you're trying to do. However, the following contain pieces of a solution that you may be able to stitch together to do what you want.
Create Test Harness
cd /tmp
mkdir -p scriptTest/subDir{1,2}
mkdir -p scriptTest/subDir1/file{4,5,6}.txt
mkdir -p scriptTest/subDir2/file{1,8,8}.txt
touch scriptTest/file{1,2,3}.txt
Finding and Deleting Duplicates
In the most general sense, you could use find's -exec flag or a Bash loop to run grep or other comparison on your files. However, if all you're trying to do is remove duplicates, then you might simply be better of using the fdupes or duff utilities to identify (and optionally remove) files with duplicate contents.
For example, given that all the .txt files in the test corpus are zero-length duplicates, consider the following duff and fdupes examples
duff
Duff has more options, but won't delete files for you directly. You'll likely need to use a command like duff -e0 * | xargs -0 rm to delete duplicates. To find duplicates using the default comparisons:
$ duff -r scriptTest/
8 files in cluster 1 (0 bytes, digest da39a3ee5e6b4b0d3255bfef95601890afd80709)
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
fdupes
This utility offers the ability to delete duplicates directly in various ways. One such way is to invoke fdupes . --delete --noprompt once you're confident that you're ready to proceed. However, to find the list of duplicates:
$ fdupes -R scriptTest/
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
Get a List of All Files, Including Non-Duplicates
$ find scriptTest -name \*.txt
scriptTest/file1.txt
scriptTest/file2.txt
scriptTest/file3.txt
scriptTest/subDir1/file4.txt
scriptTest/subDir1/file5.txt
scriptTest/subDir1/file6.txt
scriptTest/subDir2/file1.txt
scriptTest/subDir2/file8.txt
You could then act on each file with the find's -exec {} + feature, or simply use a grep that supports the --recursive --files-with-matches flags to find files with matching content.
Passing Find Results to a Bash Loop as an Array
Alternatively, if you know for sure that you won't have spaces in the file names, you can also use a Bash array to store the files into a variable you can iterate over in a Bash for-loop. For example:
files=$(find scriptTest -name \*.txt)
for file in "${files[#]}"; do
: # do something with each "$file"
done
Looping like this is often slower, but may provide you with the additional flexibility you need if you're doing something complicated. YMMV.

How to name output file according to a command line argument in a bash script?

These lines work when copy-pasted to the shell but don't work in a script:
ls -l file1 > /path/`echo !#:2`.txt
ls -l file2 > /path/`echo !#:2`.txt
 
ls -l file1 > /path/$(echo !#:2).txt
ls -l file2 > /path/$(echo !#:2).txt
What's the syntax for doing this in a bash script?
If possible, I would like to know how to do this for one file and for all files with the same extension in a folder.
Non-interactive shell has history expansion disabled.
Add the following two lines to your script to enable it:
set -o history
set -o histexpand
(UPDATE: I misunderstood the original question as referring to arguments to the script, not arguments to the current command within the script; this is a rewritten answer.)
As #choroba said, history is disabled by default in scripts, because it's not really the right way to do things like this in a script.
The preferred way to do things like this in a script is to store the item in question (in this case the filename) in a variable, then refer to it multiple times in the command:
fname=file1
ls -l "$fname" > "/path/$fname.txt"
Note that you should almost always put variable references inside double-quotes (as I did above) to avoid trouble if they contain spaces or other shell metacharacters. If you want to do this for multiple files, use a for loop:
for fname in *; do # this will repeat for each file (or directory) in the current directory
ls -l "$fname" > "/path/$fname.txt"
done
If you want to operate on files someplace other than the current directory, things are a little more complicated. You can use /inputpath/*, but it'll include the path along with each filename (e.g. it'd run the loop with "/inputpath/file1", "/inputpath/file2", etc), and if you use that directly in the output redirect you'll get something like > /path/inputpath/file1.txt (i.e. the two different paths will get appended together), probably not what you want. In this case, you can use the basename command to strip off the unwanted path for output purposes:
for fpath in /inputpath/*; do
ls -l "$fpath" > "/path/$(basename "$fpath").txt"
done
If you want a list of files with a particular extension, just use *.foo or /inputpath/*.foo as appropriate. However, in this case you'll wind up with the output going to files named e.g. "file1.foo.txt"; if you don't want stacked extensions, basename has an option to trim that as well:
for fpath in /inputpath/*.foo; do
ls -l "$fpath" > "/path/$(basename "$fpath" .foo).txt"
done
Finally, it might be neater (depending how complex the actual operation is, and whether it occurs multiple times in the script) to wrap this in a function, then use that:
doStuffWithFile() {
ls -l "$1" > "/path/$(basename "$1" "$2").txt"
}
for fpath in /inputpath/*.foo; do
doStuffWithFile "$fpath" ".foo"
done
doStuffWithFile /otherpath/otherfile.bar .bar

Assign file names to a variable in shell

I'm trying to write a script that does something a bit more sophisticated than what I'm going to show you, but I know that the problem is in this part.
I want each name of a list of files in a directory to be assigned to a variable (the same variable, one at a time) through a for loop, then do something inside of the loop with this, see what mean:
for thing in $(ls $1);
do
file $thing;
done
Edit: let's say this scrypt is called Scrypt and I have a folder named Folder, and it has 3 files inside named A,B,C. I want it to show me on the terminal when I write this:
./scrypt Folder
the following:
A: file
B: file
C: file
With the code I've shown above, I get this:
A: ERROR: cannot open `A' (No such file or directory)
B: ERROR: cannot open `B' (No such file or directory)
C: ERROR: cannot open `C' (No such file or directory)
that is the problem
One way is to use wildcard expansion instead of ls, e.g.,
for filename in "$1"/*; do
command "$filename"
done
This assumes that $1 is the path to a directory with files in it.
If you want to only operate on plain files, add a check right after do along the lines of:
[ ! -f "$filename" ] && continue
http://mywiki.wooledge.org/ParsingLs
Use globbing instead:
for filename in "$1"/* ; do
<cmd> "$filename"
done
Note the quotes around $filename
It's a bit unclear what you are trying to accomplish, but you can essentially do the same thing with functionality that already exists with find. For example, the following prints the contents of each file found in a folder:
find FolderName -type f -maxdepth 1 -exec cat {} \;
well,
i think that what you meant is that the loop will show the filenames in the desired dir.
so, i would do it like that:
for filename in "$1"/*; do
echo "file: $filename"
done
that way the result should be (in case in the dir are 3 files and the names are A B C:
`file: A
`file: B
`file: C

Resources