I'm learning bash so this is a simple question probably. I'd like to understand what happens in this case, there's no real use for this script. Let DTEST be a directory that contains some random files.
for filename in " `ls DTEST/*` " ; do
touch "$filename".txt
done
So the command substitution happens and the ls output should be file1 file2 file3. Then because the command substitution is double quoted the first line of the for loop should be for filename in "file1 file2 file3". I think it should create only one file named file1 file2 file3.txt.
But I've seen it wants to create a file named something like file1'$'\n''file2'$'\n''file3.txt.
I don't understand where the '$'\n'' come from. I read in bash manual that with double quotes \ followed by special characters like n for newline retains its special meaning, but why \n are generated?
ls DTEST/* outputs each file on a separate line. Also, the output would contain the directory name (i.e. DEST/file1 etc.).
Of course you would never use ls in this way in a real script. Instead you would just use something like
for filename in DEST/*
do
...
done
Related
I am using the bash shell and want to execute a command that takes filenames as arguments; say the cat command. I need to provide the arguments sorted by modification time (oldest first) and unfortunately the filenames can contain spaces and a few other difficult characters such as "-", "[", "]". The files to be provided as arguments are all the *.txt files in my directory. I cannot find the right syntax. Here are my efforts.
Of course, cat *.txt fails; it does not give the desired order of the arguments.
cat `ls -rt *.txt`
The `ls -rt *.txt` gives the desired order, but now the blanks in the filenames cause confusion; they are seen as filename separators by the cat command.
cat `ls -brt *.txt`
I tried -b to escape non-graphic characters, but the blanks are still seen as filename separators by cat.
cat `ls -Qrt *.txt`
I tried -Q to put entry names in double quotes.
cat `ls -rt --quoting-style=escape *.txt`
I tried this and other variants of the quoting style.
Nothing that I've tried works. Either the blanks are treated as filename separators by cat, or the entire list of filenames is treated as one (invalid) argument.
Please advise!
Using --quoting-style is a good start. The trick is in parsing the quoted file names. Backticks are simply not up to the job. We're going to have to be super explicit about parsing the escape sequences.
First, we need to pick a quoting style. Let's see how the various algorithms handle a crazy file name like "foo 'bar'\tbaz\nquux". That's a file name containing actual single and double quotes, plus a space, tab, and newline to boot. If you're wondering: yes, these are all legal, albeit unusual.
$ for style in literal shell shell-always shell-escape shell-escape-always c c-maybe escape locale clocale; do printf '%-20s <%s>\n' "$style" "$(ls --quoting-style="$style" '"foo '\''bar'\'''$'\t''baz '$'\n''quux"')"; done
literal <"foo 'bar' baz
quux">
shell <'"foo '\''bar'\'' baz
quux"'>
shell-always <'"foo '\''bar'\'' baz
quux"'>
shell-escape <'"foo '\''bar'\'''$'\t''baz '$'\n''quux"'>
shell-escape-always <'"foo '\''bar'\'''$'\t''baz '$'\n''quux"'>
c <"\"foo 'bar'\tbaz \nquux\"">
c-maybe <"\"foo 'bar'\tbaz \nquux\"">
escape <"foo\ 'bar'\tbaz\ \nquux">
locale <‘"foo 'bar'\tbaz \nquux"’>
clocale <‘"foo 'bar'\tbaz \nquux"’>
The ones that actually span two lines are no good, so literal, shell, and shell-always are out. Smart quotes aren't helpful, so locale and clocale are out. Here's what's left:
shell-escape <'"foo '\''bar'\'''$'\t''baz '$'\n''quux"'>
shell-escape-always <'"foo '\''bar'\'''$'\t''baz '$'\n''quux"'>
c <"\"foo 'bar'\tbaz \nquux\"">
c-maybe <"\"foo 'bar'\tbaz \nquux\"">
escape <"foo\ 'bar'\tbaz\ \nquux">
Which of these can we work with? Well, we're in a shell script. Let's use shell-escape.
There will be one file name per line. We can use a while read loop to read a line at a time. We'll also need IFS= and -r to disable any special character handling. A standard line processing loop looks like this:
while IFS= read -r line; do ... done < file
That "file" at the end is supposed to be a file name, but we don't want to read from a file, we want to read from the ls command. Let's use <(...) process substitution to swap in a command where a file name is expected.
while IFS= read -r line; do
# process each line
done < <(ls -rt --quoting-style=shell-escape *.txt)
Now we need to convert each line with all the quoted characters into a usable file name. We can use eval to have the shell interpret all the escape sequences. (I almost always warn against using eval but this is a rare situation where it's okay.)
while IFS= read -r line; do
eval "file=$line"
done < <(ls -rt --quoting-style=shell-escape *.txt)
If you wanted to work one file at a time we'd be done. But you want to pass all the file names at once to another command. To get to the finish line, the last step is to build an array with all the file names.
files=()
while IFS= read -r line; do
eval "files+=($line)"
done < <(ls -rt --quoting-style=shell-escape *.txt)
cat "${files[#]}"
There we go. It's not pretty. It's not elegant. But it's safe.
Does this do what you want?
for i in $(ls -rt *.txt); do echo "FILE: $i"; cat "$i"; done
I have a folder which may contain several files. Among those files I have files like these:
test.xml
test.jar
test.jarGENERATED
dev.project.jar
...
and many other files. To get only the "dev.project.jar" I have executed:
ls | grep ^{{dev}}.*.jar$
This displays the file with its properties for me. However, I only want the file name (only the file name string)
How to rectify it??
ls and grep are both unnecessary here. The shell will show you any file name matches for a wildcard:
echo dev.*.jar
(ls dev.*.jar without options will do something similar per se; if you see anything more than the filename, perhaps you have stupidly defined alias ls='ls -l' or something like that?)
The argument to grep should be a regular expression; what you specified would match {{dev}} and not dev, though in the absence of quoting, your shell might have expanded the braces. The proper regex would be grep '^dev\..*\.jar$' where the single quotes protect the regex from any shell expansions, and . matches any character, and * repeats that character as many times as possible. To match a literal dot, we backslash-escape it.
Just printing a file name is rarely very useful; often times, you actually want something like
for file in ./dev.*.jar; do
echo "$file"
: probably do more things with "$file"
done
though if that's all you want, maybe prefer printf over echo, which also lets you avoid the loop:
printf '%s\n' dev.*.jar
I tried to do something tricky today with bash scripting, which made me question my knowledge of bash scripting.
I have the following script called get_ftypes.sh, where the first input argument is a file containing file globs:
for ftype in `cat $1`
do
echo "this is ftype $ftype"
done
For example, the script would be called like this get_ftypes.sh file_types, and file_types would contain something like this:
*.txt
*.sh
I would expect the echo to print each line in the file, which in this example would be *.txt, *.sh, etc. But, instead it expands the globbing, *, and it echos the actual file names, instead of the globb as I would expect.
Any reason for this behavior? I cannot figure out why. Thank you.
On the line for ftype in `cat $1`, the shell performs both word splitting and pathname expansion. If you don't want that, use a while loop:
while read -r ftype
do
echo "this is ftype $ftype"
done <"$1"
This loop reads one line at a time from the file $1 and, while leading and trailing whitespace are removed from each line, no expansions are performed.
(If you want to keep the leading and trailing whitespace, use while IFS= read -r ftype).
Typically, for loops are useful when you are looping over items that are already shell-defined variables, like for x in "$#". If you are reading something in from an external command or file, you typically want a while read loop.
Alternative not using shell
When processing files line-by-line, the goal can often be accomplished more efficiently using sed or awk. As an example using awk, the above loop simplifies to:
$ awk '{print "this is ftype " $0}' filetypes
this is ftype *.txt
this is ftype *.sh
echo $(cat foo)
will produce the content of foo, split them into words, do globs on each word - i.e. treat the content of foo as parameters - before it interpolates it into the current command line.
echo "$(cat foo)"
will produce the content of foo as a single argument, does not treat them as parameters, will not glob (but you will only get one pass through the loop).
You want to read foo one line at a time; use while read -r ftype for that.
For the command ls, the option -1 is supposed to do run ls yielding an output with only one column (yielding one file per line). But if I put it in a script, it just shows every file jammed on one line, separated with spaces.
Script:
#!/bin/ksh
text=`ls -1`
echo $text
Folder contents:
test
|--bob
|--coolDir
|--file
|--notThisDirectoryAgain
|--script.sh
|--spaces are cool
|--thatFile
Script Output:
bob coolDir file notThisDirectoryAgain script.sh spaces are cool thatFile
Output if I run ls -1 in the terminal (not in a script)
bob
coolDir
file
notThisDirectoryAgain
script.sh
spaces are cool
thatFile
it just shows every file jammed on one line, separated with spaces.
You have to consider what it is.
When you do
text=`ls -1`
that runs the program ls and presents the output as if you typed it in. So the shell gets presented with:
ls=file1
file2
file3
etc.
The shell splits command-line tokens on whitespace, which by default includes a space, a tab, and a newline. So each filename is seen as a separate token by the shell.
These tokens are then passed into echo as separate parameters. The echo command is unaware that the newlines were ever there.
As I'm sure you know, all echo does is to write each parameter to stdout, with a space between each one.
This is why the suggestion given by #user5228826 works. Change IFS if you don't want a newline to separate tokens.
However, all you really had to do is to enclose the variable in quotes, so that it didn't get split:
echo "$text"
By the way, using `backticks` is deprecated and poor practice because it can be difficult to read, particularly when nested. If you run ksh -n on your script it will report this to you (assuming you are not using an ancient version). Use:
text=$(ls -1)
Having said all that, this is a terrible way to get a list of files. UNIX shells do globbing, this is an unnecessary use of the ls external program. Try:
text=(*) # Get all the files in current directory into an array
oldIFS="$IFS" # Save the current value of IFS
IFS=$'\n' # Set IFS to a newline
echo "${text[*]}" # Join the elements of the array by newlines, and display
IFS="$oldIFS" # Reset IFS to previous value
That's because you're capturing ls output into a variable. Bash does the same.
I'm quite new to Bash so this might be something trivial, but I'm just not getting it. I'm trying to escape the spaces inside filenames. Have a look. Note that this is a 'working example' - I get that interleaving files with blank pages might be accomplished easier, but I'm here about the space.
#! /bin/sh
first=true
i=combined.pdf
o=combined2.pdf
for f in test/*.pdf
do
if $first; then
first=false
ifile=\"$f\"
else
ifile=$i\ \"$f\"
fi
pdftk $ifile blank.pdf cat output $o
t=$i
i=$o
o=$t
break
done
Say I have a file called my file.pdf (with a space). I want the ifile variable to contain the string combined.pdf "my file.pdf", such that pdftk is able to use it as two file arguments - the first one being combined.pdf, and the second being my file.pdf.
I've tried various ways of escaping (with or without first escaping the quotes themselves, etc.), but it keeps splitting my and file.pdf when executing pdftk.
EDIT: To clarify: I'm trying to pass multiple file names (as multiple arguments) in one variable to the pdftk command. I would like it to recognise the difference between two file names, but not tear one file name apart at the spaces.
Putting multiple arguments into a single variable doesn't make sense. Instead, put them into an array:
args=(combined.pdf "my file.pdf");
Notice that "my file.pdf" is quoted to preserve whitespace.
You can use the array like this:
pdftk "${args[#]}" ...
This will pass two separate arguments to pdftk. The quotes in "${args[#]}" are required because they tell the shell to treat each array element as a separate "word" (i.e. do not split array elements, even if they contain whitespace).
As a side note, if you use bashisms like arrays, change your shebang to
#!/bin/bash
Try:
find test/*.pdf | xargs -I % pdftk % cat output all.pdf
As I said in my comments on other answers xargs is the most efficient way to do this.
EDIT: I did not see you needed a blank page but I suppose you could pipe the find above to some command to put the blank page between (similar to a list->string join). I prefer this way as its more FP like.