How to use reg expression as a variable - ksh

I'm using ksh and I have a directory full of .csv and .CSV files. I want to list all csv files, both capital and lowercase endings. So I type:
ls *#(CSV|csv)
and that lists all the files.
But if I set the the regular expression as a variable like so:
REGEXP="*#(CSV|csv)"
ls $REGEXP
I get the error
ls: cannot access *#(CSV|csv): No such file or directory
Can anyone explain what is the difference between these two commands and how to fix it so that I can use the variable in place of writing out the regex?

It's not really regex but globbing as you indicate in the [glob] tag.
With globbing use this:
$ ls *.[Cc][Ss][Vv]
a.csv a.CSV b.csv b.CSV
$ GLOB=*.[Cc][Ss][Vv]
$ ls $GLOB
a.csv a.CSV b.csv b.CSV

Related

how to escape parenthesis in ls commands

I do not succed to filter with ls files with parenthesis (on bash)
$ ls -1
a_échéancier(1).pdf
a_échéancier(2).pdf
a_échéancier(3).pdf
a_échéancier(4).pdf
a_échéancier(5).pdf
a_échéancier(6).pdf
a_échéancier.pdf
$
A try here:
$ ls "*).pdf"
ls: cannot access '*).pdf': No such file or directory
$
$ ls '*\).pdf'
ls: cannot access '*\).pdf': No such file or directory
$
You are escaping too many characters; the only character that needs to be escaped is ):
ls *\).pdf
but everything else except the * can be escaped:
ls *").pdf"
The shell itself is what expands the glob before ls even runs; ls just gets an explicit list of filenames. Quoting the * makes ls try to list the single file named *).pdf, not every file in the current directory that matches the pattern.
ls -1 | grep "*).pdf" should do the job. ls with a filename after it just tries to match it exactly.

Print list of files in a directory to a text file (but not the text file itself) from terminal

I would like to print all the filenames of every file in a directory to a .txt file.
Let's assume that I had a directory with 3 files:
file1.txt
file2.txt
file3.txt
and I tried using ls > output.txt.
The thing is that when I open output.txt I find this list:
file1.txt
file2.txt
file3.txt
output.txt
Is there a way to avoid printing the name of the file where I'm redirecting the output? Or better is there a command able to print all the filenames of files in a directory except one?
printf '%s\n' * > output.txt
Note that this assumes that there's no preexisting output.txt file -
if so, delete it first.
printf '%s\n' * uses globbing (filename expansion) to robustly print the names of all files and subdirectories located in the current directory, line by line.
Globbing happens before output.txt is created via output redirection > output.txt (which still happens before the command is executed, which explains your problem), so its name is not included in the output.
Globbing also avoids the use of ls, whose use in scripting is generally discouraged.
In general, it is not good to parse the output of ls, especially while writing production quality scripts that need to be in good standing for a long time. See this page to find out why: Don't parse ls output
In your example, output.txt is a part of the output in ls > output.txt because shell arranges the redirection (to output.txt) before running ls.
The simplest way to get the right behavior for your case would be:
ls file*txt > output.txt # as long as you are looking for files named that way
or, store the output in a hidden file (or in a normal file in some other directory) and then move it to the final place:
ls > .output.txt && mv .output.txt output.txt
A more generic solution would be using grep -v:
ls | grep -vFx output.txt > output.txt
Or, you can use an array:
files=( "$(ls)" )
printf '%s\n' "${files[#]}" > output.txt
ls has an ignore option and we can use find command also.
Using ls with ignore option
ls -I "output.txt" > output.txt
ls --ignore "output.txt" > output.txt
-I, --ignore are same. This option says, as in the man page, do not list implied entries matching shell PATTERN.
Using find
find \! -name "output.txt" > output.txt
-name option in find finds files/directories whose name match the pattern.
! -name excludes whose name match the pattern.
find \! -name "output.txt" -printf '%P\n' > output.txt
%P strips the path and gives only names.
The most safe way, without assuming anything about the file names, is to use bash arrays (in memory) or a temporary file. A temporary file does not need memory, so it may be even safer. Something like:
#!/bin/bash
tmp=$(tempfile)
ls > $tmp
mv $tmp output.txt
Using ls and awk commands you can get the correct output.
ls -ltr | awk '/txt/ {print $9}' > output.txt
This will print only filenames.
My way would be like:
ls *.txt > output.txt
Note that shell will always expand all globs before running it. In your specific case, the glob expansion process goes like:
# "ls *.txt > output.txt" will be expanded as
ls file1.txt file2.txt file3.txt > output.txt
The reason why you get "output.txt" in your final output file is that redirection actually works among all connected programs SIMULTANEOUSLY.
That means the redirection process does not occur at the end of the program ls, but happens each time ls yields a line of output. In your case, when ls finishing yield the very first line, the file "output.txt" would be created, which will finally be return by ls anyway.

Need to concatenate a string to each line of ls command output in unix

I am a beginer in Shell script. Below is my requirement in UNIX Korn Shell.
Example:
When we list files using ls command redirect to a file the file names will be stored as below.
$ ls FILE*>FLIST.TXT
$ cat FLIST.TXT
FILE1
FILE2
FILE3
But I need output as below with a prefixed constant string STR,:
$ cat FLIST.TXT
STR,FILE1
STR,FILE2
STR,FILE3
Please let me what should be the ls command to acheive this output.
You can't use ls alone to append data before each file. ls exists to list files.
You will need to use other tools along side ls.
You can append to the front of each line using the sed command:
cat FLIST.TXT | sed 's/^/STR,/'
This will send the changes to stdout.
If you'd like to change the actual file, run sed in place:
sed -i -e 's/^/STR,/' FLIST.TXT
To do the append before writing to the file, pipe ls into sed:
ls FILE* | sed 's/^/STR,/' > FLIST.TXT
The following should work:
ls FILE* | xargs -i echo "STR,{}" > FLIST.TXT
It takes every one of the file names filtered by ls and adds the "STR," prefix to it prior to the appending

Listing files in date order with spaces in filenames

I am starting with a file containing a list of hundreds of files (full paths) in a random order. I would like to list the details of the ten latest files in that list. This is my naive attempt:
$ ls -las -t `cat list-of-files.txt` | head -10
That works, so long as none of the files have spaces in, but fails if they do as those files are split up at the spaces and treated as separate files. File "hello world" gives me:
ls: hello: No such file or directory
ls: world: No such file or directory
I have tried quoting the files in the original list-of-files file, but the here-document still splits the files up at the spaces in the filenames, treating the quotes as part of the filenames:
$ ls -las -t `awk '{print "\"" $0 "\""}' list-of-files.txt` | head -10
ls: "hello: No such file or directory
ls: world": No such file or directory
The only way I can think of doing this, is to ls each file individually (using xargs perhaps) and create an intermediate file with the file listings and the date in a sortable order as the first field in each line, then sort that intermediate file. However, that feels a bit cumbersome and inefficient (hundreds of ls commands rather than one or two). But that may be the only way to do it?
Is there any way to pass "ls" a list of files to process, where those files could contain spaces - it seems like it should be simple, but I'm stumped.
Instead of "one or more blank characters", you can force bash to use another field separator:
OIFS=$IFS
IFS=$'\n'
ls -las -t $(cat list-of-files.txt) | head -10
IFS=$OIFS
However, I don't think this code would be more efficient than doing a loop; in addition, that won't work if the number of files in list-of-files.txt exceeds the max number of arguments.
Try this:
xargs -a list-of-files.txt ls -last | head -n 10
I'm not sure whether this will work, but did you try escaping spaces with \? Using sed or something. sed "s/ /\\\\ /g" list-of-files.txt, for example.
This worked for me:
xargs -d\\n ls -last < list-of-files.txt | head -10

Concatenating multiple text files into a single file in Bash

What is the quickest and most pragmatic way to combine all *.txt file in a directory into one large text file?
Currently I'm using windows with cygwin so I have access to BASH.
Windows shell command would be nice too but I doubt there is one.
This appends the output to all.txt
cat *.txt >> all.txt
This overwrites all.txt
cat *.txt > all.txt
Just remember, for all the solutions given so far, the shell decides the order in which the files are concatenated. For Bash, IIRC, that's alphabetical order. If the order is important, you should either name the files appropriately (01file.txt, 02file.txt, etc...) or specify each file in the order you want it concatenated.
$ cat file1 file2 file3 file4 file5 file6 > out.txt
The Windows shell command type can do this:
type *.txt > outputfile.txt
Type type command also writes file names to stderr, which are not captured by the > redirect operator (but will show up on the console).
You can use Windows shell copy to concatenate files.
C:\> copy *.txt outputfile
From the help:
To append files, specify a single file for destination, but multiple files for source (using wildcards or file1+file2+file3 format).
Be careful, because none of these methods work with a large number of files. Personally, I used this line:
for i in $(ls | grep ".txt");do cat $i >> output.txt;done
EDIT: As someone said in the comments, you can replace $(ls | grep ".txt") with $(ls *.txt)
EDIT: thanks to #gnourf_gnourf expertise, the use of glob is the correct way to iterate over files in a directory. Consequently, blasphemous expressions like $(ls | grep ".txt") must be replaced by *.txt (see the article here).
Good Solution
for i in *.txt;do cat $i >> output.txt;done
How about this approach?
find . -type f -name '*.txt' -exec cat {} + >> output.txt
the most pragmatic way with the shell is the cat command. other ways include,
awk '1' *.txt > all.txt
perl -ne 'print;' *.txt > all.txt
type [source folder]\*.[File extension] > [destination folder]\[file name].[File extension]
For Example:
type C:\*.txt > C:\1\all.txt
That will Take all the txt files in the C:\ Folder and save it in C:\1 Folder by the name of all.txt
Or
type [source folder]\* > [destination folder]\[file name].[File extension]
For Example:
type C:\* > C:\1\all.txt
That will take all the files that are present in the folder and put there Content in C:\1\all.txt
You can do like this:
cat [directory_path]/**/*.[h,m] > test.txt
if you use {} to include the extension of the files you want to find, there is a sequencing problem.
The most upvoted answers will fail if the file list is too long.
A more portable solution would be using fd
fd -e txt -d 1 -X awk 1 > combined.txt
-d 1 limits the search to the current directory. If you omit this option then it will recursively find all .txt files from the current directory.
-X (otherwise known as --exec-batch) executes a command (awk 1 in this case) for all the search results at once.
Note, fd is not a "standard" Unix program, so you will likely need to install it
When you run into a problem where it cats all.txt into all.txt,
You can try check all.txt is existing or not, if exists, remove
Like this:
[ -e $"all.txt" ] && rm $"all.txt"
all of that is nasty....
ls | grep *.txt | while read file; do cat $file >> ./output.txt; done;
easy stuff.

Resources