Add string to each member of variable in Bash - bash

I have the following command to gather all files in a folder and concatenate them....but what is held in the variable is only the file names and not the directory. How can I add 'colid-data/' to each of the files for cat to us?
cat $(ls -t colid-data) > catfiles.txt

List the filenames, not the directory.
cat $(ls -t colid-data/*) > catfiles.txt
Note that this will not work if any of the filenames contain whitespace. See Why not parse ls? for better alternatives.

If you want to concatenate them in date order, consider using zsh:
cat colid-data/*(.om) >catfiles.txt
That would concatenate all regular files only, in order of most recently modified first.
From bash, you could do this with
zsh -c 'cat colid-data/*(.om)' >catfiles.txt
If the ordering of the files is not important (and if there's only regular files in the directory, no subdirectories), just use
cat colid-data/* >catfiles.txt
All of these variations would work with filenames containing spaces, tabs and newlines, since the list of pathnames returned by a filename globbing pattern is not split into further words (which the result of an unquoted command substitution is).

Related

Shell : Display a folder's files content (ls) , search for a file name and display it

I have a folder which may contain several files. Among those files I have files like these:
test.xml
test.jar
test.jarGENERATED
dev.project.jar
...
and many other files. To get only the "dev.project.jar" I have executed:
ls | grep ^{{dev}}.*.jar$
This displays the file with its properties for me. However, I only want the file name (only the file name string)
How to rectify it??
ls and grep are both unnecessary here. The shell will show you any file name matches for a wildcard:
echo dev.*.jar
(ls dev.*.jar without options will do something similar per se; if you see anything more than the filename, perhaps you have stupidly defined alias ls='ls -l' or something like that?)
The argument to grep should be a regular expression; what you specified would match {{dev}} and not dev, though in the absence of quoting, your shell might have expanded the braces. The proper regex would be grep '^dev\..*\.jar$' where the single quotes protect the regex from any shell expansions, and . matches any character, and * repeats that character as many times as possible. To match a literal dot, we backslash-escape it.
Just printing a file name is rarely very useful; often times, you actually want something like
for file in ./dev.*.jar; do
echo "$file"
: probably do more things with "$file"
done
though if that's all you want, maybe prefer printf over echo, which also lets you avoid the loop:
printf '%s\n' dev.*.jar

How do I only list specific files with 'ls' in bash?

I was wondering how I can list files with ls in bash that will only list a specific subset of files?
For example, I have a folder with 10000 files, some of which are named:
temp_cc1_covmat and temp_cc1_slurm, but the values of 1 range from 1-1000.
So how would I list only say, temp_cc400_slurm-temp_cc_499_slurm?
I want to do this as I would like to queue files on a supercomputer that only ends with slurm. I could do sbatch *_slurm but there are also a lot of other files in the folder that ends with _slurm.
You can use this Brace Expansion in bash:
temp_cc{400..499}_slurm
To list these file use:
echo temp_cc{400..499}_slurm
or:
printf "%s\n" temp_cc{400..499}_slurm
or even ls:
ls temp_cc{400..499}_slurm
Using the ? wildcard:
$ ls temp_cc4??_slurm
man 7 glob:
Wildcard matching
A string is a wildcard pattern if it contains one of the characters
'?', '*' or '['. Globbing is the operation that expands a wildcard
pattern into the list of pathnames matching the pattern. Matching is
defined by:
A '?' (not between brackets) matches any single character.
The argument list too long error applies using the ? also. I tested with ls test????? and it worked but with ls test[12]????? I got the error. (Yes, you could ls temp_cc4[0-9][0-9]_slurm also.)

Expand part of the path in bash script

I am trying to list all files located in specific sub-directories of a directory in my bash script. Here is what I tried.
root_dir="/home/shaf/data/"
sub_dirs_prefixes=('ab' 'bp' 'cd' 'cn' 'll' 'mr' 'mb' 'nb' 'nh' 'nw' 'oh' 'st' 'wh')
ls "$root_dir"{$(IFS=,; echo "${sub_dirs_prefixes[*]}")}"rrc/"
Please note that I do not want to expand value stored in $root_dir as it may contain spaces but I do want to expand sub-path contained in {} which is a comma delimited string of contents of $sub_dirs_prefixes. I stored sub-directories prefixes in an array variable, $sub_dirs_prefixes , because I have to use them later on for something else.
I am getting following error:
ls: cannot access /home/shaf/data/{ab,bp,cd,cn,ll,mr,mb,nb,nh,nw,oh,st,wh}rrc/
If I copy the path in error message and run ls from command line, it does display contents of listed sub-directories.
You can command substitution to generate an extended pattern.
shopt -s extglob
ls "$root_dir"/$(IFS="|"; echo "#(${sub_dirs_prefixes[*]})rrc")
By the time parameter can command substitutions have completed, the shell sees this just before performing pathname expansion:
ls "/home/shaf/data/"/#(ab|bp|cd|cn|ll|mr|mb|nb|nh|nw|oh|st|wh)rrc
The #(...) pattern matches one of the enclosed prefixes.
It gets a little trickier if the components of the directory names contain characters that need to be quoted, since we aren't quoting the command substitution.

How to list files whose names are a substring of files to ignore

Hello I'm new with bash and I'd like to know in which way I can list files that can end by one or two digits.
e.g.
hello1
hello2
hello3
hello11
are the files i want to list in a directory, but that directory also includes files I don't want to list such as:
hello2-super
hello3_nice
hello1-the_best1
You can do
ls *[0-9]
to list all the files that end in a digit, or replace the [0-9] with other groups if you have different matches in the future you want.
If you want to get all the "hello{digits}" type files but exclude "hello1-the_best1" you can use extended globs in bash, or you could use grep. With extended globs:
shopt -s extglob
ls hello[0-9]?([0-9])
which will enable extended globs, then match hello followed by 1 or 2 digits
with grep you could do
ls | egrep "^hello[0-9]{1,2}$"
which will do the same, but requires a subshell and second process

bash for loop on directories with symbols

I'm trying to create a for loop on folders that contain spaces, comma's and parenthesis. For example:
Italy - Rimini (Feb 09, 2013)
First it scans a parent folder /albums for sub-folders that look like in the example above. Then it executes a curl actions on files in thoses sub-folders. It works fine if the sub-folders do not contain spaces, comma's or other symbols.
for dir in `ls /albums`;
do
for file in /albums/$dir/*
do
curl http://upload.com/up.php -F uploadfile[]=#"$file" > out.txt
php process.php
done
php match.php
done
But if there are such symbols, it seems the the curl bit gets stuck - it can't find the $file (probably because $dir is incorrect).
I could replace all the symbols in the sub-dirs or remove them or rename the folders to 001, 002 and it works flawlessly. But before resorting to that I'd like to know if it can be solved using bash tricks while keeping the sub-folder name intact?
Familiarize yourself with the concept of word splitting of your shell. Then realize that using ls to get a list of files with spaces is asking for trouble. Instead, use shell globbing and then quote expansions:
cd /albums
for dir in *; do
for file in /albums/"$dir"/*; do
echo x"$dir"x"$file"x
done
php match.php
done
For problems with spaces in filenames, you have to change the IFS to
IFS='
'
which tells the shell, that only linebreaks are file separators. By default IFS is set to tabs, spaces and linebreaks.
So just put this before the loop begins, and it will work with filenames that contains spaces.
And of course put quotes around your variablenames.

Resources