I have a bunch of sequentially named files in this format: imageXXX.jpg. So it would be like image001.jpg and onward. I just want to keep the number part of this, and get rid of the prepended 0's. So instead, that file would be named 1.jpg. How could I achieve this using Bash?
Pure Bash:
shopt -s extglob
for f in image*.jpg; do mv "$f" "${f/#image*(0)}"; done
Additional code could check for name collisions or handle other error conditions. You could use mv -i to prompt before files are overwritten.
On Linux the venerable Perl utility rename is friendly:
$ rename 's/^image0+//' image*.jpg
You should be aware that stripping leading zeros will ruin the sort order, that is *.jpg orders like:
1.jpg
10.jpg
11.jpg
...
2.jpg
20.jpg
If you want to keep the order just use
$ rename 's/^image//' image*.jpg
instead.
added in response to system identification
You can likely script it in bash alone, but it would be non-trivial and the failure cases really need to be handled correctly. Yeah, hoisting Perl onto a system is non-trivial too, but it is easy and that wheel's already been invented
Fedora Core 8 Perl RPM: http://rpm.pbone.net/index.php3/stat/4/idpl/5152898/dir/fedora_8/com/perl-5.8.8-30.n0i.51.fc8.i386.rpm.html
CPAN rename: http://metacpan.org/pod/File::Rename
added in response to silent failure
rename like chmod will complain if you give it malformed arguments, but both are silent if what you request has no effect. For example
$ ls -l junk
-rw-r--r-- 1 msw msw 0 2010-09-24 01:59 junk
$ chmod 688 junk
chmod: invalid mode: '688'
$ chmod 644 junk # was already 644 mode, nothing happened no error
$ rename 's/bob/alice/' ju*k
# there was no 'bob' in 'junk' to substitute, no change, no error
$ ls -l junk
-rw-r--r-- 1 msw msw 0 2010-09-24 01:59 junk
$ rename 's/un/ac/' j*k # but there is an 'un' in 'junk', change it
$ ls -l j*k
-rw-r--r-- 1 msw msw 0 2010-09-24 01:59 jack
You can make rename less silent:
$ rename --verbose 's/ac/er/' j*k
jack renamed as jerk
$ rename --verbose 's/ac/er/' j*k # nothing to rename
$
Related
I was trying to copy some files with numbering with for i in {};do cp ***;done, but I have encountered an error.
$ for i in {0..2};do cp ./CTCC_noSDS_0$i_hg19.bwt2pairs.validPairs 3_dataset_pool/;done
cp: cannot stat ‘./CTCC_noSDS_0.bwt2pairs.validPairs’: No such file or directory
cp: cannot stat ‘./CTCC_noSDS_0.bwt2pairs.validPairs’: No such file or directory
cp: cannot stat ‘./CTCC_noSDS_0.bwt2pairs.validPairs’: No such file or directory
The file name look like below:
-rw-r--r-- 1 jiangxu lc_lc 456M Nov 12 20:22 CTCC_noSDS_00_hg19.bwt2pairs.validPairs
-rw-r--r-- 1 jiangxu lc_lc 466M Nov 12 20:23 CTCC_noSDS_01_hg19.bwt2pairs.validPairs
-rw-r--r-- 1 jiangxu lc_lc 473M Nov 12 20:23 CTCC_noSDS_02_hg19.bwt2pairs.validPairs
I can cp the file one by one manually but can not use the for loop. It seems that the system just ignored the $i for no reason, So, could anyone tell me what is the problem with the command?
Using a similar method to the OP
You can do something like this (call this script b.bash):
#!/bin/bash
DST_DIR=./mydir/
SRC_DIR=./
for i in {1..5}; do
echo "[*] Trying to find files with number $i"
if [ "$i" -lt 10 ]; then
potential_file="$SRC_DIR/CTCC_noSDS_0${i}_hg19.bwt2pairs.validPairs"
else
potential_file="$SRC_DIR/CTCC_noSDS_${i}_hg19.bwt2pairs.validPairs"
fi
if [ -f "$potential_file" ]; then
echo "[!] Moving $potential_file"
cp "$potential_file" "$DST_DIR"
fi
done
Let us say we have the following in the current directory:
$ ls -1
b.bash
CTCC_noSDS_00_hg19.bwt2pairs.validPairs
CTCC_noSDS_01_hg19.bwt2pairs.validPairs
CTCC_noSDS_02_hg19.bwt2pairs.validPairs
CTCC_noSDS_12_hg19.bwt2pairs.validPairs
mydir
And let us say we want to copy these files to mydir. If we run the above script, we see this output:
$ ./b.bash
[*] Trying to find files with number 1
[!] Moving .//CTCC_noSDS_01_hg19.bwt2pairs.validPairs
[*] Trying to find files with number 2
[!] Moving .//CTCC_noSDS_02_hg19.bwt2pairs.validPairs
[*] Trying to find files with number 3
[*] Trying to find files with number 4
[*] Trying to find files with number 5
Then looking in mydir we see the files:
$ ls -1 mydir/
CTCC_noSDS_01_hg19.bwt2pairs.validPairs
CTCC_noSDS_02_hg19.bwt2pairs.validPairs
Note that the above for loop only goes to 5. You can modify that as
you see fit.
Note the if statement in the for loop and what it is
for.
Using find : )
You can instead use the find command like so:
find . -type f -iname 'CTCC_noSDS_*_hg19.bwt2pairs.validPairs' -exec bash -c 'cp {} mydir/' \;
Here is a safer/stricter find command using regex:
find . -regextype sed -regex ".*CTCC_noSDS_[0-9]\+_hg19.bwt2pairs.validPairs" -exec bash -c 'cp {} ./mydir/' \;
The shell did not ignore the variable. You are misusing variable expansion. Valid characters for variable names include underscores (one of which you have next to $i). So, what the shell is actually seeing is a variable called i__hg19, which is undefined. Thus, the filename is unexistent.
The solution is to wrap $i between curly braces like this:
cp ./CTCC_noSDS_0${i}_hg19.bwt2pairs.validPairs
I am trying to get a directory to replace an existing folder but can't get it done with mv - I believe there's a way and I just don't know it (yet). Even after consulting the man page and searching the web.
If /path/to/ only contains directory, the following command will move /path/to/directory (vanishes) to /path/to/folder
mv /path/to/directory /path/to/folder
It is basically a rename, which is what I try to achieve.
But if /path/to/folder already exists, the same command moves the /path/to/directory to /path/to/folder/directory.
I do not want to use cp command to avoid IO.
Instead of using cp to actually copy the data in each file, use ln to make "copies" of the pointers to the file.
ln /path/to/directory/* /path/to/folder && rm -rf /path/to/directory
Note this is slightly more atomic than using cp; each individual file appears in /path/to/folder in a single step (i.e., there is no chance that /path/to/folder/foo.txt is ever partially copied), but there is still a small window where some, but not all, files from /path/to/directory have been linked to folder. Also, the rm -rf is not atomic, but assuming no one is interested in directory, that's not an issue. (Although, as files from /path/to/directory are unlinked, you can see changes to the link counts of files under /path/to/foldoer changing from 2 to 1. It's unlikely that anyone will care about that.)
What you think of as a file is really just a file system entry to an otherwise anonymous file managed by the file system. For example, consider a simple example.
$ mkdir d
$ cd d
$ echo hello > file.txt
$ cp file.txt file_copy.txt
$ ln file.txt file_link.txt
$ ls -li
total 24
12890456377 -rw-r--r-- 2 chepner staff 6 Mar 3 12:46 file.txt
12890456378 -rw-r--r-- 1 chepner staff 6 Mar 3 12:47 file_copy.txt
12890456377 -rw-r--r-- 2 chepner staff 6 Mar 3 12:46 file_link.txt
The -i option adds each entries inode number (the first column) to the output; an inode can be thought of as a unique identifier for a file. In this output, you can see that file_copy.txt is an entirely new file, with a different inode than file.txt. file_link.txt has the exact same inode, meaning that file.txt and file_link.txt are simply two different names for the same thing. The number just before the owner is the link count; file.txt and file_link.txt both refer to a file with a link count of 2.
When you use rm, you are just removing a link to a file, not the file itself. A file is not removed until the link count is reduced to 0. To demonstrate, we'll remove file.txt and file_copy.txt.
$ rm file.txt file_copy.txt
$ ls -li
total 8
12890456377 -rw-r--r-- 1 chepner staff 6 Mar 3 12:46 file_link.txt
As you can see, the only link to file_copy is gone, so inode 12890456378 no longer appears in the output. (Whether or not the data is really gone is a matter of file-system implementation.) file_link.txt, though, still refers to the same file as before, but now with a link count of 1, because file.txt was removed.
Links to a file do not have to appear in the same directory; they can appear in any directory on the same file system, which is the only caveat using this trick. ln will, IIRC, give you an error if you try to create a link to a file on another file system.
This is a simplified example to hopefully illustrate my problem.
I have a script that takes a parameter to be used as a wildcard. Sometimes this wildcard contains whitespace. I need to be able to use the wildcard for globbing, but word splitting is causing it to fail.
For example, consider the following example files:
$ ls -l "/home/me/dir with whitespace"
total 0
-rw-r--r-- 1 me Domain Users 0 Jun 25 16:58 file_a.txt
-rw-r--r-- 1 me Domain Users 0 Jun 25 16:58 file_b.txt
My script - simplified to use a hard coded pattern variable - looks like this:
#!/bin/bash
# Here this is hard coded, but normally it would be passed via parameter
# For example: pattern="${1}"
# The whitespace and wildcard can appear anywhere in the pattern
pattern="/home/me/dir with whitespace/file_*.txt"
# First attempt: without quoting
ls -l ${pattern}
# Result: word splitting AND globbing
# ls: cannot access /home/me/dir: No such file or directory
# ls: cannot access with: No such file or directory
# ls: cannot access whitespace/file_*.txt: No such file or directory
####################
# Second attempt: with quoting
ls -l "${pattern}"
# Result: no word splitting, no globbing
# ls: cannot access /home/me/dir with whitespace/file_*.txt: No such file or directory
Is there a way to enable globbing, but disable word splitting?
Do I have any options except manually escaping whitespace in my pattern?
Don't keep glob inside the quote to be able to expand it:
pattern="/home/me/dir with whitespace/file_"
ls -l "${pattern}"*
EDIT:
Based on edited question and comment you can use find:
find . -path "./$pattern" -print0 | xargs -0 ls -l
I finally got it!
The trick is modifying the internal field separator (IFS) to be null. This prevents word splitting on unquoted variables until IFS is reverted to its old value or until it becomes unset.
Example:
$ pattern="/home/me/dir with whitespace/file_*.txt"
$ ls -l $pattern
ls: cannot access /home/me/dir: No such file or directory
ls: cannot access with: No such file or directory
ls: cannot access whitespace/file_*.txt: No such file or directory
$ IFS=""
$ ls -l $pattern
-rw-r--r-- 1 me Domain Users 0 Jun 26 09:14 /home/me/dir with whitespace/file_a.txt
-rw-r--r-- 1 me Domain Users 0 Jun 26 09:14 /home/me/dir with whitespace/file_b.txt
$ unset IFS
$ ls -l $pattern
ls: cannot access /home/me/dir: No such file or directory
ls: cannot access with: No such file or directory
ls: cannot access whitespace/file_*.txt: No such file or directory
I found out the hard way that you cannot set and use IFS with ls. For example, this doesn't work:
$ IFS="" ls -l $pattern
This is because the command has already undergone word splitting before IFS changes.
I have a directory that contains sub-directories and other files and would like to update the date/timestamps recursively with the date/timestamp of another file/directory.
I'm aware that:
touch -r file directory
changes the date/timestamp for the file or directory with the others, but nothing within it. There's also the find version which is:
find . -exec touch -mt 201309300223.25 {} +\;
which would work fine if i could specify the actual file/directory and use anothers date/timestamp. Is there a simple way to do this? even better, is there a way to avoid changing/updating timestamps when doing a 'cp'?
even better, is there a way to avoid changing/updating timestamps when doing a 'cp'?
Yes, use cp with the -p option:
-p
same as --preserve=mode,ownership,timestamps
--preserve
preserve the specified attributes (default:
mode,ownership,timestamps), if possible additional attributes:
context, links, xattr, all
Example
$ ls -ltr
-rwxrwxr-x 1 me me 368 Apr 24 10:50 old_file
$ cp old_file not_maintains <----- does not preserve time
$ cp -p old_file do_maintains <----- does preserve time
$ ls -ltr
total 28
-rwxrwxr-x 1 me me 368 Apr 24 10:50 old_file
-rwxrwxr-x 1 me me 368 Apr 24 10:50 do_maintains <----- does preserve time
-rwxrwxr-x 1 me me 368 Sep 30 11:33 not_maintains <----- does not preserve time
To recursively touch files on a directory based on the symmetric file on another path, you can try something like the following:
find /your/path/ -exec touch -r $(echo {} | sed "s#/your/path#/your/original/path#g") {} \;
It is not working for me, but I guess it is a matter of try/test a little bit more.
In addition to 'cp -p', you can (re)create an old timestamp using 'touch -t'. See the man page of 'touch' for more details.
touch -t 200510071138 old_file.dat
I wonder how to list the content of a tar file only down to some level?
I understand tar tvf mytar.tar will list all files, but sometimes I would like to only see directories down to some level.
Similarly, for the command ls, how do I control the level of subdirectories that will be displayed? By default, it will only show the direct subdirectories, but not go further.
depth=1
tar --exclude="*/*" -tf file.tar
depth=2
tar --exclude="*/*/*" -tf file.tar
tar tvf scripts.tar | awk -F/ '{if (NF<4) print }'
drwx------ glens/glens 0 2010-03-17 10:44 scripts/
-rwxr--r-- glens/www-data 1051 2009-07-27 10:42 scripts/my2cnf.pl
-rwxr--r-- glens/www-data 359 2009-08-14 00:01 scripts/pastebin.sh
-rwxr--r-- glens/www-data 566 2009-07-27 10:42 scripts/critic.pl
-rwxr-xr-x glens/glens 981 2009-12-16 09:39 scripts/wiki_sys.pl
-rwxr-xr-x glens/glens 3072 2009-07-28 10:25 scripts/blacklist_update.pl
-rwxr--r-- glens/www-data 18418 2009-07-27 10:42 scripts/sysinfo.pl
Make sure to note, that the number is 3+ however many levels you want, because of the / in the username/group. If you just do
tar tf scripts.tar | awk -F/ '{if (NF<3) print }'
scripts/
scripts/my2cnf.pl
scripts/pastebin.sh
scripts/critic.pl
scripts/wiki_sys.pl
scripts/blacklist_update.pl
scripts/sysinfo.pl
it's only two more.
You could probably pipe the output of ls -R to this awk script, and have the same effect.
Another option is archivemount. You mount it, and cd into it. Then you can do anything with it just as with other filesystem.
$ archivemount /path/to/files.tgz /path/to/mnt/folder
It seems faster than the tar method.
It would be nice if we could tell the find command to look inside a tar file, but I doubt that is possible.
I quick and ugly (and not foolproof) way would be to limit the number of directory separators, for example:
$ tar tvf myfile.tar | grep -E '^[^/]*(/[^/]*){1,2}$'
The 2 tells to display not more than 2 slashes (in my case one is already generated by the user/group separator), and hence, to display files at depth at most one. You might want to try with different numbers in place of the 2.
I agree with leonbloy's answer - there's no way to do this straightforwardly within the tarball itself.
Regarding the second part of your question, ls does not have a max depth option. You can recurse everything with ls -R, but that's often not very useful.
However you can do this with both find and tree. For example to list files and directories one level deep, you can do
find -maxdepth 2
or
tree -L 2
tree also has a -d option, which recursively lists directories, but not files, which I find much more useful than -L, in general.
I was able to show only the directory names at a particular depth using grep:
for depth 3:
tar -tf mytar.tar | grep -Ex '([^/]+/){3}'
or for depth $DEPTH:
tar -tf mytar.tar | grep -Ex '([^/]+){$DEPTH}/'
You can speed that up by combining grep with --exclude from #sacapeao's accepted answer.
for depth 3:
tar --exclude '*/*/*/*/*' -tf mytar.tar | grep -Ex '([^/]+/){3}'