I have around 7,000 .txt files that have been spat out by a program where the naming convention clearly broke. The only saving grace is that they follow the following structure: id, date, time.
m031060209104704.txt --> id:m031 date:060209 time:104704.txt
Sample of other filenames (again same thing):
115-060202105710.txt --> id:115- date:060202 time: 105710.txt
x138051203125338.txt etc...
9756060201194530.txt etc..
I want to rename all 7,000 files in this directory to look like the following:
m031060209104704.txt --> 090206_104704_m031.txt
i.e date_time_id (each separated by underscores or hyphens, I don't mind). I need the date format to be switched from yymmdd to ddmmyy as shown directly above though!
I'm not clear on whats overkill here, full program script or bash command (MAC OS). Again, I don't mind, any and all help is appreciated.
Try something like:
#!/bin/bash
# directory to store renamed files
newdir="./renamed"
mkdir -p $newdir
for file in *.txt; do
if [[ $file =~ ^(....)([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{6})\.txt$ ]]; then
# extract parameters
id=${BASH_REMATCH[1]}
yy=${BASH_REMATCH[2]}
mm=${BASH_REMATCH[3]}
dd=${BASH_REMATCH[4]}
time=${BASH_REMATCH[5]}
# then rearrange them to new name
newname=${dd}${mm}${yy}_${time}_${id}.txt
# move to new directory
mv "$file" "$newdir/$newname"
fi
done
Bash string indexes makes it very easy and efficient to rework the filenames as you intend. You should also validate you are only operating on input filenames of 20 characters. That can be accomplished as follows:
#!/bin/bash
for i in *.txt; do
## validate a 20 character filename
(( ${#i} == 20 )) || { printf "invalid length '%s'\n" "$i"; continue; }
echo "mv $i ${i:8:2}${i:6:2}${i:4:2}_${i:10:6}_${i:0:4}.txt" ## output rename
mv "$i" "${i:8:2}${i:6:2}${i:4:2}_${i:10:6}_${i:0:4}.txt" ## actual rename
done
Example Directory
$ ls -l
total 0
-rw-r--r-- 1 david david 0 Dec 21 19:16 115-060202105710.txt
-rw-r--r-- 1 david david 0 Dec 21 19:16 9756060201194530.txt
-rw-r--r-- 1 david david 0 Dec 21 19:15 m031060209104704.txt
-rw-r--r-- 1 david david 0 Dec 21 19:16 x138051203125338.txt
Example Use/Output
$ cd thedir
$ bash ../script.sh
mv 115-060202105710.txt 020206_105710_115-.txt
mv 9756060201194530.txt 010206_194530_9756.txt
mv m031060209104704.txt 090206_104704_m031.txt
mv x138051203125338.txt 031205_125338_x138.txt
$ ls -l
total 0
-rw-r--r-- 1 david david 0 Dec 21 19:42 010206_194530_9756.txt
-rw-r--r-- 1 david david 0 Dec 21 19:42 020206_105710_115-.txt
-rw-r--r-- 1 david david 0 Dec 21 19:42 031205_125338_x138.txt
-rw-r--r-- 1 david david 0 Dec 21 19:42 090206_104704_m031.txt
Look things over and let me know if you have any further questions.
Related
I got a bunch of mp3 files with random names and numbers like:
01_fileabc.mp3
01.filecdc.mp3
fileabc.mp3
929-audio.mp3
For sorting purposes, I need to add a sequential number in front of the file name like:
001_01_fileabc.mp3
002_01.filecdc.mp3
003_fileabc.mp3
004_929-audio.mp3
I checked some of the solutions I found here. One of the first solutions worked kind of but replaced the filename instead of adding to it.
num=0; for i in *; do mv "$i" "$(printf '%04d' $num).${i#*.}"; ((num++)); done
How can I modify this command to add to the filename instead?
I am sorry, but whatever I try I can't find a solution myself here.
Just replace ${i#*.} (which stands for "Remove from $i from the left up to the first dot) with $i, which is the original name of the file (I'd probably use $filename, $oldfile, or at least $f instead of $i as the variable's name).
You can also replace the . before it with _, otherwise the files will be named
0001.01_fileabc.mp3
etc.
UPDATE: As RobC commented about this answer, existing whitespace or newline characters can cause problems listing files because of using ls command with bash arrays. So the above code can be improved in this way
#!/bin/bash
i=0
for file in *.mp3; do
i=$((i+1))
mv "$file" "$(printf "%03d_%s" "$i" "$file")"
done
ORIGINAL ANSWER: You can try this code in a bash script. Remember to make it executable with
$ chmod +x script.sh.
#!/bin/bash
contents_dir=($(ls *.mp3))
for file in ${!contents_dir[*]}; do
new=$(awk -v i="$file" -v cd="${contents_dir[$file]}" 'BEGIN {printf("%03d_%s", i+1, cd)}')
mv ${contents_dir[$file]} $new
done
It will add a consecutive 0-leaded tree digits number as you wanted to all mp3 files found in the dir where the script is executed.
You could try this …
$ ls -l
total 4
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 01.filecdc.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 01_fileabc.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 929-audio.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 fileabc.mp3
$ t=0
$ for i in *mp3
> do
> # Use the seq command to get a formatted zero filled string.
> prefix=$(seq -f "%04g" $t $t)
>
> # Move $i to new file name.
> mv $i ${prefix}_${i}
>
> # Increment our counter, t.
> t=$(expr $t + 1)
> done
$ ls -l
total 4
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 0000_01.filecdc.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 0001_01_fileabc.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 0002_929-audio.mp3
-rw-r--r-- 1 plankton None 5 Nov 4 13:35 0003_fileabc.mp3
I have over 100000 files.
for example, I mentioned 3 files below
bcbb79d8-1d4a-4fbb-b16c-4df86839773e.htseq.counts.gz
bcdc68db-c874-4097-9c46-b06e331caaf5.htseq.counts.gz
bd4b6975-90d9-43f8-aadc-344d04644822.htseq.counts.gz
I have a text file named key.txt with the following information.
File Name ID
bcbb79d8-1d4a-4fbb-b16c-4df86839773e.htseq.counts.gz TCCC-06-0210
bcdc68db-c874-4097-9c46-b06e331caaf5.htseq.counts.gz TCHA-27-2519
bd4b6975-90d9-43f8-aadc-344d04644822.htseq.counts.gz TCHU-76-4929
I want to take only those files that their name are in the key , move them to a new folder and change their name to the ID.
I guess a little more of a write up rather than a comment would be helpful. The approach to take is to read the filename (fname) and ID (id) from each line in key.txt and then validate that fname is a file and does exist, and then move the file in "$fname" to whatever "/path/to/move/to/$id" you need.
For example:
#!/bin/bash
## read each line into variables fname and id (handle non-POSIX eof)
while read -r fname id || [ -n "$fname" ]; do
## test that "$fname" is a file, and if so, move to destination
[ -f "$fname" ] && mv "$fname" "/path/to/move/to/$id"
done < key.txt
(note: a POSIX end-of-file (eof) is simply the final '\n' at the end of the last line. Some editors do not enforce it and it will cause your read to miss the final line of data unless you check that "$fname" was filled with data (is non-empty) -- the [ -n "$fname" ] added to the end of the white read -r ...)
You are feeding the loop with a redirection of key.txt. Each iteration of the while loop will read a new line from key.txt into the variables fname and id (word-splitting on the default Internal Field Separator (IFS). After the read and separation into fname and id, you simply verify $fname holds a valid filename (in the current working directory) and then mv the file where you want it.
You should execute the script in the directory containing the files, or append a relative or absolute filename to where they are located to "$fname".
Example
Here is a short example that may help clear things up:
The move_rename.sh script:
$ cat move_rename.sh
#!/bin/bash
## read each line into variables fname and id (handle non-POSIX eof)
while read -r fname id || [ -n "$fname" ]; do
## test that "$fname" is a file, and if so, move to destination
[ -f "$fname" ] && mv "$fname" "dest/$id.txt"
done < key.txt
The key.txt file:
$ cat key.txt
File Name ID
bcbb79d8-1d4a-4fbb-b16c-4df86839773e.htseq.counts.gz TCCC-06-0210
bcdc68db-c874-4097-9c46-b06e331caaf5.htseq.counts.gz TCHA-27-2519
bd4b6975-90d9-43f8-aadc-344d04644822.htseq.counts.gz TCHU-76-4929
File locations before script execution. (dest) is the directory to move to. (that is ls -one output not ls -L(lowercase), the ls -al is `L(lowercase))
$ ls -1
dest
bcbb79d8-1d4a-4fbb-b16c-4df86839773e.htseq.counts.gz
bcdc68db-c874-4097-9c46-b06e331caaf5.htseq.counts.gz
bd4b6975-90d9-43f8-aadc-344d04644822.htseq.counts.gz
key.txt
move_rename.sh
$ ls -al dest
total 16
drwxr-xr-x 2 david david 4096 Jan 17 20:05 .
drwxr-xr-x 16 david david 12288 Jan 17 20:05 ..
Execute the script
$ bash move_rename.sh
Working directory contents after execution
$ ls -1
dest
key.txt
move_rename.sh
Contents of dest after execution.
$ ls -al dest
total 8
drwxr-xr-x 2 david david 4096 Jan 17 20:00 .
drwxr-xr-x 3 david david 4096 Jan 17 20:00 ..
-rw-r--r-- 1 david david 0 Jan 17 19:59 TCCC-06-0210.txt
-rw-r--r-- 1 david david 0 Jan 17 19:59 TCHA-27-2519.txt
-rw-r--r-- 1 david david 0 Jan 17 19:59 TCHU-76-4929.txt
I have folders with multiple files with names like 2024_CULT_IMAGE_2012_03.shp and CULT_IMAGE_2017_03.shp How can I test if the string begins with digits. I know how to rename them when I have tested them. I have used reg-ex to test if a file contains digits but I am unsure how to test beginning of the string.
if [[ $file =~ [0-9] ]];
then
do something
The output I expect is CULT_IMAGE.shp.
I would use rename utility
rename 's/^\d+_//' *.shp
To remove EVERY digits with sed you can just do sed 's/[0-9]//g'
So you should be able to adapt it quickly to remove only the first digits
Before
ls -lrt
total 4
-rw-rw-r-- 1 user super 173 May 20 09:58 main
-rw-rw-r-- 1 user super 0 May 20 10:13 CULT_IMAGE_2017_1111132.shp
-rw-rw-r-- 1 user super 0 May 20 10:13 2024_CULT_IMAGE_2012_03.shp
#Execution:Rename files which starts with numbers.
./main
After Execution
ls -lrt
total 4
-rw-rw-r-- 1 user super 173 May 20 09:58 main
-rw-rw-r-- 1 user super 0 May 20 10:13 CULT_IMAGE_2017_1111132.shp
-rw-rw-r-- 1 user super 0 May 20 10:13 CULT_IMAGE.shp
File
cat main
for file in *.shp
do
if [[ "$file" =~ "^[0-9]" ]];then
newName=$(echo $file | sed -e 's/[0-9]//g' -e 's/^_//' -e 's/__.shp/.shp/')
mv $file $newName
fi
done
I'm trying to figure out a way to use a bash script that will:
- Search a directory
- Identify a file by creation date
- Rename that file
I'm 'hoping' to use a: 'if, then' script to accomplish this. Anybody have any idea how to do this???
Example:
if 'search in /current/path' for 'file created 12-25-2000'
then mv /current/path/FILENAME /other/path/FILENAME-1
Here's what I have right now but it doesn't work:
#!/bin/bash
touch --date "2000-12-24" /tmp/start
touch --date "2000-12-26" /tmp/end
if ls -l /current/path -type f -newer /tmp/start -not -newer /tmp/end
then mv /current/path/FILENAME-1 /other/path/FILENAME
#Reapplying original FILENAME
echo "Back to boring..."
fi
There are a number of ways to do this. You were on the right track using touch and temp files to set the boundaries for the search. You are probably better off using find as opposed to ls. The following script takes the target directory and the start time as arguments to search for files in the target dir that are between that start time and end time (default = start time + 1 day). If any files fall within the time(s) given, the files are added to an array. You can then use the filenames in the array to move as you desire.
Since the script makes use of find, you can tailor the selection by find to meet almost any need. The move criteria will have to be provided in the script as needed. Here is a quick example:
#!/bin/bash
## validate required input and provide usage information
[ -n "$1" -a -n "$2" ] || {
printf "\nerror: insufficient input. Usage: %s path start_date [end_date (end=start+1d)]\n\n" "${0//\//}"
printf " NOTE: date STRING -- any allowable formatt accepted by coreutils 'date -d=STRING'\n\n"
printf " examples -- 11/01/2014 or \"11/01/2014 10:25:30\" or 20141101\n\n"
exit 1
}
## store input path and start time
path="$1"
starttm=$2
## set start and end date (end defaults to starttm + 1 day)
startd="$(date -d "$starttm" +%s)"
endd="$(date -d "${3:-$(date -d "$starttm + 1 day")}" +%s)"
## set temp dir
[ -d /tmp ] && tmpdir="/tmp" || tmpdir="$PWD"
## tempfiles start & end (to create with compare dates)
tfs="${tmpdir}/ttm_${startd}"
tfe="${tmpdir}/ttm_${endd}"
## create temp file with start and end dates
touch -t $(date -d "#${startd}" +%Y%m%d%H%M.%S) $tfs
touch -t $(date -d "#${endd}" +%Y%m%d%H%M.%S) $tfe
## fill array using find to select files between tempfile dates
file_array=( $(find "$path" -maxdepth 1 -type f -newer $tfs ! -newer $tfe) )
## cleanup - remove temp files
rm $tfs $tfe || printf "warning: failed to remove tempfiles '%s' or '%s'\n" "$tfs" "$tfe"
## rename files at will
if [ "${#file_array[#]}" -gt 0 ]; then
for i in "${file_array[#]}"; do
fdir="${i%/*}"
ffn="${i##*/}"
printf " moving %-32s -> %s\n" "$i" "${fdir}/newname_${ffn}"
done
fi
exit 0
usage (run without arguments):
$ ./find_range_snip.sh
error: insufficient input. Usage: .find_range_snip.sh path start_date [end_date (end=start+1d)]
NOTE: date STRING -- any allowable formatt accepted by coreutils 'date -d=STRING'
examples -- 11/01/2014 or "11/01/2014 10:25:30" or 20141101
test directory:
ls -l ~/tmp
total 387224
drwxr-xr-x 3 david david 4096 Nov 3 15:51 asm
drwxr-xr-x 16 david dcr 4096 Jul 13 2010 fluxbox
drwxr-xr-x 3 david david 4096 Oct 13 13:42 log
-rw-r--r-- 1 david david 159557 Jul 16 04:15 acpidumpfile.bin
-rw-r--r-- 1 david david 1429 Jul 13 10:57 blderror.txt
-rw-r--r-- 1 david david 7663 Aug 21 05:39 fc-list-fonts-sorted-no-path.txt
-rw-r--r-- 1 david users 60 Jul 13 02:20 homelnk
-rw-r--r-- 1 david david 870 Sep 6 03:32 junk.c
-rw-r--r-- 1 david david 32323 Aug 1 21:53 knemo-no-essid.jpg
-rw-r--r-- 1 david david 14082 Sep 19 18:29 rlfwiki.tbl.desc
-rw-r--r-- 1 david david 2211 Jul 29 02:23 scrnbrightness.sh
-rw-r--r-- 1 david david 7456152 Sep 19 13:22 tcpd.tar.xz
-rw-r--r-- 1 david david 3371941 Sep 19 22:08 tcpdump-capt
-rw-r--r-- 1 david david 589676 Sep 19 14:49 tcpdump.new.1000
-rw-r--r-- 1 david david 0 Oct 26 02:38 test
-rw-r--r-- 1 david david 595 Jul 23 21:25 tmpkernel315.txt
-rwxr-xr-x 1 david david 12694 Oct 13 17:44 tstvi
-rw-r--r-- 1 david david 620 Oct 13 17:47 tstvi.c
-rw-r--r-- 1 david david 3599 Jul 16 04:29 xrandr-output.txt
output:
$ ./find_range_snip.sh ~/tmp 09/19/2014
moving /home/david/tmp/tcpd.tar.xz -> /home/david/tmp/newname_tcpd.tar.xz
moving /home/david/tmp/rlfwiki.tbl.desc -> /home/david/tmp/newname_rlfwiki.tbl.desc
moving /home/david/tmp/tcpdump-capt -> /home/david/tmp/newname_tcpdump-capt
moving /home/david/tmp/tcpdump.new.1000 -> /home/david/tmp/newname_tcpdump.new.1000
To find all files in /tmp and its subdirectories with a modify date of 2000-12-25 and to move them to /other/path/, try:
find /tmp -daystart -type f -newermt "2000-12-24" ! -newermt "2000-12-25" -exec mv {} "/other/path/$(basename {})-1" \;
Explanation
/tmp
Start looking in the /tmp directory and its subdirectories. This can be replaced with any other path, or list of paths, that you like.
-daystart
Optional. Count the day a file was created from the beginning of the day rather than multiples of 24 hours before now.
-newermt "2000-12-24" ! -newermt "2000-12-25"
Select only files with a modify date newer than 2000-12-24 but not newer than 2000-12-25. This selects only files from 2000-12-25.
-type f
Optional. Select only regular files. Don't try to move directories.
-exec mv {} "/other/path/$(basename {})-1" \;
When a file is found, execute this move command on it. The name of the file, including path elements needed to find it are substituted in for {} wherever it occurs. $(basename {}) gets the name of the file without the path.
It may seem a little odd at first find requires that -exec commands such as this end with a semicolon. To keep the shell from eating the semicolon, it has to be escaped. Hence, the \; at the end of the command.
Debugging
If things are not working as expected, try leaving off the -exec clause:
find /tmp -daystart -type f -newermt "2000-12-24" ! -newermt "2000-12-25"
Instead of moving files, this will just print their names. If the correct names print, then the problem is with the -exec clause. If the wanted file names are missing, try leaving off more clauses. For example, this should list all files in /tmp and its subdirectories:
find /tmp
This should list all file in /tmp that are older than 2000-12-25:
find /tmp -daystart ! -newermt "2000-12-25"
Running "ls -lrt" on my terminal I get a large list that looks something like this:
-rw-r--r-- 1 pratik staff 1849089 Jun 23 12:24 cam13-vid.webm
-rw-r--r-- 1 pratik staff 1850653 Jun 23 12:24 cam12-vid.webm
-rw-r--r-- 1 pratik staff 1839110 Jun 23 12:24 cam11-vid.webm
-rw-r--r-- 1 pratik staff 1848520 Jun 23 12:24 cam10-vid.webm
-rw-r--r-- 1 pratik staff 1839122 Jun 23 12:24 cam1-vid.webm
I have only shown part of it above as a sample.
I would like to rename all the files to have a number one less than current.
For example,
mv cam1-vid.webm cam0-vid.webm
mv cam2-vid.webm cam1-vid.webm
.....
....
mv cam 200-vid.webm cam199-vid.webm
How can this be done using a os x / linux bash script (perhaps using sed) ?
You can do this with plain bash:
for i in {1..200}
do
mv "cam${i}-vid.webm" "cam$((i-1))-vid.webm"
done
I would use find, split up the file names, to find the number, subtract one, and rename:
find . -name "cam*-vid.webm" -print0 | while read -d\$0 old_name
do
number=${old_name#cam} #Filter left to remove 'cam' prefix
number=${number%-vid.webm"} #Filter right to remove '-vid.webm' suffix
$((number -= 1))
new_name="cam${number}-vid.webm"
echo "mv \"$old_name\" \"$new_name\""
done | tee results
This will merely print out the commands (that is why I have echo). I'm piping it into a file named results. Once this command completes, look at results and make sure it does everything it should. Whenever there's an operation like this, there can be a nasty surprise. For example, if I rename cam02-vid.webm to cam01-vid.webm before I rename cam01-vid.webm, I am going to overwrite cam01-vid-webm.
Maybe a safer way is to explicitly give the file numbers I need:
for number in {1..200}
do
$((old_number = $number + 1))
echo mv "\"cam${old_number}-vid.webm\" \"cam${number}-vid.webm\""
done | tee results
Useful hint: If the result file looks good, you can actually just run it as a shell script:
$ bash results
Another possibility is to test to make sure the old file exist:
for number in {1..200}
do
$((old_number = $number + 1))
if [ -f "$cam${old_number}-vid.webm" ]
then
echo mv "\"cam${old_number}-vid.webm\" \"cam${number}-vid.webm\""
else
echo "ERROR: Can't find a file called 'cam${old_number}-vid.webm'"
fi
done | tee results
A perl solution.
First it traverses all input files (#ARGV) and filters those that are plain files and not links (grep), extracts the number (map) and sorts numerically in ascendant to avoid overwritting (sort). Later creates a new file decrementing the number and renames the original:
perl -e '
for (
sort { $a->[0] <=> $b->[0] }
map { m/(\d+)/; [$1, $_ ] }
grep { -f $_ && ! -l $_ }
#ARGV
) {
$n = --$_-> [0];
($newname = $_->[1]) =~ s/\A(?i)(cam)\d+(.*)\z/$1$n$2/;
print "Executing command ===> rename $_->[1], $newname\n";
rename $_->[1], $newname;
}' *
Assuming initial content of the directory as:
cam1-vid.webm
cam13-vid.webm
cam12-vid.webm
cam11-vid.webm
cam10-vid.webm
cam2-vid.webm
After running the command yields:
cam0-vid.webm
cam10-vid.webm
cam11-vid.webm
cam12-vid.webm
cam1-vid.webm
cam9-vid.webm