Rename certain portion of filepaths in current directory recursively - bash

Let's assume I have following directory tree:
.
|-- foo
`-- foodir
|-- bardir
| |-- bar
| `-- foo
|-- foo -> bardir/foo
`-- foodir
|-- bar
`-- foo
3 directories, 6 files
How can I rename all foo into buz, including symlinks? like:
.
|-- buz
`-- buzdir
|-- bardir
| |-- bar
| `-- buz
|-- buz -> bardir/buz
`-- buzdir
|-- bar
`-- buz
3 directories, 6 files
I thought it is relatively easy at the first glance, but it turns out that was unexpectedly tough.
Firstly, I tried to mv around all files using git ls-files:
$ for file in $(git ls-files '*foo*'); do mv "$file" "${file//foo/buz}"; done
This gave me a bunch of errors said that I have to create new directories before doing so:
mv: cannot move 'foodir/bardir/bar' to 'buzdir/bardir/bar': No such file or directory
mv: cannot move 'foodir/bardir/foo' to 'buzdir/bardir/buz': No such file or directory
mv: cannot move 'foodir/foo' to 'buzdir/buz': No such file or directory
mv: cannot move 'foodir/foodir/bar' to 'buzdir/buzdir/bar': No such file or directory
mv: cannot move 'foodir/foodir/foo' to 'buzdir/buzdir/buz': No such file or directory
I didn't want to care about cleaning up empty directories after copy, so I tried find -exec expecting it can handle file renaming while finding files based on its names.
$ find . -path .git -prune -o -name '*foo*' -exec bash -c 'mv "$0" "${0//foo/buz}"' "{}" \;
But find seems still tried renaming files from renamed path.
find: ./foodir: No such file or directory
My final solution is to find the first file/directory for every single mv commands.
#!/bin/bash
# Rename file paths recursively
while :; do
path=$(find . -path .git -prune -o -name '*foo*' -print -quit)
if [ -z "$path" ]; then
break
fi
if ! mv "$path" "${path/foo/buz}"; then
break
fi
done
# Change symlink targets as well
find . -path .git -prune -o -type l -exec bash -c '
target=$(readlink "$0")
if [ "$target" != "${target//foo/buz}" ]; then
ln -sfn "${target//foo/buz}"
fi
' "{}" \;
This, kinda lame, but works as I expected. So my questions are:
Can I assume find always output directories before its sub directories/files?
Is there any chance to avoid using find multiple times?
Thank you.

Related

Create symbolic links with cp preserving parent structure

I have the following folder structure:
.
`-- top_level/
|-- sub-01_ses-01/
| `-- filtered_data.tar.gz*
|-- sub-01_ses-02/
| `-- filtered_data.tar.gz*
|-- sub-02_ses-01/
| `-- filtered_data.tar.gz*
|-- sub-02_ses-02/
| `-- filtered_data.tar.gz*
I wanted to create symbolic links to these files preserving the parent structure (since they all have the same filenames).
Here's what I tried:
find -name "filtered_data.tar.gz" \
-exec cp -s --parents --no-clobber -t /home/data/filtered {} \;
Now, I notice that cp does create the parent structure, but the symbolic links fail and I get the following notice:
cp: '/home/data/filtered/./sub-01_ses-01/filtered_data.tar.gz': can make relative symbolic links only in current directory
I'd like to understand why this is hapenning, and what the cp warning is trying to tell me. Also, any pointers on how to fix the issue would be greatly appreciated.
Found the solution here: symlink-copying a directory hierarchy
The path of the file to cp must be absolute, not ./something. So, this should work for you:
find $(pwd) -name "filtered_data.tar.gz" \
-exec cp -s --parents --no-clobber -t /home/data/filtered {} \;
Per your comment about what you're really trying to do, here's a Python script that does it. You should be able to tweak it.
#!/usr/bin/env python3
import os
target_filename = 'filtered_data.tar.gz'
top_src_dir = '.'
top_dest_dir = 'dest'
# Walk the source directory recursively looking for
# target_filename
for parent, dirs, files in os.walk(top_src_dir):
# debugging
# print(parent, dirs, files)
# Skip this directory if target_filename not found
if target_filename not in files:
continue
# Strip off all path parts except the immediate parent
local_parent = os.path.split(parent)[-1]
# Compute the full, relative path to the symlink
dest_file = os.path.join(top_dest_dir, local_parent, target_filename)
# debugging
# print('{} {}'.format(dest_file, os.path.exists(dest_file)))
# Nothing to do if it already exists
if os.path.exists(dest_file):
print('{} already exists'.format(dest_file))
continue
# Make sure the destination path exists
dest_dir = os.path.dirname(dest_file)
os.makedirs(dest_dir, exist_ok=True)
# Translate the relative path to target_filename
# to be relative based on the new destination dir
src_file = os.path.join(parent, target_filename)
src_file = os.path.relpath(src_file, start=dest_dir)
os.symlink(src_file, dest_file)
print('{} --> {}'.format(dest_file, src_file))

grep for two patterns independently (in different lines)

I have some directories with the following structure:
DAY1/ # Files under this directory should have DAY1 in the name.
|-- Date
| |-- dir1 # Something wrong here, there are files with DAY2 and files with DAY1.
| |-- dir2
| |-- dir3
| |-- dir4
DAY2/ # Files under this directory should all have DAY2 in the name.
|-- Date
| |-- dir1
| |-- dir2 # Something wrong here, there are files with DAY2, and files with DAY1.
| |-- dir3
| |-- dir4
In each dir there are hundreds of thousands of files with names containing DAY, for example 0.0000.DAY1.01927492. Files with DAY1 on the name should only appear under parent directory DAY1.
Something went wrong when copying files around, so that I now have mixed files with DAY1 and DAY2 in some of the dir directories.
I wrote a script to find folders that contain mixed files, so I can then look at them more closely. My script is the following:
for directory in */; do
if ls $directory | grep -q DAY2 ; then
if ls $directory | grep -q DAY1; then
echo "mixed files in $directory";
fi ;
fi;
done
The problem here is that I'm going through all files twice, which doesn't make sense considering that I'd only have to look through the files once.
What would be a more efficient way achieve what I want?
If i understand you correctly, then you need to find the files under DAY1 directory recursively that have DAY2 in their names, similarly for DAY2 directory the files what have DAY1 in their names.
If so, for DAY1 directory:
find DAY1/ -type f -name '*DAY2*'
this will get you the files under DAY1 directory that have DAY2 in their names. Similarly for DAY2 directory:
find DAY2/ -type f -name '*DAY1*'
Both are recursive operations.
To get the directory names only:
find DAY1/ -type f -name '*DAY2*' -exec dirname {} +
Note that the $PWD will be shown as ..
To get uniqueness, pass the output to sort -u:
find DAY1/ -type f -name '*DAY2*' -exec dirname {} + | sort -u
Given that the difference between going through them once and going through them twice is just a factor-of-two difference, changing to an approach that goes through them only once might actually not be a win, since the new approach might easily take twice as long per file.
So you'll definitely want to experiment; it's not necessarily something that you can confidently reason about.
However, I will say that in addition to going through the files twice, the ls version also sorts the files, which probably has a more-than-linear cost (unless it's doing some kind of bucket-sort). Eliminating that, by writing ls --sort=none instead of just ls, will actually improve your algorithmic complexity, and is almost certain to give a tangible improvement.
But FWIW, here's a version that only goes through the files once, that you can try:
for directory in */; do
find "$directory" -maxdepth 1 \( -name '*DAY1*' -or -name '*DAY2*' \) -print0 \
| { saw_day1=
saw_day2=
while IFS= read -d '' subdirectory ; do
if [[ "$subdirectory" == *DAY1* ]] ; then
saw_day1=1
fi
if [[ "$subdirectory" == *DAY2* ]] ; then
saw_day2=1
fi
if [[ "$saw_day1" ]] && [[ "$saw_day2" ]] ; then
echo "mixed files in $directory"
break
fi
done
}
done

Creating empty files in new location

I have a directory full of files. The tree looks something like this:
|-- test1a
| |-- test1b
| |-- foo.txt
| |-- bar.txt
|-- test2a
| |-- test2b
Where the directory names match the regular expression test[1-9][ab].
Using find in bash, I'm trying to create blank files in test2b with the same filenames and extensions as those in test1b.
So far, I've tried the following:
find test1a/test1b -type f -exec touch test2a/test2b {} \;
This, however, does not work. I don't have much experience with bash, so I'm not sure where to go from here. Where am I going wrong?
I was able to solve this problem using the following:
$ cd test2a/test2b
$ find ../../test1a/test1b -type f -exec sh -c 'touch $(basename {})' \;
I believe the problem was resulting from {} giving the full path rather than the filename. It was then trying to create a file that already existed, so it left it alone and did nothing.
Here is a second approach:
find test1a/test1b -type f -execdir echo touch test2a/test2b/{} \; > adhoc.sh
sh adhoc.sh

Script to find all .h file and put in specified folder with same structure?

I need a bash script like
headers ~/headers-folder ~/output-folder
so it recursively finds all .h files in ~/headers-folder and put them all in ~/output-folder with the folder hierarchy maintained?
Thanks!
find /path/to/find -name "*.h" -type f | xargs -I {} cp --parents {} /path/to/destination
Check this out.
rsync is great for that too:
rsync --include '*.h' --filter 'hide,! */' -avm headers-folder/ output-folder/
This will copy all the *.h files, and create only the necessary directories.
Example:
mkdir -p headers-folder/{subdir,empty}
touch headers-folder/foo.h
touch headers-folder/subdir/foo.h
tree headers-folder
# headers-folder/
# |-- empty
# |-- foo.h
# `-- subdir
# `-- foo.h
rsync --include '*.h' --filter 'hide,! */' -avm headers-folder/ output-folder/
tree output-folder
# output-folder/
# |-- foo.h
# `-- subdir
# `-- foo.h

Copy nested folders contents to one folder recursively (terminal)

I have a Wordpress upload folder that is structured using subfolders for months.
wolfr2:uploads wolfr$ tree .
.
|-- 2007
| |-- 08
| | |-- beautifulkatamari.jpg
| | |-- beautifulkatamari.thumbnail.jpg
| | |-- beetle.jpg
| | |-- beetle.thumbnail.jpg
How do I use terminal to copy all the images recursively into another folder? I can't seem to wildcard folders like you can wildcard filenames. (e.g. *.jpg or *) (I'm on Mac OSX)
cp -R ./*.jpg .
?
This will copy all *.jpg files from the current folder to a new folder and preserve the directory structure.
tar cvfp `find . -name "*.jpg"` | (cd <newfolder>; tar xfp -)
To copy without preserving the directory structure:
cp `find . -name "*.jpg"` <newfolder>
Off the top of my head:
find . -type f -name \*.jpg -exec cp \{\} $TARGETFOLDER \;
If that doesn't work, comment and I'll try again, but find is definitely the way to go.
None of the above commands worked for me as such on macOS 10.15. Here's the one that worked:
find . -name "*.jpg" -exec cp /{/} [target folder path] \;

Resources