Linux copy directory structure and create symlinks to existing files with different extension - bash

I have a large directory structure similar to the following
/home/user/abc/src1
/file_a.xxx
/file_b.xxx
/home/user/abc/src2
/file_a.xxx
/file_b.xxx
It contains multiple srcX folders and has many files, most of the files have a .xxx extension. These are the ones that I am interested in.
I would like to create an identical directory structure in say /tmp. This part I have been able to accomplish via rsync
rsync -av -f"+ */" -f"- *" /home/user/abc/ /tmp/xyz/
The next step is what I can't figure out. I need the directory structure in /tmp/xyz to have symlinks to all the files in /home/user/abc with a different file extension (.zzz). The directory structure would look as follows:
/tmp/xyz/src1
/file_a.zzz -> /home/user/abc/src1/file_a.xxx
/file_b.zzz -> /home/user/abc/src1/file_b.xxx
/tmp/xyz/src2
/file_a.zzz -> /home/user/abc/src2/file_a.xxx
/file_b.zzz -> /home/user/abc/src2/file_b.xxx
I understand that I could just copy the data and do a batch rename. That is not an acceptable solution.
How do I recursively create symlinks for all the .xxx files in /home/user/abc and link them to /tmp/xyz with a .zzz extension.
The find + exec seems like what I want but I can't put 2 and 2 together on this one.

This could work
cd /tmp/xyz/src1
find /home/user/abc/src1/ -type f -print0 | xargs -r0 -I '{}' ln -s '{}' $(basename '{}' .xxx).zzz

Navigate to /tmp/xyz/ then run the following script:
#!/usr/bin/env bash
# First make src* folders in present directory:
mkdir -p $(find ~/user/abc/src* -type d -name "src*" | rev | cut -d"/" -f1 | rev)
# Then make symbolic links:
while read -r -d' ' file; do
ln -s ${file} $(echo ${file} | rev | cut -d/ -f-2 | rev | sed 's/\.xxx/\.zzz/')
done <<< $(echo "$(find ~/user/abc/src* -type f -name '*.xxx') dummy")

Thanks for the input all. Based upon the ideas I saw I was able to come up with a script that fits my needs.
#!/bin/bash
GLOBAL_SRC_DIR="/home/usr/abc"
GLOBAL_DEST_DIR="/tmp/xyz"
create_symlinks ()
{
local SRC_DIR="${1}"
local DEST_DIR="${2}"
# read in our file, use null terminator
while IFS= read -r -d $'\0' file; do
# If file ends with .xxx or .yyy
if [[ ${file} =~ .*\.(xxx|yyy) ]] ; then
basePath="$(dirname ${file})"
fileName="$(basename ${file})"
completeSourcePath="${basePath}/${fileName}"
#echo "${completeSourcePath}"
# strip off preceding text
partialDestPath=$(echo ${basePath} | sed -r "s|^${SRC_DIR}||" )
fullDestPath="${DEST_DIR}/${partialDestPath}"
# rename file from .xxx to .zzz. don't rename just link .yyy
cppFileName=$(echo ${fileName} | sed -r "s|\.xxx$|\.zzz|" )
completeDestinationPath="${fullDestPath}/${cppFileName}"
$(ln -s ${completeSourcePath} ${completeDestinationPath})
fi
done < <(find ${SRC_DIR} -type f -print0)
}
main ()
{
create_symlinks ${GLOBAL_SRC_DIR} ${GLOBAL_DEST_DIR}
}
main

Related

bash script to list all directories in a path withescape charectes and CD into each to perform action

I am trying to write a script to list all directories in a path. go to each directory and rename one .srt file in each directory with the name of .mp4 file (less extension) in the same sub directory.
each subdirectory has one .mp4 and one .srt file.
in return i get this $ ./change.sh ./change.sh: line 2: Lawless: command not found here LawLess (1111) is name of first directory in list of directories.
code is
#!/bin/bash
yourPathEscaped=$(`find $1 -mindepth 1 -maxdepth 1 -type d | cut -d . -f2 | cut -d \/ -f2 | sed 's/ /\\ /g' | sed 's/,/\\, /g' | sed -e s/\'/\\\\\'/g`)
for dirm in $yourPathEscaped
do
cd $dirm
echo $dirm
for f in `ls *.mp4 | grep -Po '.*(?=\.)'`; do echo $f; mv *.srt $f.srt; ls ; done
for f in `ls *.mkv | grep -Po '.*(?=\.)'`; do echo $f; mv *.srt $f.srt; ls ; done
cd ..
done
In the assumption that there is only one .srt file per directory, this might do what you wanted. Using a while + read loop and Process Substitution
#!/usr/bin/env bash
while IFS= read -rd '' files; do
directory=${files%/*}
(
cd "$directory" || exit
shopt -s nullglob
for f in *.mp4 *.mkv; do
printf '%s\n' "$f"
file_name=${f%.*}
echo mv -v -- *.srt "${file_name}.srt"
done
)
done < <(
find "$1" -mindepth 1 -maxdepth 1 -type f \( -name '*.mp4' -o -name '*.mkv' \) -print0
)
It can be probably done without the cd though but that is what I came up with, based on your script.
Remove the echo if your satisfied with the output, so mv can actually rename/move the files.
How can read a file or strem line-by-line or field-by-field
How can I find and safely handle file names containing newlines, spaces or both?

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

find files and delete by filename parameter

I have a folder with lots of images. In this folder are subfolders containing high resolution images. Images can be .png, .jpg or .gif.
Some images are duplicates called a.jpg and a.hi.jpg or a.b.c.gif and a.b.c.hi.gif. File names are always different, the will be never a.gif, a.jpg or a.png. I guess i have not to take care of extension.
These are the same images with different resolution.
Now i want to write a script to delete all lower resolution images. But there are files that do not have high resolution like b.png. So i want to delete only if there is a high resolution image too.
I guess i have to do something like this, but can't figure out how exactly.
find . -type f -name "*" if {FILENAME%hi*} =2 --delete smallest else keep file
Could anyone help? Thanks
Something like the following could do the job:
#!/bin/bash
while IFS= read -r -d '' hi
do
d=$(dirname "$hi")
b=$(basename "$hi")
low="${b//.hi./}"
[[ -e "$d/$low" ]] && echo rm -- "$d/$low" #dry run - if satisfied, remove the echo
done < <(find /some/path -type f -name \*.hi.\* -print0)
how it works:
finds all files with .hi. in their names. (not only images, you can extend the find be more restrictive
for all found images
get the directory, where is he
and get the name of the file (without directory)
in the name, remove all occurences of the string .hi. (aka make the "lowres" name
check the existence of the lowres image
delete if exists.
You can use bash extended glob features for this, which you can enable first by
shopt -s extglob
and using the pattern
!(pattern-list)
Matches anything except one of the given patterns.
Now to store the files not containing the string hi
shopt -s extglob
fileList=()
fileList+=( !(*hi*).jpg )
fileList+=( !(*hi*).gif )
fileList+=( !(*hi*).png )
You can print once the array to see if it lists all the files you need as
printf "%s\n" "${fileList[#]}"
and to delete those files do
for eachfile in "${fileList[#]}"; do
rm -v -- "$eachfile"
done
(or) as Benjamin.W suggested in comments below, do
rm -v -- "#{fileList[#]}"
Now i want to write a script to delete all lower resolution images
This script could be used for that:
find /path/to/dir -type f -iname '*.hi.png' -or -iname '*.hi.gif' -or -iname '*.hi.jpg' | while read F; do LOWRES="$(echo "$F" | rev | cut -c7- | rev)$(echo "$F" | rev | cut -c 1-3 | rev)"; if [ -f "$LOWRES" ]; then echo rm -fv -- "$LOWRES"; fi; done
You can run it to see what files will be removed first. If you're ok with results then remove echo before rm command.
Here is the non-one line version, but a script:
#!/bin/sh
find /path/to/dir -type f -iname '*.hi.png' -or -iname '*.hi.gif' -or -iname '*.hi.jpg' |
while read F; do
NAME="$(echo "$F" | rev | cut -c7- | rev)"
EXTENSION="$(echo "$F" | rev | cut -c 1-3 | rev)"
LOWRES="$NAME$EXTENSION"
if [ -f "$LOWRES" ]; then
echo rm -fv -- "$LOWRES"
fi
done

How can I recursively replace file and directory names using Terminal?

Using the Terminal on macOS, I want to recursively replace a word with the name of both a directory and a file name. For instance, I have an angular app and the module name is article, all of the file names, and directory names contain the word article. I've already done a find and replace to replace articles with apples in the code. Now I want to do the same with the file structure so both the file names and the directories share the same convention.
Just for information, I've already tried to use the newest Yeoman generator to create new files, but there seems to be an issue with it. The alternative is to duplicate a directory and rename all of the files, this is quite time consuming.
got it to work with the following script
var=$1
if [ -n "$var" ]; then
CRUDNAME=$1
CRUDNAMEUPPERCASE=`echo ${CRUDNAME:0:1} | tr '[a-z]' '[A-Z]'`${CRUDNAME:1}
FOLDERNAME=$CRUDNAME's'
# Create new folder
cp -R modules/articles modules/$FOLDERNAME
# Do the find/replace in all the files
find modules/$FOLDERNAME -type f -print0 | xargs -0 sed -i -e 's/Article/'$CRUDNAMEUPPERCASE'/g'
find modules/$FOLDERNAME -type f -print0 | xargs -0 sed -i -e 's/article/'$CRUDNAME'/g'
# Delete useless files due to sed
rm modules/$FOLDERNAME/**/*-e
rm modules/$FOLDERNAME/**/**/*-e
rm modules/$FOLDERNAME/**/**/**/*-e
# Rename all the files
for file in modules/$FOLDERNAME/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
for file in modules/$FOLDERNAME/**/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
for file in modules/$FOLDERNAME/**/**/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
else
echo "Usage: sh rename-module.sh [crud-name]"
fi
apparently I'm not the only one to encounter this issue
https://github.com/meanjs/generator-meanjs/issues/79

How do I remove a specific extension from files recursively using a bash script

I'm trying to find a bash script that will recursively look for files with a .bx extension, and remove this extension. The filenames are in no particular format (some are hidden files with "." prefix, some have spaces in the name, etc.), and not all files have this extension.
I'm not sure how to find each file with the .bx extension (in and below my cwd) and remove it. Thanks for the help!
find . -name '*.bx' -type f | while read NAME ; do mv "${NAME}" "${NAME%.bx}" ; done
find -name "*.bx" -print0 | xargs -0 rename 's/\.bx//'
Bash 4+
shopt -s globstar
shopt -s nullglob
shopt -s dotglob
for file in **/*.bx
do
mv "$file" "${file%.bx}"
done
Assuming you are in the folder from where you want to do this
find . -name "*.bx" -print0 | xargs -0 rename .bx ""
for blah in *.bx ; do mv ${blah} ${blah%%.bx}
Here is another version which does the following:
Finds out files based on $old_ext variable (right now set to .bx) in and below cwd, stores them in $files
Replaces those files' extension to nothing (or something new depending on $new_ext variable, currently set to .xyz)
The script uses dirname and basename to find out file-path and file-name respectively.
#!/bin/bash
old_ext=".bx"
new_ext=".xyz"
files=$(find ./ -name "*${old_ext}")
for file in $files
do
file_name=$(basename $file $old_ext)
file_path=$(dirname $file)
new_file=${file_path}/${file_name}${new_ext}
#echo "$file --> $new_file"
mv "$file" "$new_file"
done
Extra: How to remove any extension from filenames
find -maxdepth 1 -type f | sed 's/.\///g'| grep -E [.] | while read file; do mv $file ${file%.*}; done
will cut starting from last dot, i.e. pet.cat.dog ---> pet.cat
find -maxdepth 1 -type f | sed 's/.\///g'| grep -E [.] | while read file; do mv $file ${file%%.*}; done
will cut starting from first dot, i.e. pet.cat.dog ---> pet
"-maxdepth 1" limits operation to current directory, "-type f" is used to select files only. Sed & grep combination is used to pick only filenames with dot. Number of percent signs in "mv" command will define actual cut point.

Resources