Copy file into directories, and change text inside file to match index of directory - bash

I have the following files in a directory: text.txt and text2.txt
My goal is to:
1) copy these two files into non-existing directories m06/, m07/...m20/.
2) Then, in the file text.txt, in the line containing the string mlist 06 (all the files will contain such a string), I wish to change the "06" to match the index of the directory name (for example, in m13, that line in the text.txt file would be mlist 13.
For goal 1), I got the following script which works succesfully:
#!/bin/bash
mkdir $(printf "m%02i " $(seq 6 20))
find . -maxdepth 1 -type d -name 'm[0-9][0-9]' -print0 | xargs -0 -I {} cp text.txt {}
find . -maxdepth 1 -type d -name 'm[0-9][0-9]' -print0 | xargs -0 -I {} cp text2.txt {}
For goal 2), I wish to implement a command similar to
sed -i -e '/mlist/s/06/index/' ./*/text.inp
where index would correspond to the name of the directory (i.e. index = 13 in the m13/directory).
How can I make the sed command replace 06 with the correct "index" corresponding to the name of the directory?

This would probably be easier to manage if you used loop syntax instead of one-liners:
#!/bin/sh
for i in $(seq 6 20); do
# Add a leading 0 and generate the directory name
i_z=$(printf "%02d" "$i")
dir="m${i_z}"
# Create dir
mkdir -p "$dir"
# Copy base files into dir
cp test.txt test2.txt "$dir"
# Edit the index in the files to match the dir index
sed -i -e "s/mlist.*/mlist $i_z/g" \
"${dir}/test.txt" "${dir}/test2.txt"
done

Related

rename recursively adding parenthere if folder name ending with 4 digits in bash

I have been trying to recursively rename folders whose names ends in four digits.
For example, I have a folder name like this:
this is the name 2004
and I'm trying to rename it to:
this is the name (2004)
I've tried to split the prefix and digit parts of the name however I cannot mv as rename these folder.
Here is the code I've tried so far:
#!/bin/bash
F=$(find . -name '*[0-9]' -type d)
for i in "$F";
do
R2=$(echo "$i" | awk '{print $NF}')
R1=$(echo "$i" | sed 's/.\{4\}$//')
R3=$(echo "$R2" | sed -r "s/(^[0-9]+$)/(\1)/g")
mv "$i" "$R1 $R3"
# Even tried:
mv "\"$i"\" "\"$R2 $R3"\"
done
Does anyone can review or/and suggest some guidance to allow mv to find the initial folder and its destination?
following command:
find -name '*[0-9][0-9][0-9][0-9]' -type d -exec bash -c 'for dir; do mv "$dir" "${dir%[0-9][0-9][0-9][0-9]}(${dir#${dir%[0-9][0-9][0-9][0-9]}})"; done' - {} + -prune
should work.
double quote arround variable expansion
${dir%[0-9][0-9][0-9][0-9]} to remove last 4 digits suffix
${dir#${dir%[0-9][0-9][0-9][0-9]}} to remove previous prefix
-exec bash -c '..' - {} + the - to skip the first argument after -c command which is taken for $0, see man bash /-c
-prune at the end to prevent to search in sub tree when matched, (suppose 2004/2004 then mv 2004/2004 "2004/(2004)" or mv 2004/2004 (2004)/2004' would fail)
I found Bash annoying when it comes to find and rename files for all the escaping one needs to make. This is a cleaner Ruby solution :
#!/usr/bin/ruby
require 'fileutils'
dirs = Dir.glob('./**/*').select {|x| x =~ / [0-9]*/ }
dirs.sort().reverse().each do |dir|
new_name=dir.gsub(/(.*)( )([0-9]{4})/, '\1\2(\3)')
FileUtils.mv dir,new_name
end
When $F has more than one directoy, the for loop will consider it as a one long entry with newlines (try echo "F=[$F]").
Also use -depth, you might have topdir 2004/subdir 2004. So first rename the subdir.
When the directories don't have newlines, you can try
#!/bin/bash
while IFS= read -r orgdir; do
mv "${orgdir}" "$(sed -r "s/([0-9]+$)/(\1)/g" <<< "${orgdir}")"
done < <(find . -depth -name '*[0-9]' -type d)

Find and replace string and print file directory on change

I am using find and sed to replace a string in multiple files. Here is my script:
find ./ -type f -name "*.html" -maxdepth 1 -exec sed -i '' "s/${REPLACE_STRING}/${STRING}/g" {} \; -print
The -print always prints the file no matter if something was changed or not. What I would like to see what files are changed. Ideally I would like the output to be something like this(as the files are changing):
/path/to/file was changed
- REPLACE STRING line 9 was changed
- REPLACE STRING line 12 was changed
- REPLACE STRING line 26 was changed
/path/to/file2 was changed
- REPLACE STRING line 1 was changed
- REPLACE STRING line 6 was changed
- REPLACE STRING line 36 was changed
Is there anyway of doing something like this?
Cool idea. I think -print is a deadend for the reason you mention, so it needs to be done in the exec. I think sed is also a deadend due to the challenge of printing to STDOUT as well as modifying the file. So a natural extension is to wrap some Perl around it.
What if this was your exec statement:
perl -p -i -e '$i=1 if not defined($i); print STDOUT "$ARGV, line $i: $_" if s/REPLACE_STRING/STRING/; $i++' {} \;
-p wraps the Perl statements in a standard while(<>) loop so the file is processed line by line just like sed.
-i does in-place replacement, just like sed.
-e means execute the following Perl statements.
if not defined is a sneaky way of initialising a line count variable, even though it's executed for every line.
STDOUT tells print to output to the console instead of the file.
$ARGV is the current filename, when reading from <>.
$_ is the line being processed.
if means the print only gets executed if a match is found.
For an input file text.txt containing:
line 1
token 2
line 3
token 4
line 5
The statement perl -p -i -e '$i=1 if not defined($i); print STDOUT "$ARGV, line $i: $_" if s/token/sub/; $i++' text.txt gives me:
text.txt, line 2: sub 2
text.txt, line 4: sub 4
Leaving text.txt containing:
line 1
sub 2
line 3
sub 4
line 5
So you don't get your introductory "file was changed" line, but for a one-liner I think it's a pretty good compromise.
Operating on a couple of files it looks like this:
find ./ -type f -name "*.txt" -maxdepth 1 -exec perl -p -i -e '$i=1 if not defined($i); print STDOUT "$ARGV, line $i: $_" if s/token/sub/; $i++' {} \;
.//text1.txt, line 2: sub 2
.//text1.txt, line 4: sub 4
.//text2.txt, line 1: sub 1
.//text2.txt, line 3: sub 3
.//text2.txt, line 5: sub 5
You could chain -exec actions and take advantage of the exit status. For example:
find . \
-maxdepth 1 \
-type f \
-name '*.html' \
-exec grep -Hn "$REPLACE_STRING" {} \; \
-exec sed -i '' "s/${REPLACE_STRING}/${STRING}/g" {} \;
This prints, for each matching file, the path, the line number and the line:
./file1.html:9:contents of line 9
./file1.html:12:contents of line 12
./file1.html:26:contents of line 26
./file2.html:1:contents of line 1
./file2.html:6:contents of line 6
./file2.html:36:contents of line 36
For files without a match, nothing else happens; for files with a match, the sed command will be called.
If you wanted output closer to what you have in your question, you could add a few actions:
find . \
-maxdepth 1 \
-type f \
-name '*.html' \
-exec grep -q "$REPLACE_STRING" {} \; \
-printf '%p was changed\n' \
-exec grep -n "$REPLACE_STRING" {} \; \
-exec sed -i '' "s/${REPLACE_STRING}/${STRING}/g" {} \; \
| sed -E "s/^([[:digit:]]+):.*/ - $REPLACE_STRING line \1 was changed/"
This now first checks if the file contains the string, silently, with grep -q, then prints the filename (-printf), then all the matching lines with line numbers (grep -n), then does the substitution with sed and finally modifies the output slightly with sed.
Since you're using sed -i '', I assume you're on macOS; I'm not sure if the stock find on there supports the printf option.
By now, we're pretty close to running a complex-ish script on each file that matches, so we might as well do that directly:
shopt -s nullglob
for f in ./*.html; do
if grep -q "$REPLACE_STRING" "$f"; then
printf '%s\n' "$f was changed"
grep -n "$REPLACE_STRING" "$f" \
| sed -E "s/^([[:digit:]]+):.*/ - $REPLACE_STRING line \1 was changed/"
sed -i '' "s/${REPLACE_STRING}/${STRING}/g" "$f"
fi
done
Replace your find+sed command:
find ./ -type f -name "*.html" -maxdepth 1 -exec sed -i '' "s/${REPLACE_STRING}/${STRING}/g" {} \; -print
with this GNU awk command (needs gawk for inplace editing):
gawk -i inplace -v old="$REPLACE_STRING" -v new="$STRING" '
FNR==1 { hdr=FILENAME " was changed\n" }
gsub(old,new) { printf "%s - %s line %d was changed\n", hdr, old, FNR | "cat>&2"; hdr="" }
1' *.html
You could also make it much more robust with awk than with sed if necessary since awk can support literal strings while sed can't
Alright, always defer to Ed's awk script for efficiency, but continuing with the sed + helper script using a preliminary call to grep to determine whether your file contains the word to replace, you could use a short helper script taking your ${REPLACE_STRING}, ${STRING} and filename as the first three positional parameters as follows:
Helper Script named helper.sh
#!/bin/sh
test -z "$1" && exit
test -z "$2" && exit
test -z "$3" && exit
findw="$1"
replw="$2"
fname="$3"
grep -q "$findw" "$fname" || exit
echo "$(readlink -f $fname) was changed"
grep -n "$findw" "$fname" | {
while read line; do
printf -- " - REPLACE STRING line %d was changed\n" "${line%:*}"
done }
sed -i "s/$findw/$replw/g" "$fname"
Then your call to find could be, e.g.:
find . -type f -name "f*" -exec ./helper.sh "dog" "cat" '{}' \;
Example Use/Output
Starting with a couple of files named f containing:
$ cat f
my
dog
dog
has
fleas
In a file structure containing the script in the present directory with a subdirectory d1 and multiple copies of f, e.g.
$ tree .
.
├── d1
│   └── f
├── f
└── helper.sh
Running the script results in the following:
$ find . -type f -name "f*" -exec ./helper.sh "dog" "cat" '{}' \;
/tmp/tmp-david/f was changed
- REPLACE STRING line 2 was changed
- REPLACE STRING line 3 was changed
/tmp/tmp-david/d1/f was changed
- REPLACE STRING line 2 was changed
- REPLACE STRING line 3 was changed
and the contents of f are changed accordingly
$ cat f
my
cat
cat
has
fleas
If there is no search term found in any of the files located by find, the modification times on those files are left unchanged.
Now with all that in mind, if you have gawk available, follow Ed's advise, but -- you can do it with sed and a helper :)
install Perl easily for free, define your own strings on bash shell and test here:
STRING=
REPLACE=
perl -ne 'foreach(`find . -maxdepth 1 -type f -iname "*.html"`){ open IH,$_ or die "Error $!"; print "Processing: $_";while (<IH>) {$s=$_;$t=s/$REPLACE/$STRING/; print "$s --> $_" if $t };print "Nothing replaced" if !$t}'
to truly edit it add -i option so it'd be perl -i -ne....

Rename files to unique names and move them into a single destination directory

i have 100s of directories with same filename of content.html along with other files.
I am trying to copy all these content.html files under 1 directory, but since they have same name, it overwrites each other
so how can i rename and move all these under 1 directory
Eg:
./0BD3D9D2-F8B1-4472-95C2-13319650A45C:
card.png content.html note.xhtml quickLook.png snippet.txt
./0EA34DB4-CD56-42BE-91DA-F631E44FB6E0:
card.png content.html note.xhtml quickLook.png related snippet.txt
./1A33F29E-3938-4C2F-BA99-6B98FD045742:
card.png content.html note.xhtml quickLook.png snippet.txt
command i tried:
rename content.html to content
find . -type f | grep content.html | while read f; do mv $f ${f/.html/}; done
append number to filename "content" to make it unique
find . -type f | grep content | while read f; do i=1; echo mv $f $f$i.html; i=i+1; done
MacBook-Pro$ find . -type f | grep content | while read f; do i=1; echo mv $f $f$i.html; i=i+1; done
mv ./0BD3D9D2-F8B1-4472-95C2-13319650A45C/content ./0BD3D9D2-F8B1-4472-95C2-13319650A45C/content1.html
mv ./0EA34DB4-CD56-42BE-91DA-F631E44FB6E0/content ./0EA34DB4-CD56-42BE-91DA-F631E44FB6E0/content1.html
mv ./1A33F29E-3938-4C2F-BA99-6B98FD045742/content ./1A33F29E-3938-4C2F-BA99-6B98FD045742/content1.html
once above step is successful, i should be able do this to achieve my desired output:
find . -type f | grep content | while read f; do mv $f ../; done
however, i am sure i can do this in 1 step command and also my step 2 is not working (incrementing i)
any idea why step2 is not working??
bash script:
#!/bin/bash
find . -type f -name content.html | while IFS= read -r f; do
name=$(basename $f)
((++i))
mv "$f" "for_content/${name%.*}$i.html"
done
replace for_content with your destination folder name
Suppose in your base directory, you create a folder named final for storing
content.html files, then do something like below
find . -path ./final -prune -o -name "content.html" -print0 |
while read -r -d '' name
do
mv "$name" "./final/content$(mktemp -u XXXX).html"
# mktemp with -u option just creates random characters, or it is just a dry run
done
At the end you'll get all the content.html files under ./final folder in the format contentXXXX.html where XXXX are random characters.
Note:-path ./final -prune -o in find prevents it from descending to our results folder.
The inode of the of the files should be unique and so you could use the following:
find $(pwd) -name "content.html" -printf %f" "%i" "%p"\n" | awk '{ system("mv "$3" <directorytomoveto>"$2$1) }'
I'd use something like this:
find . -type f -name 'test' | awk 'BEGIN{ cnt=0 }{ printf "mv %s ./output-dir/content_%03d.txt\n", $0, cnt++ }' | bash;
You can replace ./output-dir/ with your destination directory
Example:
[root#sl7-o2 test]# ls -R
.:
1 2 3 output-dir
./1:
test
./2:
test
./3:
test
./output-dir:
[root#sl7-o2 test]# find . -type f -name 'test' | awk 'BEGIN{ cnt=0 }{ printf "mv %s ./output-dir/content_%03d.txt\n", $0, cnt++ }' | bash;
[root#sl7-o2 test]# ls ./output-dir/
content_000.txt content_001.txt content_002.txt
You can use shopt -s globstar to grab all content.html files recursively and then use a loop to rename them:
#!/bin/bash
set -o globstar
counter=0
dest_dir=/path/to/destination
for f in **/content.html; do # pick up all content.html files
[[ -f "$f" ]] || continue # skip if not a regular file
mv "$f" "$dest_dir/content_$((++counter).html"
done

delete directories not in the file containing directory names list

I have a file having the list of directory name I want to keep. Say file1 and its contents are names of directories like
dir1
dir2
dir3
My directory (actual directories) on the other hand has directories like
dir1
dir2
dir3
dir4
dirs
What I want to do is delete dir4, dirs and other directories of which their name doesn't exist on file1 from My directory. file1 has a directory name per line. There might be sub directories or files under dir4 and dirs which needs a recursive deletion.
I can use xargs to delete the files in the list within My directory
xargs -a file1 rm -r
But instead of removing, I want to keep them and remove the others which are not on file1. Can do
xargs -a file1 mv -t /home/user1/store/
And delete the remaining directories in my directory but I am wandering if there is a better way?
Thanks.
find . -maxdepth 1 -type d -path "./*" -exec sh -c \
'for f; do f=${f#./}; grep -qw "$f" file1 || rm -rf "$f"; done' sh {} +
Anish has a great one-liner answer for you. If you wanted something verbose that can help you in the future with data manipulation or such, here's a verbose version:
#!/bin/bash
# send this function the directory name
# it compares that name with all entries in
# file1. If entry is found, 0 is returned
# That means...do not delete directory
#
# Otherwise, 1 is returned
# That means...delete the directory
isSafe()
{
# accept the directory name parameter
DIR=$1
echo "Received $DIR"
# assume that directory will not be found in file list
IS_SAFE=1 # false
# read file line by line
while read -r line; do
echo "Comparing $DIR and $line."
if [ $DIR = $line ]; then
IS_SAFE=0 # true
echo "$DIR is safe"
break
fi
done < file1
return $IS_SAFE
}
# find all files in current directory
# and loop through them
for i in $(find * -type d); do
# send each directory name to function and
# capture the output with $?
isSafe $i
SAFETY=$?
# decide whether to delete directory or not
if [ $SAFETY -eq 1 ]; then
echo "$i will be deleted"
# uncomment below
# rm -rf $i
else
echo "$i will NOT be deleted"
fi
echo "-----"
done
you can exclude your directories using grep:
find . -mindepth 1 -maxdepth 1 -type d -printf '%P\n' | grep -f file1 -Fx -v | xargs rm -r
-printf '%P\n' is used in order to remove leading './' from directory names.
From man find, description of -printf format:
%P     File's name with the name of the starting-point under which it was found removed.
grep parameters:
-f FILE   Obtain patterns from FILE, one per line.
-F     Interpret PATTERNS as fixed strings, not regular expressions.
-x     Select only those matches that exactly match the whole line. For a regular expression pattern, this is like parenthesizing the pattern and then surrounding it with ^ and $.
-v     Invert the sense of matching, to select non-matching lines.

Listing all the directories that contain a specific file and store them in an array

How to list all the directories that contain a specific file and store them in an array using shell script. I tried the following code but it gave me this error: ls: **/myFile.txt: No such file or directory. myFile.txt can be any file.
code:
folderArray = ($(ls **/myFile.txt | tr -d myFile.txt))
echo folderArray
for folder in ${folderArray[#]}
do
echo "myFile.txt is present in $folder"
done
You can use this command to list all the directories that contain myFile.txt:
find . -type f -name 'myFile.txt' -print0 | xargs -0 -I {} dirname {}
And to store them in an array:
arr=( $(find . -type f -name 'foo.*' -print0 | xargs -0 -I {} dirname {}) )
Use the power of zsh:
folderArray=($(echo **/myFile.txt(N^/:h)))
Flags inside () at the end are so called glob modifiers used in file names generation.
N: sets the NULL_GLOB option
^/: matches only files, not directories
:h: strips filenames from results, works as dirname

Resources