How to optimize sed search and replace in shell script

How to optimize sed search and replace in shell script - bash

I must launch sed up to three times because the word startpilot can occur more than once per row. I think this script could be written much better, perhaps anybody could help me to optimize that.
grep -rl "startpilot" ./* -R | xargs sed -i '' "s/startpilot/${PWD##*/}/"
#! /bin/bash
DIR="$1"
if [ $# -ne 1 ]
then
echo "Usage: $0 {new extension key}"
exit 1
fi
if [ -d "$DIR" ]
then
echo "Directory/extension '$DIR' exists already!"
else
git clone https://github.com/misterboe/startpilot.git $DIR --depth=1
echo "$DIR created."
cd $DIR && rm -rf .git && grep -rl "startpilot" ./* -R | xargs sed -i '' "s/startpilot/${PWD##*/}/" && grep -rl "startpilot" ./* -R | xargs sed -i '' "s/startpilot/${PWD##*/}/" && grep -rl "startpilot" ./* -R | xargs sed -i '' "s/startpilot/${PWD##*/}/" && grep -rl "Startpilot" ./* -R | xargs sed -i '' "s/Startpilot/${PWD##*/}/"
cd Resources/Public/ && bower install
echo "Your extension is now in $DIR."
fi

If you want to replace all occurrences of "startpilot" with ${PWD##*/}, just add the g (global) modifier to the sed command:
sed -i '' "s/startpilot/${PWD##*/}/g"
^ here
Rather than replacing the first occurrence on the line, now it will replace all of them.

Related

moving files to their respective folders using bash scripting

I have files in this format:
2022-03-5344-REQUEST.jpg
2022-03-5344-IMAGE.jpg
2022-03-5344-00imgtest.jpg
2022-03-5344-anotherone.JPG
2022-03-5343-kdijffj.JPG
2022-03-5343-zslkjfs.jpg
2022-03-5343-myimage-2010.jpg
2022-03-5343-anotherone.png
2022-03-5342-ebee5654.jpeg
2022-03-5342-dec.jpg
2022-03-5341-att.jpg
2022-03-5341-timephoto_december.jpeg
....
about 13k images like these.
I want to create folders like:
2022-03-5344/
2022-03-5343/
2022-03-5342/
2022-03-5341/
....
I started manually moving them like:
mkdir name
mv name-* name/
But of course I'm not gonna repeat this process for 13k files.
So I want to do this using bash scripting, and since I am new to bash, and I am working on a production environment, I want to play it safe, but it doesn't give me my results. This is what I did so far:
#!/bin/bash
name = $1
mkdir "$name"
mv "${name}-*" $name/
and all I can do is: ./move.sh name for every folder, I didn't know how to automate this using loops.

With bash and a regex. I assume that the files are all in the current directory.
for name in *; do
if [[ "$name" =~ (^....-..-....)- ]]; then
dir="${BASH_REMATCH[1]}"; # dir contains 2022-03-5344, e.g.
echo mkdir -p "$dir" || exit 1;
echo mv -v "$name" "$dir";
fi;
done
If output looks okay, remove both echo.

Try this
xargs -i sh -c 'mkdir -p {}; mv {}-* {}' < <(ls *-*-*-*|awk -F- -vOFS=- '{print $1,$2,$3}'|uniq)
Or:
find . -maxdepth 1 -type f -name "*-*-*-*" | \
awk -F- -vOFS=- '{print $1,$2,$3}' | \
sort -u | \
xargs -i sh -c 'mkdir -p {}; mv {}-* {}'
Or find with regex:
find . -maxdepth 1 -type f -regextype posix-extended -regex ".*/[0-9]{4}-[0-9]{2}-[0-9]{4}.*"

You could use awk
$ cat awk.script
/^[[:digit:]-]/ && ! a[$1]++ {
dir=$1
} /^[[:digit:]-]/ {
system("sudo mkdir " dir )
system("sudo mv " $0" "dir"/"$0)
}
To call the script and use for your purposes;
$ awk -F"-([0-9]+)?[[:alpha:]]+.*" -f awk.script <(ls)
You will see some errors such as;
mkdir: cannot create directory ‘2022-03-5341’: File exists
after the initial dir has been created, you can safely ignore these as the dir now exist.
The content of each directory will now have the relevant files
$ ls 2022-03-5344
2022-03-5344-00imgtest.jpg 2022-03-5344-IMAGE.jpg 2022-03-5344-REQUEST.jpg 2022-03-5344-anotherone.JPG

Escaping commands in ssh connection via for loop

#!/bin/bash
for x in ontwikkelkaart
do
echo "***";
echo ${x};
ssh ${x}#localhost "
find ~/public_html/wp-content/themes/ -type f -name "*.webp" | awk '{ gsub(".webp$", "") ; print $0 }' | xargs -i sh -c 'if [ ! -f "{}" ]; then echo {}.webp; fi' \;
"
done
I have the above script that connects to a server via SSH, it checks wether there are webp files with no jpg/png as source file; and echo's rm "filename".
The command:
find ~/public_html/wp-content/themes/ -type f -name "*.webp" | awk '{ gsub(".webp$", "") ; print $0 }' | xargs -i sh -c 'if [ ! -f "{}" ]; then echo {}.webp; fi' \;
Works when i run it on the command line of the server (via SSH), but when i try to do it in the for loop, it does not work because of the "".
Can someone (try to) explain why the above code does not work?

Try using a heredoc :
ssh -q -T ${x}#localhost 2> /dev/null <<'EOF'
find ~/public_html/wp-content/themes/ -type f -name "*.webp" | awk '{ gsub(".webp$", "") ; print $0 }' | xargs -i sh -c 'if [ ! -f "{}" ]; then echo {}.webp; fi' \;
EOF

FInd all files that contains both the string1 and string2

The following script finds and prints the names of all those files that contains either string1 or string2.
However I could not figure out how to make change into this code so that it prints only those files that contains both string1 and string2. Kindly suggest the required change
number=0
for file in `find -name "*.txt"`
do
if [ "`grep "string2\|string1" $file`" != "" ] // change has to be done here
then
echo "`basename $file`"
number=$((number + 1))
fi
done
echo "$number"

Using grep and cut:
grep -H string1 input | grep -E '[^:]*:.*string2' | cut -d: -f1
You can use this with the find command:
find -name '*.txt' -exec grep -H string1 {} \; | grep -E '[^:]*:.*string2'
And if the patterns are not necessarily on the same line:
find -name '*.txt' -exec grep -l string1 {} \; | \
xargs -n 1 -I{} grep -l string2 {}

This solution can handle files with spaces in their names:
number=0
oldIFS=$IFS
IFS=$'\n'
for file in `find -name "*.txt"`
do
if grep -l "string1" "$file" >/dev/null; then
if grep -l "string2" "$file" >/dev/null; then
basename "$file"
number=$((number + 1))
fi
fi
done
echo $number
IFS=$oldIFS

How do I list newest directory and add as variable to bash script to process files recursively

How do I list newest directory and add as variable to bash script to process files recursively
ls -t1 | head -n1
Works perfectly to list the latest directory, but I want to add that directory name to my script so I can process the files within using the following script:
#!/bin/bash
ls | while read -r FILE
do
mv -v "$FILE" `echo $FILE | tr ' ' '_' `
done
ls | while read -r FILE
do
mv -v "$FILE" `echo $FILE | tr '\*.JPEG' '\*.jpg' `
done
mogrify -resize 750 *.jpg
wait
jpegoptim *.jpg –max=70 --strip-all
exit
I also want to process the files recursively, there might be at most one level of sub directories.
Basically keep the bash script at the root of the directory and process all latest directories and sub directories files.
OK I modified the script to this:
#!/bin/bash
DIR=ls -t1 | head -n1
ls $DIR | while read -r FILE
do
mv -v "$FILE" `echo $FILE | tr ' ' '_' `
done
ls $DIR | while read -r FILE
do
mv -v "$FILE" `echo $FILE | tr '\*.JPEG' '\*.jpg' `
done
mogrify -resize 750 $DIR/*.jpg
wait
jpegoptim $DIR/*.jpg –max=70 --strip-all
exit
But it does not seem to recognise the $DIR variable.

This bash script will rename and convert the jpg files in the newest directory in the current directory and the files in the first level of directories under that directory.
#!/bin/bash
FIRST_DIR=`ls -t1F | grep / | head -n1`
DIR="./${FIRST_DIR}"
ls -t1F $DIR | while read -r FILE
do
if [ "$FILE" ]
then
if [[ $FILE = */ ]]
then
echo "here ${DIR}${FILE}."
DEEP_DIR="${DIR}${FILE}"
ls -t1 $DEEP_DIR | while read -r FILE2
do
if [ "$FILE2" ]
then
if [[ $FILE2 != */ ]]
then
RENAME=`echo ${FILE2//\*/} | tr ' ' '_' `
mv -v "${DEEP_DIR}${FILE2//\*/}" "${DEEP_DIR}${RENAME}"
FILE2=$RENAME
RENAME2=`echo ${FILE2//\*/} | tr '\*.JPEG' '\*.jpg' `
mv -v "${DEEP_DIR}${FILE2//\*/}" "${DEEP_DIR}${RENAME2}"
FILE2=$RENAME2
fi
fi
done
mogrify -resize 750 "$DEEP_DIR*.jpg"
wait
jpegoptim "$DEEP_DIR*.jpg" –max=70 --strip-all
fi
if [[ $FILE != */ ]]
then
RENAME=`echo ${FILE//\*/} | tr ' ' '_' `
mv -v "${DIR}${FILE//\*/}" "${DIR}${RENAME}"
FILE=$RENAME
RENAME2=`echo ${FILE//\*/} | tr '\*.JPEG' '\*.jpg' `
mv -v "${DIR}${FILE//\*/}" "${DIR}${RENAME2}"
FILE=$RENAME2
fi
fi
done
if [ "$FIRST_DIR" ]
then
mogrify -resize 750 "$DIR*.jpg"
wait
jpegoptim "$DIR*.jpg" –max=70 --strip-all
fi
Here are a couple of good links about bash programming:
http://tldp.org/LDP/abs/html/comparison-ops.html
http://tldp.org/LDP/abs/html/string-manipulation.html

bash scripting challenge

I need to write a bash script that will iterate through the contents of a directory (including subdirectories) and perform the following replacements:
replace 'foo' in any file names with 'bar'
replace 'foo' in the contents of any files with 'bar'
So far all I've got is
find . -name '*' -exec {} \;
:-)

With RH rename:
find -f \( -exec sed -i s/foo/bar/g \; , -name \*foo\* -exec rename foo bar {} \; \)

find "$#" -depth -exec sed -i -e s/foo/bar/g {} \; , -name '*foo*' -print0 |
while read -d '' file; do
base=$(basename "$file")
mv "$file" "$(dirname "$file")/${base//foo/bar}"
done

UPDATED: 1632 EST
Now handles whitespace but 'while read item' never terminates. Better,
but still not right. Will keep
working on this.
aj#mmdev0:~/foo_to_bar$ cat script.sh
#!/bin/bash
dirty=true
while ${dirty}
do
find ./ -name "*" |sed -s 's/ /\ /g'|while read item
do
if [[ ${item} == "./script.sh" ]]
then
continue
fi
echo "working on: ${item}"
if [[ ${item} == *foo* ]]
then
rename 's/foo/bar/' "${item}"
dirty=true
break
fi
if [[ ! -d ${item} ]]
then
cat "${item}" |sed -e 's/foo/bar/g' > "${item}".sed; mv "${item}".sed "${item}"
fi
dirty=false
done
done

#!/bin/bash
function RecurseDirs
{
oldIFS=$IFS
IFS=$'\n'
for f in *
do
if [[ -f "${f}" ]]; then
newf=`echo "${f}" | sed -e 's/foo/bar/g'`
sed -e 's/foo/bar/g' < "${f}" > "${newf}"
fi
if [[ -d "${f}" && "${f}" != '.' && "${f}" != '..' && ! -L "${f}" ]]; then
cd "${f}"
RecurseDirs .
cd ..
fi
done
IFS=$oldIFS
}
RecurseDirs .

bash 4.0
#!/bin/bash
shopt -s globstar
path="/path"
cd $path
for file in **
do
if [ -d "$file" ] && [[ "$file" =~ ".*foo.*" ]];then
echo mv "$file" "${file//foo/bar}"
elif [ -f "$file" ];then
while read -r line
do
case "$line" in
*foo*) line="${line//foo/bar}";;
esac
echo "$line"
done < "$file" > temp
echo mv temp "$file"
fi
done
remove the 'echo' to commit changes

for f in `tree -fi | grep foo`; do sed -i -e 's/foo/bar/g' $f ; done

Yet another find-exec solution:
find . -type f -exec bash -c '
path="{}";
dirName="${path%/*}";
baseName="${path##*/}";
nbaseName="${baseName/foo/bar}";
#nbaseName="${baseName//foo/bar}";
# cf. http://www.bash-hackers.org/wiki/doku.php?id=howto:edit-ed
ed -s "${path}" <<< $'H\ng/foo/s/foo/bar/g\nwq';
#sed -i "" -e 's/foo/bar/g' "${path}"; # alternative for large files
exec mv -iv "{}" "${dirName}/${nbaseName}"
' \;

correction to find-exec approach by gregb (adding quotes):
# compare
bash -c '
echo $'a\nb\nc'
'
bash -c '
echo $'"'a\nb\nc'"'
'
# therefore we need
find . -type f -exec bash -c '
...
ed -s "${path}" <<< $'"'H\ng/foo/s/foo/bar/g\nwq'"';
...
' \;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to optimize sed search and replace in shell script - bash

If you want to replace all occurrences of "startpilot" with ${PWD##/}, just add the g (global) modifier to the sed command: sed -i '' "s/startpilot/${PWD##/}/g" ^ here Rather than replacing the first occurrence on the line, now it will replace all of them.

Related

moving files to their respective folders using bash scripting

Escaping commands in ssh connection via for loop

FInd all files that contains both the string1 and string2

How do I list newest directory and add as variable to bash script to process files recursively

bash scripting challenge

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to optimize sed search and replace in shell script - bash

If you want to replace all occurrences of "startpilot" with ${PWD##*/}, just add the g (global) modifier to the sed command: sed -i '' "s/startpilot/${PWD##*/}/g" ^ here Rather than replacing the first occurrence on the line, now it will replace all of them.

Related

moving files to their respective folders using bash scripting

Escaping commands in ssh connection via for loop

FInd all files that contains both the string1 and string2

How do I list newest directory and add as variable to bash script to process files recursively

bash scripting challenge

Categories

Resources

If you want to replace all occurrences of "startpilot" with ${PWD##/}, just add the g (global) modifier to the sed command: sed -i '' "s/startpilot/${PWD##/}/g" ^ here Rather than replacing the first occurrence on the line, now it will replace all of them.