Minimising domains to specific folders - bash

Looking to change these url's: Looking at cut -d'/' -f1,2,3,.. test Not sure if this the best method?
https://www.example.com/images/fdgdfg4wretrgretg/cad0be71.jpg
https://www.example.com/images/fdgdfg4w/cad0be71/32234/234234.jpg
https://www.example.com/images//77777/dfgdfgdfg/cad0be71.jpg
Connvert into:
https://www.example.com/images/fdgdfg4wretrgretg/
https://www.example.com/images/fdgdfg4w/cad0be71/
https://www.example.com/images//77777/

Why not using bash directly?
declare regexp='^(https?://[^/]+/[^/]+)(/.*)?$'
if [[ "$url" =~ $regexp ]]; then
url="${BASH_REMATCH[1]}"
fi
Since it supports regexp, this should work for most case.
And in a loop:
declare regexp='^(https?://[^/]+/[^/]+)(/.*)?$'
while read -r url; do
if [[ "$url" =~ $regexp ]]; then
url="${BASH_REMATCH[1]}"
echo "url: $url"
else
echo "error: mismatch $url"
fi
done < domains.txt
Or use sed directly:
sed -E -e 's#^(https?://[^/]+/[^/]+)(/.*)?$#\1#g' domains.txt
And if you need to do something more in bash (you may consider reading the link from Ed Morton comment):
while read -r url; do
echo "url: $url"
done < <(sed -E -e 's#^(https?://[^/]+/[^/]+)(/.*)?$#\1#g' domains.txt)

Related

Convert folder and file names to camel case

I have a list of folders and files whose names contain spaces. How can I change the names into camel case?
for oldname in *
do
newname=`echo $oldname | sed -e 's/ /_/g'`
if [ "$newname" = "$oldname" ]
then
continue
fi
if [ -e "$newname" ]
then
echo Skipping "$oldname", because "$newname" exists
else
mv "$oldname" "$newname"
fi
done
I have found this but it changes the spaces into underscores.
Try this Shellcheck-clean Bash code:
#! /bin/bash -p
lowers=abcdefghijklmnopqrstuvwxyz
uppers=ABCDEFGHIJKLMNOPQRSTUVWXYZ
for oldname in *; do
[[ $oldname == *[[:space:]]* ]] || continue
read -r -d '' -a parts <<<"$oldname"
newname=''
for p in "${parts[#]}"; do
char1=${p:0:1}
if [[ $lowers == *"$char1"* ]]; then
tmp=${lowers%"$char1"*}
uchar1=${uppers:${#tmp}:1}
newname+=${uchar1}${p:1}
else
newname+=$p
fi
done
if [[ -e $newname ]]; then
printf "Skipping '%s', because '%s' exists\\n" "$oldname" "$newname" >&2
else
echo mv -v -- "$oldname" "$newname"
fi
done
The code is intended to work with (the now ancient) Bash 3 because my understanding is that that is still the current version of the standard Bash on macOS. The code for uppercasing the first letter of filename parts is much more complicated than it would be with later versions of Bash (which have built-in mechanisms for case conversion). See How to convert a string to lower case in Bash? for information about changing case in various ways in various versions of Bash.
The code just prints the mv command that would be run. Remove the echo to make it actually do the mv.
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why I replaced echo with printf for the "Skipping" message.
For comparison, this is Bash 4+ code:
#! /bin/bash -p
for oldname in *; do
[[ $oldname == *[[:space:]]* ]] || continue
read -r -d '' -a parts <<<"$oldname"
newname=''
for p in "${parts[#]}"; do
newname+=${p^}
done
if [[ -e $newname ]]; then
printf "Skipping '%s', because '%s' exists\\n" "$oldname" "$newname" >&2
else
echo mv -v -- "$oldname" "$newname"
fi
done
You can use the regular expression aptitude to deal with upper and lower case translations, regarding your current local collation (LC_ALL, check with the locale command).
If your filename's "words" are separated with a space and are all in lower case, you can use a simple shell script like this :
#!/bin/sh
while read -r FILENAME ; do
NEWNAME="`echo \"${FILENAME}\" | sed 's/ *\([^ ]\)/\u\1/g'`"
if [ ! "${NEWNAME}" ] ; then
NEWNAME="${FILENAME}";
fi
if [ "${FILENAME}" = "${NEWNAME}" ]; then
printf "No change : %s\\n" "${FILENAME}" >&2;
else
if [ -e "${NEWNAME}" ] ; then
printf "Already changed : %s => %s\\n" "${FILENAME}" "${NEWNAME}" >&2;
else
echo "mv \"${FILENAME}\" \"${NEWNAME}\"";
fi
fi
done
Remove the echo on echo "mv \"${FILENAME}\" \"${NEWNAME}\""; to do the mv.
Note that it should work fine with accented letters or any unicode letter having lower and upper code.
The script takes the file list to operate from stdin, so to use it "as is", you can use something like the following examples :
find . -type 'f' | theScript.sh
For a whole tree of files.
For folders, you'll have to operate them separately. List them and sort them in a descending order.
ls -1 | theScript.sh
For files in the current folder.
If your files may have all or partial upper cases at start and you look to force them entirely to camel case, you can change the line :
NEWNAME="`echo \"${FILENAME}\" | sed 's/ *\([^ ]\)/\u\1/g'`"
With:
NEWNAME="\`echo \"${FILENAME}\" | sed 's/\(.*\)/\l\1/;s/ *\([^ ]\)/\u\1/g'\`"
If you have rename installed, then all you need to do is :
rename 's/ /_/g' *

How does process substitution work with while loops?

I'm reading/editing a bash git integration script
This snippet is supposed to print ${SYMBOL_GIT_PUSH} or ${SYMBOL_GIT_PULL} alongside how many commits i am behind and/or ahead by.
local marks
while IFS= read -r line; do
if [[ $line =~ ^## ]]; then
[[ $line =~ ahead\ ([0-9]+) ]] && marks+=" ${BASH_REMATCH[1]}${SYMBOL_GIT_PUSH}"
[[ $line =~ behind\ ([0-9]+) ]] && marks+=" ${BASH_REMATCH[1]}${SYMBOL_GIT_PULL}"
else
marks="${SYMBOL_GIT_MODIFIED}${marks}"
break
fi
done < <(git status --porcelain --branch 2>/dev/null)
printf '%s' "$marks"
Example:
4↑ 10↓
It is working, but i am trying to understand it.
Why is there some IFS and how does it work with process substitution?
I've heard process isn't defined in sh. Is there a way to do this the /bin/sh way or at least more efficiently?
I was provided with a link that should explain what IFS does.
I switched mixed up things and managed to remove the process substitution:
local marks
git status --porcelain --branch 2>/dev/null |
while IFS= read -r line; do
if [[ $line =~ ^## ]]; then
[[ $line =~ ahead\ ([0-9]+) ]] && marks+=" ${BASH_REMATCH[1]}${SYMBOL_GIT_PUSH}"
[[ $line =~ behind\ ([0-9]+) ]] && marks+=" ${BASH_REMATCH[1]}${SYMBOL_GIT_PULL}"
else
marks="${SYMBOL_GIT_MODIFIED}${marks}"
break
fi
done
printf '%s\n' "$marks"
But now, the value of $marks isn't saved and it prints nothing.
I was provided with another link that explains why.
Will return and update on what i've found.
I used the command grouping workaround and wrapped the loop and the print statement inside curly braces:
Also, i made the /bin/sh version almost functional (the exception - show how much commits i'm ahead or behind, not hard, i'm sure i'll do something with awk or cut).
I took advantage of fact that grep returns non-0 when nothing matches.
git status --porcelain --branch 2>/dev/null | {
SYMBOL_GIT_PUSH='↑'
SYMBOL_GIT_PULL='↓'
while IFS= read -r line
do
if echo "$line" | egrep -q '^##'
then
echo "$line" | egrep -q 'ahead' && marks="$marks $SYMBOL_GIT_PUSH"
echo "$line" | egrep -q 'behind' && marks="$marks $SYMBOL_GIT_PULL"
else
marks="*$marks"
break
fi
done
printf ' %s' "$marks"
}
This was a fun learning experience! Thanks to everyone who helped. When i find the 100% solution i'll update this.
Here's the bashism-less git info function.
__git() {
git_eng="env LANG=C git"
ref="$($git_eng symbolic-ref --short HEAD 2>/dev/null)"
[ -n "$ref" ] && ref="$SYMBOL_GIT_BRANCH$ref" || ref="$($git_eng describe --tags --always 2>/dev/null)"
[ -n "$ref" ] || return;
git status --porcelain --branch 2>/dev/null | {
SYMBOL_GIT_PUSH='↑'
SYMBOL_GIT_PULL='↓'
while IFS= read -r line
do
if echo "$line" | grep -E -q '^##'
then
echo "$line" | grep -E -q 'ahead' &&
marks="$marks $SYMBOL_GIT_PUSH$(echo "$line" | sed 's/.*\[ahead //g' | sed 's/\].*//g')"
echo "$line" | grep -E -q 'behind' &&
marks="$marks $SYMBOL_GIT_PULL$(echo "$line" | sed 's/.*\[behind //g' | sed 's/\].*//g')"
else
marks="$SYMBOL_GIT_MODIFIED$marks"
break
fi
done
printf ' %s%s' "$ref" "$marks"
}
}
sed searches for [ahead and deletes it, as well as everything before it, then it pipes it into another sed which deletes everything past ]. This way only the number remains.

Bash - How can I execute a variable

I am reading a file with lines like:
folder=abc
name=xyz
For some lines line I would like set a variable e.g name=xyz corresponding to the line I have read.
Cutting it down, with name=xyz and folder=abc, I have tried:
while read -r line; do
$line
echo $name
done < /etc/testfile.conf
This gives an error message ./test: line 4: folder=abc: command not found etc.
I have tried "$line" and $($line) and it is the same. Is it possible to do what I whant?
I have succeeded by doing:
while read -r line; do
if [[ "$line" == 'folder'* ]]; then
folder="$(echo "$line" | cut -d'=' -f 2)"
fi
if [[ "$line" == 'name'* ]]; then
name="$(echo "$line" | cut -d'=' -f 2)"
fi
done < /etc/testfile.conf
but this seems messy
for your sample, declare is the safest option:
while read -r line; do
declare "$line"
done
$ echo "$folder"
abc
$ echo "$name"
xyz
Direct approach, use eval.
Different approach, try with source or .:
$ echo "$line"
folder=abc
$ . <(echo "$line")
$ echo "$folder"
abc
But probably the good answer will be to tackle the problem in a different way.
You can clean up your approach a bit without resorting to eval.
while IFS="=" read -r name value; do
case $name in
folder) folder=$value ;;
name) name=$value ;;
esac
done < /etc/testfile.conf
why not only source de file ?
$ . infile ; echo "$name"
xyz

base64 decode while ignoring brackets

I'm trying to decode a file, which is mostly encoded with base64. What I want to do is to decode the following, while still maintaining the [_*_].
example.txt
wq9cXyjjg4QpXy/Crwo=
[_NOTBASE64ED_]
aGkgdGhlcmUK
[_CONSTANT_]
SGVsbG8gV29ybGQhCg==
Sometimes it'll be in this form
aGkgdGhlcmUK[_CONSTANT_]SGVsbG8gV29ybGQhCg==
Desired output
¯\_(ツ)_/¯
[_NOTBASE64ED_]
hi there
[_CONSTANT_]
Hello World!
hi there[_CONSTANT_]Hello World!
Error output
¯\_(ツ)_/¯
4��!:�#�H\�B�8ԓ��[��ܛBbase64: invalid input
What I've tried
base64 -di example.txt
base64 -d example.txt
base64 --wrap=0 -d -i example.txt
I tried to individually base64 the [_*_] using grep -o. Then find and
replacing them through a weird arrangement with arrays, but I couldn't
get it to work.
base64ing it all, then decoding. Results in double base64ed rows.
The file is significantly downsized!
Encoded using base64 --wrap=0, while loop, and if/else statement.
The [_*_] still need to be there after being decoded.
I am sure someone has a more clever solution than this. But try this
#! /bin/bash
MYTMP1=""
function printInlineB64()
{
local lines=($(echo $1 | sed -e 's/\[/\n[/g' -e 's/\]/]\n/g'))
OUTPUT=""
for line in "${lines[#]}"; do
MYTMP1=$(base64 -d <<< "$line" 2>/dev/null)
if [ "$?" != "0" ]; then
OUTPUT="${OUTPUT}${line}"
else
OUTPUT="${OUTPUT}${MYTMP1}"
fi;
done
echo "$OUTPUT"
}
MYTMP2=""
function printB64Line()
{
local line=$1
# not fully base64 line
if [[ ! "$line" =~ ^[A-Za-z0-9+/=]+$ ]]; then
printInlineB64 "$line"
return
fi;
# likely base64 line
MYTMP2=$(base64 -d <<< "$line" 2>/dev/null)
if [ "$?" != "0" ]; then
echo $line
else
echo $MYTMP2
fi;
}
FILE=$1
if [ -z "$FILE" ]; then
echo "Please give a file name in argument"
exit 1;
fi;
while read line; do
printB64Line "$line"
done < ${FILE}
and here is output
$ cat example.txt && echo "==========================" && ./base64.sh example.txt
wq9cXyjjg4QpXy/Crwo=
[_NOTBASE64ED_]
aGkgdGhlcmUK
[_CONSTANT_]
SGVsbG8gV29ybGQhCg==
==========================
¯\_(ツ)_/¯
[_NOTBASE64ED_]
hi there
[_CONSTANT_]
Hello World!
$ cat example2.txt && echo "==========================" && ./base64.sh example2.txt
aGkgdGhlcmUK[_CONSTANT_]SGVsbG8gV29ybGQhCg==
==========================
hi there[_CONSTANT_]Hello World!
You need a loop that reads each line and tests whether it's base64 or non-base64, and processes it appropriately.
while read -r line
do
case "$line" in
\[*\]) echo "$line" ;;
*) base64 -d <<< "$line" ;;
esac
done << example.txt
I would suggest using other languages other than sh but here is a solution using cut. This would handle the case where there are more than one [_constant_] in a line.
#!/bin/bash
function decode() {
local data=""
local line=$1
while [[ -n $line ]]; do
data=$data$(echo $line | cut -d[ -f1 | base64 -d)
const=$(echo $line | cut -d[ -sf2- | cut -d] -sf1)
[[ -n $const ]] && data=$data[$const]
line=$(echo $line | cut -d] -sf2-)
done
echo "$data"
}
while read -r line; do
decode $line
done < example.txt
If Perl is an option, you can say something like:
perl -MMIME::Base64 -lpe '$_ = join("", grep {/^\[/ || chomp($_ = decode_base64($_)), 1} split(/(?=\[)|(?<=\])/))' example.txt
The code below is equivalent to the above but is broken down into steps for the explanation purpose:
#!/bin/bash
perl -MMIME::Base64 -lpe '
#ary = split(/(?=\[)|(?<=\])/, $_);
foreach (#ary) {
if (! /^\[/) {
chomp($_ = decode_base64($_));
}
}
$_ = join("", #ary);
' example.txt
-MMIME::Base64 option loads the base64 codec module.
-lpe option makes Perl bahave like AWK to loop over input lines and implicitly handle newlines.
The regular expression (?=\[)|(?<=\]) matches the boundary between the base64 block and the maintaining block surrounded by [...].
The split function divides the line into blocks on the boundary and store them in an array.
Then loop over the array and decode the base64-encoded entry if found.
Finally merge the substring blocks into a line to print.

find multiple patterns in multiple files bash

I'm trying to find multiple patterns (I have a file of them) in multiple differents files with a lot of subdirs.
I'm trying to use exit codes for not outputting all patterns found (because I need only the ones which are NOT found), but exit codes doesn't work as I understand them.
while read pattern; do
grep -q -n -r $pattern ./dir/
if [ $? -eq 0 ]; then
: #echo $pattern ' exists'
else
echo $pattern " doesn't exist"
fi
done <strings.tmp
You can use this in bash:
while read -r pattern; do
grep -F -q -r "$pattern" ./dir/ || echo $pattern " doesn't exist"
done < strings.tmp
Use read -r to safely read regex patterns
Use quoting in "$pattern" to avoid shell escaping
No need to use -n since you're using -q (quiet) flag
#anubhava's solution should work. If it doesn't for some reason, try the following
while read -r pattern; do
lines=`grep -q -r "$pattern" ./dir/ | wc -l`
if [ $lines -eq 0 ]; then
echo $pattern " doesn't exist"
else
echo $pattern "exists"
fi
done < strings.tmp

Resources