Using sed to remove prefix with slash from string - bash

I'm extracting a Jira issue from a string using sed. I want to get rid of prefixes too.
My possible prefixes are:
FEATURE_PREFIX="feature/"
BUGFIX_PREFIX="bugfix/"
I've tried three different ways to use sed despite the slash in the prefix but nothing seems to work.
First try:
export UNFIXED_ID=$(echo ${CI_MERGE_REQUEST_TITLE} | sed -e "s~^$BUGFIX_PREFIX~/" | sed -e "s~^$FEATURE_PREFIX~/")
export MERGE_REQUEST_JIRA_ID=$(echo ${UNFIXED_ID} | sed -r "s/^([A-Za-z][A-Za-z0-9]+-[0-9]+).*/\1/")
echo ${MERGE_REQUEST_JIRA_ID}
gives the error sed: unmatched '~'
Second try:
export UNFIXED_ID=$(echo ${CI_MERGE_REQUEST_TITLE} | sed -e "s~^$BUGFIX_PREFIX/~" | sed -e "s~^$FEATURE_PREFIX/~")
export MERGE_REQUEST_JIRA_ID=$(echo ${UNFIXED_ID} | sed -r "s/^([A-Za-z][A-Za-z0-9]+-[0-9]+).*/\1/")
echo ${MERGE_REQUEST_JIRA_ID}
gives the error sed: unmatched '~'
Third try:
export UNFIXED_ID=$(echo ${CI_MERGE_REQUEST_TITLE} | sed -e "s~^$BUGFIX_PREFIX~" | sed -e "s~^$FEATURE_PREFIX~")
export MERGE_REQUEST_JIRA_ID=$(echo ${UNFIXED_ID} | sed -r "s/^([A-Za-z][A-Za-z0-9]+-[0-9]+).*/\1/")
echo ${MERGE_REQUEST_JIRA_ID}
gives the error sed: unmatched '~'
As per this question Sed error : bad option in substitution expression I thought it was just a matter of replacing the / by ~
What am I failing to do here with the delimiter?

I apologise if I have misunderstood the question, but if the two variables are on their own lines, then you can probably just search for and print the entire line:-
sed -n '/FEATURE_PREFIX/p' file | \
cut -d'=' -f2 | \
head -c -3 | \
sed s'#$#\"#'

Related

How do I delete lines from my bash history matching a specific pattern?

I can get a list of the line numbers matching a specific pattern such as containing the word "function".
history | grep function | sed -e 's/^\(.\{5\}\).*/\1/' | sed 's/^ *//g'
If I do history -d on that it says bad pattern, I don't know if it's as it's a list or their strings rather than numbers?
history -d (history | grep function | sed -e 's/^\(.\{5\}\).*/\1/' | sed 's/^ *//g')
Quick answer:
while read n; do history -d $n; done < <(history | tac | awk '/function/{print $1}')
Explanation:
The history command accepts only a single offset when using the -d flag. On top of that when you delete an entry, it also renumbers all the commands after this entry. For this reason we revert the output of history using tac and process the lines from last to first. This short awk line just replaces the grep and sed command to pick up the history offset.
We do not use a full pipeline as this creates subshells and history -d $n would not work properly. This is nicely explained in: Why can't I delete multiple entries from bash history with this loop
Note: If you want to push this to your history file ($HISTFILE), you have to use history -w
Warning: When you have multiline commands in your history the story becomes very complicated and strongly depends on various options that have been set. See [U&L] When is a multiline history entry (aka lithist) in bash possible? for the nasty bits.
You can delete one history entry or a range of entries, but not a list. Your matches are likely to be spread out, so the range option is out.
The multiple sed commands to extract the history offsets can be simplified into one:
sed -E 's/^ *([0-9]*).*$/\1/'
One problem with history is that it can have multiline entries, like:
741 source <(history | \
grep function | \
sed -E 's/^ *([0-9]*).*$/\1/' | \
sort -rn | \
xargs -n1 echo history -d)
If your grep matches on function above, your sed will not be able to extract the history offset number, so we need to make that possible. One way may be to remove all newlines and only add them on lines containing the history offset. This is one way that probably can be done in some easier way:
awk '/^ {0,4}[0-9]+/ {
printf("\n%s",$0);
}
!/^ {0,4}[0-9]+/{
printf(" %s",$0);
}
END{
printf("\n")
}'
We can then produce a number of history -d commands with xargs. xargs can't run the build-it history directly, so I've just used it to produce input to the built-in source using Process Substitution:
source <(history | \
awk '/^ {0,4}[0-9]+/ {
printf("\n%s",$0);
}
!/^ {0,4}[0-9]+/{
printf(" %s",$0);
}
END{
printf("\n")
}' | \
grep function | \
sed -E 's/^ *([0-9]*).*$/\1/' | \
sort -rn | \
xargs -n1 echo history -d)
#kvantour gives nice alternatives to grep + sed + sort -rn. Using those, my above blob could be simplified into:
source <(history | \
awk '/^ {0,4}[0-9]+/ {
printf("\n%s",$0);
}
!/^ {0,4}[0-9]+/{
printf(" %s",$0);
}
END{
printf("\n")
}' | \
awk '/function/ {print "history -d",$1}' | \
tac)
You need to store the pattern in a variable and then pass it to history.
$ history | grep function | sed -e 's/^\(.\{5\}\).*/\1/' | sed 's/^ *//g'
1077
$ var=$( history | grep function | sed -e 's/^\(.\{5\}\).*/\1/' | sed 's/^ *//g')
$ history -d $var
However, as you can have a lot of ocurrences for the patter, I would use a loop
$ var=$( history | grep function | sed -e 's/^\(.\{5\}\).*/\1/' | sed 's/^ *//g')
$ for i in $var
> do
> history -d $i
> history -w
> done
If the line you want to delete has already been written to your $HISTFILE (which typically happens when you end a session by default), you will need to write back to $HISTFILE, or the line will reappear when you open a new session.
After the deletion you need to load again the .bashrc by executing
$ cd
$ source .bashrc
However, there are cases that the lines won't be deleted: if you set PROMPT_COMMAND to history -a, in that case it is already written to the history file, rather than on exit under normal configuration.

output of sed gives strange result when using capture groups

I'm doing the following command in a bash:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'
I think this should output only the matching lines and the content of the capture group. So I'm expecting 0.0.0 as the result. But I'm getting 0.0.0abcd
Why contains the capture group parts from the left and the right side of the /? What I am doing wrong?
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' |
sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'
echo outputs two lines:
UNUSED
URL: ^/tags/0.0.0/abcd
The regular expression given to sed does not match the first line, so this line is not printed. The regular expression matches the second line, so URL: ^/tags/0.0.0/ is replaced with 0.0.0; only the matched part of the line is replaced, so abcd is passed unchanged.
To obtain the desired output you must also match abcd, for example with
sed -rn 's#^URL: \^/tags/([^/]+)/.*#\1#p'
where the .* eats all characters to the end of the line.
You can use awk:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | awk -F/ 'index($0, "^/tags/"){print $3}'
0.0.0
This awk command uses / as field delimiter and prints 3rd column when there ^/tags/ text in input.
Alternatively, you can use gnu grep:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | grep -oP '^URL: \^/tags/\K([^/]+)'
0.0.0
Or this sed:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -nE 's~^URL: \^/tags/([^/]+).*~\1~p'
0.0.0
This sed catch your desired output.
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -E '/URL/!d;s#.*/(.*)/[^/]*#\1#'

replacing spaces and brackets in a string + sed + is there a better way than this?

trying to replace the sapces and underscores in this is just a (test)
I do the following:
echo "this is just a (test)" | sed -e 's/ /_/g' | sed -e 's/(//g' | sed -e 's/)//g'
And this gives me:
this_is_just_a_test
Is there a better way? shorter way of writing it in sed?
You can achieve the same thing using tr:
echo "this is just a (test)" | tr \ _ | tr -d \(\)
The first tr replaces spaces with underscores and the second one deletes all parenthesis.

Bash script builds correct $cmd but fails to execute complex stream

This short script scrapes some log files daily to create a simple extract. It works from the command line and when I echo the $cmd and copy/paste, it also works. But it will breaks when I try to execute from the script itself.
I know this is a nightmare of patterns that I could probably improve, but am I missing something simple to just execute this correctly?
#!/bin/bash
priorday=$(date --date yesterday +"%Y-%m-%d")
outputfile="/home/CCHCS/da14/$priorday""_PROD_message_processing_times.txt"
cmd="grep 'Processed inbound' /home/rules/care/logs/RootLog* | cut -f5,6,12,16,18 -d\" \" | grep '^"$priorday"' | sed 's/\,/\./' | sed 's/ /\t/g' | sed -r 's/([0-9]+\-[0-9]+\-[0-9]+)\t/\1 /' | sed 's/ / /g' | sort >$outputfile"
printf "command to execute:\n"
echo $cmd
printf "\n"
$cmd
ouput:
./make_log_extract.sh command to execute: grep 'Processed inbound' /home/rules/care/logs/RootLog.log /home/rules/care/logs/RootLog.log.1
/home/rules/care/logs/RootLog.log.10
/home/rules/care/logs/RootLog.log.11
/home/rules/care/logs/RootLog.log.12
/home/rules/care/logs/RootLog.log.2
/home/rules/care/logs/RootLog.log.3
/home/rules/care/logs/RootLog.log.4
/home/rules/care/logs/RootLog.log.5
/home/rules/care/logs/RootLog.log.6
/home/rules/care/logs/RootLog.log.7
/home/rules/care/logs/RootLog.log.8
/home/rules/care/logs/RootLog.log.9 | cut -f5,6,12,16,18 -d" " | grep
'^2014-01-30' | sed 's/\,/./' | sed 's/ /\t/g' | sed -r
's/([0-9]+-[0-9]+-[0-9]+)\t/\1 /' | sed 's/ / /g' | sort
/home/CCHCS/da14/2014-01-30_PROD_message_processing_times.txt
grep: 5,6,12,16,18: No such file or directory
As grebneke comments, do not store the command and then execute it.
What you can do is to execute it but firstly print it: Bash: Print each command before executing?
priorday=$(date --date yesterday +"%Y-%m-%d")
outputfile="/home/CCHCS/da14/$priorday""_PROD_message_processing_times.txt"
set -o xtrace # <-- set printing mode "on"
grep 'Processed inbound' /home/rules/care/logs/RootLog* | cut -f5,6,12,16,18 -d\" \" | grep '^"$priorday"' | sed 's/\,/\./' | sed 's/ /\t/g' | sed -r 's/([0-9]+\-[0-9]+\-[0-9]+)\t/\1 /' | sed 's/ / /g' | sort >$outputfile"
set +o xtrace # <-- revert to normal

Using sed to replace a string with the contents of a variable, even if it's an escape character

I'm using
sed -e "s/\*DIVIDER\*/$DIVIDER/g" to replace *DIVIDER* with a user-specified string, which is stored in $DIVIDER. The problem is that I want them to be able to specify escape characters as their divider, like \n or \t. When I try this, I just end up with the letter n or t, or so on.
Does anyone have any ideas on how to do this? It will be greatly appreciated!
EDIT: Here's the meat of the script, I must be missing something.
curl --silent "$URL" > tweets.txt
if [[ `cat tweets.txt` == *\<error\>* ]]; then
grep -E '(error>)' tweets.txt | \
sed -e 's/<error>//' -e 's/<\/error>//' |
sed -e 's/<[^>]*>//g' |
head $headarg | sed G | fmt
else
echo $REPLACE | awk '{gsub(".", "\\\\&");print}'
grep -E '(description>)' tweets.txt | \
sed -n '2,$p' | \
sed -e 's/<description>//' -e 's/<\/description>//' |
sed -e 's/<[^>]*>//g' |
sed -e 's/\&amp\;/\&/g' |
sed -e 's/\&lt\;/\</g' |
sed -e 's/\&gt\;/\>/g' |
sed -e 's/\&quot\;/\"/g' |
sed -e 's/\&....\;/\?/g' |
sed -e 's/\&.....\;/\?/g' |
sed -e 's/^ *//g' |
sed -e :a -e '$!N;s/\n/\*DIVIDER\*/;ta' | # Replace newlines with *divider*.
sed -e "s/\*DIVIDER\*/${DIVIDER//\\/\\\\}/g" | # Replace *DIVIDER* with the actual divider.
head $headarg | sed G
fi
The long list of sed lines are replacing characters from an XML source, and the last two are the ones that are supposed to replace the newlines with the specified character. I know it seems redundant to replace a newline with another newline, but it was the easiest way I could come up with to let them pick their own divider. The divider replacement works great with normal characters.
You can use bash to escape the backslash like this:
sed -e "s/\*DIVIDER\*/${DIVIDER//\\/\\\\}/g"
The syntax is ${name/pattern/string}. If pattern begins with /, every occurence of pattern in name is replaced by string. Otherwise only the first occurence is replaced.
Maybe:
case "$DIVIDER" in
(*\\*) DIVIDER=$(echo "$DIVIDER" | sed 's/\\/\\\\/g');;
esac
I played with this script:
for DIVIDER in 'xx\n' 'xxx\\ddd' "xxx"
do
echo "In: <<$DIVIDER>>"
case "$DIVIDER" in (*\\*) DIVIDER=$(echo "$DIVIDER" | sed 's/\\/\\\\/g');;
esac
echo "Out: <<$DIVIDER>>"
done
Run with 'ksh' or 'bash' (but not 'sh') on MacOS X:
In: <<xx\n>>
Out: <<xx\\n>>
In: <<xxx\\ddd>>
Out: <<xxx\\\\ddd>>
In: <<xxx>>
Out: <<xxx>>
It seems to be a simple substitution:
$ d='\n'
$ echo "a*DIVIDER*b" | sed "s/\*DIVIDER\*/$d/"
a
b
Maybe I don't understand what you're trying to accomplish.
Then maybe this step could take the place of the last two of yours:
sed -n ":a;$ {s/\n/$DIVIDER/g;p;b};N;ba"
Note the space after the dollar sign. It prevents the shell from interpreting "${s..." as a variable name.
And as ghostdog74 suggested, you have way too many calls to sed. You may be able to change a lot of the pipe characters to backslashes (line continuation) and delete "sed" from all but the first one (leave the "-e" everywhere). (untested)
You just need to escape the escape char.
\n will match \n
\ will match \
\\ will match \
Using FreeBSD sed (e.g. on Mac OS X) you have to preprocess the $DIVIDER user input:
d='\n'
d='\t'
NL=$'\\\n'
TAB=$'\\\t'
d="${d/\\n/${NL}}"
d="${d/\\t/${TAB}}"
echo "a*DIVIDER*b" | sed -E -e "s/\*DIVIDER\*/${d}/"

Resources