I have a question about the "sed" statement in bash (macOS) - bash

I'm just starting to play around with (bash/zsh) shell scripting in macOS. I'm experiencing an issue with the "sed" statement:
I have set the following variable
var1=MACBOOKPRO
when executing the following terminal command:
echo $var1 | sed -- 's/IMAC/IM/g; s/IMACPRO/IMP/g; s/MACMINI/MM/g; s/MACMINIPRO/MMP/g; s/MACBOOK/MB/g; s/MACBOOKAIR/MBA/g; s/MACBOOKPRO/MBP/g; s/MACPRO/MP/g; s/POWERMAC/PM/g'
The following output is produced: MBPRO (which is not what I want)
But when using the following command:
echo $var1 | sed -- 's/MACBOOKPRO/MBP/g; s/IMAC/IM/g; s/IMACPRO/IMP/g; s/MACMINI/MM/g; s/MACMINIPRO/MMP/g; s/MACPRO/MP/g; s/MACBOOKAIR/MBA/g; s/MACBOOK/MB/g; s/POWERMAC/PM/g'
I get a different (but correct output)! - MBP
Has anyone have a clue for me why this is happening?
Thanks in advance for your reply
Charly

Because it performs the replacements in the order you specify them. Your variable matches both /MACBOOK/ and /MACBOOKPRO/. When the first pattern is earlier in the list, it replaces the MACBOOK prefix with MB.
When you have overlapping patterns like this, you should put the longer ones first.
Another option is to change your patterns so they're anchored to match the entire string, using ^ and $:
echo $var1 | sed -- 's/^IMAC$/IM/g; s/^IMACPRO$/IMP/g; s/^MACMINI$/MM/g; s/^MACMINIPRO$/MMP/g; s/^MACBOOK$/MB/g; s/^MACBOOKAIR$/MBA/g; s/^MACBOOKPRO$/MBP/g; s/^MACPRO$/MP/g; s/^POWERMAC$/PM/g'

s/MACBOOK/MB/g; turns MACBOOKPRO into MBPRO.

Related

How to remove everything from path after a word using sed in bash/ shell script?

Where the word is repeated and only wanted to be removed from specific word location
Lets say my path is - /opt/xyz/config/config.xml
Solution I want after using sed is: /opt/xyz/config/
how can this be obtained?
I am sick of using {sed 's/config.*//'} >> This actually removes both config words
such as it looks
/opt/xyz/
I have tried using this in multiple ways
> sed 's/config.*//'
Maybe with something like this?
sed 's/[^/]*$//'
But if the filepath is in a shell variable then you might as well use:
mydir=${myfilepath%/*}/
Another potential solution, depending on your use-case, is the dirname bash function, e.g.
dirname /opt/xyz/config/config.xml
/opt/xyz/config
Using BASH parameter expansion:
p="/opt/xyz/config/config.xml"; echo "${p%/*}"
/opt/xyz/config

Is it possible to resolve SC2001 ("See if you can use ${variable//search/replace} instead") while using a position variable?

I'm looking for a one liner to replace any character in a variable string at a variable position with a variable substitute. I came up with this working solution:
echo "$string" | sed "s/./${replacement}/${position}"
An example usage:
string=aaaaa
replacement=b
position=3
echo "$string" | sed "s/./${replacement}/${position}"
aabaa
Unfortunately, when I run shellcheck with a script which contains my current solution it tells me:
SC2001: See if you can use ${variable//search/replace} instead.
I'd like to use parameter expansion like it's suggesting instead of piping to sed, but I'm unclear as to the proper formatting when using a position variable. The official documentation doesn't seem to discuss positioning within strings at all.
Is this possible?
Bash doesn't have a general-case replacement for all sed facilities (the shellcheck wiki page for warning SC2001 acknowledges as much), but in some specific scenarios -- including the case posed -- parameter expansions can be combined to achieve the desired effect:
string=aaaaa
replacement=b
position=3
echo "${string:0:$(( position - 1 ))}${replacement}${string:position}"
Here, we're splitting the value up into substrings: ${string:0:$(( position - 1 ))} is the text preceding the content to be replaced, and ${string:position} is the text following that point.

Running sed ON a variable in bash script

Apologies for a seemingly inane question. But I have spent the whole day trying to figure it out and it drives me up the walls. I'm trying to write a seemingly simple bash script that would take a list of files in the directory from ls, replace part of the file names using sed, get unique names from the list and pass them onto some command. Like so:
inputs=`ls *.ext`
echo $inputs
test1_R1.ext test1_R2.ext test2_R1.ext test2_R2.ext
Now I would like to put it through sed to replace 1.ext and 2.ext with * to get test1_R* etc. Then I'd like to remove resulting duplicates by running sort -u to arrive to the following $outputs variable:
echo $outputs
test1_R* test2_R*
And pass this onto a command, like so
cat $outputs
I can do something like this in a command line:
ls *.ext | sed s/..ext/\*/g | sort -u
But if I try to assign the above to a variable in the script it just returns the output from the ls. I have tried several ways to do it: including the whole pipe in the script. Running each command separately and assigning it to a variable, then passing that variable to the next command and writing the outputs to files then passing the file to the next command. But so far none of this managed to achieve what I aimed to. I think my problem lies in (except general cluelessness aroung bash scripting) inability to run seq on a variable within script. There seems to be a lot of advice around in how to pass variables to pattern or replacement string in sed, but they all seem to take files as input. But I understand that it might not be the proper way of doing it anyway. Therefore I would really appreciate if someone could suggest an elegant way to achieve, what I'm trying to.
Many thanks!
Update 2/06/2014
Hi Barmar, thanks for your answer. Can't say it solved the problem, but it helped pin-pointing it. Seems like the problem is in me using the asterisk. I have to say, I'm very puzzled. The actual file names I've got are:
test1_R1.fastq.gz test1_R2.fastq.gz test2_R1.fastq.gz test2_R2.fastq.gz
If I'm using the code you suggested, which seems to me the right way do to it:
ins=$(ls *.fastq.gz | sed 's/..fastq.gz/\*/g' | sort -u)
Sed doesn't seem to do anything and I'm getting the output of ls:
test1_R1.fastq.gz test1_R2.fastq.gz test2_R1.fastq.gz test2_R2.fastq.gz
Now if I replace that backslash with anything else, the sed works, but it also returns whatever character I'm putting in front (or after) the asteriks:
ins=$(ls *.fastq.gz | sed 's/..fastq.gz/"*/g' | sort -u)
test1_R"* test2_R"*
That's odd enough, but surely I can just put an "R" in front of the asteriks and then replace R in the search pattern string, right? Wrong! If I do that whichever way: 's/R..fastq.gz/R*/g' 's/...fastq.gz/R*/g' 's/[A-Z]..fastq.gz/R*/g' I'm back to the original names! And even if I end up with something like test1_RR* test2_RR* and try to run it through sed again and replace "_R" for "_" or "RR" for "R", I'm having no luck and I'm back to the original names. And yet I can replace the rest of the file name no problem, just not to get me test1_R* I need.
I have a feeling I should be escaping that * in some very clever way, but nothing I've tried seems to work. Thanks again for your help!
This is how you capture the result of the whole pipeline in a variable:
var=$(ls *.ext | sed s/..ext/\*/g | sort -u)

Problem with run commands in shell (bash) with arguments as variables !

(Sorry for the confusion. Previous $ sign occurred when I tried to simplify the actual problem. Thanks for correcting the question)
I wanted to split a directory name on underscores (ex: dir_to_split="my_test_dir") like this:
my_dir=($dir_to_split)
var=$(echo $my_dir | awk -F"_" '{print $1,$2,$3}')
set -- $var
splited_1=$1
splited_2=$2
splited_3=$3
now using these splited_x is causing me errors. ex.
myprograme $splited_1 $splited_2 $splited_3
Can anyone please help me with this ? Thank you....
(Rewritten after updated question.)
What kind of errors do you get? I find it useful to add set -x to the top of my shell scripts when debugging, this lets the shell print all commands it executes so you can pinpoint the line where problems begin.
Are you sure that $dir_to_split is actually set? Does it contain spaces or tabs? Does it contain two underscores? I don't see any other problems right now.
There are in-shell methods of splitting a variable such as:
dir="my_test_dir"
OIFS="$IFS"
IFS="-"
set --
IFS="$OIFS"
See also this SO question.

strip version from package name using Bash

I'm trying to strip the version out of a package name using only Bash. I have one solution but I don't think that's the best one available, so I'd like to know if there's a better way to do it. by better I mean cleaner, easier to understand.
suppose I have the string "my-program-1.0" and I want only "my-program". my current solution is:
#!/bin/bash
PROGRAM_FULL="my-program-1.0"
INDEX_OF_LAST_CHARACTER=`awk '{print match($0, "[A-Za-z0-9]-[0-9]")} <<< $PROGRAM_FULL`
PROGRAM_NAME=`cut -c -$INDEX_OF_LAST_CHARACTER <<< $PROGRAM_FULL`
actually, the "package name" syntax is an RPM file name, if it matters.
thanks!
Pretty well-suited to sed:
# Using your matching criterion (first hyphen with a number after it
PROGRAM_NAME=$(echo "$PROGRAM_FULL" | sed 's/-[0-9].*//')
# Using a stronger match
PROGRAM_NAME=$(echo "$PROGRAM_FULL" | sed 's/-[0-9]\+\(\.[0-9]\+\)*$//')
The second match ensures that the version number is a sequence of numbers separated by dots (e.g. X, X.X, X.X.X, ...).
Edit: So there are comments all over based on the fact that the notion of version number isn't very well-defined. You'll have to write a regex for the input you expect. Hopefully you won't have anything as awful as "program-name-1.2.3-a". Absent any additional request from the OP though, I think all the answers here are good enough.
Bash:
program_full="my-program-1.0"
program_name=${program_full%-*} # remove the last hyphen and everything after
Produces "my-program"
Or
program_full="alsa-lib-1.0.17-1.el5.i386.rpm"
program_name=${program_full%%-[0-9]*} # remove the first hyphen followed by a digit and everything after
Produces "alsa-lib"
How about:
$ echo my-program-1.0 | perl -pne 's/-[0-9]+(\.[0-9]+)+$//'
my-program

Resources