Removing substring out of string using sed - bash

I am trying to remove substring out of variable using sed like this:
PRINT_THIS="`echo "$fullpath" | sed 's/${rootpath}//' -`"
where
fullpath="/media/some path/dir/helloworld/src"
rootpath=/media/some path/dir
I want to echo just rest of the fullpath like this (i am using this on whole bunch of directories, so I need to store it in variables and do it automatically
echo "helloworld/src"
using variable it would be
echo "Directory: $PRINT_THIS"
Problem is, I can not get sed to remove the substring, what I am I doing wrong? Thanks

You don't need sed for that, bash alone is enough:
$ fullpath="/media/some path/dir/helloworld/src"
$ rootpath="/media/some path/dir"
$ echo ${fullpath#${rootpath}}
/helloworld/src
$ echo ${fullpath#${rootpath}/}
helloworld/src
$ rootpath=unrelated
$ echo ${fullpath#${rootpath}/}
/media/some path/dir/helloworld/src
Check out the String manipulation documentation.

To use variables in sed, you must use it like this :
sed "s#$variable##g" FILE
two things :
I use double quotes (shell don't expand variables in single quotes)
I use another separator that doesn't conflict with the slashes in your paths
Ex:
$ rootpath="/media/some path/dir"
$ fullpath="/media/some path/dir/helloworld/src"
$ echo "$fullpath"
/media/some path/dir/helloworld/src
$ echo "$fullpath" | sed "s#$rootpath##"
/helloworld/src

Related

Extracting a substring until and including a matching word using bash tools

I have file names like these:
func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz
func/sub-01_task-pfobloc_run-01_bold_space-T1w_preproc.nii.gz
func/sub-01_task-rest_run-01_bold_space-T1w_preproc.nii.gz
and from each file name I want to extract the part until and including the word bold so that in the end I have:
func/sub-01_task-biommtloc_run-01_bold
func/sub-01_task-pfobloc_run-01_bold
func/sub-01_task-rest_run-01_bold
Any ideas how to do that?
The easiest thing to do is to just remove bold and everything after, then replace bold. Obviously, this only works if the terminating string is fixed, as in this case.
$ f=func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz
$ echo "${f%%bold*}"
func/sub-01_task-biommtloc_run-01_
$ echo "${f%%bold*}bold"
func/sub-01_task-biommtloc_run-01_bold
Is something like this what you want?
echo func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz | sed -e 's#bold_.*$#bold#'
Hope this helps
This is (needlessly) clever: remove the prefix ending with "bold"
and then so some substring index arithmetic based on the length of the suffix that's left over:
$ file=func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz
$ tmp=${file#*bold}
$ keep=${file:0:${#file}-${#tmp}}
$ echo "$keep"
func/sub-01_task-biommtloc_run-01_bold
If $file does not contain "bold", then $keep will be empty: we can give it the value of $file if it is empty:
$ file=foobar
$ tmp=${file#*bold}
$ keep=${file:0:${#file}-${#tmp}}
$ : ${keep:=$file}
$ echo "$keep"
foobar
But seriously, do what chepner suggests.
using Perl
> echo "func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz" | perl -e 'while (<>) { $_=~s/(.*bold)(.*)/\1/g; print } '
func/sub-01_task-biommtloc_run-01_bold
>
This is similar to glenn's solution, but a bit "less clever" in that it doesn't use substrings, just nested substitutions:
$ while IFS= read -r fname; do echo "${fname%"${fname#*bold}"}"; done < infile
func/sub-01_task-biommtloc_run-01_bold
func/sub-01_task-pfobloc_run-01_bold
func/sub-01_task-rest_run-01_bold
The substitution "${fname%"${fname#*bold}"}" says:
Remove "${fname#*bold}" from the end of each filename, where
"${fname#*bold}" is everything up to and including bold removed from the front of the filename
Example for the first filename with explicit intermediate steps:
$ fname=func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.gz
$ echo "${fname#*bold}"
_space-T1w_preproc.nii.gz
$ echo "${fname%"${fname#*bold}"}"
func/sub-01_task-biommtloc_run-01_bold
f=func/sub-01_task-biommtloc_run-01_bold_space-T1w_preproc.nii.g
echo "${f//bold*/bold}"
I would recommend using sed for this task. First take all of your input filenames and stick them in a file, call it namelist.txt in the current directory. The following will work, as long as your sed supports extended regular expressions (which most will, particularly GNU sed). Note that the flag for extended regular expressions may differ a bit between platforms, check your sed manual page. On my Linux, it is -r.
bash -c "sed -r 's/(sub-01_task-.{1,10}_run-01_bold).+/\\1/' namelist.txt"

append given variable in stream of cpp files using awk/sed by ignoring backslash

I have a variable in shell script which is as follows:
var=file1.cpp file2.cpp file3.cpp file4.cpp
I want to append this "Folder/SubFolder/ " to every .cpp file. I use the following sed command that helps partially:
echo $var | sed 's/^/'"$i"'\//g;s/\s/ '"$i"'\//g'
where $i --> "Folder" and sed adds extra "/" to it
This sed is able to append only "Folder/" to the files.... I am unable to append "Folder/SubFolder" to every file.
How can I modify sed to do add "Folder/SubFolder/" path to the files. Can I modify it somehow to ignore the backslash "/ " in the $i variable ( i.e. ignore the "/" in Folder/SubFolder
You can use a different delimiter for sed like #
Also you can combine the s command together to a single sed
Example
$ echo $var | sed -r 's#(^| )#\1folder/subfolder/#g'
folder/subfolder/file1.cpp folder/subfolder/file2.cpp folder/subfolder/file3.cpp folder/subfolder/file4.cpp
OR
If you want to use variables inside sed
$ re="folder/subfolder/"
$ echo $var | sed -r "s#(^| )#\1$re#g"
folder/subfolder/file1.cpp folder/subfolder/file2.cpp folder/subfolder/file3.cpp folder/subfolder/file4.cpp
You can also try these codes in a script
var="file1.cpp file2.cpp file3.cpp file4.cpp"
i="Folder/SubFolder/"
echo "$var"|sed "s#[^ ][[:alnum:]]*[^ ]\.#$i&#g"
output
Folder/SubFolder/file1.cpp Folder/SubFolder/file2.cpp Folder/SubFolder/file3.cpp Folder/SubFolder/file4.cpp

String substitute in Shell script

I suppose to strip down a substring in my shell script. I am trying as follows:
fileName="Test_VSS_TT.csv.old"
here i want to remove the string ".csv.old" and my
test=${fileName%.*}
but getting bad substitution error.
you are looking for test=${filename%%.*}
the doc for parameter expansion in bash here and in zsh here
%.* will match the first .* pattern, whereas %%.* will match the longest one
[edit]
if sed is available, you could try something like that : echo "filename.txt.bin" | sed "s/\..*//g" which yields filename
Here you go,
$ echo $f
Test_VSS_TT.csv.old
$ test=${f%%.*}
$ echo $test
Test_VSS_TT
%% will do a longest match. So it matches from the first dot upto the last and then removes the matched characters.
If your intention is to extract file name without extension, then how about this?
$ echo ${fileName}
Test_VSS_TT.csv.old
$ test=`echo ${fileName} |cut -d '.' -f1`
$ echo $test
Test_VSS_TT
echo "Test_VSS_TT.csv.old"| awk -F"." '{print $1}'

How to read output of sed into a variable

I have variable which has value "abcd.txt".
I want to store everything before the ".txt" in a second variable, replacing the ".txt" with ".log"
I have no problem echoing the desired value:
a="abcd.txt"
echo $a | sed 's/.txt/.log/'
But how do I get the value "abcd.log" into the second variable?
You can use command substitution as:
new_filename=$(echo "$a" | sed 's/.txt/.log/')
or the less recommended backtick way:
new_filename=`echo "$a" | sed 's/.txt/.log/'`
You can use backticks to assign the output of a command to a variable:
logfile=`echo $a | sed 's/.txt/.log/'`
That's assuming you're using Bash.
Alternatively, for this particular problem Bash has pattern matching constructs itself:
stem=$(textfile%%.txt)
logfile=$(stem).log
or
logfile=$(textfile/%.txt/.log)
The % in the last example will ensure only the last .txt is replaced.
The simplest way is
logfile="${a/\.txt/\.log}"
If it should be allowed that the filename in $a has more than one occurrence of .txt in it, use the following solution. Its more safe. It only changes the last occurrence of .txt
logfile="${a%%\.txt}.log"
if you have Bash/ksh
$ var="abcd.txt"
$ echo ${var%.txt}.log
abcd.log
$ variable=${var%.txt}.log

How can I extract part of a string via a shell script?

The string is setup like so:
href="PART I WANT TO EXTRACT">[link]
use awk
$ echo "href="PART I WANT TO EXTRACT">[link]" | awk -F""" '{print $2}'
PART I WANT TO EXTRACT
Or using shell itself
$ a="href="PART I WANT TO EXTRACT">[link]"
$ a=${a//"/}
$ echo ${a/&*/}
PART I WANT TO EXTRACT
Here's another way in Bash:
$ string="href="PART I WANT TO EXTRACT">[link]"
$ entity="""
$ string=${string#*${entity}*}
$ string=${string%*${entity}*}
$ echo $string
PART I WANT TO EXTRACT
This illustrates two features: Remove matching prefix/suffix pattern and the use of a variable to hold the pattern (you could use a literal instead).
expr "$string" : 'href="\(.*\)">\[link\]'
grep -o "PART I WANT TO EXTRACT" foo
Edit: "PART I WANT TO EXTRACT" can be a regex, i.e.:
grep -o http://[a-z/.]* foo

Resources