Delete everything before last / in a file path - bash

I have many file paths in a file that look like so:
/home/rtz11/files/testfiles/547/prob547455_01
I want to use a bash script that will print all the filenames to the screen, basically whatever comes after the last /. I don't want to assume that it would always be the same length because it might not be.
Would there be a way to delete everything before the last /? Maybe a sed command?

Using sed for this is vast overkill -- bash has extensive string manipulation built in, and using this built-in support is far more efficient when operating on only a single line.
s=/home/rtz11/files/testfiles/547/prob547455_01
basename="${s##*/}"
echo "$basename"
This will remove everything from the beginning of the string greedily matching */. See the bash-hackers wiki entry for parameter expansion.
If you only want to remove everything prior to the last /, but not including it (a literal reading of your question, but also a generally less useful operation), you might instead want if [[ $s = */* ]]; then echo "/${s##*/}"; else echo "$s"; fi.

awk '{print $NF}' FS=/ input-file
The 'print $NF' directs awk to print the last field of each line, and assigning FS=/ makes forward slash the field delimeter. In sed, you could do:
sed 's#.*/##' input-file
which simply deletes everything up to and including the last /.

Meandering but simply because I can remember the syntax I use:
cat file | rev | cut -d/ -f1 | rev
Many ways to skin a 'cat'. Ouch.

One more way:
Use the basename executable (command):
basename /path/with/many/slashes/and/file.extension
>file.extension
basename /path/with/many/slashes/and/file.extension .extension
OR
basename -s .extension /path/with/many/slashes/and/file.extension
> file

Related

Bash get text between 5th and 6th underscore in a variable

I have a variable called $folder_name which contains the string
Release_2019_Config_V6_Standalone_PJ6678_Test
which is the name of a folder.
I'm trying to extract PJ6678 from the folder name.
I know the folder name will put the user id (the text I need) between the 5th and 6th underscore, I don't know what text/symbols will be present after the 6th underscore.
I'm using Bash script, i'd really appreciate the help if someone could help with this functionality as i'm completely lost trying to use sed (after reading for hours i'm assuming this is the correct tool for the job?
Here is a Bash only solution:
#!/bin/bash
INPUT="Release_2019_Config_V6_Standalone_PJ6678_Test"
IFS='_' read -ra IN <<< "$INPUT"
echo ${IN[5]}
Or use cut:
cut -d '_' -f 6 <<< "Release_2019_Config_V6_Standalone_PJ6678_Test"
Or use awk:
awk -F "_" '{ print $6 }' <<< "Release_2019_Config_V6_Standalone_PJ6678_Test"
If you want pure-bash solution, you can use tokenize the file name, and pick up the 5th element
IFS=_ read -a token <<< "$folder_name"
id=${token[5]}
Eliminating dependency and performance hit from launching additional programs per folder name.
Try this command:
echo $a | awk -F'_' '{print $6}'
Here, _ is the delimiter and $a is a variable that holds the value.
For completeness, here's a pure-shell solution that doesn't rely on bash extensions like arrays.
$ folder_name=Release_2019_Config_V6_Standalone_PJ6678_Test
$ tmp=${folder_name#*_*_*_*_*_} # Because we know how many _ to strip
$ echo ${tmp%_*}
PJ6678
Because the # operator strips the shortest prefix, this won't allow * to match any _ itself; if it did, we could shorten the prefix by making the underscore match one of the literal _ in the pattern instead.

bash scripting: Can I get sed to output the original line, then a space, then the modified line?

I'm new to Unix in all its forms, so please go easy on me!
I have a bash script that will pipe an ls command with arbitrary filenames into sed, which will use an arbitrary replacement pattern on the files, and then this will be piped into awk for some processing. The catch is, awk needs to know both the original file name and the new one.
I've managed everything except getting the original file names into awk. For instance, let's say my files are test.* and my replacement pattern is 's:es:ar;', which would change every occurrence of "test" to "tart". For testing purposes I'm just using awk to print what it's receiving:
ls "$#" | sed "$pattern" | awk '{printf "0: %s\n1: %s\n2: %s\n", $0,$1,$2}'
where test.* is in $# and the pattern is stored in $pattern.
Clearly, this doesn't get me to where I want to be. The output is obviously
0: tart.c
1: tart.c
2:
If I could get sed to output "test.c tart.c", then I'd have two parameters for awk. I've played around with the pattern to no avail, even hardcoding "test.c" into the replacement. But of course that just gave me amateur results like "ttest.c art.c". Is it possible for sed to remember the input, then work it into the beginning of the output? Do I even have the right ideas? Thanks in advance!
Two ways to change the first t in a b in the duplicated field.
Duplicate (& replays the matched part), change first word and swap (remember 2 strings with a space in between):
echo test.c | sed -r 's/.*/& &/;s/t/b/;s/([^ ]*) (.*)/\2 \1/'
or with more magic (copy original value to buffer, make the change, insert value from buffer as the first line and replace eond of line with a space)
echo test.c | sed 'h;s/t/b/;x;G;s/\n/ /'
Use Perl instead of sed:
echo test.c | perl -lne 'print "$_ ", s/es/ar/r'
-l removes the newline from input and adds it after each print. The /r modifier to the substitution returns the modified string instead of changing the variable (Perl 5.14+ needed).
Old answer, not working for s/t/b/2 or s/.*/replaced/2:
You can duplicate the contents of the line with s/.*/& &/, then just tell sed that it should only apply the second substitution (this works at least in GNU sed):
echo test.c | sed 's/.*/& &/; s/es/ar/2'
$ echo 'foo' | awk '{old=$0; gsub(/o/,"e"); print old, $0}'
foo fee

sed bash substitution only if variable has a value

I'm trying to find a way using variables and sed to do a specific text substitution using a changing input file, but only if there is a value given to replace the existing string with. No value= do nothing (rather than remove the existing string).
Example
Substitute.csv contains 5 lines=
this-has-text
this-has-text
this-has-text
this-has-text
and file.text has one sentence=
"When trying this I want to be sure that text-this-has is left alone."
If I run the following command in a shell script
Text='text-this-has'
Change=`sed -n '3p' substitute.csv`
grep -rl $Text /home/username/file.txt | xargs sed -i "s|$Text|$Change|"
I end up with
"When trying this I want to be sure that is left alone."
But I'd like it to remain as
"When trying this I want to be sure that text-this-has is left alone."
Any way to tell sed "If I give you nothing new, do nothing"?
I apologize for the overthinking, bad habit. Essentially what I'd like to accomplish is if line 3 of the csv file has a value - replace $Text with $Change inline. If the line is empty, leave $Text as $Text.
Text='text-this-has'
Change=$(sed -n '3p' substitute.csv)
if [[ -n $Change ]]; then
grep -rl $Text /home/username/file.txt | xargs sed -i "s|$Text|$Change|"
fi
Just keep it simple and use awk:
awk -v t="$Text" -v c="$Change" 'c!=""{sub(t,c)} {print}' file
If you need inplace editing just use GNU awk with -i inplace.
Given your clarified requirement, this is probably what you actually want:
awk -v t="$Text" 'NR==FNR{if (NR==3) c=$0; next} c!=""{sub(t,c)} {print}' Substitute.csv file.txt
Testing whether $Change has a value before launching into the grep and sed is undoubtedly the most efficient bash solution, although I'm a bit skeptical about the duplication of grep and sed; it saves a temporary file in the case of files which don't contain the target string, but at the cost of an extra scan up to the match in the case of files which do contain it.
If you're looking for typing efficiency, though, the following might be interesting:
find . -name '*.txt' -exec sed -i "s|$Text|${Change:-&}|" {} \;
Which will recursively find all files whose names end with the extension .txt and execute the sed command on each one. ${Change:-&} means "the value of $Change if it exists and is non-empty, and otherwise an &"; & in the replacement of a sed s command means "the matched text", so s|foo|&| replaces every occurrence of foo with itself. That's an expensive no-op but if your time matters more than your cpu time, it might have been worth it.

How can I strip first X characters from string using sed?

I am writing shell script for embedded Linux in a small industrial box. I have a variable containing the text pid: 1234 and I want to strip first X characters from the line, so only 1234 stays. I have more variables I need to "clean", so I need to cut away X first characters and ${string:5} doesn't work for some reason in my system.
The only thing the box seems to have is sed.
I am trying to make the following to work:
result=$(echo "$pid" | sed 's/^.\{4\}//g')
Any ideas?
The following should work:
var="pid: 1234"
var=${var:5}
Are you sure bash is the shell executing your script?
Even the POSIX-compliant
var=${var#?????}
would be preferable to using an external process, although this requires you to hard-code the 5 in the form of a fixed-length pattern.
Here's a concise method to cut the first X characters using cut(1). This example removes the first 4 characters by cutting a substring starting with 5th character.
echo "$pid" | cut -c 5-
Use the -r option ("use extended regular expressions in the script") to sed in order to use the {n} syntax:
$ echo 'pid: 1234'| sed -r 's/^.{5}//'
1234
Cut first two characters from string:
$ string="1234567890"; echo "${string:2}"
34567890
pipe it through awk '{print substr($0,42)}' where 42 is one more than the number of characters to drop. For example:
$ echo abcde| awk '{print substr($0,2)}'
bcde
$
Chances are, you'll have cut as well. If so:
[me#home]$ echo "pid: 1234" | cut -d" " -f2
1234
Well, there have been solutions here with sed, awk, cut and using bash syntax. I just want to throw in another POSIX conform variant:
$ echo "pid: 1234" | tail -c +6
1234
-c tells tail at which byte offset to start, counting from the end of the input data, yet if the the number starts with a + sign, it is from the beginning of the input data to the end.
Another way, using cut instead of sed.
result=`echo $pid | cut -c 5-`
I found the answer in pure sed supplied by this question (admittedly, posted after this question was posted). This does exactly what you asked, solely in sed:
result=\`echo "$pid" | sed '/./ { s/pid:\ //g; }'\``
The dot in sed '/./) is whatever you want to match. Your question is exactly what I was attempting to, except in my case I wanted to match a specific line in a file and then uncomment it. In my case it was:
# Uncomment a line (edit the file in-place):
sed -i '/#\ COMMENTED_LINE_TO_MATCH/ { s/#\ //g; }' /path/to/target/file
The -i after sed is to edit the file in place (remove this switch if you want to test your matching expression prior to editing the file).
(I posted this because I wanted to do this entirely with sed as this question asked and none of the previous answered solved that problem.)
Rather than removing n characters from the start, perhaps you could just extract the digits directly. Like so...
$ echo "pid: 1234" | grep -Po "\d+"
This may be a more robust solution, and seems more intuitive.
This will do the job too:
echo "$pid"|awk '{print $2}'

String Manipulation in Bash

I am a newbie in Bash and I am doing some string manipulation.
I have the following file among other files in my directory:
jdk-6u20-solaris-i586.sh
I am doing the following to get jdk-6u20 in my script:
myvar=`ls -la | awk '{print $9}' | egrep "i586" | cut -c1-8`
echo $myvar
but now I want to convert jdk-6u20 to jdk1.6.0_20. I can't seem to figure out how to do it.
It must be as generic as possible. For example if I had jdk-6u25, I should be able to convert it at the same way to jdk1.6.0_25 so on and so forth
Any suggestions?
Depending on exactly how generic you want it, and how standard your inputs will be, you can probably use AWK to do everything. By using FS="regexp" to specify field separators, you can break down the original string by whatever tokens make the most sense, and put them back together in whatever order using printf.
For example, assuming both dashes and the letter 'u' are only used to separate fields:
myvar="jdk-6u20-solaris-i586.sh"
echo $myvar | awk 'BEGIN {FS="[-u]"}; {printf "%s1.%s.0_%s",$1,$2,$3}'
Flavour according to taste.
Using only Bash:
for file in jdk*i586*
do
file="${file%*-solaris*}"
file="${file/-/1.}"
file="${file/u/.0_}"
do_something_with "$file"
done
i think that sed is the command for you
You can try this snippet:
for fname in *; do
newname=`echo "$fname" | sed 's,^jdk-\([0-9]\)u\([0-9][0-9]*\)-.*$,jdk1.\1.0_\2,'`
if [ "$fname" != "$newname" ]; then
echo "old $fname, new $newname"
fi
done
awk 'if(match($9,"i586")){gsub("jdk-6u20","jdk1.6.0_20");print $9;}'
The if(match()) supersedes the egrep bit if you want to use it. You could use substr($9,1,8) instead of cut as well.
garph0 has a good idea with sed; you could do
myvar=`ls jdk*i586.sh | sed 's/jdk-\([0-9]\)u\([0-9]\+\).\+$/jdk1.\1.0_\2/'`
You're needing the awk in there is an artifact of the -l switch on ls. For pattern substitution on lines of text, sed is the long-time champion:
ls | sed -n '/^jdk/s/jdk-\([0-9][0-9]*\)u\([0-9][0-9]*\)$/jdk1.\1.0_\2/p'
This was written in "old-school" sed which should have greater portability across platforms. The expression says:
don't print lines unless they match -n
on lines beginning with 'jdk' do:
on a line that contains only "jdk-IntegerAuIntegerB"
change it to "jdk.1.IntegerA.0_IntegerB"
and print it
Your sample becomes even simpler as:
myvar=`echo *solaris-i586.sh | sed 's/-solaris-i586\.sh//'`

Resources