shell sed get file path - bash

I have a file path.
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
I want to get this part
/T11073_RICekkR/Fq/AS34_59329
as $location.
How can I use sed to get those?

When working with file paths, I find it easier using awk - but you can make your own mind up. Here's what I'd do:
location=$(echo "$path" | awk -F "/" '{ print "", $6, $7, $8 }' OFS="/")
If you're trying to match on a pattern, then sed would be a good option. But you haven't mentioned any specifications.

Use cut or awk (as suggested).
But to do it with sed you do something like this:
locationpath=/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
location=$(echo $locationpath | sed 's%\(/[^/]*\)\{4\}\(/[^/]*/[^/]*/[^/]*\).*%\2%')

Related

Strip only domain name out of input url string

Did a bit of searching already but cannot seem to find an elegant way of doing this. I'd like to be able to search through a list like below and only end up with a plain text output file containing on the domain name, no http:// or anything after the /
So a list like this:
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
I want to end up with plain text output file like this.
7wind.ru
aldersgatencsc.org
amunow.org
Given:
$ echo "$txt"
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
You can use cut:
$ echo "$txt" | cut -d'/' -f3
7wind.ru
aldersgatencsc.org
amunow.org
Or, if your content is in a file:
$ cut -d'/' -f3 file
7wind.ru
aldersgatencsc.org
amunow.org
Then redirect that to the file you want:
$ cut -d'/' -f3 file >new_file
awk -F \/ '{ print $3 }' outputfile > newfile
Print the 3rd field delimited by /
$ sed -r 's#.*//([^/]*)/.*#\1#' Input_file
7wind.ru
aldersgatencsc.org
amunow.org
try following awks.
Solution 1st:
awk '{sub(/.*\/\//,"");sub(/\/.*/,"");print}' Input_file
Solution 2nd:
awk '{match($0,/\/.[^/]*/);print substr($0,RSTART+2,RLENGTH-2)}' Input_file
This works by stripping the protocol and :// first, then anything after and including the next slash.
sed "s|.*://||; s|/.*||" url-list.txt
Add -i to change the file directly.
try this regexp
((http|https):\/\/)?([a-zA-Z\.]+)(\/)?
first match, 3th group
but it may validate invalid url too! be careful

Get only part of file using sed or awk

I have a file which contains text as follows:
Directory /home/user/ "test_user"
bunch of code
another bunch of code
How can I get from this file only the /home/user/ part?
I've managed to use awk -F '"' 'NR==1{print $1}' file.txt to get rid of rest of the file and I'm gettig output like this:
Directory /home/user/
How can I change this command to get only /home/user/ part? I'd like to make it as simple as possible. Unfortunately, I can't modify this file to add/change the content.
this should work the fastest, noticeable if your file is large
awk '{print $2; exit}' file
it will print the second field of the first line and stop processing the rest of the file.
With awk it should be:
awk 'NR==1{print $2}' file.txt
Setting the field delimiter to " was wrong Since it splits the line into these fields:
$1 = 'Directory /home/user/'
$2 = 'test_user'
$3 = '' (empty)
The default record separator, which is [[:space:]]+, splits like this:
$1 = 'Directory'
$2 = '/home/user/'
$3 = '"test_user"'
As an alternate, you can use head and cut:
$ head -n 1 file | cut -d' ' -f2
Not sure why you are using the -F" as that changes the delimiter. If you remove that, then $2 will get you what you want.
awk 'NR==1{print $2}' file.txt
You can also use awk to execute the print when the line contains /home/user instead of counting records:
awk '/\home\/user\//{print $2}' file.txt
In this case, if the line were buried in the file, or if you had multiple instances, you would get the name for every occurrence wherever it was.
Adding some grep
grep Directory file.txt|awk '{print $2}'

Using grep to pull a series of random numbers from a known line

I have a simple scalar file producing strings like...
bpred_2lev.ras_rate.PP 0.9413 # RAS prediction rate (i.e., RAS hits/used RAS)
Once I use grep to find this line in the output.txt, is there a way I can directly grab the "0.9413" portion? I am attempting to make a cvs file and just need whatever value is generated.
Thanks in advance.
There are several ways to combine finding and extracting into a single command:
awk (POSIX-compliant)
awk '$1 == "bpred_2lev.ras_rate.PP" { print $2 }' file
sed (GNU sed or BSD/OSX sed)
sed -En 's/^bpred_2lev\.ras_rate\.PP +([^ ]+).*$/\1/p' file
GNU grep
grep -Po '^bpred_2lev\.ras_rate\.PP +\K[^ ]+' file
You can use awk like this:
grep <your_search_criteria> output.txt | awk '{ print $2 }'

bash invalid delimiter in cut command

tab="`\echo '\t'`"
grep "^.*${tab}.*${tab}.*${tab}.*${tab}.*${tab}" $file |
grep -vi ssm_id |
cut -f 1,5,6 -d "${tab}" > $rmloadfile
I am getting error as
-cut: invalid delimiter
the above code is part of my bash script.
Ignoring the actual problem, you really want to use awk here instead of this combination of grep and cut:
awk 'NF>=6 && tolower($0) !~ ssm_id { print $1, $5, $6 }' $file > $rmloadfile
The echo command doesn't interpret backslash escaped characters by default. It has to be enabled using the -e switch.
If you use:
tab="$(echo -e '\t')"
it works.
But I'd rather recommend using the approach proposed by #devnull in the comments, or refer to the linked question.

command-line with variable, doesn't work

I have a script that uses awk with parameters but I can't find the right syntax to make it functionnal. May be could you help me ?
(under osx, terminal, zsh, command line)
I get a variable who is the path name (it's a awk result)
path_dir="/picture/dir/'
After, I ask this:
awk -F"/" '{print $2}' $path_dir
But it doesn't work. I get:
cant' open file $path_dir
My goal is to do this:
dir_name=awk -F "/" '{print $2}' $path_dir
Then, to use
$dir_name
But first, awk can't read my $path_dir
Any idea ?
Thank you
try this:
dir_name=$(awk -F'/' '{print $2}' <<<$path_dir)
this should assign dir_name with picture
Give a try to:
dir_name=`awk -F "/" '{print $2}' $path_dir`

Resources