How to filter all the paths from the urls using "sed" or "grep"

How to filter all the paths from the urls using "sed" or "grep" - bash

I was trying to filter all the files from the URLs and get only paths.
echo -e "http://sub.domain.tld/secured/database_connect.php\nhttp://sub.domain.tld/section/files/image.jpg\nhttp://sub.domain.tld/.git/audio-files/top-secret/audio.mp3" | grep -Ei "(http|https)://[^/\"]+" | sort -u
http://sub.domain.tld
But I want the result like this
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
Is there any way to do it with sed or grep

Using grep
$ echo ... | grep -o '.*/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/

with grep
If your grep has the -o option:
... | grep -Eio 'https?://.*/'
If there could be multiple URLs per line:
... | grep -Eio 'https?://[^[:space:]]+/'
with sed
If the input is always precisely one URL per line and nothing else, you can just delete the filename part:
... | sed 's/[^/]*$//'

You could use match function of awk, will work in any version of awk. Simple explanation would be, passing echo command's output to awk program. Using match matching everything till last occurrence of / and then printing the sub-string to print just before /(with -1 to RLENGTH).
your_echo_command | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH-1)}'

GNU Awk
$ echo ... | awk 'match($0,/.*\//,a){print a[0]}'
$ echo ... | awk '{print gensub(/(.*\/).*/,"\\1",1)}'
$ echo ... | awk 'sub(/[^/]*$/,"")'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
xargs
$ echo ... | xargs -i sh -c 'echo $(dirname "{}")/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/

Related

how to search for an occurrence of string in shell ignoring spaces

I have below parameter that needs to be checked if it is available in a file.
PARAMS='SQLNET.INBOUND_CONNECT_TIMEOUT>=45'
how can i check for the occurrence of the above listed parameter in a file such that the count is 1 if it is available in the file.Also to note that it is ok to have spaces before and after the '>='.
i have the below code :
PARAM_COUNT=`cat file_name | tr -d "[:blank:]" |awk '$1 ~ /^[^;#]/' | grep -i ${PARAM} | wc -l`
Please suggest what modification is necessary.Thanks.

The awk is not needed because grep will do the reg exp
The wc -l is not needed because grep can count too
cat input | tr -d '[:blank:]' | grep -c "^PARAMS='SQLNET.INBOUND_CONNECT_TIMEOUT>=45'"
Please note the ^ to indicate that the text should be found at the start of the line.
Or:
grep -c "^[[:blank:]]*PARAMS='SQLNET.INBOUND_CONNECT_TIMEOUT[[:blank:]]*>=[[:blank:]]*
45'" input

Echo command containing both double and single quotes

I have this command
cp $(ldd MyApp.out | awk '{print $3}' | sed -E '/^$/d') lib/
and at some point, I want to echo it into a file but a naive approach echo command_above doesn't work.
If I put the command into single quotes, then $3 expands to whitespace.
Is it possible to print that command char-by-char as it is after echo command without any expansion and substitution?

The common approach is to use the << operator to read until some delimiter:
# "cat" just prints what it reads
cat << 'EOF' > output_file
cp $(ldd MyApp.out | awk '{print $3}' | sed -E '/^$/d') lib/
EOF

Use xargs to pass file names list to cp
ldd MyApp.out | awk '$3!=""{print $3}' | xargs -d'\n' -I{} cp {} lib/

For debugging and logging purposes you can use set -x or set -v:
set -v # dump commands below
cp $(ldd MyApp.out | awk '{print $3}' | sed -E '/^$/d') lib/
set +v # stop dumping

Count of matching word, pattern or value from unix korn shell scripting is returning just 1 as count

I'm trying to get the count of a matching pattern from a variable to check the count of it, but it's only returning 1 as the results, here is what I'm trying to do:
x="HELLO|THIS|IS|TEST"
echo $x | grep -c "|"
Expected result: 3
Actual Result: 1
Do you know why is returning 1 instead of 3?
Thanks.

grep -c counts lines not matches within a line.
You can use awk to get a count:
x="HELLO|THIS|IS|TEST"
echo "$x" | awk -F '|' '{print NF-1}'
3
Alternatively you can use tr and wc:
echo "$x" | tr -dc '|' | wc -c
3

$ echo "$x" | grep -o '|' | grep -c .
3
grep -c does not count the number of matches. It counts the number of lines that match. By using grep -o, we put the matches on separate lines.
This approach works just as well with multiple lines:
$ cat file
hello|this|is
a|test
$ grep -o '|' file | grep -c .
3

The grep manual says:
grep, egrep, fgrep - print lines matching a pattern
and for the -c flag:
instead print a count of matching lines for each input file
and there is just one line that match

You don't need grep for this.
pipe_only=${x//[^|]} # remove everything except | from the value of x
echo "${#pipe_only}" # output the length of pipe_only

Try this :
$ x="HELLO|THIS|IS|TEST"; echo -n "$x" | sed 's/[^|]//g' | wc -c
3
With only one pipe with perl:
echo "$x" |
perl -lne 'print scalar(() = /\|/g)'

Add symbol every 2 bytes

I have a string 20000024ff3dbf50 that I would like to convert it like: 20:00:00:24:ff:3d:bf:50, I've tried with sed:
echo 20000024ff3dbf50 | sed 's/\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)/\1:\2:\3:\4:\5:\6:\7:\8/'
but it's a little ugly.

Two substitutions:
echo "20000024ff3dbf50" | sed 's/../&:/g;s/.$//'
Results:
20:00:00:24:ff:3d:bf:50

echo 20000024ff3dbf50 | grep -o .. | paste -d ':' -s -
Grep with -o splits the input to 2 chars per line;
paste uses delimiter ':' to pad them [-s]erially

You could also use GNU awk auto-splitting for this:
echo 20000024ff3dbf50 | awk '$1=$1' FPAT=.. OFS=:
Output:
20:00:00:24:ff:3d:bf:50

Use each line of piped output as parameter for script

I have an application (myapp) that gives me a multiline output
result:
abc|myparam1|def
ghi|myparam2|jkl
mno|myparam3|pqr
stu|myparam4|vwx
With grep and sed I can get my parameters as below
myapp | grep '|' | sed -e 's/^[^|]*//' | sed -e 's/|.*//'
But then want these myparamx values as paramaters of a script to be executed for each parameter.
myscript.sh myparam1
myscript.sh myparam2
etc.
Any help greatly appreciated

Please see xargs. For example:
myapp | grep '|' | sed -e 's/^[^|]*//' | sed -e 's/|.*//' | xargs -n 1 myscript.sh

May be this can help -
myapp | awk -F"|" '{ print $2 }' | while read -r line; do /path/to/script/ "$line"; done

I like the xargs -n 1 solution from Dark Falcon, and while read is a classical tool for such kind of things, but just for completeness:
myapp | awk -F'|' '{print "myscript.sh", $2}' | bash
As a side note, speaking about extraction of 2nd field, you could use cut:
myapp | cut -d'|' -f 1 # -f 1 => second field, starting from 0

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to filter all the paths from the urls using "sed" or "grep" - bash

Using grep $ echo ... | grep -o '.*/' http://sub.domain.tld/secured/ http://sub.domain.tld/section/files/ http://sub.domain.tld/.git/audio-files/top-secret/

with grep If your grep has the -o option: ... | grep -Eio 'https?://./' If there could be multiple URLs per line: ... | grep -Eio 'https?://[^[:space:]]+/' with sed If the input is always precisely one URL per line and nothing else, you can just delete the filename part: ... | sed 's/[^/]$//'

Related

how to search for an occurrence of string in shell ignoring spaces

Echo command containing both double and single quotes

Count of matching word, pattern or value from unix korn shell scripting is returning just 1 as count

Add symbol every 2 bytes

Use each line of piped output as parameter for script

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to filter all the paths from the urls using "sed" or "grep" - bash

Using grep $ echo ... | grep -o '.*/' http://sub.domain.tld/secured/ http://sub.domain.tld/section/files/ http://sub.domain.tld/.git/audio-files/top-secret/

with grep If your grep has the -o option: ... | grep -Eio 'https?://.*/' If there could be multiple URLs per line: ... | grep -Eio 'https?://[^[:space:]]+/' with sed If the input is always precisely one URL per line and nothing else, you can just delete the filename part: ... | sed 's/[^/]*$//'

Related

how to search for an occurrence of string in shell ignoring spaces

Echo command containing both double and single quotes

Count of matching word, pattern or value from unix korn shell scripting is returning just 1 as count

Add symbol every 2 bytes

Use each line of piped output as parameter for script

Categories

Resources

with grep If your grep has the -o option: ... | grep -Eio 'https?://./' If there could be multiple URLs per line: ... | grep -Eio 'https?://[^[:space:]]+/' with sed If the input is always precisely one URL per line and nothing else, you can just delete the filename part: ... | sed 's/[^/]$//'