Sed output a value between two matching strings in a url - bash

I have multiple urls as input
https://drive.google.com/a/domain.com/file/d/1OR9QLGsxiLrJIz3JAdbQRACd-G9ZfL3O/view?usp=drivesdk
https://drive.google.com/a/domain.com/file/d/1sEWMFqGW9p2qT-8VIoBesPlVJ4xvOzXD/view?usp=drivesdk
How can I create a sed command to simply return only the file ID
desired output:
1OR9QLGsxiLrJIz3JAdbQRACd-G9ZfL3O
1sEWMFqGW9p2qT-8VIoBesPlVJ4xvOzXD
Looks like I need to start between /d/ and stop at /view but I'm not quite sure how to do that.
I've tried? sed -e 's/d\(.*\)\/view/\1/'

I was able to do this with cut -d '/' -f 8
also awk -F/ '{print $8}' file worked, thanks!

Your command was almost right:
# Wrong
sed -e 's/d\(.*\)\/view/\1/'
# better, removing unmatched stuff including the / after the d
sed -e 's/.*d\/\(.*\)\/view.*/\1/'
# better: using # for making the command easier to read
sed -e 's#.*d/\(.*\)/view.*#\1#'
# Alternative:Using cut when you don't know which field /d/ is
some_straem | grep -Eo '/d/.*/view' | cut -d/ -f3

Related

grep return the string in between words

I am trying to use grep to filter out the RDS snapshot identifier from the rds describe-db-snapshots command output below:
"arn:aws:rds:ap-southeast-1:123456789:snapshot:rds:apple-pie-2018-05-06-17-12",
"rds:apple-pie-2018-05-06-17-12",
how to return the exact output as in
rds:apple-pie-2018-05-06-17-12
tried using
grep -Eo ",rds:"
but not able to
Following awk may also help you on same.
awk 'match($0,/^"rds[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Your grep -Eo ",rds:" is failing for different reasons:
You did not add a " in the string to match
Between the comma and rds you need to match the character.
You are trying to match the comma that can be on the previous line
Your sample input is 2 lines (with a newline in between), perhaps the real input is without the newline.
You want to match until the next double quote.
You can support both input-styles (with/without newline) with
grep -Eo '(,|^)"rds:[^"]*' rdsfile |cut -d'"' -f2
You can do this in one command with
sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p' rdsfile
EDIT: Manipulting stdout and not the file is with similar commands:
yourcommand | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'
You can also test the original commands with yourcommand > rdsfile.
You might notice that rdsfile is missing data that you have seen on the screen, in that case add 2>&1
yourcommand 2>&1 | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand 2>&1 | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'

Print lines with sed using line number from grep

I'm trying to pipe line numbers from grep to sed.
First I was extracting the start and end line of what I want to print with sed:
grep -n "Start" file1 | cut -d: -f 1 | head -n 1
grep -n "End" file1 | cut -d: -f 1 | head -n 1
Now I need to use these numbers to print everything from Start to End by line. E.g.
sed -ne '1,30w output1' file1
I'm not sure how this can be done as piping the line numbers to sed will be seen as "input" right?
Example:
Start
some text
some more text
End
Start
some text
some more text
End
As there's more than one start and end i cut of the rest of the line numbers from grep. And I'm supposed to combine grep and sed or is this not possible?
You can do it without grep
sed -n '/Start/,/End/w output1' file1
should work.
It looks like you want to print from the first occurrence of Start to the first subsequent occurrence of End, inclusive. That'd just be:
awk '/Start/{found=1} found{print; if (/End/) exit}' file
This might work for you (GNU sed):
sed -ne '/Start/,/End/w outputfile' -e '/End/q' file
This will write to outputfile the lines between the first Start and End and then quit and obviate the need to use grep too.
If you must use grep then perhaps:
sed -n "$(grep -n "Start" file | cut -d: -f 1 | head -n 1),$(grep -n "End" file | cut -d: -f 1 | head -n 1)"'p' file

How to compose custom command-line argument from file lines?

I know about the xargs utility, which allows me to convert lines into multiple arguments, like this:
echo -e "a\nb\nc\n" | xargs
Results in:
a b c
But I want to get:
a:b:c
The character : is used for an example. I want to be able to insert any separator between lines to get a single argument. How can I do it?
If you have a file with multiple lines than you want to change to a single argument changing the NEWLINES by a single character, the paste command is what you need:
$ echo -en "a\nb\nc\n" | paste -s -d ":"
a:b:c
Then, your command becomes:
your_command "$(paste -s -d ":" your_file)"
EDIT:
If you want to insert more than a single character as a separator, you could use sed before paste:
your_command "$(sed -e '2,$s/^/<you_separator>/' your_file | paste -s -d "")"
Or use a single more complicated sed:
your_command "$(sed -n -e '1h;2,$H;${x;s/\n/<you_separator>/gp}' your_file)"
The example you gave is not working for me. You would need:
echo -e "a\nb\nc\n" | xargs
to get a b c.
Coming back to your need, you could do this:
echo "a b c" | awk 'OFS=":" {print $1, $2, $3}'
it will change the separator from space to : or whatever you want it to be.
You can also use sed:
echo "a b c" | sed -e 's/ /:/g
that will output a:b:c.
After all these data processing, you can use xargs to perform the command you want to. Just | xargs and do whatever you want.
Hope it helps.
You can join the lines using xargs and then replace the space(' ' ) using sed.
echo -e "a\nb\nc"|xargs| sed -e 's/ /:/g'
will result in
a:b:c
obviously you can use this output as argument for other command using another xargs.
echo -e "a\nb\nc"|xargs| sed -e 's/ /:/g'|xargs

How do I convert multi public key into a single line?

I'm trying to make a txt file with a generated key into 1 line. example:
<----- key start ----->
lkdjasdjskdjaskdjasdkj
skdhfjlkdfjlkdsfjsdlfk
kldshfjlsdhjfksdhfksdj
jdhsfkjsdhfksdjfhskdfh
jhdfkjsdhfkjsdhfkjsdhf
<----- key stop ----->
I want it to look like:
lkdjasdjskdjaskdjasdkjskdhfjlkdfjlkdsfjsdlfkkldshfjlsdhjfksdhfksdjjdhsfkjsdhfksdjfhskdfhjhdfkjsdhfkjsdhfkjsdhf
Notice I also want the lines <----- key start -----> and <----- key stop -----> removed. How can I do this? Would this be done with sed?
tr -d '\n' < key.txt
Found on http://linux.dsplabs.com.au/rmnl-remove-new-line-characters-tr-awk-perl-sed-c-cpp-bash-python-xargs-ghc-ghci-haskell-sam-ssam-p65/
To convert multi line output to a single space separated line, use
tr '\n' ' ' < key.txt
I know this does not answer the detailed question. But it is one possible answer to the title. I needed this answer and my google search found this question.
tail -n +2 key.txt | head -n -1 | tr -d '\n'
Tail to remove the first line, head to remove the last line and tr to remove newlines.
If you're looking for everything you asked for in one sed, I have this...
sed -n '1h;2,$H;${g;s/\n//g;s/<----- key \(start\|stop\) ----->//g;p}' key.txt
But it's not exactly easily readable :) If you don't mind piping a couple of commands, you could use the piped grep, tr, sed, etc. suggestions in the rest of the answers you got.
An easy way would be to use cat file.txt | tr -d '\n'
grep '^[^<]' test.txt | tr -d '\n'
This might work for you (GNU sed):
sed -r '/key start/{:a;N;/key stop/!ba;s/^[^\n]*\n(.*)\n.*/\1/;s/\n//g}' file
Gather up lines between key start and key stop. Then remove the first and last lines and delete any newlines.
In vim, it's just :%s/^M//
I use this all the time to generate comma separated lists from lines. For sed or awk, check out the many solutions at this link:
http://www.unix.com/shell-programming-scripting/35107-remove-line-break.html
Example:
paste -s -d',' tmpfile | sed 's/,/, /g'
grep -v -e "key start" -e "key stop" /PATH_TO/key | tr -d '\n'
awk '/ key (start|stop) / {next} {printf("%s", $0)} END {print ""}' filename
Every other answer mentioned here converts the key to a single line, but the result that we get is not a valid key and hence I was running into problems.
If you also have the same issue, please try
awk -v ORS='\\n' '1' key.txt/file-name
Credit: https://gist.github.com/bafxyz/de4c94c0912f59969bd27b47069eeac0
You may use man 1 ed to join lines as well:
str='
aaaaa
<----- key start ----->
lkdjasdjskdjaskdjasdkj
skdhfjlkdfjlkdsfjsdlfk
kldshfjlsdhjfksdhfksdj
jdhsfkjsdhfksdjfhskdfh
jhdfkjsdhfkjsdhfkjsdhf
<----- key stop ----->
bbbbb
'
# for in-place file editing use "ed -s file" and replace ",p" with "w"
# cf. http://wiki.bash-hackers.org/howto/edit-ed
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$str")
H
/<----- key start ----->/+1,/<----- key stop ----->/-1j
/<----- key start ----->/d
/<----- key stop ----->/d
,p
q
EOF
# print the joined lines to stdout only
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$str")
H
/<----- key start ----->/+1,/<----- key stop ----->/-1jp
q
EOF

Add Tab Separator to Grep

I am new to grep and awk, and I would like to create tab separated values in the "frequency.txt" file output (this script looks at a large corpus and then outputs each individual word and how many times it is used in the corpus - I modified it for the Khmer language). I've looked around ( grep a tab in UNIX ), but I can't seem to find an example that makes sense to me for this bash script (I'm too much of a newbee).
I am using this bash script in cygwin:
#!/bin/bash
# Create a tally of all the words in the corpus.
#
echo Creating tally of word frequencies...
#
sed -e 's/[a-zA-Z]//g' -e 's/​/ /g' -e 's/\t/ /g' \
-e 's/[«|»|:|;|.|,|(|)|-|?|។|”|“]//g' -e 's/[0-9]//g' \
-e 's/ /\n/g' -e 's/០//g' -e 's/១//g' -e 's/២//g' \
-e 's/៣//g' -e 's/៤//g' -e 's/៥//g' -e 's/៦//g' \
-e 's/៧//g' -e 's/៨//g' -e 's/៩//g' dictionary.txt | \
tr [:upper:] [:lower:] | \
sort | \
uniq -c | \
sort -rn > frequency.txt
grep -Fwf dictionary.txt frequency.txt | awk '{print $2 "," $1}'
Awk is printing with a comma, but that is only on-screen. How can I place a tab (a comma would work as well), between the frequency and the term?
Here's a small part of the dictionary.txt file (Khmer does not use spaces, but in this corpus there is a non-breaking space between each word which is converted to a space using sed and regular expressions):
ព្រះ​វិញ្ញាណ​នឹង​ប្រពន្ធ​ថ្មោង​ថ្មី​ពោល​ថា
អញ្ជើញ​មក ហើយ​អ្នក​ណា​ដែល​ឮ​ក៏​ថា
អញ្ជើញ​មក​ដែរ អ្នក​ណា​ដែល​ស្រេក
នោះ​មាន​តែ​មក ហើយ​អ្នក​ណា​ដែល​ចង់​បាន
មាន​តែ​យក​ទឹក​ជីវិត​នោះ​ចុះ
ឥត​ចេញ​ថ្លៃ​ទេ។
Here is an example output of frequency.txt as it is now (frequency and then term):
25605 នឹង 25043 ជា 22004 បាន 20515 នោះ
I want the output frequency.txt to look like this (where TAB is an actual tab character):
25605TABនឹង 25043TABជា 22004TABបាន 20515TABនោះ
Thanks for your help!
You should be able to replace the whole lengthy sed command with this:
tr -d '[a-zA-Z][0-9]«»:;.,()-?។”“|០១២៣៤៥៦៧៨៩'
tr '\t' ' '
Comments:
's/​/ /g' - the first two slashes mean re-use the previous match which was [a-z][A-Z] and replace them with spaces, but they were deleted so this is a no-op
's/[«|»|:|;|.|,|(|)|-|?|។|”|“]//g' - the pipe characters don't delimit alternatives inside square brackets, they are literal (and more than one is redundant), the equivalent would be 's/[«»:;.,()-?។”“|]//g' (leaving one pipe in case you really want to delete them)
's/ /\n/g' - earlier, you replaced tabs with spaces, now you're replacing the spaces with newlines
You should be able to have the tabs you want by inserting this in your pipeline right after the uniq:
sed 's/^ *\([0-9]\+\) /\1\t/'
If you want the AWK command to output a tab:
awk 'BEGIN{OFS='\t'} {print $2, $1}'
What about writing awk to file with "<"?
The following script should get you where you need to go. The pipe to tee will let you see output on the screen while at the same time writing the output to ./outfile
#!/bin/sh
sed ':a;N;s/[a-zA-Z0-9។០១២៣៤៥៦៧៨៩\n«»:;.,()?”“-]//g;ta' < dictionary.txt | \
gawk '{$0=toupper($0);for(i=1;i<=NF;i++)a[$i]++}
END{for(item in a)printf "%s\t%d ", item, a[item]}' | \
tee ./outfile

Resources