Excluding '#' comments from a sed selection

Excluding '#' comments from a sed selection - bash

I'm trying to get a config value from a yml file but there is one line that has that same value, but commented out. That is:
...
#database_name: prod
database_name: demo
database_user: root
database_password: password
...
I'm getting all values with this sed/awk command:
DATABASE_NAME=$(sed -n '/database_name/p' "$CONFIG_PATH" | awk -F' ' '{print $2}');
Now, if I do that, I get the right values for the user and password, but get double name.
Question is:
How do I exclude '#' comments from my sed selection?

You might as well use awk for the whole operation:
DATABASE_NAME=$(awk -F' ' '$1!~/^#/ && /database_name/{print $2}' "$CONFIG_PATH")
This will exclude all lines that start with # (comments).

If there is always a character before the d use /[^#]database_name/p.
If not you can use /\(^\|[^#]\)database_name/p.

I think the braces are a GNU sed feature (not sure though)
sed -n '/database_name/ {/^[[:blank:]]*#/!p}'
For lines matching "database_name", if the line does NOT begin with blanks and a hash then print it.

if the file has blank spaces at starting of lines:
sed 's/ //g' file.txt | awk '/^(database)/{print}'

I ended up using #etan-reisner solution.
Here is another solution to my particular problem I found along the way:
DATABASE_NAME=$(cat "$CONFIG_PATH" | grep -v '^[[:space:]]*#' | sed -n '/database_host/p' | awk -F' ' '{print $2}');
This will filter every line that contains some spaces followed by a hash.

Related

Trimming a textfile

i want to trim a textfile and delete all lines from line n to the end of the file. I tried to use sed for that. The sed command for n=26 should look like that:
sed -i '26,$d' /path/to/textfile
So in my textfile i don't know n beforehand, but i know that there is a unique text in that line. So i tried it that way:
myvar=`grep -n 'unique text' /path/to/textfile | awk -F":" '{print $1 }'`
sed -i "${myvar}"',$d' /path/to/textfile
That works and deletes all wanted lines but it throws the error message:
sed: -e expression # 1, character 1: unknown command: »,«
So i tried changing my command to:
myvar=`grep -n 'unique text' /path/to/textfile | awk -F":" '{print $1 }'`
sed -i "${myvar},$d" /path/to/textfile
With that i get the same error message but it doesn't delete the lines.
I tried some variations with ' and " and how to put the variable in there, but it never works as wanted. Does someone knows what i do wrong?
I would appreciate other methods for trimming the textfile as long as i can do it in a bash script.

You can replace the fixed line number with a regular expression matching the line to start at.
sed -i '/unique text/,$d' /path/to/textfile
You can also use ed to edit the file, rather than rely on a non-standard sed extension.
printf '/unique text/,$d\nwq\n' | ed /path/to/textfile

grep serial numbers not starting with specific prefix

I have this file (serials.txt) containing serial numbers:
S/N:175-1915011190
S/N:244-1920023447
S/N:335-1920101144
S/N:244-1920101149
Using grep or similar tool I want to select all serials NOT starting with '244'
I'm able to select all the '244' with grep -Eo '244-[0-9]*' serials.txt but I want the opposite.
Something like grep -Eo '(^244)-[0-9]*' serials.txt
The output should be (without S/N:)
175-1915011190
335-1920101144

Following awk may help you in same.
awk '!/S\/N:244/' Input_file
EDIT: Above code will give complete line as output if you need starting from serial number to till end in output then following may help you.
awk -F':' '!/S\/N:244/{print $2}' Input_file
EDIT2: Adding a sed solution too here for same.
sed -n '/:244/d;s/.*://;p' Input_file

The -v option on grep would be helpful here, and then cut to remove the leading cruft:
grep -v ':244-' serials.txt | cut -c5-

Here you go, without S/N:
grep -v ':244' serials.txt | cut -d':' -f2
Antigrep for :244, cuts with delimiter : shows field 2.

awk -F':' '$2!~/^244/{print $2}' file

Strip only domain name out of input url string

Did a bit of searching already but cannot seem to find an elegant way of doing this. I'd like to be able to search through a list like below and only end up with a plain text output file containing on the domain name, no http:// or anything after the /
So a list like this:
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
I want to end up with plain text output file like this.
7wind.ru
aldersgatencsc.org
amunow.org

Given:
$ echo "$txt"
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
You can use cut:
$ echo "$txt" | cut -d'/' -f3
7wind.ru
aldersgatencsc.org
amunow.org
Or, if your content is in a file:
$ cut -d'/' -f3 file
7wind.ru
aldersgatencsc.org
amunow.org
Then redirect that to the file you want:
$ cut -d'/' -f3 file >new_file

awk -F \/ '{ print $3 }' outputfile > newfile
Print the 3rd field delimited by /

$ sed -r 's#.*//([^/]*)/.*#\1#' Input_file
7wind.ru
aldersgatencsc.org
amunow.org

try following awks.
Solution 1st:
awk '{sub(/.*\/\//,"");sub(/\/.*/,"");print}' Input_file
Solution 2nd:
awk '{match($0,/\/.[^/]*/);print substr($0,RSTART+2,RLENGTH-2)}' Input_file

This works by stripping the protocol and :// first, then anything after and including the next slash.
sed "s|.*://||; s|/.*||" url-list.txt
Add -i to change the file directly.

try this regexp
((http|https):\/\/)?([a-zA-Z\.]+)(\/)?
first match, 3th group
but it may validate invalid url too! be careful

Reading numbers from a text line in bash shell

I'm trying to write a bash shell script, that opens a certain file CATALOG.dat, containing the following lines, made of both characters and numbers:
event_0133_pk.gz
event_0291_pk.gz
event_0298_pk.gz
event_0356_pk.gz
event_0501_pk.gz
What I wanna do is print the numbers (only the numbers) inside a new file NUMBERS.dat, using something like > ./NUMBERS.dat, to get:
0133
0291
0298
0356
0501
My problem is: how do I extract the numbers from the text lines? Is there something to make the script read just the number as a variable, like event_0%d_pk.gz in C/C++?

A grep solution:
grep -oP '[0-9]+' CATALOG.dat >NUMBERS.dat
A sed solution:
sed 's/[^0-9]//g' CATALOG.dat >NUMBERS.dat
And an awk solution:
awk -F"[^0-9]+" '{print $2}' CATALOG.dat >NUMBERS.dat

There are many ways that you can achieve your result. One way would be to use awk:
awk -F_ '{print $2}' CATALOG.dat > NUMBERS.dat
This sets the field separator to an underscore, then prints the second field which contains the numbers.

Awk
awk 'gsub(/[^[:digit:]]/,"")' infile
Bash
while read line; do echo ${line//[!0-9]}; done < infile
tr
tr -cd '[[:digit:]\n]' <infile

You can use grep command to extract the number part.
grep -oP '(?<=_)\d+(?=_)' CATALOG.dat
gives output as
0133
0291
0298
0356
0501
Or
much simply
grep -oP '\d+' CATALOG.dat

You don't need perl mode in grep for this. BREs can do this.
grep -o '[[:digit:]]\+' CATALOG.dat > NUMBERS.dat

Display all fields except the last

I have a file as show below
1.2.3.4.ask
sanma.nam.sam
c.d.b.test
I want to remove the last field from each line, the delimiter is . and the number of fields are not constant.
Can anybody help me with an awk or sed to find out the solution. I can't use perl here.

Both these sed and awk solutions work independent of the number of fields.
Using sed:
$ sed -r 's/(.*)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
Note: -r is the flag for extended regexp, it could be -E so check with man sed. If your version of sed doesn't have a flag for this then just escape the brackets:
sed 's/\(.*\)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
The sed solution is doing a greedy match up to the last . and capturing everything before it, it replaces the whole line with only the matched part (n-1 fields). Use the -i option if you want the changes to be stored back to the files.
Using awk:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file
1.2.3.4
sanma.nam
c.d.b
The awk solution just simply prints n-1 fields, to store the changes back to the file use redirection:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file > tmp && mv tmp file

Reverse, cut, reverse back.
rev file | cut -d. -f2- | rev >newfile
Or, replace from last dot to end with nothing:
sed 's/\.[^.]*$//' file >newfile
The regex [^.] matches one character which is not dot (or newline). You need to exclude the dot because the repetition operator * is "greedy"; it will select the leftmost, longest possible match.

With cut on the reversed string
cat youFile | rev |cut -d "." -f 2- | rev

If you want to keep the "." use below:
awk '{gsub(/[^\.]*$/,"");print}' your_file

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Excluding '#' comments from a sed selection - bash

You might as well use awk for the whole operation: DATABASE_NAME=$(awk -F' ' '$1!~/^#/ && /database_name/{print $2}' "$CONFIG_PATH") This will exclude all lines that start with # (comments).

If there is always a character before the d use /[^#]database_name/p. If not you can use /\(^\|[^#]\)database_name/p.

I think the braces are a GNU sed feature (not sure though) sed -n '/database_name/ {/^[[:blank:]]*#/!p}' For lines matching "database_name", if the line does NOT begin with blanks and a hash then print it.

if the file has blank spaces at starting of lines: sed 's/ //g' file.txt | awk '/^(database)/{print}'

Related

Trimming a textfile

grep serial numbers not starting with specific prefix

Strip only domain name out of input url string

Reading numbers from a text line in bash shell

Display all fields except the last

Categories

Resources