grep to ignore the first character - shell

If I have a bunch of lines that have :
//mainHtml = "https://
mainHtml = "https:
//https:www.google.com
public String ydmlHtml = "https://remando
aasdfa dsfadsf a asdfasd fasd fsdafsdaf
Now I want to grep only those lines which have "https:" in them, but they should NOT start with "//"
So far I have :
cat $javaFile | grep -e '\^\/ *https:*\'
where $javaFile is the file I want to look for the words.
My output is a blank.
Please help :)

You can use character class to negate the start of lines. We use -E option to use ERE or Extended Regular Expression.
grep -E '^[^/]{2}.*https' file
With your sample data:
$ cat file
//mainHtml = "https://
mainHtml = "https:
//https:www.google.com
public String ydmlHtml = "https://remando
aasdfa dsfadsf a asdfasd fasd fsdafsdaf
$ grep -E '^[^/]{2}.*https' file
mainHtml = "https:
public String ydmlHtml = "https://remando
You may also choose to write it without the -E option by saying:
grep '^[^/][^/].*https' file

In two steps:
grep -v '^//' | grep 'https:'
grep -v '^//' removes the lines starting with //
grep 'https:' gets the lines containing http:

Related

Bash regex: get value in conf file preceded by string with dot

I have to get my db credentials from this configuration file:
# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra
In particular, I want to get the value mydb from line
Aisse.LocalDataBase=mydb
So far, I have developed this
mydbname=$(echo "$my_conf_file.conf" | grep "LocalDataBase=" | sed "s/LocalDataBase=//g" )
that returns
mydb #Aisse.Trace_blabla4.tra
that would be ok if it did not return also the comment string.
Then I have also tryed
mydbname=$(echo "$my_conf_file.conf" | grep "Aisse.LocalDataBase=" | sed "s/LocalDataBase=//g" )
that retruns void string.
How can I get only the value that is preceded by the string "Aisse.LocalDataBase=" ?
Using sed
$ mydbname=$(sed -n 's/Aisse\.LocalDataBase=//p' input_file)
$ echo $mydbname
mydb
I'm afraid you're being incomplete:
You mention you want the line, containing "LocalDataBase", but you don't want the line in comment, let's start with that:
A line which contains "LocalDataBase":
grep "LocalDataBase" conf.conf.txt
A line which contains "LocalDataBase" but who does not start with a hash:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#"
??? grep -v "^ *#"
That means: don't show (-v) the lines, containing:
^ : the start of the line
* : a possible list of space characters
# : a hash character
Once you have your line, you need to work with it:
You only need the part behind the equality sign, so let's use that sign as a delimiter and show the second column:
cut -d '=' -f 2
All together:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2
Are we there yet?
No, because it's possible that somebody has put some comment behind your entry, something like:
LocalDataBase=mydb #some information
In order to prevent that, you need to cut that comment too, which you can do in a similar way as before: this time you use the hash character as a delimiter and you show the first column:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2 | cut -d '#' -f 1
Have fun.
You may use this sed:
mydbname=$(sed -n 's/^[^#][^=]*LocalDataBase=//p' file)
echo "$mydbname"
mydb
RegEx Details:
^: Start
[^#]: Matches any character other than #
[^=]*: Matches 0 or more of any character that is not =
LocalDataBase=: Matches text LocalDataBase=
You can use
mydbname=$(sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' file)
If there can be leading whitespace you can add [[:blank:]]* after ^:
mydbname=$(sed -n 's/^[[:blank:]]*Aisse\.LocalDataBase=\(.*\)/\1/p' file)
See this online demo:
#!/bin/bash
s='# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra'
sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' <<< "$s"
Output:
mydb
Details:
-n - suppresses default line output in sed
^[[:blank:]]*Aisse\.LocalDataBase=\(.*\) - a regex that matches the start of a string (^), then zero or more whiespaces ([[:blank:]]*), then a Aisse.LocalDataBase= string, then captures the rest of the line into Group 1
\1 - replaces the whole match with the value of Group 1
p - prints the result of the successful substitution.

Grep and awk use

i try one day but dont fixed. I dont know this method.
content query --uri content://com.android.contacts/contacts | grep "+9053158888" | awk -F'[,,= ]' '{cmd="content delete --uri content://com.android.contacts/contacts/"$(NF-3);system(cmd)}'
but not finding
My string
Row: 9991 last_time_contacted=0, phonetic_name=NULL, custom_ringtone=NULL, contact_status_ts=NULL, pinned=0, photo_id=NULL, photo_file_id=NULL, contact_status_res_package=NULL, contact_chat_capability=NULL, contact_status_icon=NULL, display_name_alt=+90532555688, sort_key_alt=+90532555688, in_visible_group=1, starred=0, contact_status_label=NULL, phonebook_label=#, is_user_profile=0, has_phone_number=1, display_name_source=40, phonetic_name_style=0, send_to_voicemail=0, lookup=0r10070-24121C1814241820221C1A14.3789r10071-24121C1814241820221C1A14.0r10072-24121C1814241820221C1A14.0r10073-24121C1814241820221C1A14.0r10074-24121C1814241820221C1A14.0r10075-24121C1814241820221C1A14.0r10078-24121C1814241820221C1A14.0r10082-24121C1814241820221C1A14.0r10083-24121C1814241820221C1A14.0r10084-24121C1814241820221C1A14.0r10085-24121C1814241820221C1A14.0r10086-24121C1814241820221C1A14.0r10087-24121C1814241820221C1A14.0r10092-24121C1814241820221C1A14.0r10094-24121C1814241820221C1A14.0r10097-24121C1814241820221C1A14, phonebook_label_alt=#, contact_last_updated_timestamp=1612984348874, photo_uri=NULL, phonebook_bucket=213, contact_status=NULL, display_name=+90532555688, sort_key=+90532555688, photo_thumb_uri=NULL, contact_presence=NULL, in_default_directory=1, times_contacted=0, _id=10097, name_raw_contact_id=10070, phonebook_bucket_alt=213
i need string " _id=10097 "
You may use this grep to find word _id followed by a = and 1+ digits:
... | grep -Eo '\b_id=[0-9]+'
_id=10097
To get all occurrences of if try following, written and tested with shown samples in GNU grep. Where str is your shell variable have your shown sample input in it.
echo "$str" | grep -oP ', \K_id=\d+'
OR try with awk:
echo "$str" |
awk 'match($0,/, _id=[0-9]+/){print substr($0,RSTART+2,RLENGTH-2)}'
Above will output as:
_id=10097

Extract values from a property file using bash

I have a variable which contains key/values separated by space:
echo $PROPERTY
server_geo=BOS db.jdbc_url=jdbc\:mysql\://mysql-test.com\:3306/db02 db.name=db02 db.hostname=/mysql-test.com datasource.class.xa=com.mysql.jdbc.jdbc2.optional.MysqlXADataSource server_uid=BOS_mysql57 hibernate33.dialect=org.hibernate.dialect.MySQL5InnoDBDialect hibernate.connection.username=db02 server_labels=mysql57,mysql5,mysql db.jdbc_class=com.mysql.jdbc.Driver db.schema=db02 hibernate.connection.driver_class=com.mysql.jdbc.Driver uuid=a19ua19 db.primary_label=mysql57 db.port=3306 server_label_primary=mysql57 hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
I'd need to extract the values of the single keys, for example db.jdbc_url.
Using one code snippet I've found:
echo $PROPERTY | sed -e 's/ db.jdbc_url=\(\S*\).*/\1/g'
but that returns also other properties found before my key.
Any help how to fix it ?
Thanks
If db.name always follow db.jdbc_url, then use grep lookaround,
$ echo "${PROPERTY}" | grep -oP '(?<=db.jdbc_url=).*(?=db.name)'
jdbc\:mysql\://mysql-test.com\:3306/db02
or add the VAR to an array,
$ myarr=($(echo $PROPERTY))
$ echo "${myarr[1]}" | grep -oP '(?<=db.jdbc_url=).*(?=$)'
jdbc\:mysql\://mysql-test.com\:3306/db02
This is caused because you are using the substitute command (sed s/.../.../), so any text before your regex is kept as is. Using .* before db\.jdbc_url along with the begin (^) / end ($) of string marks makes you match the whole content of the variable.
In order to be totaly safe, your regex should be :
sed -e 's/^.*db\.jdbc_url=\(\S*\).*$/\1/g'
You can use grep for this, like so:
echo $PROPERTY | grep -oE "db.jdbc_url=\S+" | cut -d'=' -f2
The regex is very close to the one you used with sed.
The -o option is used to print the matched parts of the matching line.
Edit: if you want only the value, cut on the '='
Edit 2: egrep say it is deprecated, so use grep -oE instead, same result. Just to cover all bases :-)

Count lines following a pattern from file

For example I have a file test.json that contains a series of line containing:
header{...}
body{...}
other text (as others)
empty lines
I wanted to run a script that returns the following
Counted started on : test.json
- headers : 4
- body : 5
- <others>
Counted finished : <time elapsed>
What I got so far is this.
count_file() {
echo "Counted started on : $1"
#TODO loop
cat $1 | grep header | wc -l
cat $1 | grep body | wc -l
#others
echo "Counted finished : " #TODO timeElapsed
}
Edit:
Edit question and added code snippet
Perl on Command Line
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' your-file your-pattern
For me
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' parsing_command_line.pl my
It counts how many number of pattern my are, in my script parsing_command_line.pl
output
my 3
For you
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' test.json headers
NOTE
Write all code in one line on your command prompt
First argument is your file
Second is your pattern
This is not a complete code since you have to enter all your pattern one-by-one
You can capture the result of a commande in a variable, like:
result=`cat $1 | grep header | wc -l`
and then print the result:
echo "# headers : $b"
` is the eval operator that let replace the whole expression by the output of the command inside.

Grep (Bash) error

I have a file like this called new.samples.dat
-4.5000000000E-01 8.0000000000E+00 -1.3000000000E-01
5.0000000000E-02 8.0000000000E+00 3.4000000000E-01
...
I have to search all this numbers of this file in another file called Remaining.Simulations.dat and copy them in another file. I did like this
for sample_index in $(seq 1 100)
do
sample=$(awk 'NR=='$sample_index'' new.samples.dat)
grep "$sample" Remaining.Simulations.dat >> Previous.Training.dat
done
It works almost fine but it does not copy all the $sample into Previous.Training.dat even if I am sure that these are in Remaining.Simulations.dat
This errors appear
grep: invalid option -- '.'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
Do you have any idea how to solve it?Thank you
It's because you're trying to grep for something like -4.5 and grep is treating that as an option rather than a search string. If you use -- to indicate there are no more options, this should work okay:
pax> echo -4.5000000000E-01 | grep -4.5000000000E-01
grep: invalid option -- '.'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
pax> echo -4.5000000000E-01 | grep -- -4.5000000000E-01
-4.5000000000E-01
In addition, if you pass the string 7.2 to grep, it will match any line containing 7 followed by any character followed by 2 since:
Regular expressions treat . as a special character; and
Without start and end markers, 7.2 will also match 47.2, 7.25 and so on.
With awk you can try something like:
awk '
NR==FNR {
for (i=1;i<=NF;i++) {
numbers[$i]++
}
next
}
{
for (number in numbers)
if (index ($0,number) > 0) {
print $0
}
}' new.samples.dat Remaining.Simulations.dat > anotherfile

Resources