I have to get my db credentials from this configuration file:
# Database settings
# My other app settings
# Log settings
In particular, I want to get the value mydb from line
So far, I have developed this
mydbname=$(echo "$my_conf_file.conf" | grep "LocalDataBase=" | sed "s/LocalDataBase=//g" )
that returns
mydb #Aisse.Trace_blabla4.tra
that would be ok if it did not return also the comment string.
Then I have also tryed
mydbname=$(echo "$my_conf_file.conf" | grep "Aisse.LocalDataBase=" | sed "s/LocalDataBase=//g" )
that retruns void string.
How can I get only the value that is preceded by the string "Aisse.LocalDataBase=" ?
Using sed
$ mydbname=$(sed -n 's/Aisse\.LocalDataBase=//p' input_file)
$ echo $mydbname
I'm afraid you're being incomplete:
You mention you want the line, containing "LocalDataBase", but you don't want the line in comment, let's start with that:
A line which contains "LocalDataBase":
grep "LocalDataBase" conf.conf.txt
A line which contains "LocalDataBase" but who does not start with a hash:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#"
??? grep -v "^ *#"
That means: don't show (-v) the lines, containing:
^ : the start of the line
* : a possible list of space characters
# : a hash character
Once you have your line, you need to work with it:
You only need the part behind the equality sign, so let's use that sign as a delimiter and show the second column:
cut -d '=' -f 2
All together:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2
Are we there yet?
No, because it's possible that somebody has put some comment behind your entry, something like:
LocalDataBase=mydb #some information
In order to prevent that, you need to cut that comment too, which you can do in a similar way as before: this time you use the hash character as a delimiter and you show the first column:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2 | cut -d '#' -f 1
Have fun.
You may use this sed:
mydbname=$(sed -n 's/^[^#][^=]*LocalDataBase=//p' file)
echo "$mydbname"
RegEx Details:
^: Start
[^#]: Matches any character other than #
[^=]*: Matches 0 or more of any character that is not =
LocalDataBase=: Matches text LocalDataBase=
You can use
mydbname=$(sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' file)
If there can be leading whitespace you can add [[:blank:]]* after ^:
mydbname=$(sed -n 's/^[[:blank:]]*Aisse\.LocalDataBase=\(.*\)/\1/p' file)
See this online demo:
s='# Database settings
# My other app settings
# Log settings
sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' <<< "$s"
-n - suppresses default line output in sed
^[[:blank:]]*Aisse\.LocalDataBase=\(.*\) - a regex that matches the start of a string (^), then zero or more whiespaces ([[:blank:]]*), then a Aisse.LocalDataBase= string, then captures the rest of the line into Group 1
\1 - replaces the whole match with the value of Group 1
p - prints the result of the successful substitution.
I have a text file: file.txt, with several thousand lines. It contains a lot of junk lines which I am not interested in, so I use the cut command to regex for the lines I am interested in first. For each entry I am interested in, it will be listed twice in the text file: Once in a "definition" section, another in a "value" section. I want to retrieve the first value from the "definition" section, and then for each entry found there find it's corresponding "value" section entry.
The first entry starts with ' gl_ ', while the 2nd entry would look like ' "gl_ ', starting with a '"'.
This is the code I have so far for looping through the text document, which then retrieves the values I am interested in and appends them to a .csv file:
while read -r line
if [[ $line == gl_* ]] ; then (param=$(cut -d'\' -f 1 $line) | def=$(cut -d'\' -f 2 $line) | type=$(cut -d'\' -f 4 $line) | prompt=$(cut -d'\' -f 8 $line))
while read -r glline
if [[ $glline == '"'$param* ]] ; then val=$(cut -d'\' -f 3 $glline) |
"$project";"$param";"$val";"$def";"$type";"$prompt" >> /filepath/file.csv
done < file.txt
done < file.txt
This seems to throw some syntax errors related to unexpected tokens near the first 'done' statement.
Example of text that needs to be parsed, and paired:
gl_one\User Defined\1\String\1\\1\Some Text
gl_two\User Defined\1\String\1\\1\Some Text also
gl_three\User Defined\1\Time\1\\1\Datetime now
So effectively, the while loop reads each line until it hits the first line that starts with 'gl_', which then stores that value (ie. gl_one) as a variable 'param'.
It then starts the nested while loop that looks for the line that starts with a ' " ' in front of the gl_, and is equivalent to the 'param' value. In other words, the
script should couple the lines gl_one and "gl_one, gl_two and "gl_two, gl_three and "gl_three.
The text file is large, and these are settings that have been defined this way. I need to collect the values for each gl_ parameter, to save them together in a .csv file with their corresponding "gl_ values.
Wanted regex output stored in variables would be something like this:
first while loop:
$param = gl_one, $def = User Defined, $type = String, $prompt = Some Text
second while loop:
$val = Value1
Then it stores these variables to the file.csv, with semi-colon separators.
Currently, I have an error for the first 'done' statement, which seems to indicate an issue with the quotation marks. Apart from this,
I am looking for general ideas and comments to the script. I.e, not entirely sure I am looking for the quotation mark parameters "gl_ correctly, or if the
semi-colons as .csv separators are added correctly.
Edit: Overall, the script runs now, but extremely slow due to the inner while loop. Is there any faster way to match the two lines together and add them to the .csv file?
Any ideas and comments?
This will generate a file containing the data you want:
cat file.txt | grep gl_ | sed -E "s/\"//" | sort | sed '$!N;s/\n/\\/' | awk -F'\' '{print $1"; "$5"; "$7"; "$NF}' > /filepath/file.csv
It uses grep to extract all lines containing 'gl_'
then sed to remove the leading '"' from the lines that contain one [I have assumed there are no further '"' in the line]
The lines are sorted
sed removes the return from each pair of lines
awk then prints
the required columns according to your requirements
Output routed to the file.
LANG=C sort -t\\ -sd -k1,1 <file.txt |\
sed '
/^gl_/{ # if definition
N; # append next line to buffer
s/\n"gl_[^\\]*//; # if value, strip first column
t; # and start next loop
D; # otherwise, delete the line
' |\
awk -F\\ -v p="$project" -v OFS=\; '{print p,$1,$10,$2,$4,$8 }' \
sort lines so gl_... appears immediately before "gl_... (LANG fixes LC_TYPE) - assumes definition appears before value
sed to help ensure matching definition and value (may still fail if duplicate/missing value), and tidy for awk
awk to pull out relevant fields
I'm trying to write a script that looks at a part of a line, does a sort -u or something to look for unique occurrences, and then displays the output, sorted by the ORIGINAL ordering of the lines. In other words, only the FIRST occurrence of that part of the line would show up.
I managed to do it using cut, but my output just displays the cut portion of the data. How could I do it so that it gets the entire line?
Here's what I've got so far:
cut -d, -f6 infile.txt | cut -c4-11 | grep -n . | sort -t: -k2,2 -u | sort -t: -k1n,1 | cut -d: -f2-
I know the data doesn't have an extra : or a , in a place that would break this script. But this only outputs the data that was unique. How can I get the entire line? I would prefer to stay away from perl, but awk is okay (though I don't know it very well).
If the input file is this (note, the ABCDEFGH is not real, I just put it there to illustrate what I mean):
My program outputs:
I want to see:
Yes, awk is your best bet. Here's a mysterious example:
awk -F, '!seen[substr($6,4,8)]++' infile.txt
-F, set the field separator to ,
substr($6,4,8) up to 8 characters starting at the fourth character
of the sixth field
seen[...]++ seen is an associative array (dictionary). Increment the
value associated with ..., and return the old value
!seen[...]++ if there was no old value, perform the action
There is no action, only a condition, so the default action is
performed if the test succeeds. The default action is to print
the line. So the line will be printed if the relevant characters of
the sixth field haven't yet been seen.
$ awk -F, '!seen[substr($6,4,8)]++' <<EOF
> A....,....,...........,.....,....,...20130718......,.........,...........,......
> B....,....,...........,.....,....,...20130714......,.........,...........,......
> C....,....,...........,.....,....,...20130718......,.........,...........,......
> D....,....,...........,.....,....,...20130719......,.........,...........,......
> E....,....,...........,.....,....,...20130713......,.........,...........,......
> F....,....,...........,.....,....,...20130714......,.........,...........,......
> G....,....,...........,.....,....,...20130630......,.........,...........,......
> H....,....,...........,.....,....,...20130718......,.........,...........,......