how to clear the specific lines from file based on some string in file in unix - shell

I have a file as below -
Password:
Msg 2401, Level 11, State 2:
Server 'test':
Character set conversion is not available between client character set 'utf8' and server character set 'iso_1'.
No conversions will be done.
|Extraction_Date|Agent_Cde_1|Agent_Cde_2|Agent_Cde_3|Agent_Cde_4|Agent_Name
|20140902 |0010 | NULL| NULL| NULL|NULL
I want to delete all the lines which are present before column names. The number of lines present before column names can vary every time. Is there any way wherein we can check for 'Extraction_date' string and delete all the lines present above it using unix commands ?

This will print all line starting from Extraction date:
awk '/^\|Extraction_Date/ {f=1} f' file
|Extraction_Date|Agent_Cde_1|Agent_Cde_2|Agent_Cde_3|Agent_Cde_4|Agent_Name
|20140902 |0010 | NULL| NULL| NULL|NULL
Or this may be ok:
awk '/^\|/' file

Try the grep command
grep -F '|'

using sed address range:
sed -n '/^|Extraction_Date/,$p' file

Related

Bash regex: get value in conf file preceded by string with dot

I have to get my db credentials from this configuration file:
# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra
In particular, I want to get the value mydb from line
Aisse.LocalDataBase=mydb
So far, I have developed this
mydbname=$(echo "$my_conf_file.conf" | grep "LocalDataBase=" | sed "s/LocalDataBase=//g" )
that returns
mydb #Aisse.Trace_blabla4.tra
that would be ok if it did not return also the comment string.
Then I have also tryed
mydbname=$(echo "$my_conf_file.conf" | grep "Aisse.LocalDataBase=" | sed "s/LocalDataBase=//g" )
that retruns void string.
How can I get only the value that is preceded by the string "Aisse.LocalDataBase=" ?
Using sed
$ mydbname=$(sed -n 's/Aisse\.LocalDataBase=//p' input_file)
$ echo $mydbname
mydb
I'm afraid you're being incomplete:
You mention you want the line, containing "LocalDataBase", but you don't want the line in comment, let's start with that:
A line which contains "LocalDataBase":
grep "LocalDataBase" conf.conf.txt
A line which contains "LocalDataBase" but who does not start with a hash:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#"
??? grep -v "^ *#"
That means: don't show (-v) the lines, containing:
^ : the start of the line
* : a possible list of space characters
# : a hash character
Once you have your line, you need to work with it:
You only need the part behind the equality sign, so let's use that sign as a delimiter and show the second column:
cut -d '=' -f 2
All together:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2
Are we there yet?
No, because it's possible that somebody has put some comment behind your entry, something like:
LocalDataBase=mydb #some information
In order to prevent that, you need to cut that comment too, which you can do in a similar way as before: this time you use the hash character as a delimiter and you show the first column:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2 | cut -d '#' -f 1
Have fun.
You may use this sed:
mydbname=$(sed -n 's/^[^#][^=]*LocalDataBase=//p' file)
echo "$mydbname"
mydb
RegEx Details:
^: Start
[^#]: Matches any character other than #
[^=]*: Matches 0 or more of any character that is not =
LocalDataBase=: Matches text LocalDataBase=
You can use
mydbname=$(sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' file)
If there can be leading whitespace you can add [[:blank:]]* after ^:
mydbname=$(sed -n 's/^[[:blank:]]*Aisse\.LocalDataBase=\(.*\)/\1/p' file)
See this online demo:
#!/bin/bash
s='# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra'
sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' <<< "$s"
Output:
mydb
Details:
-n - suppresses default line output in sed
^[[:blank:]]*Aisse\.LocalDataBase=\(.*\) - a regex that matches the start of a string (^), then zero or more whiespaces ([[:blank:]]*), then a Aisse.LocalDataBase= string, then captures the rest of the line into Group 1
\1 - replaces the whole match with the value of Group 1
p - prints the result of the successful substitution.

Remove duplicates from the same line in a file

How do I remove below duplicates from the same line in a file? I need the duplicates removed including semicolon.
For example from the below output of a file I need only "dg01.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com" similarly other lines of the file.
dg01.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com;jms1001-02-ri5.ri5.dc2.responsys.com dg02.server.wmq.host=jms1002-01-ri5.ri5.dc2.responsys.com;jms1002-02-ri5.ri5.dc2.responsys.com dg03.server.wmq.host=jms1003-01-ri5.ri5.dc2.responsys.com;jms1003-02-ri5.ri5.dc2.responsys.com dg04.server.wmq.host=jms1004-01-ri5.ri5.dc2.responsys.com;jms1004-02-ri5.ri5.dc2.responsys.com dg05.server.wmq.host=jms1005-01-ri5.ri5.dc2.responsys.com;jms1005-02-ri5.ri5.dc2.responsys.com dg06.server.wmq.host=jms1006-01-ri5.ri5.dc2.responsys.com;jms1006-02-ri5.ri5.dc2.responsys.com dg07.server.wmq.host=jms1007-01-ri5.ri5.dc2.responsys.com;jms1007-02-ri5.ri5.dc2.responsys.com dg08.server.wmq.host=jms1008-01-ri5.ri5.dc2.responsys.com;jms1008-02-ri5.ri5.dc2.responsys.com dg09.server.wmq.host=jms1009-01-ri5.ri5.dc2.responsys.com;jms1009-02-ri5.ri5.dc2.responsys.com dg10.server.wmq.host=jms1010-01-ri5.ri5.dc2.responsys.com;jms1010-02-ri5.ri5.dc2.responsys.com dg11.server.wmq.host=jms1011-01-ri5.ri5.dc2.responsys.com;jms1011-02-ri5.ri5.dc2.responsys.com dg12.server.wmq.host=jms1012-01-ri5.ri5.dc2.responsys.com;jms1012-02-ri5.ri5.dc2.responsys.com dg13.server.wmq.host=jms1013-01-ri5.ri5.dc2.responsys.com;jms1013-02-ri5.ri5.dc2.responsys.com dg14.server.wmq.host=jms1014-01-ri5.ri5.dc2.responsys.com;jms1014-02-ri5.ri5.dc2.responsys.com dg15.server.wmq.host=jms1015-01-ri5.ri5.dc2.responsys.com;jms1015-02-ri5.ri5.dc2.responsys.com dg16.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com;jms1001-02-ri5.ri5.dc2.responsys.com dg17.server.wmq.host=jms1002-01-ri5.ri5.dc2.responsys.com;jms1002-02-ri5.ri5.dc2.responsys.com dg18.server.wmq.host=jms1003-01-ri5.ri5.dc2.responsys.com;jms1003-02-ri5.ri5.dc2.responsys.com dg19.server.wmq.host=jms1004-01-ri5.ri5.dc2.responsys.com;jms1004-02-ri5.ri5.dc2.responsys.com dg20.server.wmq.host=jms1005-01-ri5.ri5.dc2.responsys.com;jms1005-02-ri5.ri5.dc2.responsys.com dg21.server.wmq.host=jms1006-01-ri5.ri5.dc2.responsys.com;jms1006-02-ri5.ri5.dc2.responsys.com dg22.server.wmq.host=jms1007-01-ri5.ri5.dc2.responsys.com;jms1007-02-ri5.ri5.dc2.responsys.com dg23.server.wmq.host=jms1008-01-ri5.ri5.dc2.responsys.com;jms1008-02-ri5.ri5.dc2.responsys.com dg24.server.wmq.host=jms1009-01-ri5.ri5.dc2.responsys.com;jms1009-02-ri5.ri5.dc2.responsys.com dg25.server.wmq.host=jms1010-01-ri5.ri5.dc2.responsys.com;jms1010-02-ri5.ri5.dc2.responsys.com dg26.server.wmq.host=jms1011-01-ri5.ri5.dc2.responsys.com;jms1011-02-ri5.ri5.dc2.responsys.com dg27.server.wmq.host=jms1012-01-ri5.ri5.dc2.responsys.com;jms1012-02-ri5.ri5.dc2.responsys.com dg28.server.wmq.host=jms1013-01-ri5.ri5.dc2.responsys.com;jms1013-02-ri5.ri5.dc2.responsys.com dg29.server.wmq.host=jms1014-01-ri5.ri5.dc2.responsys.com;jms1014-02-ri5.ri5.dc2.responsys.com dg30.server.wmq.host=jms1015-01-ri5.ri5.dc2.responsys.com;jms1015-02-ri5.ri5.dc2.responsys.com dg31.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com;jms1001-02-ri5.ri5.dc2.responsys.com dg32.server.wmq.host=jms1002-01-ri5.ri5.dc2.responsys.com;jms1002-02-ri5.ri5.dc2.responsys.com dg33.server.wmq.host=jms1003-01-ri5.ri5.dc2.responsys.com;jms1003-02-ri5.ri5.dc2.responsys.com dg34.server.wmq.host=jms1004-01-ri5.ri5.dc2.responsys.com;jms1004-02-ri5.ri5.dc2.responsys.com dg35.server.wmq.host=jms1009-01-ri5.ri5.dc2.responsys.com;jms1009-02-ri5.ri5.dc2.responsys.com dg36.server.wmq.host=jms1010-01-ri5.ri5.dc2.responsys.com;jms1010-02-ri5.ri5.dc2.responsys.com dg37.server.wmq.host=jms1011-01-ri5.ri5.dc2.responsys.com;jms1011-02-ri5.ri5.dc2.responsys.com dg38.server.wmq.host=jms1012-01-ri5.ri5.dc2.responsys.com;jms1012-02-ri5.ri5.dc2.responsys.com dg39.server.wmq.host=jms1007-01-ri5.ri5.dc2.responsys.com;jms1007-02-ri5.ri5.dc2.responsys.com dg40.server.wmq.host=jms1008-01-ri5.ri5.dc2.responsys.com;jms1008-02-ri5.ri5.dc2.responsys.com
Assuming dg01.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com;jms1001-02-ri5.ri5.dc2.responsys.com is a line in your input file and you're only interested in the dg01.server.wmq.host=jms1001-01-ri5.ri5.dc2.responsys.com part (up to, but not including, the semicolumn) you can obtain the desired output by running:
cat inputfile | awk -F ';' {'print $1'}
Another way to obtain the same output, as pointed out by #Shawn, would be:
cut -d ';' -f1 inputfile

Command line: retrieving specific column from CSV file

I have a CSV file called articles.csv with headers as follows:
article_id, article_title, article_shares, article_date.
The first row of data in the article is found as $ articles.csv | sed "1 d" and this returns: "895", "Trump, Clinton, America. Who will win, who will lose?", "100", "01/05/2016".
I want to return the fourth column of data (the date of the article) so I use the following code:
$ articles.csv | sed "1 d" | cut -d , -f 4.
However I don't get the date, I get America. Who will win. How do I get the output of the fourth column, regardless of the fact that some columns have commas in them?
A quick and dirty solution:
... | awk -F'",' '{print $4}'
A slow but clean solution:
... | ruby -ne $'require "csv"; print CSV.parse($_)[0][3]'
Note: CSV format should not have spaces between fields, so change your record to:
"895","Trump, Clinton, America. Who will win, who will lose?","100","01/05/2016"

Fetch values from particular file and display swapped values in terminal

I have a file named input.txt which contains students data in StudentName|Class|SchoolName format.
Shriii|Fourth|ADCET
Chaitraliii|Fourth|ADCET
Shubhangi|Fourth|ADCET
Prathamesh|Third|RIT
I want to display this values in reverse order for particular college. Example:
ADCET|Fourth|Shriii
ADCET|Fourth|Chaitraliii
I used grep 'ADCET$' input.txt which gives output
Shriii|Fourth|ADCET
Chaitraliii|Fourth|ADCET
But I want it in reverse order. I also used grep 'ADCET$' input.txt | sort -r but didn't get required output
Ref1
You may use either of the following sed or awk solutions:
grep 'ADCET$' input.txt | sed 's/^\([^|]*\)\(|.*|\)\([^|]*\)$/\3\2\1/'
grep 'ADCET$' input.txt | awk 'BEGIN {OFS=FS="|"} {temp=$NF;$NF=$1;$1=temp;}1'
See the online demo
awk details
BEGIN {OFS=FS="|"} - the field separator is set to | and the same char will be used for output
{temp=$NF;$NF=$1;$1=temp;}1:
temp=$NF; - the last field value is assigned to a temp variable
$NF=$1; - the last field is set to Field 1 value
$1=temp; - the value of Field 1 is set to temp
1 - makes the awk write the output.
sed details
^ - start of the line
\([^|]*\) - Capturing group 1: any 0+ chars other than |
\(|.*|\) - Capturing group 2: |, then any 0+ chars and then a |
\([^|]*\) - Capturing group 3: any 0+ chars other than|`
$ - end of line.
The \3\2\1 are placeholders for values captured into Groups 1, 2 and 3.

Allow only specifi character else null should transfer in unix

Allow characters in 2nd columns are 0 to 9 and A to Z and Symbol like "+" and "-", if allow character found in 2nd column then complete record should be Transfer else null should be Transfer in 2nd column
Input
- 1|89+
- 2|-AB
- 3|XY*
- 4|PR%
Output
- 1|89+
- 2|-AB
- 3|<null>
- 4|<null>
grep -E '^[a-zA-Z0-9\+\-\|]+$' file > file1
but above code is discard complete record if matching not found, I Need all records but if matching found then it should Transfer else null Transfer.
Use sed to replace everything after a pipe, that begins with zero or more characters in the class of digits, letters, plus or minus followed by one character not in that class up to the end of the string with a pipe only.
sed 's/\|[0-9a-zA-Z+-]*[^0-9a-zA-Z+-].*$/|/' file
Using awk and character classes where supported:
$ awk 'BEGIN{FS=OFS="|"}$2~/[^[:alnum:]+-]/{$2=""}1' file
1|89+
2|-AB
3|
4|
Where not supported (such as mawk) use:
$ awk 'BEGIN{FS=OFS="|"}$2~/[^A-Za-z0-9+-]/{$2=""}1' file

Resources