Shell Script to prepare unformatted data - bash

I have text file TEST.txt which has below data which is unformated:
0411 14:30:00 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup#TEST.fs, Businesspartner#TEST.fs
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigaben had no results
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigabe 14:30 NOT sent to TRE_ClientServiceGroup#TEST.fs, Businesspartner#TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [itraderdbint] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [qlp_devp] has been added to datasource map
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
Now i want to create Shell script which will prepare this unformated data into below format and create for example PrepardFile.txt. I want to separate every string with pipe operator. The first part is date format so i want this as complete string. The second part always start with INF[ and ends with ] or we can take the complete part without spaces starting from INF[ and this would be my second string separated as pipe operator. The third part will be the remaining part which would be my third string. I want to add header for better understanding of what does this field value indicate:
DATE_FORMAT|ROW_EXECUTE|ROW_VALUE
0411 14:30:00|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup#TEST.fs, Businesspartner#TEST.fs
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigaben had no results
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigabe 14:30 NOT sent to TRE_ClientServiceGroup#TEST.fs, Businesspartner#TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [itraderdbint] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [qlp_devp] has been added to datasource map
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
I am very new to Shell script and dont know if this possbile to do with the help of shell script.

#Symonds
This response is regarding your comment asking for adding a header section and further explanation.
To add header section, you can use echo and create the PreparedFile.txt first. Then use >> operator to append to the file. You can copy the complete code to a file named Script.sh and then run it using bash Script.sh
#!/bin/bash
echo "DATE_FORMAT|ROW_EXECUTE|ROW_VALUE" > PreparedFile.txt
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' >> PreparedFile.txt
As far as the explanation you have asked for, you can chain commands using the pipe symbol |. The sed command allows you to substitute occurrences of regular expressions you specify with a replacement. In my first pipeline following cat command, I use s/ /|/2. This means replace the second occurence of blank space with |. You can read more about the sed command usage here.

You can use the below Shell script and see if it helps. It uses sed command and combination of pipes to replace second occurrence of space first and then the closing square bracket.
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' > PreparedFile.txt

Related

formatting email output in bash

I'm writing a script that sends a user an email when their AWS access keys are too old.
Right now the output looks like this:
Hello tdunphy, \n Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \n Regards, \n Cloud Ops
I want the message to look like this:
Hello tdunphy,
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
This is the line that sets up the body of the message:
MAIL_TXT1="Hello $user_name, \\n Your access key: $user_access_key1 was created on $date1 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \\n Regards, \\n Cloud Ops"
Why are the newlines not working in this example? How can I get this to work?
You can use:
template=$'Hello tdunphy, \n Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \n Regards, \n Cloud Ops'
Then either echo works:
$ echo "$template"
Hello tdunphy,
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
Or printf:
$ printf "%s" "$template"
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
With printf you can assemble the template fields more easily as well:
$ t2=$'Fields:\n\tf1=%s\n\tf2=%s'
$ printf "$t2" 101 102
Fields:
f1=101
f2=102

Shell Script to read from log file and update to Oracle DB table

I have a requirement to read a splunk log file for certain parameters and use that data to update an Oracle 11g DB table once those parameters are found.
for e.g.
Splunk log file name is: app.log
input parameters in log file would be:
[timestamp] amount=100,name=xyz,time=19 May 2018 13:45 PM
output from shell script should be: amount should be read in to a variable and 100 should be assigned to that. This value 100 should be stored in a DB table in Oracle.
I may have to use awk script for this. I am not getting an idea on this as I am new to shell scripting..
tail -f|egrep -wi 'amount' /apps/JBoss/log/app.log
This type of commands doesn't seem to be working.
You may easily capture such values using Perl's regex.
amt=$(perl -pe 's/^amount=(\d+).*/$1/' /apps/JBoss/log/app.log)
If you want to use pure shell commands,
amt=$(grep amount app.log| cut -f1 -d',' | cut -f2 -d '=')
You may use this variable in the insert query from sqlplus
sqlplus -s USER/PWD<<SQL
INSERT INTO yourtable(column_name) VALUES(${amt});
commit;
exit
SQL
The DB Connect app may the job for you. See http://docs.splunk.com/Documentation/DBX/3.1.3/DeployDBX/Createandmanagedatabaseoutputs.
For an input file (app.log) like:
[timestamp] amount=100,name=xyz,time=19 May 2018 13:45 PM
[timestamp] amount=150,name=xyz,time=19 May 2018 13:45 PM
[timestamp] amount=200,name=xyz,time=19 May 2018 13:45 PM
you could use grep's P flag (PCRE):
arr=($(grep -oP "(?<=amount=)\d+" app.log))
This will store the values of amount in an array arr. Output:
echo ${arr[#]}
100 150 200

Parse line for specific date format?

Writing a script using bash. I am trying to look through lines in a file for a specific date format:
date +"%a %b %d %T %Z %Y"
For example, if the line were
/foo/bar/foobar this 12 is 411 arbitrary stuff in the line Wed Jun 10 10:10:10 PST 2017
I would want to obtain Wed Jun 10 10:10:10 PST 2017.
Any way to search for specific date formats?
I'm not sure whether you'll agree with this approach. But if this is for some quick, non-recurring work, I won't look for a perfect solution that can handle all the scenarios.
To start with, you can use the following too generic pattern to match the part you want.
cat file | sed -n 's/.*\(... ... .. ..:..:.. ... ....\).*/\1/p'
Then you can enhance this further restricting the matches as you need.
E.g.
cat file | sed -n 's/.*\([a-Z]\{3\} [a-Z]\{3\} [0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9] [A-Z]\{3\} [0-9]\{4\}\).*/\1/p'
Note that this still is not perfect and can match invalid contents. If you find it still not good enough, you can further fine tune the pattern to the point you want.

How to find the date using internet (ie ntp) from bash?

How can I learn date and time from the internet using bash without installing anything extra.
I am basically looking for an equivalent of bash $ date, but using an NTP (or any other way) to get the correct date and time from the internet. All the methods I find (such as ntpd) are meant to correct the system time, which is not my purpose.
date has a lot of options for formatting, but I'm assuming that you just want the date and time:
ntpdate -q time.google.com | sed -n 's/ ntpdate.*//p'
(or any other time server)
If you have ntpd installed & configured then you can use the NTP Query command ntpq -crv which will return;
associd=0 status=04ff leap_none, sync_uhf_radio, 15 events, stale_leapsecond_values,
version="ntpd 4.2.6p5#1.2349-o Mon Feb 6 07:22:46 UTC 2017 (1)",
processor="x86_64", system="Linux/4.10.13-1.el6.elrepo.x86_64", leap=00,
stratum=1, precision=-23, rootdelay=0.000, rootdisp=1.000, refid=PPS,
reftime=dd2c9f10.f25911ee Wed, Aug 2 2017 19:57:20.946,
clock=dd2c9f11.f4251b0a Wed, Aug 2 2017 19:57:21.953, peer=6516, tc=4,
mintc=3, offset=-0.005, frequency=-17.045, sys_jitter=0.110,
clk_jitter=0.007, clk_wander=0.003, tai=37, leapsec=201701010000,
expire=201706010000
You want the line starting clock which gives the time, date etc - you would be best parsing this out with awk or something if you just want the date stamp rather then everything else.
You do not need to be a root user to run the command. It won't set anything, but will query your local server (presuming your running ntp) and present the details.

Parsing entry name from a log

Writing bash parsing scripts is my own personal nightmare, so here I am.
The server log format is below:
197 INFO Thu Mar 27 10:10:32 2014
seq_1_1..JobControl (DSWaitForJob): Waiting for job job_1_1_1 to finish
198 INFO Thu Mar 27 10:10:36 2014
seq_1_1..JobControl (DSWaitForJob): Job job_1_1_1 has finished, status = 3 (Aborted)
199 WARNING Thu Mar 27 10:10:36 2014
seq_1_1..JobControl (#job_1_1_1): Job job_1_1_1 did not finish OK, status = 'Aborted'
From here I need to parse out the string which follows the format:
Job job_name has finished, status = 3 (Aborted)
So from the output above I should get: job_1_1_1
What would the script for that look like if I get this server log as a certain command output?
Thanks xx
Using grep -P:
grep -oP '\w+(?= has finished, status = 3)' file
job_1_1_1

Resources