Using sed to extract a substring in curly brackets - bash

I've currently got a string as below:
integration#{Wed Nov 19 14:17:32 2014} branch: thebranch
This is contained in a file, and I parse the string. However I want the value between the brackets {Wed Nov 19 14:17:32 2014}
I have zero experience with Sed, and to be honest I find it a little cryptic.
So far I've managed to use the following command, however the output is still the entire string.
What am I doing wrong?
sed -e 's/[^/{]*"\([^/}]*\).*/\1/'

To get the values which was between {, }
$ sed 's/^[^{]*{\([^{}]*\)}.*/\1/' file
Wed Nov 19 14:17:32 2014

This is very simple to do with awk, not complicate regex.
awk -F"{|}" '{print $2}' file
Wed Nov 19 14:17:32 2014
It sets the field separator to { or }, then your data will be in the second field.
FS could be set like this to:
awk -F"[{}]" '{print $2}' file
To see all field:
awk -F"{|}" '{print "field#1="$1"\nfield#2="$2"\nfield#3="$3}' file
field#1=integration#
field#2=Wed Nov 19 14:17:32 2014
field#3= branch: thebranch

This might work
sed -e 's/[^{]*\({[^}]*}\).*/\1/g'
Test
$ echo "integration#{Wed Nov 19 14:17:32 2014} branch: thebranch" | sed -e 's/[^{]*{\([^}]*\)}.*/\1/g'
Wed Nov 19 14:17:32 2014
Regex
[^{]* Matches anything other than the {, That is integration#
([^}]*) Capture group 1
\{ Matches {
[^}]* matches anything other than }, That is Wed Nov 19 14:17:32 2014
\} matches a }
.* matches the rest

Simply, below command also get the data...
echo "integration#{Wed Nov 19 14:17:32 2014} branch: thebranch" | sed 's/.*{\(.*\)}.*/\1/g'

Related

How can I change command and option about 'date' command in bash?

I want to convert this bash command to shell script.
BASH
Input:
date --date="Wed Aug 25 22:37:44 +0900 2021" +"%s"
Output:
1629898664
SHELL
tmp.sh:
function time(a, b, c, d, e) { return date --date="a b c d +0900 e" +"%s" }
{print time($1, $2, $3, $4, $5}
timeline:
Wed Aug 25 22:37:44 2021
Command:
awk -f tmp.sh timeline
Output:
awk: tmp.sh:1: function cvtTime(w) { return date --date="Thu May 14 23:40:52 +0900 2020" +"%s" }
awk: tmp.sh:1: ^ syntax error
What about timeline file has multiple lines? Like:
Wed Aug 25 22:37:44 2021 JACK
Wed Aug 26 22:37:44 2021 EMILY
Wed Aug 27 22:37:44 2021 SAM
I tried:
#!/bin/bash
while read -r line; do
date --date="${1} ${2} ${3} ${4} +0900 ${5}" +"%s"
done
Want:
1629898664 JACK
1629985064 EMILY
1630071464 SAM
But it doesn't work :(
It seems that you want a shell script that is invoked with five command line parameters:
A weekday (in a three-letter format)
A month (in a three-letter format)
Day-of-month
A time expression (HH:MM:SS)
A year (four digits)
(Note that 1. is redundant, it is implied by 2., 3., and 5.)
Hence a somewhat minimal shell script would look sth. like this:
#!/bin/bash
date --date="${1} ${2} ${3} ${4} +0900 ${5}" +"%s"
Of course, this can be greatly improved, e.g., by adding sanity checks for the passed parameters.
In case you want to store the date information in a file so that you can pass a single filename parameter to the script instead (allowing for multiple such lines), the following variation will do:
#!/bin/bash
while read -a i; do
echo $(date --date="${i[0]} ${i[1]} ${i[2]} ${i[3]} +0900 ${i[4]}" +"%s") ${i[5]}
done < ${1}
Note, however, that this version expects an additional name parameter after the date information in each line.
In any event, no need for awk here.

Parsing java logs for multiline entries using bash

I have loads of java logs on a Linux machine and I'm trying to find a grep expression or something else (perl, awk) that gives me the entire log entry on a match somewhere in its body. Logstash looks like it could do the job, but something with onboard tools would be way better.
An example should help best. Here is an exemplary log with 5 different entries:
25 Aug 2016 14:00:46,435 DEBUG [User][IP][rsc] An error occurred
java.Exception: Foo1
at xyz
25 Aug 2016 14:00:46,436 Foo2 [User][IP][rsc] Some error occured
25 Aug 2016 14:00:46,436 DEBUG [User][IP][rsc] Somethin occured Foo3
25 Aug 2016 14:18:18,224 XYZ [User][IP][rsc] Some problems
More: bla1
More: bla2
USER.bla.bla: Blala::123 - 456
More: Could not open something
at 567
at 890
Caused by: Foo4: Could not open another thing
at 123
at 456
... 127 more
Caused by: gaga
at a1a2a3
at b3b3b3
... 146 more
25 Aug 2016 14:18:20,118 SSO [User][IP][rsc] Process: error -
Could not Foo5
<here is a blank line>
When I search for "Foo1", I need:
25 Aug 2016 14:00:46,435 DEBUG [User][IP][rsc] An error occurred
java.Exception: Foo1
at xyz
When I search for "Foo2":
25 Aug 2016 14:00:46,436 Foo2 [User][IP][rsc] Some error occured
For "Foo3":
25 Aug 2016 14:00:46,436 DEBUG [User][IP][rsc] Somethin occured Foo3
For "Foo4":
25 Aug 2016 01:18:18,224 XYZ [User][IP][rsc] Some problems
More: bla1
More: bla2
USER.bla.bla: Blala::123 - 456
More: Could not open connection
at 567
at 890
Caused by: Foo4: Could not open connection
at 123
at 456
... 127 more
Caused by: gaga
at a1a2a3
at b3b3b3
... 146 more
And finally for "Foo5":
25 Aug 2016 01:18:20,118 SSO [User][IP][rsc] Process: error -
Could not Foo5
When I search for "Foo", everything should be returned.
Is something like this possible? Maybe even as a one liner?
I would like to use it in a Webmin Custom Commands module where I supply the expression via variable.
The only basic idea I have at the moment is search for the expression and use the "[" as pattern to identify where a new entry begins.
Thanks in advance for anybody who has an idea!
A sed solution - good for environments where awk is not allowed - same sed command is shown in oneliner and multiline forms
pat=$1
# oneliner form
#sed -nr '/^[0-9]{2} [a-zA-Z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3} /!{H; $!b}; x; /'"$pat"'/p; ${g; /^[0-9]{2} [a-zA-Z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3} /!q; /'"$pat"'/p }'
# multiline form
sed -nr '
/^[0-9]{2} [a-zA-Z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3} /!{H; $!b}
x
/'"$pat"'/p
${
g
/^[0-9]{2} [a-zA-Z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3} /!q
/'"$pat"'/p
}'
uses timestamp at beginning of line as record start - accumulates non-timestamp lines i.e. record body in holdspace - swaps holdspace and patternspace on record start - prints record if pattern is matched
special case for record start on last line - it has to be re-gotten from holdspace and separately tested for pattern match
shell quoting needed to construct sed command with pat bash variable
I set awk RS to the timestamp pattern for multiline records:
pat=$1
awk -vpat="$pat" '
BEGIN{
RS="[0-9]{2} [a-zA-Z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3} "
}
$0 ~ pat {printf("%s%s", prt, $0)}
{prt=RT}
'

I want to convert 18-Aug-2015 date format to '2015-08-18' using shell script

I want to convert 18-Aug-2015 date format to '2015-08-18' using shell script
Try this formatting:
$ date +"%Y-%m-%d"
http://www.cyberciti.biz/faq/linux-unix-formatting-dates-for-display/
The -d option is GNU specific.
Here, you don't need to do date calculation, just rewrite the string which already contains all the information:
a=$(printf '%s\n' "$Prev_date" | awk '{
printf "%04d-%02d-%02d\n", $6, \
(index("JanFebMarAprMayJunJulAugSepOctNovDec",$2)+2)/3,$3}')
Without awk, assuming your initial date is in $mydate:
IFS=- d=($mydate)
months=(Zer Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
z=1
while [[ ${months[$z]} != ${d[1]} ]]; do z=$((z+1)); done
printf "%s-%02d-%s\n" ${d[2]} $z ${d[0]}

Bash script assistance with renaming file using existing parts of filename

I'm looking for help with a bash script to do some renaming of files for me. I don't know much about bash scripting, and what I have read is overwhelming. It's a lot to know/understand for the limited applications I will probably have.
In Dropbox, my media files are named something like:
Photo Jul 04, 5 49 44 PM.jpg
Video Jun 22, 11 21 00 AM.mov
I'd like them to be renamed in the following format: 2015-07-04 1749.ext
Some difficulties:
The script has to determine if AM or PM to put in the correct 24-hour format
The year is not specified; it is safe to assume the current year
The date, minute and second have a leading zero, but the hour does not; therefore the position after the hour is not absolute
Any assistance would be appreciated. FWIW, I'm running MacOS.
Mac OSX
This uses awk to reformat the date string:
for f in *.*
do
new=$(echo "$f" | awk -F'[ .]' '
BEGIN {
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",month)
for (i in month) {
nums[month[i]]=i
}
}
$(NF-1)=="PM" {$4+=12;}
{printf "%s 2015-%02i-%02i %02i%02i.%s",$1,nums[$2],$3,$4,$5,$8;}
')
mv "$f" "$new"
done
After the above was run, the files are now named:
$ ls -1 *.*
Photo 2015-07-04 1749.jpg
Video 2015-06-22 1121.mov
The above was tested on GNU awk but I don't believe that I have used any GNU-specific features.
GNU/Linux
GNU date has a handy feature for interpreting human-style date strings:
for f in *.*
do
prefix=${f%% *}
ext=${f##*.}
datestr=$(date -d "$(echo "$f" | sed 's/[^ ]* //; s/[.].*//; s/ /:/3; s/ /:/3; s/,//')" '+%F %H%M')
mv "$f" "$prefix $datestr.$ext"
done
Here is an example of the script in operation:
$ ls -1 *.*
Photo Jul 04, 5 49 44 PM.jpg
Video Jun 22, 11 21 00 AM.mov
$ bash script
$ ls -1 *.*
Photo 2015-07-04 1749.jpg
Video 2015-06-22 1121.mov
While not a simple parse and reformat for date, it isn't that difficult. The bash string tools of parameter expansion/substring removal are all you need to parse the pieces of the date into a format that date can use to output a new date string in the format for use in a filename. (see String Manipulation ) date -d is used to generate a new date string based on the contents of the original filename.
Note: the following presumes the dropbox filenames are in the format you have specified. (it doesn't care what the first part of the name or extension is as long as it matches the format you have specified) Here is an example of properly isolating the pieces of the filename needed to generate a date in the format specified)
Further, all spaces have been removed from the filename. While you originally showed a space between the day and hours, I will not provide an example of poor practice by inserting a space in a filename. As such, the spaces have been replaced with '_' and '-':
#!/bin/bash
# Photo Jul 04, 5 49 44 PM.jpg
# Video Jun 22, 11 21 00 AM.mov
# fn="Photo Jul 04, 5 49 44 PM.jpg"
fn="Video Jun 22, 11 21 00 AM.mov"
ext=${fn##*.} # determine extension
prefix=${fn%% *} # determine prefix (Photo or Video)
datestr=${fn%.${ext}} # remove extension from filename
datestr=${datestr#${prefix} } # remove prefix from datestr
day=${datestr%%,*} # isolate Month and date in day
ampm=${datestr##* } # isloate AM/PM in ampm
datestr=${datestr% ${ampm}} # remove ampm from datestr
timestr=${datestr##*, } # isolate time in timestr
timestr=$(tr ' ' ':' <<<"$timestr") # translate spaces to ':' using herestring
cmb="$day $timestr $hr" # create combined date/proper format
## create date/time string for filename
datetm=$(date -d "$cmb" '+%Y%m%d-%H%M')
newfn="${prefix}_${datetm}.${ext}"
## example moving of file to new name
# (assumes you handle the path correctly)
printf "mv '%s' %s\n" "$fn" "$newfn"
# mv "$fn" "$newfn" # uncomemnt to actually use
exit 0
Example/Output
$ bash dateinfname.sh
mv 'Video Jun 22, 11 21 00 AM.mov' Video_20150622-1121.mov

Cutting part of line/string in shell scripting

I have the following line:
Jan 13, 2014 1:01:31 AM
I want to remove the seconds part of the line. The result should be:
Jan 13, 2014 1:01 AM
How can this be done ?
Use parameter expansion:
t='Jan 13, 2014 1:01:31 AM'
ampm=${t: -2} # last two characters
echo "${t%:*} $ampm" # remove everything after the last :
Using sed:
s='Jan 13, 2014 1:01:31 AM'
sed 's/:[0-9]*\( [AP]M\)/\1/' <<< "$s"
Jan 13, 2014 1:01 AM
you can give this a try:
sed 's/:[^:]* / /'
with your example:
kent$ (master|✚2) echo "Jan 13, 2014 1:01:31 AM"|sed 's/:[^:]* / /'
Jan 13, 2014 1:01 AM
Another way, if your date command is gnu date which support -d option.
$ str="Jan 13, 2014 1:01:31 AM"
$ date -d "$str" +"%b %d, %Y %l:%M %p"
Jan 13, 2014 1:01 AM

Resources