How to extract text by date range? - bash

I am trying to extract text from a file according a given date range. The date range will be decided by the user but here I am just using a fixed range.
File content after using grep are as follows:
ronnie#ronnie:~$ zgrep added new.txt
Jul 02 21:03 : update: added Linkin Park/Living Things(2012)/02 - Linkin Park - In My Remains.mp3
Jul 02 21:03 : update: added Linkin Park/Living Things(2012)/03 - Linkin Park - Burn It Down.mp3
Jul 07 10:33 : update: added Linkin Park/Living Things(2012)/04 - Linkin Park - Lies Greed Misery.mp3
Jul 09 07:54 : update: added Linkin Park/Living Things(2012)/04 - Linkin Park - Lies Greed Misery.mp3
Now, lets suppose I want to extract the text between date Jul 07 and Jul 09. So I used the below command for that
zgrep added new.txt | sed '/"Jul 09"/,/"Jul 07"/p'
which gave me the following output
Jul 02 21:03 : update: added Linkin Park/Living Things(2012)/02 - Linkin Park - In My Remains.mp3
Jul 02 21:03 : update: added Linkin Park/Living Things(2012)/03 - Linkin Park - Burn It Down.mp3
Jul 07 10:33 : update: added Linkin Park/Living Things(2012)/04 - Linkin Park - Lies Greed Misery.mp3
Jul 09 07:54 : update: added Linkin Park/Living Things(2012)/04 - Linkin Park - Lies Greed Misery.mp3
So, as you can see it didn't considered the range i gave to sed.
My question is what should be the correct way to extract text according to a date range.

For ordered input,
command | sed -n '/^Jul 07/,/^Jul 09/p' inputFile
is sufficient.

You're pretty close, what you want is this:
zgrep added new.txt | sed -n -e '/Jul 09/,/Jul 07/p'
Changes were:
added -n, which means lines won't be printed unless you specifically say to with p
added -e, just for clarity
removed your double quotes around the strings. These are not needed because the expression is already enclosed in single quotes and the double quotes do not appear in your file.
Note that this, and your version, will only work if the lines are always ordered by date/time in the first place.

Related

Parse line for specific date format?

Writing a script using bash. I am trying to look through lines in a file for a specific date format:
date +"%a %b %d %T %Z %Y"
For example, if the line were
/foo/bar/foobar this 12 is 411 arbitrary stuff in the line Wed Jun 10 10:10:10 PST 2017
I would want to obtain Wed Jun 10 10:10:10 PST 2017.
Any way to search for specific date formats?
I'm not sure whether you'll agree with this approach. But if this is for some quick, non-recurring work, I won't look for a perfect solution that can handle all the scenarios.
To start with, you can use the following too generic pattern to match the part you want.
cat file | sed -n 's/.*\(... ... .. ..:..:.. ... ....\).*/\1/p'
Then you can enhance this further restricting the matches as you need.
E.g.
cat file | sed -n 's/.*\([a-Z]\{3\} [a-Z]\{3\} [0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9] [A-Z]\{3\} [0-9]\{4\}\).*/\1/p'
Note that this still is not perfect and can match invalid contents. If you find it still not good enough, you can further fine tune the pattern to the point you want.

Convert .CSV file to .txt file using bash

I have a CSV with lots of lines delimited by comma. I have to convert it to text.
66012523,39,Feb 02 2015 05:19AM,
66012523,39,Feb 02 2015 09:53AM,
66012523,39,Feb 02 2015 01:38PM,
I used command cp source.csv destination.csv and also cat source.csv > destination.txt but it does output in the same format witch each line coming in new one. It just gets appended together. It outputs like
66012523,39,Feb 02 2015 05:19AM,66012523,39,Feb 02 2015 09:53AM,66012523,39,Feb 02 2015 01:38PM
How do I make them to output each line in newline. Please help.
I hypothesise that the first block in your question is what you WANT, and what you actually HAVE is
66012523,39,Feb 02 2015 05:19AM,66012523,39,Feb 02 2015 09:53AM,66012523,39,Feb 02 2015 01:38PM
So what you want to do, I hypothesise, is split this up into separate lines.
Am I right?
This is a bit rough-and-ready but it works. Relies on each group ending "M," and there being no other "M,"s in the text - which there aren't but I wouldn't call it robust.
sed s/M,/'\n'/g source.csv

SHELL SCRIPT: Save egrep results into a Variable

Hi I am trying to Save my egrep results into a variable and do a foreach.
However, i keep getting the following error despite with the following type of codes
#!/bin/sh
RESULT1=$(egrep 'Begin|End' $SYNCLOG)
RESULT2=egrep 'Begin|End' $SYNCLOG
RESULT3="egrep 'Begin|End' $SYNCLOG"
Errror
./test.sh: syntax error at line 24: `RESULT=$' unexpected
I am trying to get my egrep results to be saved into the variable.
The egrep will return the following results
File 2:Begin - Date :Fri Jan 10 22:44:47 SGT 2014
File 2:End - Date :Fri Jan 10 22:47:06 SGT 2014
File 3:Begin - Date : Tue Jan 11 22:32:54 SGT 2014
File 3:End - Date : Tue Jan 11 22:34:43 SGT 2014
File 4:Begin - Date : Wed Jan 12 22:46:15 SGT 2014
File 4:End - Date : Wed Jan 12 22:48:23 SGT 2014
File 5:Begin - Date : Thu Jan 13 22:30:31 SGT 2014
File 5:End - Date : Thu Jan 13 22:32:51 SGT 2014
Problem is this shebang of sh:
#!/bin/sh
And use of $(...), which is a BASH syntax.
To fix, you can use this shebang to use bash instead:
#!/bin/bash
Or else use this command substitution syntax in /bin/sh:
RESULT1=`egrep 'Begin|End' $SYNCLOG`
it seems you have backticks somewhere on line 24. Paste your whole script. Above shell script excerpt i.e.
RESULT1=$(egrep 'Begin|End' $SYNCLOG)
Should work.

Use cat to combine mp3 files based on filename

I have a large number of downloaded radio programs that consist of 4 mp3 files each. The files are named like so:
Show Name - Nov 28 2011 - Hour 1.mp3
Show Name - Nov 28 2011 - Hour 2.mp3
Show Name - Nov 28 2011 - Hour 3.mp3
Show Name - Nov 28 2011 - Hour 4.mp3
Show Name - Nov 29 2011 - Hour 1.mp3
Show Name - Nov 29 2011 - Hour 2.mp3
Show Name - Nov 29 2011 - Hour 3.mp3
Show Name - Nov 29 2011 - Hour 4.mp3
Show Name - Nov 30 2011 - Hour 1.mp3 and so on...
I have used the cat command to join the files with great success by moving the four files of the same date into a folder and using the wildcard:
cat *.mp3 > example.mp3
The files are all the same bitrate, sampling rate, etc.
What I would like to do is run a script that will look at the file name and combine hours 1-4 of each date and name the file accordingly. Just the show name, the date and drop the 'Hour 1'.
I looked around and found a number of scripts that can be used to move files around based on their names but I'm not adept enough at bash scripting to be able to understand the methods used and adapt them to my needs.
I'm using Ubuntu 14.04.
Many thanks in advance
You can use a bash for loop to find each distinct date name and then construct the expected mp3 names from that.
Because your files have spaces in their names and my solution uses globbing, you'll also have to edit your Internal Field Separator to ignore spaces for the duration of the script.
SAVEIFS=$IFS
IFS=$'\n\b'
for mdy in `/bin/ls *mp3 | cut -d' ' -f'4,5,6' | sort -u`; do
cat *${mdy}*.mp3 > "showName_${mdy}_full.mp3"
done
IFS=$SAVEIFS
This won't alert you if some hours are missing for some particular date. It'll just join together whatever's there for that date.
Note: The comment pointing out that cat probably won't work for these files is spot on. The resulting file will probably be corrupted. You probably want to use something like mencoder or ffmpeg instead. (Check out this thread.)

How can I use the Chronic natural language date/time parser to parse "12:00" as 12:00 PM?

I am using the ruby gem "Chronic" to parse four digit strings as DateTime objects. I am using time in military format (ie: "0800") which seems from the documentaion to be a valid format.
In most cases, Chronic parses time in this format correctly - however it always parses a four digit string beginning with "12" as 00:XX AM of the next day, never as 12:XX PM of the current day.
For example:
>> Chronic.parse("1234")
=> Thu Sep 17 00:34:00 -0600 2009
I see that if I put a colon between the hours and minutes I get the desired output:
>> Chronic.parse("12:34")
=> Wed Sep 16 12:34:00 -0600 2009
I am however wanting to pass the value without a colon, like this:
>> Chronic.parse("1234")
=> Wed Sep 16 12:34:00 -0600 2009
What string do I have to pass to the parser in order for Chronic to interpret "1234" as 12:34 PM of the current day?
I'm not certain, but it looks like it might be a bug. My guess is you're ending up in this corner of the code:
http://github.com/mojombo/chronic/commit/c7d9591acf5179345cbc916bd509c48acee8e744

Resources