So, everything works fine in the code, except for one tiny little thing.
This part:
if [ "$LIMITHOURS" -gt "0" -a "$LIMITHOURS" -lt "24" ]; then
SDATE=$( echo "01/jan/2003:11:00:06 +0100"| sed 's/[/]/ /g' |sed 's/:/ /')
EDATE=$(date --date "$SDATE - $x seconds" +"%d%m%Y%H%M%S")
#echo "$SDATE"
#echo "$EDATE"
while read LINE; do
CDATE=$( awk '{print $4}'| sed 's/[[]//' | sed 's/[/]//g' |sed 's/://g' )
DATE=$(date --date "$CDATE" +"%d%m%Y%H%M%S")
#echo "$CDATE"
done < "$FILENAME"
When I try to run the script, I get the error message "date: Argument list too long
" and I know that the problem is in the while loop, with:
DATE=$(date --date "$CDATE" +"%d%m%Y%H%M%S")
Anyone who know any solution for this? I want the date format in ddmmYYYYHHMMSS, eg. 23102002120022
You can find rest of the script here:

This code:
while read LINE; do
CDATE=$( awk '{print $4}'| sed 's/[[]//' | sed 's/[/]//g' |sed 's/://g' )
DATE=$(date --date "$CDATE" +"%d%m%Y%H%M%S")
#echo "$CDATE"
done < "$FILENAME"
will read one line from $FILENAME into the variable LINE, but then the first call to awk is reading the rest of the lines. The resulting CDATE value is probably too large to fit in a single command line, never mind it containing too many dates. You probably wanted
echo "$LINE" | awk '{print $4}' | ...
A simpler way to strip the undesirable characters from LINE, however, is


Using sed to extract the middle of a line/filename

I have multiple files named:
I want to use sed to print out:
I want to use the "printed" words in a command like this (prokka is a tool for genome annotation):
prokka $file --outdir `echo $file | sed s/\.fasta//` --genus `echo $file | sed s/_.*\.fasta//` --species `echo $file | sed <something here>` --strain `echo $file | sed <something here>`
I would appreciate the help. I am very new to all of this, and as you see above, I only know how to print out Genus.
Below I have some additional questions (no need to answer these if it only complicates things further). This is one of my attempts to print species, and the questions are the following:
sed s/.*_//1 | sed s/_.*\.fasta//
I know the second command isn't correct. I assume it needs to start from the second _, but I don't know how to do that, since the continuation (that is .fasta) is unique.
When used alone, sed s/.*_//1 returns strain.fasta. How to make it not skip the first _?
Combining commands (either as you see above, or with ;) doesn't seem to work for me.
You can use string splitting with string manipulation:
IFS='[_.]' read -r genus species strain _ <<< "$file"
Then you can use the variables in the command:
prokka "$file" --outdir "$outdir" --genus "$genus" --species "$species" --strain "$strain"
See this online demo:
IFS='[_.]' read -r genus species strain _ <<< "$file"
echo "${file%.*}" # outdir
echo "$genus"
echo "$species"
echo "$strain"
One liners without setting multiple varibles
Using sed capture groups:
One liner
$(echo "$file" | sed "s/\(^[^_]*\)_\([^_]*\)_\([^_]*\)\.\(.*\)/prokka "$(echo "$file")" --outdir \4 --genus \1 --species \2 --strain \3/")
Using Bash string manipulation:
One liner
$(echo prokka "$file" --outdir `echo "${file#*.}"` --genus `echo "${file%%_*}"` --species "$(echo `file=${file#*_} && echo "${file%%_*}"`)" --strain "$(echo `file=${file#*_} && file=${file#*_} && echo "${file%%.*}"`)")
Awk one liner
$(echo "$file" | awk -F [_\.] -v var="$file" '{print "prokka " $var " --outdir " $4 " --genus " $1 " --species " $2 " --strain " $4}')
Now you can use above commands within loop or with xargs with file variable pointing to filenames.
It will create a prokka command and directly evaluates/executes it.
Hoping it works for you. Accept answer if it is more efficient
Using sed
$ file=path_to_file
$ sed "s/\(\([^_]*\)_\([^_]*\)_\([^.]*\)\).*/prokka $file --outdir \1 --genus \2 --species \3 --strain \4/e" <(echo *.fasta)
Output of command executed
prokka path_to_file --outdir Genus_species_strain --genus Genus --species species --strain strain

Extracting a substring from a variable using bash script

I have a bash variable with value something like this:
There are no spaces within value. This value can be very long or very short. Here pairs such as 65:3.0 exist. I know the value of a number from the first part of pair, say 65. I want to extract the number 3.0 or pair 65:3.0. I am not aware of the position (offset) of 65.
I will be grateful for a bash-script that can do such extraction. Thanks.
Probably awk is the most straight-forward approach:
awk -F: -v RS=',' '$1==65{print $2}' <<< "$var"
Or to get the pair:
$ awk -F: -v RS=',' '$1==65' <<< "$var"
Here's a pure Bash solution:
while read -r -d, i; do
[[ $i = 65:* ]] || continue
echo "$i"
done <<< "$var,"
You may use break after echo "$i" if there's only one 65:... in var, or if you only want the first one.
To get the value 3.0: echo "${i#*:}".
Other (pure Bash) approach, without parsing the string explicitly. I'm assuming you're only looking for the first 65 in the string, and that it is present in the string:
echo "$value"
This will be very slow for long strings!
Same as above, but will output all the values corresponding to 65 (or none if there are none):
while [[ $tmpvar = *,65:* ]]; do
echo "${tmpvar%%,*}"
Same thing, this will be slow for long strings!
The fastest I can obtain in pure Bash is my original answer (and it's fine with 10000 fields):
IFS=, read -ra ary <<< "$var"
for i in "${ary[#]}"; do
[[ $i = 65:* ]] || continue
echo "$i"
In fact, no, the fastest I can obtain in pure Bash is with this regex:
[[ ,$var, =~ ,65:([^,]+), ]] && echo "${BASH_REMATCH[1]}"
Test of this vs awk,
where the 65:3.0 is at the end:
printf -v var '%s:3.0,' {100..11000}
time awk -F: -v RS=',' '$1==65{print $2}' <<< "$var"
shows 0m0.020s (rough average) whereas:
time { [[ ,$var, =~ ,65:([^,]+), ]] && echo "${BASH_REMATCH[1]}"; }
shows 0m0.008s (rough average too).
where the 65:3.0 is not at the end:
printf -v var '%s:3.0,' {1..10000}
time awk -F: -v RS=',' '$1==65{print $2}' <<< "$var"
shows 0m0.020s (rough average) and with early exit:
time awk -F: -v RS=',' '$1==65{print $2;exit}' <<< "$var"
shows 0m0.010s (rough average) whereas:
time { [[ ,$var, =~ ,65:([^,]+), ]] && echo "${BASH_REMATCH[1]}"; }
shows 0m0.002s (rough average).
With grep:
grep -o '\b65\b[^,]*' <<<"$var"
grep -oP '\b65\b:\K[^,]*' <<<"$var"
\K option ignores everything before matched pattern and ignore pattern itself. It's Perl-compatibility(-P) for grep command .
Here is an gnu awk
awk -vRS="(^|,)65:" -F, 'NR>1{print $1}' <<< "$var"
echo $var | tr , '\n' | awk '/65/'
tr , '\n' turn comma to new line
awk '/65/' pick the line with 65
echo $var | tr , '\n' | awk -F: '$1 == 65 {print $2}'
-F: use : as separator
$1 == 65 pick line with 65 as first field
{ print $2} print second field
Using sed
sed -e 's/^.*,\(65:[0-9.]*\),.*$/\1/' <<<",$var,"
There are two different ways to protect against 65:3.0 being the first-in-line or last-in-line. Above, commas are added to surround the variable providing for an occurrence regardless. Below, the Gnu extension \? is used to specify zero-or-one occurrence.
sed -e 's/^.*,\?\(65:[0-9.]*\),\?.*$/\1/' <<<$var
Both handle 65:3.0 regardless of where it appears in the string.
Try egrep like below:
echo $myvar | egrep -o '\b65:[0-9]+.[0-9]+' |

Weird bash results using cut

I am trying to run this command:
./smstocurl SLASH2.911325850268888.911325850268896
smstocurl script:
model=$(echo \&model=$1 | cut -d'.' -f 1)
echo $model
imea1=$(echo \&simImea1=$1 | cut -d'.' -f 2)
echo $imea1
imea2=$(echo \&simImea2=$1 | cut -d'.' -f 3)
echo $imea2
echo $model$imea1$imea2
Result Received
Result Expected
What am I missing here ?
You are cutting based on the dot .. In the first case your desired string contains the first string, the one containing &model, so then it is printed.
However, in the other cases you get the 2nd and 3rd blocks (-f2, -f3), so that the imea text gets cutted off.
Instead, I would use something like this:
while IFS="." read -r model imea1 imea2
printf "&model=%s&simImea1=%s&simImea2=%s\n" $model $imea1 $imea2
done <<< "$1"
Note the usage of printf and variables to have more control about what we are writing. Using a lot of escapes like in your echos can be risky.
while IFS="." read -r model imea1 imea2; do printf "&model=%s&simImea1=%s&simImea2=%s\n" $model $imea1 $imea2
done <<< "SLASH2.911325850268888.911325850268896"
Alternatively, this sed makes it:
sed -r 's/^([^.]*)\.([^.]*)\.([^.]*)$/\&model=\1\&simImea1=\2\&simImea2=\3/' <<< "$1"
by catching each block of words separated by dots and printing back.
You can also use this way
./program SLASH2.911325850268888.911325850268896
String=`echo $1 | sed "s/\./\&simImea1=/"`
String=`echo $String | sed "s/\./\&simImea2=/"`
echo "&model=$String
awk way
awk -F. '{print "&model="$1"&simImea1="$2"&simImea2="$3}' <<< "SLASH2.911325850268888.911325850268896"
awk -F. '$0="&model="$1"&simImea1="$2"&simImea2="$3' <<< "SLASH2.911325850268888.911325850268896"

Parsing date and time format - Bash

I have date and time format like this(yearmonthday):
20141105 11:30:00
I need assignment year, month, day, hour and minute values to variable.
I can do it year, day and hour like this:
year=$(awk '{print $1}' log.log | sed 's/^\(....\).*/\1/')
day=$(awk '{print $1}' log.log | sed 's/^.*\(..\).*/\1/')
hour=$(awk '{print $2}' log.log | sed 's/^\(..\).*/\1/')
How can I do this for month and minute?
And I need that every line of my log file:
20141105 11:30:00 /bla/text.1
20141105 11:35:00 /bla/text.2
20141105 11:40:00 /bla/text.3
I'm trying read line by line this log file and do this:
mkdir -p "/bla/backup/$year/$month/$day/$hour/$minute"
mv $file "/bla/backup/$year/$month/$day/$hour/$minute"
Here is my not working code:
while read line
file=$(awk '{print $3}')
if [ -f "$file" ]; then
printf -v path "%s/%s/%s/%s/%s" $year $month $day $hour $minute
mkdir -p "/bla/backup/$path"
mv $file "/bla/backup/$path"
done < $LOG
You don't need to call out to awk to date at all, use bash's substring operations
d="20141105 11:30:00"
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" $yr $mo $dy $hr $mi
echo "$dir"
Or directly, without all the variables.
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" ${d:0:4} ${d:4:2} ${d:6:2} ${d:9:2} ${d:12:2}
Given your log file:
while read -r date time file; do
d="$date $time"
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" ${d:0:4} ${d:4:2} ${d:6:2} ${d:9:2} ${d:12:2}
mkdir -p "$dir"
mv "$file" "$dir"
done < filename
or, making a big assumption that there are no whitespace or globbing characters in your filenames:
sed -r 's#(....)(..)(..) (..):(..):.. (.*)#mv \6 /blah/\1/\2/\3/\4/\5#' | sh
date command also do this work
year=$(date +'%Y' -d'20141105 11:30:00')
day=$(date +'%d' -d'20141105 11:30:00')
month=$(date +'%m' -d'20141105 11:30:00')
minutes=$(date +'%M' -d'20141105 11:30:00')
echo "$year---$day---$month---$minutes"
You can use only one awk
month=$(awk '{print substr($1,5,2)}' log.log)
year=$(awk '{print substr($1,0,4)}' log.log)
minute=$(awk '{print substr($2,4,2)}' log.log)
I guess you are processing the log file, which each line starts with the date string. You may have already written a loop to handle each line, in your loop, you could do:
d="$(awk '{print $1,$2}' <<<"$line")"
year=$(date -d"$d" +%Y)
month=$(date -d"$d" +%m)
day=$(date -d"$d" +%d)
min=$(date -d"$d" +%M)
Don't repeat yourself.
d='20141105 11:30:00'
IFS=' ' read -r year month day min < <(date -d"$d" '+%Y %d %m %M')
echo "year: $year"
echo "month: $month"
echo "day: $day"
echo "min: $min"
The trick is to ask date to output the fields you want, separated by a character (here a space), to put this character in IFS and ask read to do the splitting for you. Like so, you're only executing date once and only spawn one subshell.
If the date comes from the first line of the file log.log, here's how you can assign it to the variable d:
IFS= read -r d < log.log
eval "$(
echo '20141105 11:30:00' \
| sed 'G;s/\(....\)\(..\)\(..\) \(..\):\(..\):\(..\) *\(.\)/Year=\1\7Month=\2\7Day=\3\7Hour=\4\7Min=\5\7Sec=\6/'
pass via a assignation string to evaluate. You could easily adapt to also check the content by replacing dot per more specific pattern like [0-5][0-9] for min and sec, ...
posix version so --posix on GNU sed
I wrote a function that I usually cut and paste into my script files
function getdate()
local a
a=(`date "+%Y %m %d %H %M %S" | sed -e 's/ / /'`)
in the script file, on a line of it's own
echo "year=$year,month=$month,day=$day,hour=$hour,minute=$minute,second=$sec"
Of course, you can modify what I provided or use answer [6] above.
The function takes no arguments.

shell $ character overwrites the previous variable

I'm trying to write a script on shell but I'm stucked on a point.
I have a program creating data daily and puting it to a directory like this: home/meee/data/2013/07/22/mydata
My problem is I'm trying to change directory using date. Here is my script:
x=$(date -u -v-2H "+%Y-%m-%d")
echo $x
year=$(echo $x | cut -d"-" -f1)
month=$(echo $x | cut -d"-" -f2)
day=$(echo $x | cut -d"-" -f3)
echo $year
echo $month
echo $day
echo $d1
There is no problem related to year, day, month, they are working. But the output of echo d1 is /07me/sensor/data/2013. Similarly, when I write echo $year$day it gives 2312 (characters of day is overwritten on the first two characater of the year)
I tried many other syntax like instead of ' character put " or leave it empty. Removing { and so on. But nothing changed.
Shortly, when I write two variable ($var1 $var2) in same line the second $ behaves like go to the beginning of the line and start overwriting the first variable.
I've been looking for that but there is nothing related to that or I couldn't find anything related and there are a lot of solution in Stackoverflow that solves the problem using $var1$var2
What am I doing wrong, or how can I solve that.
I'm working on FreeBSD 9.0-RELEASE amd64 and using sh
Any help will be appreciated.
Somehow, your commands are introducing carriage returns to your variables, which affect the output when the variable is not the last thing echoed. You can confirm this by passing the value through hexdump or od:
printf "%s" "$x" | hexdump -C # Look for 0d in the output.
printf "%s" "$year" | hexdump -C # Look for 0d in the output.
printf "%s" "$month" | hexdump -C # Look for 0d in the output.
printf "%s" "$day" | hexdump -C # Look for 0d in the output.
I don't think this will fix the problem, but you can get the year, month, and day without forking so many external programs:
IFS=- read year month day <<EOF
$(date -u -v-2H "+%Y-%m-%d")
or more simply
read year month day <<EOF
$(date -u -v-2H "+%Y %m %d")
You’re likely using MinGW or some other not-quite-Unix environment for Windows®, which is introducing Carriage Return (CR, \r) characters at end-of-line (from the Unix PoV).
So change this to either:
x=$(date -u -v-2H "+%Y-%m-%d" | sed $'s/\r$//')
echo $x
year=$(echo $x | cut -d"-" -f1 | sed $'s/\r$//')
month=$(echo $x | cut -d"-" -f2 | sed $'s/\r$//')
day=$(echo $x | cut -d"-" -f3 | sed $'s/\r$//')
Or, even better:
x=$(date -u -v-2H "+%Y %m %d ")
echo $x
set -- $x
Note the extra space after %d which ensures that the CR will become $4 instead of attached to the day.
