I need to swap characters of a string (which is mmddyyyy format) and rearrange them in yyyymmdd. This string is obtained from a file name (abc_def_08032011.txt).
string=$(ls abc_def_08032011.txt | awk '{print substr($0,9,8)}')
For example:
Current string: 08032011 (This may not necessarily be the current date)
Desired string: 20110803
I tried split function, but it won't work since the string does not have any delimiter.
Any ideas/suggestions greatly appreciated.
echo 08032011 | sed 's/\(....\)\(....\)/\2\1/'
or
echo 08032011 | perl -pe 's/(....)(....)/$2$1/'
Why not using awk all the way:
echo abc_def_08032011.txt | awk '{print substr($0,13,4) substr($0,9,4)}'
or sed all the way, avoiding one awk:
echo abc_def_08032011.txt | sed 's/^........\(....\)\(....\).*$/\2\1/'
or using ksh substitution all the way to avoid spawning a awk/sed process:
s=abc_def_08032011.txt
s1="${s#????????}"
s2="${s1%.*}"
echo "${s2#????}${s2%????}"
Related
I am trying to use awk to get the name of a file given the absolute path to the file.
For example, when given the input path /home/parent/child/filename I would like to get filename
I have tried:
awk -F "/" '{print $5}' input
which works perfectly.
However, I am hard coding $5 which would be incorrect if my input has the following structure:
/home/parent/child1/child2/filename
So a generic solution requires always taking the last field (which will be the filename).
Is there a simple way to do this with the awk substr function?
Use the fact that awk splits the lines in fields based on a field separator, that you can define. Hence, defining the field separator to / you can say:
awk -F "/" '{print $NF}' input
as NF refers to the number of fields of the current record, printing $NF means printing the last one.
So given a file like this:
/home/parent/child1/child2/child3/filename
/home/parent/child1/child2/filename
/home/parent/child1/filename
This would be the output:
$ awk -F"/" '{print $NF}' file
filename
filename
filename
In this case it is better to use basename instead of awk:
$ basename /home/parent/child1/child2/filename
filename
If you're open to a Perl solution, here one similar to fedorqui's awk solution:
perl -F/ -lane 'print $F[-1]' input
-F/ specifies / as the field separator
$F[-1] is the last element in the #F autosplit array
Another option is to use bash parameter substitution.
$ foo="/home/parent/child/filename"
$ echo ${foo##*/}
filename
$ foo="/home/parent/child/child2/filename"
$ echo ${foo##*/}
filename
Like 5 years late, I know, thanks for all the proposals, I used to do this the following way:
$ echo /home/parent/child1/child2/filename | rev | cut -d '/' -f1 | rev
filename
Glad to notice there are better manners
It should be a comment to the basename answer but I haven't enough point.
If you do not use double quotes, basename will not work with path where there is space character:
$ basename /home/foo/bar foo/bar.png
bar
ok with quotes " "
$ basename "/home/foo/bar foo/bar.png"
bar.png
file example
$ cat a
/home/parent/child 1/child 2/child 3/filename1
/home/parent/child 1/child2/filename2
/home/parent/child1/filename3
$ while read b ; do basename "$b" ; done < a
filename1
filename2
filename3
I know I'm like 3 years late on this but....
you should consider parameter expansion, it's built-in and faster.
if your input is in a var, let's say, $var1, just do ${var1##*/}. Look below
$ var1='/home/parent/child1/filename'
$ echo ${var1##*/}
filename
$ var1='/home/parent/child1/child2/filename'
$ echo ${var1##*/}
filename
$ var1='/home/parent/child1/child2/child3/filename'
$ echo ${var1##*/}
filename
you can skip all of that complex regex :
echo '/home/parent/child1/child2/filename' |
mawk '$!_=$-_=$NF' FS='[/]'
filename
2nd to last :
mawk '$!--NF=$NF' FS='/'
child2
3rd last field :
echo '/home/parent/child1/child2/filename' |
mawk '$!--NF=$--NF' FS='[/]'
child1
4th-last :
mawk '$!--NF=$(--NF-!-FS)' FS='/'
echo '/home/parent/child000/child00/child0/child1/child2/filename' |
child0
echo '/home/parent/child1/child2/filename'
parent
major caveat :
- `gawk/nawk` has a slight discrepancy with `mawk` regarding
- how it tracks multiple,
- and potentially conflicting, decrements to `NF`,
- so other than the 1st solution regarding last field,
- the rest for now, are only applicable to `mawk-1/2`
just realized it's much much cleaner this way in mawk/gawk/nawk :
echo '/home/parent/child1/child2/filename' | …
'
awk ++NF FS='.+/' OFS= # updated such that
# root "/" still gets printed
'
filename
You can also use:
sed -n 's/.*\/\([^\/]\{1,\}\)$/\1/p'
or
sed -n 's/.*\/\([^\/]*\)$/\1/p'
I'm trying to replace floating-point numbers like 1.2e + 3 with their integer value 1200. For this I use sed in the following way:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\2|bc -l)/"
but the pattern parts \1 and \2 doesn't get evaluated in the echo.
Is there a way to solve this problem with sed?
Thanks in advance
Within the double quotes, \1 and \2 are interpreted as literal 1 and 2.
You need to put additional backslashes to escape them. In addition, $(command substitution) in
sed replacement seems not to work when combined with back references.
If you are using GNU sed, you can instead say something like:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/echo \"\\1*10^\\2\"|bc -l/;e"
which yields:
12000.0
If you want to chop off the decimal point, you'll know what to do ;-).
If you are happy with awk command like this can do the work:
echo 1.2e+4|awk '{printf "%d",$0}'
It is perhaps better to use perl (or other typed language) to manage the variable types:
echo '"1.2e+04"' | perl -lane 'my $a=$_;$a=~ s/"//g;print sprintf("%.10g",$a);print $a;'
In any case, your sed expression is incorrect, it should be:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\3 + \2*10^$(echo \3 - 1 | bc -l)|bc -l)/"
The best way to solve the problem properly is to use an advanced combination of # tshiono and # Romeo solutions:
sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
So it is possible to convert all such floats into arbitrary contexts.
for example:
echo '"1.2e+04"' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
"12000"
and
echo 'abc"1.2e+04"def' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
abc"12000"def
This is the input .csv file
"item1","10/11/2017 2:10pm",1,2, ...
"item2","10/12/2017 3:10pm",3,4, ...
.
.
.
Now, I want to convert the second column (date) to this specific format
date -d '10/12/2017 2:10pm' +'%Y/%m/%d %H:%M:%S', so that "10/12/2017 2:10pm" converts to "2017/10/12 14:10:00"
Expecting output file
"item1","2017/10/11 14:10:00",1,2, ...
"item2","2017/10/12 15:10:00",3,4, ...
.
.
.
I know it can be done by using bash or python, but I want to do it in one-line command. Any ideas? Is there a way to pass date result to sed?
One-liner awk approach.
awk -F',' '{gsub(/"/,"",$2); cmd="date -d\""$2"\" +\\\"%Y/%m/%d\\ %T\\\"";
cmd |getline $2; close(cmd) }1' OFS=, infile #>>outfile
"item1","2017/10/11 14:10:00",1,2, ...
"item2","2017/10/12 15:10:00",3,4, ...
This will output changes in your Terminal, you need to redirect the output to a file if you need record the output or use FILENAME to redirect the output to the input infile itself.
awk -F',' '{gsub(/"/,"",$2); cmd="date -d\""$2"\" +\\\"%Y/%m/%d\\ %T\\\"";
cmd |getline $2; close(cmd); print >FILENAME }' OFS=, infile
Or with GNU awk implementations which does support -i inplace identifier for in-place replace. see 'awk' save modifications in place
You can do it in one line, but that begs the question -- "How long of a line do you want?" Since you have it labeled 'shell' and not bash, etc., you are a bit limited in your string handling. POSIX shell provides enough to do what you want, but it isn't the speediest remedy. You are either going to end up with an awk or sed solution that calls date or a shell solution that calls awk or sed to parse old date from the original file and feeds the result to date to get your new date. You will have to work out which provides the most efficient remedy.
As far as the one-liner goes, you can do something similar to the following while remaining POSIX compliant. It simply uses awk to get the 2nd field from the file, pipes the result to a while loop which uses expr length "$field" to get the length and uses that within expr substr "$field" "2" <length expression - 2> to chop the double-quotes from the end of the original date olddt, followed by date -d "$olddt" +'%Y/%m/%d %H:%M:%S' to get newdt and finally sed -i "s;$olddt;$newdt;" to perform the substitution in place. Your one-liner (shown with auto line-continuations for readability)
$ awk -F, '{print $2}' timefile.txt |
while read -r field; do
olddt="$(expr substr "$field" "2" "$(($(expr length "$field") - 2))")";
newdt=$(date -d "$olddt" +'%Y/%m/%d %H:%M:%S');
sed -i "s;$olddt;$newdt;" timefile.txt; done
Example Input File
$ cat timefile.txt
"item1","10/11/2017 2:10pm",1,2, ...
"item2","10/12/2017 3:10pm",3,4, ...
Resulting File
$ cat timefile.txt
"item1","2017/10/11 14:10:00",1,2, ...
"item2","2017/10/12 15:10:00",3,4, ...
There are probably faster ways to do it, but this is a reasonable length one-liner (relatively speaking).
Revised less ugly sed method:
sed 's/^.*,"\|",.*//g;h;s#.*#date "+%Y/%m/%d %T" -d "&"#e;H;g;s#\n\|$#,#g;s/^/s,/' input.csv | sed -f - input.csv
Spread out, (it works the same):
sed 's/^.*,"\|",.*//g
h;
s#.*#date "+%Y/%m/%d %T" -d "&"#e;
H;
g;
s#\n\|$#,#g;
s/^/s,/' input.csv | sed -f - input.csv
Output:
"item1","2017/10/11 14:10:00",1,2, ...
"item2","2017/10/12 15:10:00",3,4, ...
How it works:
The first sed block uses the evaluate command to run date, the output of which is used to generate some new sed substitute commands. To show the new s commands, temporarily replace the shell script | pipe with a # comment:
s,10/11/2017 2:10pm,2017/10/11 14:10:00,
s,10/12/2017 3:10pm,2017/10/12 15:10:00,
These are piped to the second sed.
I have a string like X1.7_RC02.20170811110948 and I need to increase only the number between RC and the next point, example:
Original string:
X1.7_RC02.20170811110948
Incremented string:
X1.7_RC03.20170811110948
How can I increase in 1 (or more this value)?
with GNU awk for the 3rd arg to match():
$ awk 'match($0,/(.*RC)([^.]+)(.*)/,a){$0=sprintf("%s%02d%s",a[1],a[2]+1,a[3])} 1' file
X1.7_RC03.20170811110948
With GNU sed
sed -r 's/(.*)(RC0?)([1-9]+)(\..*)/echo "\1\2$((\3+1))\4"/e' <<<X1.7_RC02.20170811110948
Considering your data is same as shown example then try with following awk once too and let me know if this helps you.
awk '{val=$0;gsub(/.*RC|\..*/,"",val);val=sprintf("%02d",++val);sub(/RC[0-2]+/,"RC"val);print}' Input_file
Or if you have a string then you could print it's value and could run above command like:
echo "$var" | awk '{val=$0;gsub(/.*RC|\..*/,"",val);val=sprintf("%02d",++val);sub(/RC[0-2]+/,"RC"val);print}'
awk solution:
Initial string:
s="X1.7_RC02.20170811110948"
awk 'BEGIN{ FS=OFS="_RC"}{ n=substr($2,1,2); print $1,sprintf("%02.f",n+1) substr($2,3)}' <<< $s
The output:
X1.7_RC03.20170811110948
I want to extract a certain part of a string, if it exists. I'm interested in the xml filename, i.e i want whats between an "_" and ".xml".
This is ok, it prints "555"
MYSTRING=`echo "/sdd/ee/publ/xmlfile_555.xml" | sed 's/^.*_\([0-9]*\).xml/\1/'`
echo "STRING = $MYSTRING"
This is not ok because it returns the whole string. In this case I don't want any result.
It prints "/sdd/ee/publ/xmlfile.xml"
MYSTRING=`echo "/sdd/ee/publ/xmlfile.xml" | sed 's/^.*_\([0-9]*\).xml/\1/'`
echo "STRING = $MYSTRING"
Any ideas how to get an "empty" result in the second case.
thanks!
You just need to tell sed to keep its mouth shut if it doesn't find a match. The -n option is used for that.
MYSTRING=`echo "/sdd/ee/publ/xmlfile_555.xml" | sed -n 's/^.*_\([0-9]*\)\.xml/\1/p'`
I only made two changes to what you had: the aforementioned -n option to sed, and the p flag that comes after the s/// command, which tells sed to print the output only if the substitution was successfully done.
EDIT: I've also escaped the final . as suggested in the comments.
Try this?
basename /sdd/ee/publ/xmlfile_555.xml | awk -F_ '{print $2}'
The output is 555.xml
With the other one.
basename /sdd/ee/publ/xmlfile.xml | awk -F_ '{print $2}'
The output is an empty string.
$ path=/sdd/ee/publ/xmlfile_555.xml
$ echo ${path##*/}
xmlfile_555.xml
$ path=${path##*/}
$ echo ${path%.xml}
xmlfile_555
$ path=${path%.xml}
$ echo ${path##*_}
555