In VIM, for every letter of the English alphabet, I want to insert a line in the following format:
fragment {upper(letter)}: '{lower(letter)}' | '{upper(letter)'};
So, for example, for the letter a, it would be:
fragment A: 'a' | 'A';
Writing 26 lines like this is tedious, and you shouldn't repeat yourself. How can I do that?
In vim:
for i in range(65,90) " ASCII codes
let c = nr2char(i) " Character
echo "fragment" c ": '"tolower(c)"' | '" c "';"
Or as a oneliner:
:for i in range(65,90) | let c = nr2char(i) | echo "fragment" c ": '"tolower(c)"' | '" c "';" | endfor
fragment A : ' a ' | ' A ';
fragment B : ' b ' | ' B ';
fragment C : ' c ' | ' C ';
fragment X : ' x ' | ' X ';
fragment Y : ' y ' | ' Y ';
fragment Z : ' z ' | ' Z ';
Use :redir #a to copy that output to register a.
Here's one way.
First, I'm gonna create the text in bash with a single command, then I'll tell VIM to insert the output of that command into the file.
I need to iterate through English alphabets, and for every letter, echo one line in the specified format. So at first, let's just echo each letter in a single line (By using a for loop):
❯ alphabets="abcdefghijklmnopqrstuvwxyz"
❯ for ((i=0; i<${#alphabets}; i++)); do echo "${alphabets:$i:1}"; done
The way this works is:
${#alphabets} is equal to the length of the variable alphabets.
${alphabets:$i:1} extracts the letter at position i from the variable alphabets (zero-based).
Now we need to convert these letters to upper case. Here's one way we can achieve this:
❯ echo "a" | tr a-z A-Z
Now if we apply this to the for loop we had, we get this:
❯ for ((i=0; i<${#alphabets}; i++)); do echo "${alphabets:$i:1}" | tr a-z A-Z; done
From here, it's quite easy to produce the text we wanted:
❯ for ((i=0; i<${#alphabets}; i++)); do c="${alphabets:$i:1}"; cap=$(echo "${c}" | tr a-z A-Z); echo "fragment ${cap}: '${c}' | '${cap}';"; done
fragment A: 'a' | 'A';
fragment B: 'b' | 'B';
fragment Z: 'z' | 'Z';
Now that we generated the text, we can simply use :r !command to insert the text into vim:
:r !alphabets="abcdefghijklmnopqrstuvwxyz"; for ((i=0; i<${\#alphabets}; i++)); do c="${alphabets:$i:1}"; cap=$(echo "${c}" | tr a-z A-Z); echo "fragment ${cap}: '${c}' | '${cap}';"; done
Note that # is a special character in vim and should be spaced using \.
Here's another one-liner that does the same thing, and I believe is more intuitive:
for c in {a..z}; do u=$(echo ${c} | tr a-z A-Z); echo "fragment ${u}: '${c}' | '${u}';"; done
In my directory, I have a multiple nifti files (e.g., WIP944_mp2rage-0.75iso_TR5.nii) from my MRI scanner accompanied by text files (e.g., WIP944_mp2rage-0.75iso_TR5_info.txt) containing information on the acquisition parameters (e.g., "Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND"). Based on these parameters (e.g., INV1_PHS_ND), I need to change the nifti file name, which are echoed in $niftibase. I used grep to do this. When echoing all variables individually, it gives me what I want, but when I try to concatenate them into one filename, the variables are mixed together, instead of delimited by a dot.
I tried multiple forms of sed to cut away potentially invisible characters and identified the source of the problems: the "INV1_PHS_ND" part of 'series description' gives me troubles, which is the $struct component, potentially due to the fact that this part varies in how many fields are extracted. Sometimes this is 3 (in the case of INV1_PHS_ND), but it can be 2 as well (INV1_ND). When I introduce this variable into the filename, everything goes haywire.
for infofile in ${PWD}/*.txt; do
# General characteristics of subjects (i.e., date of session, group number, and subject number)
reco=$(grep -A0 "Series description:" ${infofile} | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(grep -A0 "Series date:" ${infofile} | cut -c 16-21)
group=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(grep -A0 "Series number:" ${infofile} | cut -d ' ' -f 3)
# Change name if reco has structural prefix
if [[ $reco = *WIP944* ]]; then
struct=$(grep -A0 "Series description: WIP944" ${infofile} | cut -d '_' -f 4,5,6)
niftibase=$(basename $infofile _info.txt).nii
#echo ${subStudy}.struct.${date}.${group}.${protocol}.${paradigm}.nii
echo ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
#mv ${niftibase} ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
This gives me output like this:
for all 7 WIP944 files. However, it needs to be in the direction of this:
H1.struct.INV2_PHS_ND.190523.Pilot.Noc001.Heat47.n11.nii, where H1, Noc, and Heat47 are loaded in from a setup file.
EDIT: I tried to use awk in the following way:
reco=$(awk 'FNR==8 {print;exit}' $infofile | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(awk 'FNR==2 {print;exit}' $infofile | cut -c 15-21)
group=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(awk 'FNR==14 {print;exit}' $infofile | cut -d ' ' -f 3)
which again gave me the correct output when echoing the variables individually, but not when I tried to combine them: .niit47.n11022_PHS_ND.
I used echo "$struct" | tr -dc '[:print:]' | od -c to see if there were hidden characters due to line endings, which resulted in:
0000000 I N V 2 _ P H S _ N D
EDIT: This is how the text file looks like:
Series UID:
Study date: 20190523
Study time: 153529.718000
Series date: 20190523
Series time: 160111.750000
Subject: MDC-0153,pilot_003^pilot_003
Subject birth date: 19970226
Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND
Manufacturer: SIEMENS
Model name: Investigational_Device_7T
Software version: syngo MR B17
Study id: 1
Series number: 5
Repetition time (ms): 5000
Echo time[1] (ms): 2.51
Inversion time (ms): 900
Flip angle: 7
Number of averages: 1
Slice thickness (mm): 0.75
Slice spacing (mm):
Image columns: 320
Image rows: 320
Phase encoding direction: ROW
Voxel size x (mm): 0.75
Voxel size y (mm): 0.75
Number of volumes: 1
Number of slices: 240
Number of files: 240
Number of frames: 0
Slice duration (ms) : 0
Orientation: sag
PixelBandwidth: 248
I have one of these for each nifti file. subStudy is hardcoded in a setup file, which is loaded in prior to running the for loop. When I echo this, it shows the correct value. I need to change the names of multiple files with a specific prefix, which are stored in $reco.
As confirmed in comments, the input files have DOS carriage returns, which are basically invalid in Unix files. Also, you should pay attention to proper quoting.
As a general overhaul, I would recommend replacing the entire Bash script with a simple Awk script, which is both simpler and more idiomatic.
for infofile in ./*.txt; do # no need to use $(PWD)
# Pre-filter with a simple grep
grep -q '^Series description: [^ _]*WIP944' "$infofile" && continue
# Still here? Means we want to rename
suffix="$(awk -F : '
BEGIN { split("Series description:Series date:Subject:Series number", f, /:/) }
{ sub(/\r/, ""); } # get rid of pesky DOS carriage return
NR == 1 { nifbase = FILENAME; sub(/_info\.txt$/, ".nii", nifbase) }
$1 in f { x[$1] = substring($0, length($1)+2) }
split(x["Series description"], t, /_/); struct=t[4] "_" t[5] "_" t[6]
split(x["Series description"], t, /_/); reco = t[1]
date=substr(x["Series date"], 16, 5)
split(x["Subject"], t, /\^/); split(t[2], tt, /_/); group=tt[1]
ScanNr=x["Series number"]
### FIXME: protocol and paradigm are still undefined
print struct "." date "." group "." protocol number "." paradigm ".n" ScanNr
}' "$infofile")"
echo mv "$infofile" "$subStudy.struct.$suffix"
This probably still requires some tweaking (at least "protocol" and "paradigm" are still undefined). Once it seems to print the correct values, you can remove the echo before mv and have it actually rename files for you.
(Probably still better test on a copy of your real data files first!)
It doesn't seem that simple. At least for me.
I need to have a variable in printf text. It's something like:
FOO="User data"
+++++++++++++++++++ $FOO +++++++++++++++++++++
Would output
+++++++++++++++++++ User Data +++++++++++++++++++++
FOO="Fooooooo barrrr"
+++++++++++++++++++ $FOO +++++++++++++++++++++
Should output
++++++++++++++++ Fooooooo barrrr ++++++++++++++++++
FOO="Foooooooooooooooooooo barrrrr"
+++++++++++++++++++ $FOO +++++++++++++++++++++
Should be
+++++++++ Foooooooooooooooooooo barrrrr +++++++++++
As you can see I need a variable to be in the middle of n-length line, surrounded by + mark. How to achieve that using printf and other default-available commands?
(Debian 8)
declare -i x1 x2 x3 width
foo="User data"
width=50 # total width
x2=${#foo}+2 # length of $foo and 2 whitespaces
x1=(50-x2)/2 # length of first part
x3=$width-x1-x2 # length of last part
for ((i=1;i<=$x1;i++)); do echo -n "+"; done
echo -n " $foo "
for ((i=1;i<=$x3;i++)); do echo -n "+"; done
+++++++++++++++++++ User data ++++++++++++++++++++
With foo="":
+++++++++++++++ ++++++++++++++++
#!/usr/bin/env bash
len=$(echo -n $text | wc -m)
fillerlen=$((($linelen - $len - 2) / 2))
filler=$(printf "$char%.0s" $(seq 1 $fillerlen))
echo $filler $text $filler
In the format string for printf, you can specify the "precision" of a string with %${p}s, where $p is the precision. You can take advantage of that by printing nothing (expanding to a space) the desired number of times and then translating the spaces into "+":
$ p=10
$ printf "%${p}s\n" | tr ' ' +
This function takes the length of your line and the string you want to put in its centre, then prints it padded with plus signs:
pad () {
# ${#string} expands to the length of $string
n_pad=$(( (len - ${#string} - 2) / 2 ))
printf "%${n_pad}s" | tr ' ' +
printf ' %s ' "$string"
printf "%${n_pad}s\n" | tr ' ' +
Works like this:
$ pad 50 Test
++++++++++++++++++++++ Test ++++++++++++++++++++++
$ pad 50 "A longer string to be padded"
++++++++++ A longer string to be padded ++++++++++
Notice how you have to quote strings consisting of more than one word, or only the first one will be used.
If the length of your line is not divisible by 2, the padding will be rounded down, but will always be symmetrical.
Try this :
n=50; # You can change the value of n as you please.
var="fooo baar";
n=$(( n - size ))
n=$(( n / 2 ))
s=$(printf "%-${n}s" "*")
echo "${s// /*} "$var" ${s// /*}" #white-spaces included here.
I have a large .xml file like that:
c1="a1" c2="b1" c3="cccc1"
c1="aa2" c2="bbbb2" c3="cc2"
c1="aaaaaa3" c2="bb3" c3="cc3"
I need the result like the following:
a1 b1 cccc1
aa2 bbbb2 cc2
aaaaaa3 bb3 cc3
How can I get the column in BASH?
I have the following method in PL/SQL,but it's very inconvenient:
TRIM(BOTH '"' FROM REGEXP_SUBSTR(C1, '"[^"]+"', 1, 1)) c1,
TRIM(BOTH '"' FROM REGEXP_SUBSTR(C1, '"[^"]+"', 1, 2)) c2,
TRIM(BOTH '"' FROM REGEXP_SUBSTR(C1, '"[^"]+"', 1, 3)) c3
Use cut:
cut -d'"' -f2,4,6 --output-delimiter=" " test.txt
Or you can use sed if the number of columns is not known:
sed 's/[a-z][a-z0-9]\+="\([^"]\+\)"/\1/g' < test.txt
[a-z][a-z0-9]\+ - matches a string starting with a alpha char followed by any number of alphanumeric chars
"\([^"]\+\)" - captures any string inside the quotes
\1 - represents the captured string that in this case is used to replace the entire match
A perl approach (based on the awk answer by #A-Ray)
perl -F'"' -ane 'print join(" ",#F[ map { 2 * $_ + 1} (0 .. $#F) ]),"\n";' < test.txt
-F'"' set input separator to "
-a turn autosplit on - this results in #F being filed with content of fields in the input
-n iterate through all lines but don't print them by default
-e execute code following
map { 2 * $_ + 1} (0 .. $#F) generates a list of indexes (1,3,5 ...)
#F[map { 2 * $_ + 1} (0 .. $#F)] takes a slice from the array, selecting only odd fields
join - joins the slice with spaces
NOTE: I would not use this approach without a good reason, the first two are easier.
Some benchmarking (on a Raspberry Pi, with a 60000 lines input file and output thrown away to /dev/null)
cut - 0m0.135s no surprise there
sed - 0m5.864s
perl - 0m8.218s - I guess regenerating the index list every line isn't that fast (with a hard coded slice list it goes to half, but that would defeat the purpose)
the read based solution - 0m52.027s
You can also look at the built-in substring replacement/removal bash offers. Either in a short script or One-Liner:
while read -r line; do
new=${line//c[0-9]=/} ## remove 'cX=', where X is '0-9'
new=${new//\"/} ## remove all '"' (double-quotes)
echo "$new"
done <"$1"
exit 0
$ cat dat/stuff.xml
c1="a1" c2="b1" c3="cccc1"
c1="aa2" c2="bbbb2" c3="cc2"
c1="aaaaaa3" c2="bb3" c3="cc3"
$ bash dat/stuff.xml
a1 b1 cccc1
aa2 bbbb2 cc2
aaaaaa3 bb3 cc3
As a One-Liner
while read -r line; do new=${line//c[0-9]=/}; new=${new//\"/}; echo "$new"; done <dat/stuff.xml
awk -F '"' '{ for(i=2; i<=NF; i+=2) { printf $i" " } print "" }'
-F '"' makes Awk treat quotation marks (") as field delimiters. For example, Awk will split the line...
c1="a1" c2="b1" c3="cccc1"
...into fields numbered as...
1: 'c1='
2: 'a1'
3: ' c2='
4: 'b1'
5: ' c3='
6: 'cccc1'
7: ''
for(i=2; i<=NF; i+=2) { printf $i" " } starts at field 2, prints the value of the field, skips a field, and continues. In this case, fields 2, 4, and 6 will be printed.
print outputs a string following by a newline. printf also outputs a string, but doesn't append a newline. Therefore...
printf $i" "
...outputs the value of field $i followed by a space.
print ""
...simply outputs a newline.
I have tried several different search terms but have not found exactly what I want, I am sure there is already an answer for this so please point me to it if so.
I would like to understand how to increment a letter code given a standard number convention in a bash script.
Starting with AAAA=0 or with leading zerosAAAA=000000 (26x26x26x26) I would like to increment the value with a a positive single digit each time, so aaab=000001,aaac=000002 and aaba=000026 and aaaca=000052 etc.
Thanks Art!
I guess this is what you want
echo {a..z}{a..z}{a..z}{a..z} | tr ' ' '\n' | nl
will be too long, perhaps test with this first
echo {a..z}{a..z} | tr ' ' '\n' | nl
if you don't need the line numbers remove last pipe and nl
If you need the output in xxxx=nnnnnn format, you can use awk
echo {a..z}{a..z}{a..z}{a..z} | tr ' ' '\n' | awk '{printf "%s=%06d\n", $0, NR-1}'
If you are aiming for speed and simplicity:
for text in {a..z}{a..z}{a..z}{a..z}; do
printf '%06d %5.5s\n' "$i" "$text"
(( i++ ))
Aiming at having a function that convert any number to the character string:
We must Understand that what you are describing is a number written in base 26, using the character a as 0, b as 1, c as 3, etc.
Thus, aaaa means 0000, aaab means 0001, aaac means 0002, .... aaaz means 0025
and aaba means 0026, aaca means 0052.
bc could do the base conversion directly (as numbers):
$ echo 'obase=26; 199'|bc
07 17
The 7th letter is: a0, b1, c2, d3, e4, f5, g6, (h)7,
the 17th letter is (r).
If we set the variable list to: list=$(printf '%s' {a..z}) or list=abcdefghijklmnopqrstuvwxyz
We could get each letter from the number with: ${list:7:1} and ${list:17:1}
$ echo "${list:7:1} and ${list:17:1}"
h and r
$ printf '%s' "${list:7:1}" "${list:17:1}" # Using printf:
All together inside an script, is:
list=$(printf '%s' {a..z})
local numbers
numbers="$(echo "obase=26; $1"|bc)"
for number in $numbers; do
printf '%s' "${list:10#$number:1}";
limit=$(( 26**$count - 1 ))
for (( i=0; i<=$limit; i++)); do
printf '%06d %-5.5s\n' "$i" "$(getletters "$i")"
Please change count from 2 to 4 to get the whole list. Be aware that such list is more than half a million lines: The limit is 456,975 and will take some time.
With perl, you can ++ a string to increment the letter:
for (my ($n,$s) = (0,"aaaa"); $n < 200; $n++, $s++) {
printf "%s=%0*d\n", $s, length($s), $n;
How can I add spaces between every character or symbol within a UTF-8 document? E.g. 123hello! becomes 1 2 3 h e l l o !.
I have BASH,, and gedit, if any of those can do that.
I don't care if it sometimes leaves extra spaces in places (e.g. 2 or 3 spaces in a single place is no problem).
Shortest sed version
sed 's/./& /g'
$ echo '123hello!' | sed 's/./& /g'
1 2 3 h e l l o !
Obligatory awk version
awk '$1=$1' FS= OFS=" "
$ echo '123hello!' | awk '$1=$1' FS= OFS=" "
1 2 3 h e l l o !
sed(1) can do this:
$ sed -e 's/\(.\)/\1 /g' < /etc/passwd
r o o t : x : 0 : 0 : r o o t : / r o o t : / b i n / b a s h
d a e m o n : x : 1 : 1 : d a e m o n : / u s r / s b i n : / b i n / s h
It works well on e.g. UTF-8 encoded Japanese content:
$ file japanese
japanese: UTF-8 Unicode text
$ sed -e 's/\(.\)/\1 /g' < japanese
E X I F 中 の 画 像 回 転 情 報 対 応 に よ り 、 一 部 画 像 ( 特 に 『
sed is ok but this is pure bash
for ((i=0; i<${#string}; i++)); do
string_new+="${string:$i:1} "
Since you have bash, I am will assume that you have access to sed. The following command line will do what you wish.
$ sed -e 's:\(.\):\1 :g' < input.txt > output.txt
I like these solutions because they do not have a trailing space like the rest
GNU awk:
echo 123hello! | awk NF=NF FS=
GNU awk:
echo 123hello! | awk NF=NF FPAT=.
POSIX awk:
echo 123hello! | awk '{while(a=substr($0,++b,1))printf b-1?FS a:a}'
This might work for you:
echo '1 23h ello ! ' | sed 's/\s*/ /g;s/^\s*\(.*\S\)\s*$/\1/;l'
1 2 3 h e l l o !$
1 2 3 h e l l o !
In retrospect a far better solution:
sed 's/\B/ /g' file
Replaces the space between letters with a space.
echo ${string} | sed -r 's/(.{1})/\1 /g'
Pure POSIX Shell version:
addspace() {
while [ -n "${__addspace_str#?}" ]; do
printf '%c ' "$__addspace_str"
printf '%c' "$__addspace_str"
Or if you need to put it in a variable:
addspace_var() {
while [ -n "${__addspace_str#?}" ]; do
addspace_result="$addspace_result${__addspace_str%${__addspace_str#?}} "
addspace_var abc
echo "$addspace_result"
Tested with dash, ksh, zsh, bash (+ bash --posix), and busybox ash.
This parameter expansion removes the first character of x. ${x#...} in general removes a prefix given by a pattern, and ? matches any single character.
printf '%c ' "$str"
The %c format parameter transforms the string argument into its first character, so the full format string '%c ' prints the first character of the string followed by a space. Note that if the string was empty this would cause issues, but we already checked that it wasn't before, so it's fine. To print the first character safely in any situation we can use '%.1s', but I like living dangerously :3j
This is an alternate way to get the first character of the string. We already know that ${x#?} is all but the first character. Well, ${x%...} removes ... from the end of x, so ${x%${x#?}} removes all but the first character from the end of x, leaving only the first one.
POSIX doesn't define local, so to avoid variable conflicts it's safer to create unique names that are unlikely to clobber each other. I am starting to experiment using M4 to generate unique names while not having to destroy my code every time but it's probably overkill for people who don't use shell as much as me.
[ -n "${str#?}" ]
Why not just [ -n "$str" ]? It's to avoid the dreaded trailing space, it's also why we have a little statement guy at the bottom there outside the loop. The loops goes until the string is one character long, then we finish outside of it so we can append this last character without adding a space.
When should I use this?
This is good for small inputs in long running loops, since it avoids the overhead of calling an external process, but for larger inputs it starts lagging behind fast, specially the var version. (I fault the ${x%${x#?}} trick).
Benchmark Commands
# addspace
time dash -c ". ./; for x in $(seq -s ' ' 1 10000); do addspace \"$input\" >/dev/null; done"
# addspace_var
time dash -c ". ./; for x in $(seq -s ' ' 1 10000); do addspace_var \"$input\" >/dev/null; done"
# sed for comparison
time dash -c ". ./; for x in $(seq -s ' ' 1 10000); do echo \"$input\" | sed 's/./& /g' >/dev/null; done"
Input Length = 3
addspace addspace_var sed
real 0m0,106s 0m0,106s 0m10,651s
user 0m0,077s 0m0,075s 0m9,349s
sys 0m0,029s 0m0,031s 0m3,030s
Input Length = 200
addspace addspace_var sed
real 0m6,050s 0m47,115s 0m11,049s
user 0m5,557s 0m46,919s 0m9,727s
sys 0m0,488s 0m0,068s 0m3,085s
Input Length = 1000
addspace addspace_var sed
real 0m55,989s TBD 0m11,534s
user 0m53,560s TBD 0m10,214s
sys 0m2,428s TBD 0m2,975s
(Yeah, I was waiting a bit for that last var one.)
In situations like this you can simply check the length of the input and call the appropriate function for maximum performance.
addspace() {
if [ ${#1} -lt 100 ]; then
addspace_builtins "$1"
addspace_proccess "$1"