Read a file by bytes in BASH - bash

I need to read first byte of file I specified, then second byte,third and so on. How could I do it on BASH?
P.S I need to get HEX of this bytes

Full rewrite: september 2019!
A lot shorter and simplier than previous versions! (Something faster, but not so much)
Yes , bash can read and write binary:
Syntax:
LANG=C IFS= read -r -d '' -n 1 foo
will populate $foo with 1 binary byte. Unfortunately, as bash strings cannot hold null bytes ($\0), reading one byte once is required.
But for the value of byte read, I've missed this in man bash (have a look at 2016 post, at bottom of this):
printf [-v var] format [arguments]
...
Arguments to non-string format specifiers are treated as C constants,
except that ..., and if the leading character is a single or double
quote, the value is the ASCII value of the following character.
So:
read8() {
local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS=
read -r -d '' -n 1 _r8_car
printf -v $_r8_var %d "'"$_r8_car
}
Will populate submitted variable name (default to $OUTBIN) with decimal ascii value of first byte from STDIN
read16() {
local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb
read8 _r16_lb &&
read8 _r16_hb
printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb ))
}
Will populate submitted variable name (default to $OUTBIN) with decimal value of first 16 bits word from STDIN...
Of course, for switching Endianness, you have to switch:
read8 _r16_hb &&
read8 _r16_lb
And so on:
# Usage:
# read[8|16|32|64] [varname] < binaryStdInput
read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS=
read -r -d '' -n 1 _r8_car
printf -v $_r8_var %d "'"$_r8_car ;}
read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb
read8 _r16_lb && read8 _r16_hb
printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) ;}
read32() { local _r32_var=${1:-OUTBIN} _r32_lw _r32_hw
read16 _r32_lw && read16 _r32_hw
printf -v $_r32_var %d $(( _r32_hw<<16| _r32_lw )) ;}
read64() { local _r64_var=${1:-OUTBIN} _r64_ll _r64_hl
read32 _r64_ll && read32 _r64_hl
printf -v $_r64_var %d $(( _r64_hl<<32| _r64_ll )) ;}
So you could source this, then if your /dev/sda is gpt partitioned,
read totsize < <(blockdev --getsz /dev/sda)
read64 gptbackup < <(dd if=/dev/sda bs=8 skip=68 count=1 2>/dev/null)
echo $((totsize-gptbackup))
1
Answer could be 1 (1st GPT is at sector 1, one sector is 512 bytes. GPT Backup location is at byte 32. With bs=8 512 -> 64 + 32 -> 4 = 544 -> 68 blocks to skip... See GUID Partition Table at Wikipedia).
Quick small write function...
write () {
local i=$((${2:-64}/8)) o= v r
r=$((i-1))
for ((;i--;)) {
printf -vv '\%03o' $(( ($1>>8*(0${3+-1}?i:r-i))&255 ))
o+=$v
}
printf "$o"
}
This function default to 64 bits, little endian.
Usage: write <integer> [bits:64|32|16|8] [switchto big endian]
With two parameter, second parameter must be one of 8, 16, 32 or 64, to be bit length of generated output.
With any dummy 3th parameter, (even empty string), function will switch to big endian.
.
read64 foo < <(write -12345);echo $foo
-12345
...
First post 2015...
Upgrade for adding specific bash version (with bashisms)
With new version of printf built-in, you could do a lot without having to fork ($(...)) making so your script a lot faster.
First let see (by using seq and sed) how to parse hd output:
echo ;sed <(seq -f %02g 0 $(( COLUMNS-1 )) ) -ne '
/0$/{s/^\(.*\)0$/\o0337\o033[A\1\o03380/;H;};
/[1-9]$/{s/^.*\(.\)/\1/;H};
${x;s/\n//g;p}';hd < <(echo Hello good world!)
0 1 2 3 4 5 6 7
012345678901234567890123456789012345678901234567890123456789012345678901234567
00000000 48 65 6c 6c 6f 20 67 6f 6f 64 20 77 6f 72 6c 64 |Hello good world|
00000010 21 0a |!.|
00000012
Were hexadecimal part begin at col 10 and end at col 56, spaced by 3 chars and having one extra space at col 34.
So parsing this could by done by:
while read line ;do
for x in ${line:10:48};do
printf -v x \\%o 0x$x
printf $x
done
done < <( ls -l --color | hd )
Old original post
Edit 2 for Hexadecimal, you could use hd
echo Hello world | hd
00000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a |Hello world.|
or od
echo Hello world | od -t x1 -t c
0000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a
H e l l o w o r l d \n
shortly
while IFS= read -r -n1 car;do [ "$car" ] && echo -n "$car" || echo ; done
try them:
while IFS= read -rn1 c;do [ "$c" ]&&echo -n "$c"||echo;done < <(ls -l --color)
Explain:
while IFS= read -rn1 car # unset InputFieldSeparator so read every chars
do [ "$car" ] && # Test if there is ``something''?
echo -n "$car" || # then echo them
echo # Else, there is an end-of-line, so print one
done
Edit; Question was edited: need hex values!?
od -An -t x1 | while read line;do for char in $line;do echo $char;done ;done
Demo:
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex
while read line;do # Read line of HEX pairs
for char in $line;do # For each pair
printf "\x$char" # Print translate HEX to binary
done
done
Demo 2: We have both hex and binary
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex
while read line;do # Read line of HEX pairs
for char in $line;do # For each pair
bin="$(printf "\x$char")" # translate HEX to binary
dec=$(printf "%d" 0x$char) # translate to decimal
[ $dec -lt 32 ] || # if caracter not printable
( [ $dec -gt 128 ] && # change bin to a single dot.
[ $dec -lt 160 ] ) && bin="."
str="$str$bin"
echo -n $char \ # Print HEX value and a space
((i++)) # count printed values
if [ $i -gt 15 ] ;then
i=0
echo " - $str"
str=""
fi
done
done
New post on september 2016:
This could be usefull on very specific cases, ( I've used them to manualy copy GPT partitions between two disk, at low level, without having /usr mounted...)
Yes, bash could read binary!
... but only one byte, by one... (because `char(0)' couldn't be correctly read, the only way of reading them correctly is to consider end-of-file, where if no caracter is read and end of file not reached, then character read is a char(0)).
This is more a proof of concept than a relly usefull tool: there is a pure bash version of hd (hexdump).
This use recent bashisms, under bash v4.3 or higher.
#!/bin/bash
printf -v ascii \\%o {32..126}
printf -v ascii "$ascii"
printf -v cntrl %-20sE abtnvfr
values=()
todisplay=
address=0
printf -v fmt8 %8s
fmt8=${fmt8// / %02x}
while LANG=C IFS= read -r -d '' -n 1 char ;do
if [ "$char" ] ;then
printf -v char "%q" "$char"
((${#char}==1)) && todisplay+=$char || todisplay+=.
case ${#char} in
1|2 ) char=${ascii%$char*};values+=($((${#char}+32)));;
7 ) char=${char#*\'\\};values+=($((8#${char%\'})));;
5 ) char=${char#*\'\\};char=${cntrl%${char%\'}*};
values+=($((${#char}+7)));;
* ) echo >&2 ERROR: $char;;
esac
else
values+=(0)
fi
if [ ${#values[#]} -gt 15 ] ;then
printf "%08x $fmt8 $fmt8 |%s|\n" $address ${values[#]} "$todisplay"
((address+=16))
values=() todisplay=
fi
done
if [ "$values" ] ;then
((${#values[#]}>8))&&fmt="$fmt8 ${fmt8:0:(${#values[#]}%8)*5}"||
fmt="${fmt8:0:${#values[#]}*5}"
printf "%08x $fmt%$((
50-${#values[#]}*3-(${#values[#]}>8?1:0)
))s |%s|\n" $address ${values[#]} ''""'' "$todisplay"
fi
printf "%08x (%d chars read.)\n" $((address+${#values[#]})){,}
You could try/use this, but don't try to compare performances!
time hd < <(seq 1 10000|gzip)|wc
1415 25480 111711
real 0m0.020s
user 0m0.008s
sys 0m0.000s
time ./hex.sh < <(seq 1 10000|gzip)|wc
1415 25452 111669
real 0m2.636s
user 0m2.496s
sys 0m0.048s
same job: 20ms for hd vs 2000ms for my bash script.
... but if you wanna read 4 bytes in a file header or even a sector address in an hard drive, this could do the job...

Did you try xxd? It gives hex dump directly, as you want..
For your case, the command would be:
xxd -c 1 /path/to/input_file | while read offset hex char; do
#Do something with $hex
done
Note: extract the char from hex, rather than while read line. This is required because read will not capture white space properly.

using read a single char can be read at a time as follows:
read -n 1 c
echo $c
[ANSWER]
Try this:
#!/bin/bash
# data file
INPUT=/path/to/input.txt
# while loop
while IFS= read -r -n1 char
do
# display one character at a time
echo "$char"
done < "$INPUT"
From this link
Second method,
Using awk, loop through char by char
awk '{for(i=1;i<=length;i++) print substr($0, i, 1)}' /home/cscape/Desktop/table2.sql
third way,
$ fold -1 /home/cscape/Desktop/table.sql | awk '{print $0}'
EDIT: To print each char as HEX number:
Suppose I have a file name file :
$ cat file
123A3445F
I have written a awk script (named x.awk) to that read char by char from file and print into HEX :
$ cat x.awk
#!/bin/awk -f
BEGIN { _ord_init() }
function _ord_init( low, high, i, t)
{
low = sprintf("%c", 7) # BEL is ascii 7
if (low == "\a") { # regular ascii
low = 0
high = 127
} else if (sprintf("%c", 128 + 7) == "\a") {
# ascii, mark parity
low = 128
high = 255
} else { # ebcdic(!)
low = 0
high = 255
}
for (i = low; i <= high; i++) {
t = sprintf("%c", i)
_ord_[t] = i
}
}
function ord(str, c)
{
# only first character is of interest
c = substr(str, 1, 1)
return _ord_[c]
}
function chr(c)
{
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}
{ x=$0; printf("%s , %x\n",$0, ord(x) )}
To write this script I used awk-documentation
Now, You can use this awk script for your work as follows:
$ fold -1 /home/cscape/Desktop/file | awk -f x.awk
1 , 31
2 , 32
3 , 33
A , 41
3 , 33
4 , 34
4 , 34
5 , 35
F , 46
NOTE: A value is 41 in HEX decimal. To print in decimal change %x to %d in last line of script x.awk.
Give it a Try!!

Yet another solution, using head, tail and printf:
for a in $( seq $( cat file.txt | wc -c ) ) ; do cat file.txt | head -c$a | tail -c1 | xargs -0 -I{} printf '%s %0X\n' {} "'{}" ; done
More readable:
#!/bin/bash
function usage() {
echo "Need file with size > 0"
exit 1
}
test -s "$1" || usage
for a in $( seq $( cat $1 | wc -c ) )
do
cat $1 | head -c$a | tail -c1 | \
xargs -0 -I{} printf '%c %#02x\n' {} "'{}"
done

use read with -n option.
while read -n 1 ch; do
echo $ch
done < moemoe.txt

I have a suggestion to give, but would like a feedback from everybody and manly a personal advice from syntaxerror's user.
I don't know much about bash but I thought maybe it would be better to have "cat $1" stored in a variable.. but the problem is that echo command will also bring a small overhead right?
test -s "$1" || (echo "Need a file with size greater than 0!"; exit 1)
a=0
rfile=$(cat $1)
max=$(echo $rfile | wc -c)
while [[ $((++a)) -lt $max ]]; do
echo $rfile | head -c$a | tail -c1 | \
xargs -0 -I{} printf '%c %#02x\n' {} "'{}"
done
in my opinion it would have a better performance but i haven't perf'tested..

Although I rather wanted to expand Perleone's own post (as it was his basic concept!), my edit was rejected after all, and I was kindly adviced that this should be posted as a separate answer. Fair enough, so I will do that.
Considerations in short for the improvements on Perleone's original script:
seq would be totally overkill here. A simple while loop with a used as a (likewise simple) counter variable will do the job just fine (and much quicker too)
The max value, $(cat $1 | wc -c) must be assigned to a variable, otherwise it will be recalculated every time and make this alternate script run even slower than the one it was derived from.
There's no need to waste a function on a simple usage info line. However, it is necessary to know about the (mandatory) curly braces around two commands, for without the { }, the exit 1 command will be executed in either case, and the script interpreter will never make it to the loop. (Last note: ( ) will work too, but not in the same way! Parentheses will spawn a subshell, whilst curly braces will execute commands inside them in the current shell.)
#!/bin/bash
test -s "$1" || { echo "Need a file with size greater than 0!"; exit 1; }
a=0
max=$(cat $1 | wc -c)
while [[ $((++a)) -lt $max ]]; do
cat $1 | head -c$a | tail -c1 | \
xargs -0 -I{} printf '%c %#02x\n' {} "'{}"
done

Related

Take first 16 character and covert it into hex string

I have UUID, 3abbea88-c77d-11eb-b8bc-0242ac130003 and I want to take first 16 character of this string and want Hexadecimal string of first 16 characters using shell script.
I tried,
code=$(echo -n ${${ID##*:}:0:16} | od -A n -t x1)
HEX_ID=$(echo ${code//[[:blank:]]/})
Any better way ?
Expected Output : 33616262656138382d633737642d3131
Using od you can simply limit the number of read characters using the -N option:
HEX_ID=$(od -A n -t x1 -N 16 <<< ${ID##*:} | tr -dc '[:xdigit:]')
Edit: tr is used to suppress non-hexadecimal characters, namely whitespaces and potential newlines.
Perl to the rescue!
perl -le 'print unpack "H32", shift' 3abbea88-c77d-11eb-b8bc-0242ac130003
-l adds newlines to print
unpack takes a string and expands it to a list of values based on a template. H32 means "take characters and interpret them as 32 hex values".
shift reads the first command line argument.
Or, using xxd and head:
echo 3abbea88-c77d-11eb-b8bc-0242ac130003 | xxd -p | head -c32
That's certainly a useless echo.
Probably avoid uppercase for your private variables.
uuid='3abbea88-c77d-11eb-b8bc-0242ac130003'
tmp=${uuid//-/}
hex_id=$(od -A n -t x1 <<<${tmp:0:13})
hex_id=${hex_id//[[:blank:]]/}
hex_id=${hex_id%0a}
The here string unattractively supplies trailing newline to od which we have to trim off.
Bash-only:
while read -r -N 1 c # read input string 1 char at a time
do [[ "$c" == " " ]] || # skip embedded spaces
printf "%02X" "$( # output the hexidecimal value of
printf "%d" \'$c # the ASCII decimal ordinal of $c
)"
done <<< "${text##*:}" # ignoring the leading trash to the :
echo # newline-teminate the output
All in one line:
while read -rn1 c;do [[ "$c" == " " ]]||printf %02X $(printf "%d" \'$c);done<<<"${text##*:}";echo
This is not the fastest approach...
hexdump does it all:
hexdump -n 16 -ve '1/1 "%.2x"'
-n 16 means only process the first 16 bytes
-e '1/1 "%.2x"' means display each byte using given printf format
-v means display normally (without this, it replaces dupe sections with * 🤷)
echo '3abbea88-c77d-11eb-b8bc-0242ac130003' | hexdump -n 16 -ve '1/1 "%.2x"'
output:
33616262656138382d633737642d3131

Hex to decimal conversion in bash without using gawk

Input:
cat test1.out
12 , maze|style=0x48570006, column area #=0x7, location=0x80000d
13 , maze|style=0x48570005, column area #=0x7, location=0x80aa0d
....
...
..
.
Output needed:
12 , maze|style=0x48570006, column area #=0x7, location=8388621 <<<8388621 is decimal of 0x80000d
....
I want to convert just the last column to decimal.
I cannot use gawk as it is not available in our company machines everywhere.
Tried using awk --non-decimal-data but it didnt work also.
Wondering if just printf command can work on flipping the last word from hex to decimal.
Any other ideas that you can suggest?
There's no need for awk or any other external commands here: bash's native math operation handle hexadecimal values correctly when in an arithmetic context (this is why echo $((0xff)) emits 255).
#!/usr/bin/env bash
# ^^^^- must be really bash, not /bin/sh
location_re='location=(0x[[:xdigit:]]+)([[:space:]]|$)'
while read -r line; do
if [[ $line =~ $location_re ]]; then
hex=${BASH_REMATCH[1]}
dec=$(( $hex ))
printf '%s\n' "${line/location=$hex/location=$dec}"
else
printf '%s\n' "$line"
fi
done
You can see this running at https://ideone.com/uN7qNY
Considering the case strtonum() function is not available, how about:
#!/bin/bash
awk -F'location=0x' '
function hex2dec(str,
i, x, c, tab) {
for (i = 0; i <= 15; i++) {
tab[substr("0123456789ABCDEF", i + 1, 1)] = i;
}
x = 0
for (i = 1; i <= length(str); i++) {
c = toupper(substr(str, i, 1))
x = x * 16 + tab[c]
}
return x
}
{
print $1 "location=" hex2dec($2)
}
' test1.out
where hex2dec() is a homemade substituion of strtonum().
Wait, can't you just use printf in other awks? It won't work with gawk but it does with other awks, right? For example with mawk:
$ mawk 'BEGIN{FS=OFS="="}{$NF=sprintf("%d", $NF);print}' file
12 , maze|style=0x48570006, column area #=0x7, location=8388621
13 , maze|style=0x48570005, column area #=0x7, location=8432141
I tested with mawk, awk-20070501, awk-20121220 and Busybox awk.
Discarded after edit but left for comments' sake:
Using rev and cut to extract around the last = and printf for hex2dec conversion:
$ while IFS='' read -r line || [[ -n "$line" ]]
do
printf "%s=%d\n" "$(echo "$line" | rev | cut -d = -f 2- | rev)" \
$(echo "$line" | rev | cut -d = -f 1 | rev)
done < file
Output:
12 , maze|style=0x48570006, column area #=0x7, location=8388621
13 , maze|style=0x48570005, column area #=0x7, location=8432141
If you have Perl installed, not having Gawk is rather inconsequential.
perl -pe 's/location=\K0x([0-9a-f]+)/ hex($1) /e' file
This might work for you (GNU sed and Bash):
sed 's/\(.*location=\)\(0x[0-9a-f]\+\)/echo "\1$((\2))"/Ie' file
Use pattern matching and back references to split each line and then evaluate an echo command.
Alternative:
sed 's/\(.*location=\)\(0x[0-9a-f]\+\)/echo "\1$((\2))"/I' file | sh
BASH_REMATCH array info :
http://molk.ch/tips/gnu/bash/rematch.html
quintessential principe :
[[ string =~ regexp ]]
[[ "abcdef" =~ (b)(.)(d)e ]]
If the 'string' matches 'regexp',
.. the matched part of the string is stored in the BASH_REMATCH array.
# Now:
# BASH_REMATCH[0]=bcde # as the total match
# BASH_REMATCH[1]=b # as the 1'th captured group
# BASH_REMATCH[2]=c # as ...
# BASH_REMATCH[3]=d
enjoy !
Bash's native math operation handles hexadecimal values correctly anytime.
Example:
echo $(( 0xff))
255
printf '%d' 0xf0
240

ORD and CHR a file in Bash

I build ord and chr functions and they work just fine.
But if I take a file that contains \n, for example:
hello
CHECK THIS HIT
YES
when I ord everything I don't get any new line values. Why is that? I'm writing in Bash.
Here is the code that I am using:
function ord {
ordr="`printf "%d\n" \'$1`"
}
TEXT="`cat $1`"
for (( i=0; i<${#TEXT}; i++ ))
do
ord "${TEXT:$i:1}"
echo "$ordr"
done
Your ord function is really weird. Maybe it would be better to write it as:
function ord {
printf -v ordr "%d" "'$1"
}
Then you would use it as:
TEXT=$(cat "$1")
for (( i=0; i<${#TEXT}; i++ )); do
ord "${TEXT:$i:1}"
printf '%s\n' "$ordr"
done
This still leaves two problems: you won't be able to have null bytes and you won't see trailing newlines. For example (I called your script banana and chmod +x banana):
$ ./banana <(printf 'a\0b\n')
97
98
Two problems show here: the null byte is removed from Bash in the TEXT=$(cat "$1") part, as a Bash variable can't contain null bytes. Moreover, this step also trims trailing newlines.
A more robust approach would be to use read:
while IFS= read -r -n 1 -d '' char; do
ord "$char"
printf '%s\n' "$ordr"
done < "$1"
With this modification:
$ ./banana <(printf 'a\0b\n')
97
0
98
10
Note that this script will depend on your locale. With my locale (LANG="en_US.UTF-8):
$ ./banana <(printf 'a\0â„‚\n')
97
0
8450
10
whereas:
$ LANG= ./banana <(printf 'a\0â„‚\n')
97
0
226
132
130
10
That's to show you that Bash doesn't read bytes, but characters. So depending on how you want Bash to treat your data, set LANG accordingly.
If your script only does that, it's much simpler to not use an ord function at all:
#!/bin/bash
while IFS= read -r -n 1 -d '' char; do
printf '%d\n' "'$char"
done < "$1"
It's that simple!

Reading a file in a shell script and selecting a section of the line

This is probably pretty basic, I want to read in a occurrence file.
Then the program should find all occurrences of "CallTilEdb" in the file Hendelse.logg:
CallTilEdb 8
CallCustomer 9
CallTilEdb 4
CustomerChk 10
CustomerChk 15
CallTilEdb 16
and sum up then right column. For this case it would be 8 + 4 + 16, so the output I would want would be 28.
I'm not sure how to do this, and this is as far as I have gotten with vistid.sh:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r line
do
if [ "$occurance" = $(cut -f1 line) ] #line 10
then
sumTime+=$(cut -f2 line)
fi
done < "$filename"
so the execution in terminal would be
vistid.sh CallTilEdb
but the error I get now is:
/home/user/bin/vistid.sh: line 10: [: unary operator expected
You have a nice approach, but maybe you could use awk to do the same thing... quite faster!
$ awk -v par="CallTilEdb" '$1==par {sum+=$2} END {print sum+0}' hendelse.logg
28
It may look a bit weird if you haven't used awk so far, but here is what it does:
-v par="CallTilEdb" provide an argument to awk, so that we can use par as a variable in the script. You could also do -v par="$1" if you want to use a variable provided to the script as parameter.
$1==par {sum+=$2} this means: if the first field is the same as the content of the variable par, then add the second column's value into the counter sum.
END {print sum+0} this means: once you are done from processing the file, print the content of sum. The +0 makes awk print 0 in case sum was not set... that is, if nothing was found.
In case you really want to make it with bash, you can use read with two parameters, so that you don't have to make use of cut to handle the values, together with some arithmetic operations to sum the values:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r name value # read both values with -r for safety
do
if [ "$occurance" == "$name" ]; then # string comparison
((sumTime+=$value)) # sum
fi
done < "$filename"
echo "sum: $sumTime"
So that it works like this:
$ ./vistid.sh CallTilEdb
sum: 28
$ ./vistid.sh CustomerChk
sum: 25
first of all you need to change the way you call cut:
$( echo $line | cut -f1 )
in line 10 you miss the evaluation:
if [ "$occurance" = $( echo $line | cut -f1 ) ]
you can then sum by doing:
sumTime=$[ $sumTime + $( echo $line | cut -f2 ) ]
But you can also use a different approach and put the line values in an array, the final script will look like:
#!/bin/bash
declare -t filename=prova
declare -t occurance="$1"
declare -i sumTime=0
while read -a line
do
if [ "$occurance" = ${line[0]} ]
then
sumTime=$[ $sumtime + ${line[1]} ]
fi
done < "$filename"
echo $sumTime
For the reference,
id="CallTilEdb"
file="Hendelse.logg"
sum=$(echo "0 $(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1 +/p" < "$file") p" | dc)
echo SUM: $sum
prints
SUM: 28
the sed extract numbers from a lines containing the given id, such CallTilEdb
and prints them in the format number +
the echo prepares a string such 0 8 + 16 + 4 + p what is calculation in RPN format
the dc do the calculation
another variant:
sum=$(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1/p" < "$file" | paste -sd+ - | bc)
#or
sum=$(grep -oP "^$id\D*\K\d+" < "$file" | paste -sd+ - | bc)
the sed (or the grep) extracts and prints only the numbers
the paste make a string like number + number + number (-d+ is a delimiter)
the bc do the calculation
or perl
sum=$(perl -slanE '$s+=$F[1] if /^$id/}{say $s' -- -id="$id" "$file")
sum=$(ID="CallTilEdb" perl -lanE '$s+=$F[1] if /^$ENV{ID}/}{say $s' "$file")
Awk translation to script:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
sumtime=$(awk -v entry=$occurance '
$1==entry{time+=$NF+0}
END{print time+0}' $filename)

How can I align the columns of tables in Bash?

I want to format text as a table. I tried echoing with a '\t' separator, but it was misaligned.
Desired output:
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
Use the column command:
column -t -s' ' filename
printf is great, but people forget about it.
$ for num in 1 10 100 1000 10000 100000 1000000; do printf "%10s %s\n" $num "foobar"; done
1 foobar
10 foobar
100 foobar
1000 foobar
10000 foobar
100000 foobar
1000000 foobar
$ for((i=0;i<array_size;i++));
do
printf "%10s %10d %10s" stringarray[$i] numberarray[$i] anotherfieldarray[%i]
done
Notice I used %10s for strings. %s is the important part. It tells it to use a string. The 10 in the middle says how many columns it is to be. %d is for numerics (digits).
See man 1 printf for more info.
function printTable()
{
local -r delimiter="${1}"
local -r data="$(removeEmptyLines "${2}")"
if [[ "${delimiter}" != '' && "$(isEmptyString "${data}")" = 'false' ]]
then
local -r numberOfLines="$(wc -l <<< "${data}")"
if [[ "${numberOfLines}" -gt '0' ]]
then
local table=''
local i=1
for ((i = 1; i <= "${numberOfLines}"; i = i + 1))
do
local line=''
line="$(sed "${i}q;d" <<< "${data}")"
local numberOfColumns='0'
numberOfColumns="$(awk -F "${delimiter}" '{print NF}' <<< "${line}")"
# Add Line Delimiter
if [[ "${i}" -eq '1' ]]
then
table="${table}$(printf '%s#+' "$(repeatString '#+' "${numberOfColumns}")")"
fi
# Add Header Or Body
table="${table}\n"
local j=1
for ((j = 1; j <= "${numberOfColumns}"; j = j + 1))
do
table="${table}$(printf '#| %s' "$(cut -d "${delimiter}" -f "${j}" <<< "${line}")")"
done
table="${table}#|\n"
# Add Line Delimiter
if [[ "${i}" -eq '1' ]] || [[ "${numberOfLines}" -gt '1' && "${i}" -eq "${numberOfLines}" ]]
then
table="${table}$(printf '%s#+' "$(repeatString '#+' "${numberOfColumns}")")"
fi
done
if [[ "$(isEmptyString "${table}")" = 'false' ]]
then
echo -e "${table}" | column -s '#' -t | awk '/^\+/{gsub(" ", "-", $0)}1'
fi
fi
fi
}
function removeEmptyLines()
{
local -r content="${1}"
echo -e "${content}" | sed '/^\s*$/d'
}
function repeatString()
{
local -r string="${1}"
local -r numberToRepeat="${2}"
if [[ "${string}" != '' && "${numberToRepeat}" =~ ^[1-9][0-9]*$ ]]
then
local -r result="$(printf "%${numberToRepeat}s")"
echo -e "${result// /${string}}"
fi
}
function isEmptyString()
{
local -r string="${1}"
if [[ "$(trimString "${string}")" = '' ]]
then
echo 'true' && return 0
fi
echo 'false' && return 1
}
function trimString()
{
local -r string="${1}"
sed 's,^[[:blank:]]*,,' <<< "${string}" | sed 's,[[:blank:]]*$,,'
}
SAMPLE RUNS
$ cat data-1.txt
HEADER 1,HEADER 2,HEADER 3
$ printTable ',' "$(cat data-1.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
$ cat data-2.txt
HEADER 1,HEADER 2,HEADER 3
data 1,data 2,data 3
$ printTable ',' "$(cat data-2.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
| data 1 | data 2 | data 3 |
+-----------+-----------+-----------+
$ cat data-3.txt
HEADER 1,HEADER 2,HEADER 3
data 1,data 2,data 3
data 4,data 5,data 6
$ printTable ',' "$(cat data-3.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
| data 1 | data 2 | data 3 |
| data 4 | data 5 | data 6 |
+-----------+-----------+-----------+
$ cat data-4.txt
HEADER
data
$ printTable ',' "$(cat data-4.txt)"
+---------+
| HEADER |
+---------+
| data |
+---------+
$ cat data-5.txt
HEADER
data 1
data 2
$ printTable ',' "$(cat data-5.txt)"
+---------+
| HEADER |
+---------+
| data 1 |
| data 2 |
+---------+
REF LIB at: https://github.com/gdbtek/linux-cookbooks/blob/master/libraries/util.bash
To have the exact same output as you need, you need to format the file like this:
a very long string..........\t 112232432\t anotherfield\n
a smaller string\t 123124343\t anotherfield\n
And then using:
$ column -t -s $'\t' FILE
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
It's easier than you wonder.
If you are working with a separated-by-semicolon file and header too:
$ (head -n1 file.csv && sort file.csv | grep -v <header>) | column -s";" -t
If you are working with an array (using tab as separator):
for((i=0;i<array_size;i++));
do
echo stringarray[$i] $'\t' numberarray[$i] $'\t' anotherfieldarray[$i] >> tmp_file.csv
done;
cat file.csv | column -t
awk solution that deals with stdin
Since column is not POSIX, maybe this is:
mycolumn() (
file="${1:--}"
if [ "$file" = - ]; then
file="$(mktemp)"
cat > "${file}"
fi
awk '
FNR == 1 { if (NR == FNR) next }
NR == FNR {
for (i = 1; i <= NF; i++) {
l = length($i)
if (w[i] < l)
w[i] = l
}
next
}
{
for (i = 1; i <= NF; i++)
printf "%*s", w[i] + (i > 1 ? 1 : 0), $i
print ""
}
' "$file" "$file"
if [ "$1" = - ]; then
rm "$file"
fi
)
Test:
printf '12 1234 1
12345678 1 123
1234 123456 123456
' > file
Test commands:
mycolumn file
mycolumn <file
mycolumn - <file
Output for all:
12 1234 1
12345678 1 123
1234 123456 123456
See also:
Using awk to align columns in text file?
AWK: go through the file twice, doing different tasks
I am not sure where you were running this, but the code you posted would not produce the output you gave, at least not in the Bash version that I'm familiar with.
Try this instead:
stringarray=('test' 'some thing' 'very long long long string' 'blah')
numberarray=(1 22 7777 8888888888)
anotherfieldarray=('other' 'mixed' 456 'data')
array_size=4
for((i=0;i<array_size;i++))
do
echo ${stringarray[$i]} $'\x1d' ${numberarray[$i]} $'\x1d' ${anotherfieldarray[$i]}
done | column -t -s$'\x1d'
Note that I'm using the group separator character (0x1D) instead of tab, because if you are getting these arrays from a file, they might contain tabs.
Just in case someone wants to do that in PHP, I posted a gist on GitHub:
https://gist.github.com/redestructa/2a7691e7f3ae69ec5161220c99e2d1b3
Simply call:
$output = $tablePrinter->printLinesIntoArray($items, ['title', 'chilProp2']);
You may need to adapt the code if you are using a PHP version older than 7.2.
After that, call echo or writeLine depending on your environment.
The below code has been tested and does exactly what is requested in the original question.
Parameters:
%30s Column of 30 char and text right align.
%10d integer notation, %10s will also work. \
stringarray[0]="a very long string.........."
# 28Char (max length for this column)
numberarray[0]=1122324333
# 10digits (max length for this column)
anotherfield[0]="anotherfield"
# 12Char (max length for this column)
stringarray[1]="a smaller string....."
numberarray[1]=123124343
anotherfield[1]="anotherfield"
printf "%30s %10d %13s" "${stringarray[0]}" ${numberarray[0]} "${anotherfield[0]}"
printf "\n"
printf "%30s %10d %13s" "${stringarray[1]}" ${numberarray[1]} "${anotherfield[1]}"
# a var string with spaces has to be quoted
printf "\n Next line will fail \n"
printf "%30s %10d %13s" ${stringarray[0]} ${numberarray[0]} "${anotherfield[0]}"
a very long string.......... 1122324333 anotherfield
a smaller string..... 123124343 anotherfield
column -t skips empty fields when a line starts with a delimiter character or when there are two or more consecutive delimiter characters:
$ printf %s\\n a,b,c a,,c ,b,c|column -s, -t
a b c
a c
b c
Therefore I use this awk function instead (it requires gawk because it uses arrays of arrays):
$ tab(){ awk '{if(NF>m)m=NF;for(i=1;i<=NF;i++){a[NR][i]=$i;l=length($i);if(l>b[i])b[i]=l}}END{for(h in a){for(i=1;i<=m;i++)printf("%-"(b[i]+n)"s",a[h][i]);print""}}' n="${2-1}" "${1+FS=$1}"|sed 's/ *$//';}
$ printf %s\\n a,b,c a,,c ,b,c|tab ,
a b c
a c
b c
if you data doesn't contain the equal sign ("=") anywhere in it, you can use that as a shell-friendly delimiter for column without having to escape anything -
by modifying FS to be either a tab ("\t") plus any amount of spaces (" ") or tabs ("\t") on either side of it, or a contiguous chunk of 2 or more spaces, it also allows the input data to have any amount of single space within each field
echo "${inputdata2}" |
mawk NF=NF OFS== FS=' + |[ \t]*\t[ \t]*' |
column -s= -t
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
if the data does contain the equal sign, use a combo sep that's close to impossible to exist in typical data :
gawk -e NF=NF OFS='\301\372\5' FS=' + |[ \t]*\t[ \t]*' |
LC_ALL=C column -s$'\301\372\5' -t
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
and if ur data only has 2 columns, and you have ballpark sense of how wide the first field is, you can use this \r trick for nice on-screen formatting (but those don't become runs of spaces if u need to send it down the pipe) :
# each \t is 8-spaces at console terminal
mawk NF=2 FS=' + |[ \t]*\t[ \t]*' OFS='\r\t\t\t\t'
a very long string.......... 112232432
a smaller string 123124343

Resources