I have a tab-delimited text file which I send to column to "pretty print" a table.
Original file:
1<TAB>blablablabla<TAB>aaaa bbb ccc
2<TAB>blabla<TAB>xxxxxx
34<TAB>okokokok<TAB>zzz yyy
Using column -s$'\t' -t <original file>, I get
1 blablablabla aaaa bbb xxx
2 blabla xxxxxx
34 okokokok zzz yyy
as desired. Now I want to add colors to the columns. I tried to add the escape codes around each tab-delimited field in the original file. column successfully prints in color, but the columns are no longer aligned. Instead, it just prints the TAB separators verbatim.
The question is: how can I get the columns aligned, but also with unique colors?
I've thought of two ways to achieve this:
Adjust the column parameters to make the alignment work with color codes
Redirect the output of column to another file, and do a search+replace on the first two whitespace-delimited fields (the first two columns are guaranteed to not contain spaces; the third column most likely will contain spaces, but no TAB characters)
Problem is, I'm not sure how to do either of those two...
For reference, here is what I'm passing to column:
Note that the fields are indeed separated by TAB characters. I've confirmed this with od.
edit:
There doesn't seem to be an issue with the colorization. I already have the file shown above with the color codes working. The issue is column won't align once I send it input with escape codes. I am thinking of passing the fields without color codes to column, then copying the exact number of spaces column output between each field, and using that in a pretty print scheme.
I wrote a bash version of column (similar to the one from util-linux) which works with color codes:
#!/bin/bash
which sed >> /dev/null || exit 1
version=1.0b
editor="Norman Geist"
last="04 Jul 2016"
# NOTE: Brilliant pipeable tool to format input text into a table by
# NOTE: an configurable seperation string, similar to column
# NOTE: from util-linux, but we are smart enough to ignore
# NOTE: ANSI escape codes in our column width computation
# NOTE: means we handle colors properly ;-)
# BUG : none
addspace=1
seperator=$(echo -e " ")
columnW=()
columnT=()
while getopts "s:hp:v" opt; do
case $opt in
s ) seperator=$OPTARG;;
p ) addspace=$OPTARG;;
v ) echo "Version $version last edited by $editor ($last)"; exit 0;;
h ) echo "column2 [-s seperator] [-p padding] [-v]"; exit 0;;
* ) echo "Unknow comandline switch \"$opt\""; exit 1
esac
done
shift $(($OPTIND-1))
if [ ${#seperator} -lt 1 ]; then
echo "Error) Please enter valid seperation string!"
exit 1
fi
if [ ${#addspace} -lt 1 ]; then
echo "Error) Please enter number of addional padding spaces!"
exit 1
fi
#args: string
function trimANSI()
{
TRIM=$1
TRIM=$(sed 's/\x1b\[[0-9;]*m//g' <<< $TRIM); #trim color codes
TRIM=$(sed 's/\x1b(B//g' <<< $TRIM); #trim sgr0 directive
echo $TRIM
}
#args: len
function pad()
{
for ((i=0; i<$1; i++))
do
echo -n " "
done
}
#read and measure cols
while read ROW
do
while IFS=$seperator read -ra COLS; do
ITEMC=0
for ITEM in "${COLS[#]}"; do
SITEM=$(trimANSI "$ITEM"); #quotes matter O_o
[ ${#columnW[$ITEMC]} -gt 0 ] || columnW[$ITEMC]=0
[ ${columnW[$ITEMC]} -lt ${#SITEM} ] && columnW[$ITEMC]=${#SITEM}
((ITEMC++))
done
columnT[${#columnT[#]}]="$ROW"
done <<< "$ROW"
done
#print formatted output
for ROW in "${columnT[#]}"
do
while IFS=$seperator read -ra COLS; do
ITEMC=0
for ITEM in "${COLS[#]}"; do
WIDTH=$(( ${columnW[$ITEMC]} + $addspace ))
SITEM=$(trimANSI "$ITEM"); #quotes matter O_o
PAD=$(($WIDTH-${#SITEM}))
if [ $ITEMC -ne 0 ]; then
pad $PAD
fi
echo -n "$ITEM"
if [ $ITEMC -eq 0 ]; then
pad $PAD
fi
((ITEMC++))
done
done <<< "$ROW"
echo ""
done
Example usage:
bold=$(tput bold)
normal=$(tput sgr0)
green=$(tput setaf 2)
column2 -s § << END
${bold}First Name§Last Name§City${normal}
${green}John§Wick${normal}§New York
${green}Max§Pattern${normal}§Denver
END
Output example:
I would use awk for the colorization (sed can be used as well):
awk '{printf "\033[1;32m%s\t\033[00m\033[1;33m%s\t\033[00m\033[1;34m%s\033[00m\n", $1, $2, $3;}' a.txt
and pipe it to column for the alignment:
... | column -s$'\t' -t
Output:
A solution using printf to format the ouput as well :
while IFS=$'\t' read -r c1 c2 c3; do
tput setaf 1; printf '%-10s' "$c1"
tput setaf 2; printf '%-30s' "$c2"
tput setaf 3; printf '%-30s' "$c3"
tput sgr0; echo
done < file
In my case, I wanted to selectively colorise values in a column depending on its value. Let's say I want okokokok to be green and blabla to be red.
I can do it such way (the idea is to colorise values of columns after columnisation):
GREEN_SED='\\033[0;32m'
RED_SED='\\033[0;31m'
NC_SED='\\033[0m' # No Color
column -s$'\t' -t <original file> | echo -e "$(sed -e "s/okokokok/${GREEN_SED}okokokok${NC_SED}/g" -e "s/blabla/${RED_SED}blabla${NC_SED}/g")"
Alternatively, with a variable:
DATA=$(column -s$'\t' -t <original file>)
GREEN_SED='\\033[0;32m'
RED_SED='\\033[0;31m'
NC_SED='\\033[0m' # No Color
echo -e "$(sed -e "s/okokokok/${GREEN_SED}okokokok${NC_SED}/g" -e "s/blabla/${RED_SED}blabla${NC_SED}/g" <<< "$DATA")"
Take a note of that additional backslash in values of color definitions. It is made for sed to not interpret an origingal backsash.
This is a result:
2021 Updated BASH Answer
TL;DR
I really liked #NORMAN GEIST's answer but was way too slow for what i needed... So i coded my own version of his script, this time written in Perl (stdin looping and formatting) + Bash (only for presentation/help).
You can find the full code here with an explanation on how to use it.
It is comprehensive of:
A Bash column-like command interface (same parameters like -t, -s, -o)
Exaustive help with column_ansi --help or column_ansi -h
Option to horizontally center.
The actual "core" code can broken down to only the Perl part.
Background and differences
I needed to format a very long awk-generated colored output (more than 300 lines) into a nice table.
I first thought of using column, but as i discovered it didn't take into consideration ANSI characters, since the output would come out not aligned.
After searching a bit on Google i found #NORMAN GEIST's interesting answer on SO which dynamically calculated the width of every single column in the output after removing the ANSI characters and THEN it built the table.
It was all good, but it was taking way too long to load (as someone pointed in the comments)...
So i tried to convert #NORMAN GEIST's column2 from bash to perl and my god if there was a change!
After trying out this version in my production script the time used to display data dropped from 30s to <1s!!
Enjoy!
Related
Consider this small example:
printf "Loading data..."; \
sleep 5; \
echo -e "\rThis is my cool data point."
This brings, of course, Loading data..., and after 5 seconds, that will be overwritten by This is my cool data point..
But what if the text printed out after the \r is shorter as the first line?
printf "Loading data..."; \
sleep 5; \
echo -e "\rNo data."
...brings No data.data... after the waiting time.
Do I have to keep track of the longest possible line and print "\rNo data. " or is there any "magic character" that fills the line until its end in a normal terminal?
You can delete to end of line with tput el. So you can do:
eol=$(tput el)
printf "Loading data..."
sleep 5
printf "\rNo data.${eol}\n"
It's not really a "magic character" so much as a "magic sequence", and the actual values that are used may vary with the terminal. tput will (should) do the right thing and give you a reasonably portable method. Attempting to determine precisely which sequence to use is a futile effort.
Another solution would be:
#!/bin/bash
msg1='Loading data...'
msg2='No data.'
printf '%s' "$msg1"
sleep 5
printf '\r%*s\n' $(( -(${#msg1} > ${#msg2} ? ${#msg1} : ${#msg2}) )) "$msg2"
This appends trailing spaces to the line if the length of msg1 is greater than that of msg2.
I have various variables and what I want to do is to print them all with the same width (character wise). To achieve that, I first need to discover which is the longest string and add one to it,
and then print the shorter ones with that width, padding with spaces.
well ideally i want the output to be like
IFROGZ FREE REIN 2 "00:00:00:00:B5:C8"
Mi Phone "A4:50:46:AC:32:59"
realme Watch "D8:CA:8E:CD:5D:7C"
where in case of a device being connected on the left 2 of the 4 spaces become asterix(*) but this question is more adressed to the right padding/formating
printf %15s $something ' ' $isuppose, doesn't work which would be the ideal solution, but how do I find the width to put there?
A very important thing is it needs to be POSIX compliant scripting.
Here's where I'm at, but the code at this point is very redudant because I've tried brute-forcing the solution.
inc=$#
inc=$((inc-1))
demon=$(eval printf \"\$$inc\");
inc=$((inc-2))
tellar=$(eval printf \"\$$inc\");
demon=${demon:1:-1}$tella$tellar
inc=$((inc-1))
while (( $inc >= 2 )); do
aussie=$(eval printf \"\$$inc\");
inc=$((inc-2))
tellar=$(eval printf \"\$$inc\");
inc=$((inc+2))
demon=$(printf "%s»%"$smoll"s%s" $demon ${aussie:1:-1} " " $tellar);
inc=$((inc-3))
done
demon=$(echo $demon | sed -E "s/»/`space=${#demon}; while (( smoll > i++ )); do ( printf " " ); done; unset space;`\n/g")
Here's the current input and output although the input is "wrong".
Given a set of variables, the maximum width can be easily calculated with:
setwidth(){
width=0
for str in "$#"; do
[ $width -lt ${#str} ] && width=${#str}
done
}
We store the result in a global variable width for later use in a printf format string.
Example of use:
var1="123"
var2=" 2345"
var3="123456 89"
testprint(){
setwidth "$#"
echo right-justified:
printf "\055 %${width}s |\n" "$#"
echo
echo left-justified:
printf "\055 %-${width}s |\n" "$#"
}
testprint "$var1" "$var2" "$var3"
giving:
right-justified:
- 123 |
- 2345 |
- 123456 89 |
left-justified:
- 123 |
- 2345 |
- 123456 89 |
Essentially for a set number of systems logs pulled and exported I need to indicate the scripts progress by printing a character "#". This should eventually create a progress bar with a width of 60. Something like what's presented below: ############################################# ,additionally I need the characters to build from left to right indicating the progression of the script.
The Question/Problem that this code was based off of goes as follows: "Use a separate invocation of wevtutil el to get the count of the number of logs and scale this to,say, a width of 60."
SYSNAM=$(hostname)
LOGDIR=${1:-/tmp/${SYSNAM}_logs}
i=0
LOGCOUNT=$( wevtutil el | wc -l )
x=$(( LOGCOUNT/60 ))
wevtutil el | while read ALOG
do
ALOG="${ALOG%$'\r'}"
printf "${ALOG}:\r"
SAFNAM="${ALOG// /_}"
SAFNAM="${SAFNAM//\//-}"
wevtutil epl "$ALOG" "${SYSNAM}_${SAFNAM}.evtx"
done
I've attempted methods such as using echo -ne "#", and printf "#%0.s" however the issue that I encounter is that the "#" characters gets printed with each instance of the name of the log file being retrieved; also the pattern is printed vertically rather than horizontally.
LOGCOUNT=$( wevtutil el | wc -l )
x=$(( LOGCOUNT/60 ))
echo -ne "["
for i in {1..60}
do
if [[ $(( x*i )) != $LOGCOUNT ]]
then
echo -ne "#"
#printf '#%0.s'
fi
done
echo "]"
printf "\n"
echo "Transfer Complete."
echo "Total Log Files Transferred: $LOGCOUNT"
I tried previously integrating this code into the first block but no luck. But something tells me that I don't need to establish a whole new loop, I keep thinking that the first block of code only needs a few lines of modification. Anyhow sorry for the lengthy explanation, please let me know if anything additional is needed for assistance--Thank you.
For the sake of this answer I'm going to assume the desired output is a 2-liner that looks something like:
$ statbar
file: /bin/cygdbusmenu-qt5-2.dll
[######## ]
The following may not work for everyone as it comes down to individual terminal attributes and how they can(not) be manipulated by tput (ie, ymmv) ...
For my sample script I'm going to loop through the contents of /bin, printing the name of each file as I process it, while updating the status bar with a new '#' after each 20 files:
there are 719 files under my /bin so there should be 35 #'s in my status bar (I add an extra # at the end once processing has completed)
we'll use a few tput commands to handle cursor/line movement, plus erasing previous output from a line
for printing the status bar I've pre-calculated the number of #'s and then use 2 variables ... $barspace for spaces, $barhash for #'s; for each 20 files I strip a space off $barspace and add a single # to $barhash; by (re)printing these 2x variables every 20x files I get the appearance of a moving status bar
Putting this all together:
$ cat statbar
clear # make sure we have plenty of room to display our status bar;
# if we're at the bottom of the console/window and we cause the
# windows to 'scroll up' then 'tput sc/rc' will not work
tput sc # save pointer/reference to current terminal line
erase=$(tput el) # save control code for 'erase (rest of) line'
# init some variables; get a count of the number of files so we can pre-calculate the total length of our status bar
modcount=20
filecount=$(find /bin -type f | wc -l)
# generate a string of filecount/20+1 spaces (35+1 for my particular /bin)
barspace=
for ((i=1; i<=(filecount/modcount+1); i++))
do
barspace="${barspace} "
done
barhash= # start with no #'s for this variable
filecount=0 # we'll re-use this variable to keep track of # of files processed so need to reset
while read -r filename
do
filecount=$((filecount+1))
tput rc # return cursor to previously saved terminal line (tput sc)
# print filename (1st line of output); if shorter than previous filename we need to erase rest of line
printf "file: ${filename}${erase}\n"
# print our status bar (2nd line of output) on the first and every ${modcount} pass through loop;
if [ ${filecount} -eq 1 ]
then
printf "[${barhash}${barspace}]\n"
elif [[ $((filecount % ${modcount} )) -eq 0 ]]
then
# for every ${modcount}th file we ...
barspace=${barspace:1:100000} # strip a space from barspace
barhash="${barhash}#" # add a '#' to barhash
printf "[${barhash}${barspace}]\n" # print our new status bar
fi
done < <(find /bin -type f | sort -V)
# finish up the status bar (should only be 1 space left to 'convert' to a '#')
tput rc
printf "file: -- DONE --\n"
if [ ${#barspace} -gt 0 ]
then
barspace=${barspace:1:100000}
barhash="${barhash}#"
fi
printf "[${barhash}${barspace}]\n"
NOTE: While testing I had to periodically reset my terminal in order for the tput commands to function properly, eg:
$ reset
$ statbar
I couldn't get the above to work on any of the (internet) fiddle sites (basically having problems getting tput to work with the web-based 'terminals').
Here's a gif displaying the behavior ...
NOTES:
the script does print every filename to stdout but since this script isn't actually doing anything with the files in question a) the printfs occur quite rapidly and b) the video/gif only captures a (relatively) few fleeting images ("Duh, Mark!" ?)
the last printf "file: -- DONE --\n" was added after I created the gif, and I'm being lazy by not generating and uploading a new gif
I have data in a csv file. I wrote a script that cats this file and uses column -s, -t to nicely tabularize it into nice columns:
Foosballs Barbells Bazketballs
22 39 14
86 94 37
17 44 28
However, I'd like to display the header row in bold. I can do that by writing the format codes directly to the file.
bold=$(tput bold)
reset=$(tput sgr0)
echo "${bold}Foosballs,Barbells,Bazketballs${reset}" > /path/to/file
This works fine with cat; the format codes are displayed correctly when I cat the file. But they screw up column -t: any colored/bolded row is no longer aligned with the other rows.
Foosballs Barbells Bazketballs
22 39 14
86 94 37
17 44 28
Is there some way to get column -t to ignore color codes when lining up data into columns? (Or is there a better way to display csv data in columns?)
UPDATE:
Applying column first and the format codes second will work, as some answers point out. But in many cases I want to apply different formats/colors to individual values in the row, not to the entire row. Here's a simple example:
echo "${underline}foo${reset} ${underline}bar${reset}"
In general, I might want to use arbitrary formatting logic that's difficult or impossible to apply post-hoc (i.e., after I've already printed the line and called column -t on it). Formatting after tabularizing (as in Charles Duffy's answer) is a great start but may not always work for me (at least, conveniently).
I could always write a utility to do this format-code-transparent tabularization myself, but then I'd have to bring that with me wherever I work. I don't want to have to know column widths in advance; I need something like column -t I can throw on the end of a pipe with arbitrary delimited data. Basically, I need a clever one-liner or a third-party util that's readily available via Homebrew or other package managers.
To sum up: For the bounty, I'm looking for a simple, (reasonably) portable method to tabularize previously-formatted data.
One mechanism to enforce alignment and inject color codes is to use printf:
printf '%s%-20s %-20s %-20s%s\n' "$bold" "Foosballs" "Barbells" "Bazketballs" "$reset"
Note that we're using %s placeholders for the color codes, and strings like %-20s (20 characters, left-aligned) for the other fields. This does mean that your code needs to be responsible for knowing the desired length for each column.
If you don't want to do that, you can postprocess:
generate_data() {
echo "Foosballs,Barbells,Bazketballs"
echo 22,39,14
echo 86,94,28
echo 17,44,28
}
bold=$(tput bold)
reset=$(tput sgr0)
generate_data | column -s, -t | {
IFS= read -r header # read first line
printf '%s\n' "${bold}$header${reset}" # write first line in bold
cat # pass rest of stream through unmodified
}
Or, to color just one column:
generate_data() { printf '%s\n' "Foosballs,Barbells,Bazketballs" 22,39,14 86,94,28 17,44,28; }
color_column() {
gawk -v column_nr="$1" -v color_start="$2" -v color_end="$3" '
BEGIN { FPAT = "([[:space:]]*[^[:space:]]+)"; }
{ $column_nr = color_start $column_nr color_end; print $0 }
'
}
generate_data | column -s, -t | color_column 2 "$(tput bold)" "$(tput sgr0)"
For this test file:
$ cat file
Foosballs,Barbells,Bazketballs
22,39,14
86,94,28
The simple way:
d='\e[0m' #default env
r='\e[31m' #red color
printf "$r"; column -s, -t file; printf "$d"
More complicated with different color for each column:
s=',' #delimiter
d='\\e[0m' #default env
r='\\e[31m' #red color
g='\\e[32m' #green color
b='\\e[34m' #blue color
echo -e "$(
awk -F $s \
-v s="$s" \
-v d="$d" \
-v r="$r" \
-v g="$g" \
-v b="$b" \
'{ print r $1 s g $2 s b $3 d }' file | column -s$s -t
)"
And to make header bold, just add this code \e[1m to the echo command like this:
...
B='\e[1m'
echo -e "$B$(
awk -F "$s" \
...
)"
How can I convert one date format to another format in a shellscript?
Example:
the old format is
MM-DD-YY HH:MM
but I want to convert it into
YYYYMMDD.HHMM
Like "20${D:6:2}${D:0:2}${D:3:2}.${D:9:2}${D:12:2}00", if the old date in the $D variable.
Take advantage of the shell's word splitting and the positional parameters:
date="12-31-11 23:59"
IFS=" -:"
set -- $date
echo "20$3$1$2.$4$5" #=> 20111231.2359
myDate="21-12-11 23:59"
#fmt is DD-MM-YY HH:MM
outDate="20${myDate:6:2}${myDate:3:2}${myDate:0:2}.${myDate:9:2}${myDate:12:2}00"
case "${outDate}" in
2[0-9][0-9][0-9][0-1][0-9][0-3][0-9].[0-2][0-9][0-5][[0-9][0-5][[0-9] )
: nothing_date_in_correct_format
;;
* ) echo bad format for ${outDate} >&2
;;
esac
Note that if you have a large file to process, then the above is an expensive(ish) process. For filebased data I would recommend something like
cat infile
....|....|21-12-11 23:59|22-12-11 00:01| ...|
awk '
function reformatDate(inDate) {
if (inDate !~ /[0-3][0-9]-[0-1][0-9]-[0-9][0-9] [0-2][0-9]:[0-5][[0-9]/) {
print "bad date format found in inDate= "inDate
return -1
}
# in format assumed to be DD-MM-YY HH:MM(:SS)
return (2000 + substr(inDate,7,2) ) substr(inDate,4,2) substr(inDate, 1,2) \
"." substr(inDate,10,2) substr(inDate,13,2) \
( substr(inDate,16,2) ? substr(inDate,16,2) : "00" )
}
BEGIN {
#add or comment out for each column of data that is a date value to convert
# below is for example, edit as needed.
dateCols[3]=3
dateCols[4]=4
# for awk people, I call this the pragmatic use of associative arrays ;-)
#assuming pipe-delimited data for columns
#....|....|21-12-11 23:59|22-12-11 00:01| ...|
FS=OFS="|"
}
# main loop for each record
{
for (i=1; i<=NF; i++) {
if (i in dateCols) {
#dbg print "i=" i "\t$i=" $i
$i=reformatDate($i)
}
}
print $0
}' infile
output
....|....|20111221.235900|20111222.000100| ...|
I hope this helps.
There is a good answer down already, but you said you wanted an alternative in the comments, so here is my [rather awful in comparison] method:
read sourcedate < <(echo "12-13-99 23:59");
read sourceyear < <(echo $sourcedate | cut -c 7-8);
if [[ $sourceyear < 50 ]]; then
read fullsourceyear < <(echo -n 20; echo $sourceyear);
else
read fullsourceyear < <(echo -n 19; echo $sourceyear);
fi;
read newsourcedate < <(echo -n $fullsourceyear; echo -n "-"; echo -n $sourcedate | cut -c -5);
read newsourcedate < <(echo -n $newsourcedate; echo -n $sourcedate | cut -c 9-14);
read newsourcedate < <(echo -n $newsourcedate; echo :00);
date --date="$newsourcedate" +%Y%m%d.%H%M%S
So, the first line just reads a date in, then we get the two-digit year, then we append it to '20' or '19' based on if it's less than 50 (so this would give you years from 1950 to 2049 - feel free to shift the line). Then we append a hyphen and the month and date. Then we append a space and the time, and lastly we append ':00' as the seconds (again feel free to make your own default). Lastly we use GNU date to read it in (since it's been standardized now) and print it in a different format (which you can edit).
It's a lot longer and uglier than cutting up the string, but having the format in the last line may be worth it. Also you could shorten it significantly with the shorthand you just learned in the first answer.
Good luck.