right text align - bash - bash

I have one problem.
My text should be aligned by right in specified width. I have managed to cut output to the desired size, but i have problem with putting everything on right side
Here is what i got:
#!/usr/local/bin/bash
length=$1
file=$2
echo $1
echo -e "length = $length \t file = $file "
f=`fold -w$length $file > output`
while read line
do
echo "line is $line"
done < "output"
thanks

Try:
printf "%40.40s\n" "$line"
This will make it right-aligned with width 40. If you want no truncation, drop .40 (thanks Dennis!):
printf "%40s\n" "$line"
For example:
printf "%5.5s\n" abc
printf "%5.5s\n" abcdefghij
printf "%5s\n" abc
printf "%5s\n" abcdefghij
will print:
abc
abcde
abc
abcdefghij

Your final step could be
sed -e :a -e 's/^.\{1,$length\}$/ &/;ta'

This is a very old question (2010) but it's the top google result, so might as well. Of the existing answers here, one is a guess that doesn't adjust for terminal width, and the other one invokes sed which is unnecessarily costly.
The printf solution is better as it's a bash builtin, so it vwon't slow things down, but instead of guessing - bash gives you $COLUMNS to tell you how wide the terminal window you're dealing with is.
so while you can explicitly align to, say the 40th column:
printf "%40s\n" "$the_weather"
You can size it for whatever your terminal width is with:
printf "%$COLUMNSs\n" "$the_weather"
(since we're mixing up syntax here, we have used the full form syntax for a bash variable i.e. ${COLUMNS} instead of $COLUMNS, so that bash can identify the variable from the other syntax
In action .. now that we've freed up all that sed processing time, we can use it for something else maybe:
the_weather="$(curl -sm2 'http://wttr.in/Dublin?format=%l:+%c+%f')"
printf "%${COLUMNS}s\n" "${the_weather:-I hope the weather is nice}"

Related

Convert multi-line csv to single line using Linux tools

I have a .csv file that contains double quoted multi-line fields. I need to convert the multi-line cell to a single line. It doesn't show in the sample data but I do not know which fields might be multi-line so any solution will need to check every field. I do know how many columns I'll have. The first line will also need to be skipped. I don't how much data so performance isn't a consideration.
I need something that I can run from a bash script on Linux. Preferably using tools such as awk or sed and not actual programming languages.
The data will be processed further with Logstash but it doesn't handle double quoted multi-line fields hence the need to do some pre-processing.
I tried something like this and it kind of works on one row but fails on multiple rows.
sed -e :0 -e '/,.*,.*,.*,.*,/b' -e N -e '1n;N;N;N;s/\n/ /g' -e b0 file.csv
CSV example
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
The output I want is
First name,Last name,Address,ZIP
John,Doe,Country City Street,12345
Jane,Doe,Country City Street,67890
etc.
etc.
First my apologies for getting here 7 months late...
I came across a problem similar to yours today, with multiple fields with multi-line types. I was glad to find your question but at least for my case I have the complexity that, as more than one field is conflicting, quotes might open, close and open again on the same line... anyway, reading a lot and combining answers from different posts I came up with something like this:
First I count the quotes in a line, to do that, I take out everything but quotes and then use wc:
quotes=`echo $line | tr -cd '"' | wc -c` # Counts the quotes
If you think of a single multi-line field, knowing if the quotes are 1 or 2 is enough. In a more generic scenario like mine I have to know if the number of quotes is odd or even to know if the line completes the record or expects more information.
To check for even or odd you can use the mod operand (%), in general:
even % 2 = 0
odd % 2 = 1
For the first line:
Odd means that the line expects more information on the next line.
Even means the line is complete.
For the subsequent lines, I have to know the status of the previous one. for instance in your sample text:
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
You can say line 1 (John,Doe,"Country) has 1 quote (odd) what means the status of the record is incomplete or open.
When you go to line 2, there is no quote (even). Nevertheless this does not mean the record is complete, you have to consider the previous status... so for the lines following the first one it will be:
Odd means that record status toggles (incomplete to complete).
Even means that record status remains as the previous line.
What I did was looping line by line while carrying the status of the last line to the next one:
incomplete=0
cat file.csv | while read line; do
quotes=`echo $line | tr -cd '"' | wc -c` # Counts the quotes
incomplete=$((($quotes+$incomplete)%2)) # Check if Odd or Even to decide status
if [ $incomplete -eq 1 ]; then
echo -n "$line " >> new.csv # If line is incomplete join with next
else
echo "$line" >> new.csv # If line completes the record finish
fi
done
Once this was executed, a file in your format generates a new.csv like this:
First name,Last name,Address,ZIP
John,Doe,"Country City Street",12345
I like one-liners as much as everyone, I wrote that script just for the sake of clarity, you can - arguably - write it in one line like:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l " || echo "$l";done >new.csv
I would appreciate it if you could go back to your example and see if this works for your case (which you most likely already solved). Hopefully this can still help someone else down the road...
Recovering the multi-line fields
Every need is different, in my case I wanted the records in one line to further process the csv to add some bash-extracted data, but I would like to keep the csv as it was. To accomplish that, instead of joining the lines with a space I used a code - likely unique - that I could then search and replace:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l ~newline~ " || echo "$l";done >new.csv
the code is ~newline~, this is totally arbitrary of course.
Then, after doing my processing, I took the csv text file and replaced the coded newlines with real newlines:
sed -i 's/ ~newline~ /\n/g' new.csv
References:
Ternary operator: https://stackoverflow.com/a/3953666/6316852
Count char occurrences: https://stackoverflow.com/a/41119233/6316852
Other peculiar cases: https://www.linuxquestions.org/questions/programming-9/complex-bash-string-substitution-of-csv-file-with-multiline-data-937179/
TL;DR
Run this:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l " || echo "$l";done >new.csv
... and collect results in new.csv
I hope it helps!
If Perl is your option, please try the following:
perl -e '
while (<>) {
$str .= $_;
}
while ($str =~ /("(("")|[^"])*")|((^|(?<=,))[^,]*((?=,)|$))/g) {
if (($el = $&) =~ /^".*"$/s) {
$el =~ s/^"//s; $el =~ s/"$//s;
$el =~ s/""/"/g;
$el =~ s/\s+(?!$)/ /g;
}
push(#ary, $el);
}
foreach (#ary) {
print /\n$/ ? "$_" : "$_,";
}' sample.csv
sample.csv:
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
John,Doe,"Country
City
Street",67890
Result:
First name,Last name,Address,ZIP
John,Doe,Country City Street,12345
John,Doe,Country City Street,67890
This might work for you (GNU sed):
sed ':a;s/[^,]\+/&/4;tb;N;ba;:b;s/\n\+/ /g;s/"//g' file
Test each line to see that it contains the correct number of fields (in the example that was 4). If there are not enough fields, append the next line and repeat the test. Otherwise, replace the newline(s) by spaces and finally remove the "'s.
N.B. This may be fraught with problems such as ,'s between "'s and quoted "'s.
Try cat -v file.csv. When the file was made with Excel, you might have some luck: When the newlines in a field are a simple \n and the newline at the end is a \r\n (which will look like ^M), parsing is simple.
# delete all newlines and replace the ^M with a new newline.
tr -d "\n" < file.csv| tr "\r" "\n"
# Above two steps with one command
tr "\n\r" " \n" < file.csv
When you want a space between the joined line, you need an additional step.
tr "\n\r" " \n" < file.csv | sed '2,$ s/^ //'
EDIT: #sjaak commented this didn't work is his case.
When your broken lines also have ^M you still can be a lucky (wo-)man.
When your broken field is always the first field in double quotes and you have GNU sed 4.2.2, you can join 2 lines when the first line has exactly one double quote.
sed -rz ':a;s/(\n|^)([^"]*)"([^"]*)\n/\1\2"\3 /;ta' file.csv
Explanation:
-z don't use \n as line endings
:a label for repeating the step after successful replacement
(\n|^) Search after a newline or the very first line
([^"]*) Substring without a "
ta Go back to label a and repeat
awk pattern matching is working.
answer in one line :
awk '/,"/{ORS=" "};/",/{ORS="\n"}{print $0}' YourFile
if you'd like to drop quotes, you could use:
awk '/,"/{ORS=" "};/",/{ORS="\n"}{print $0}' YourFile | sed 's/"//gw NewFile'
but I prefer to keep it.
to explain the code:
/Pattern/ : find pattern in current line.
ORS : indicates the output line record.
$0 : indicates the whole of the current line.
's/OldPattern/NewPattern/': substitude first OldPattern with NewPattern
/g : does the previous action for all OldPattern
/w : write the result to Newfile

getting error while using sed function on hp unix box

I'm trying to retrieve nth column from "busfile" file by substituting values in "i"
the below code works fine on redhat linux, when tried on hp unix i'm getting error
"sed: Function {i}{p} cannot be parsed."
here is my code
acList=/z/temp/busfile
i=1
temp1=`sed -n "{i}{p}" $acList`
echo $temp1
Update:
Even when I add the $ as suggested in some of the answers, I still have the same problem.
temp1=`sed -n "${i}{p}" $acList`
If you're trying to use the i variable to print a line, you need to precede it with $:
temp1=`sed -n "${i}p" $acList`
as per the following transcript:
pax> i=3
pax> echo 'a
...> b
...> c
...> d
...> e
...> f
...> g' | sed -n "${i}p"
c
In situations like this, I tend to first try the simplest solution then gradually add complexity until it fails.
The first step would be to create a four-line file (called myfile) with the words one through four:
one
two
three
four
then try various commands with it, in ever increasing complexity:
sed -n "p" myfile # Print all lines.
sed -n "3p" myfile # Print hard-coded line.
i=3 ; sed -n "${i}p" myfile # Print line with parameter.
i=3 ; x=`sed -n "${i}p" myfile` ; echo $x # Capture line with parameter.
At some point, it will hopefully "break" and you can then target your investigations in a more concentrated manner.
However, I suspect it's unnecessary here since your purported use of that command to extract a column is incorrect. If you're trying to print a column rather than a line, then awk may be a better tool for the job:
pax> i=5
pax> echo 'pax is a really great guy' | awk -vf=$i '{print $f}'
great
You can use:
acList=/z/temp/busfile
i=1
temp1=`sed -n $i'p' $acList`
echo "$temp1"

Extending terminal colors to the end of line

I have a bash script which generates a motd. The problem is depending on some terminal settings which I am not sure about the color will extend to the end of the line. Othertimes it doesn't:
e.g.
v.s.
IIRC one is just the normal gnome-terminal and the other is my tmux term. So my question is how can I get this to extend to 80 character (or really to the terminal width). Of course I can pad to 80 chars but that really doesn't solve the problem.
Here is a snip of my code which generates the motd:
TC_RESET="^[[0m"
TC_SKY="^[[0;37;44m"
TC_GRD="^[[0;30;42m"
TC_TEXT="^[[38;5;203m"
echo -n "${TC_SKY}
... lots of printing..."
echo -e "\n Welcome to Mokon's Linux! \n"
echo -n "${TC_GRD}"
nodeinfo # Just prints the info seen below...
echo ${TC_RESET}
How can I programmatically from bash change the terminal settings or something change the color to the end of the line?
Maybe use the Escape sequence to clear-to-EOL
For some reason (on my MacOS terminal!) I only needed specify this sequence and then it worked for all the lines but for completeness I list it for all
TC_RESET=$'\x1B[0m'
TC_SKY=$'\x1B[0;37;44m'
TC_GRD=$'\x1B[0;30;42m'
TC_TEXT=$'\x1B[38;5;203m'
CLREOL=$'\x1B[K'
echo -n "${TC_SKY}${CLREOL}"
echo -e "\n ABC${CLREOL}\n"
echo -e "\n DEFG${CLREOL}\n"
echo -n "${TC_GRD}"
echo -e "\n ABC${CLREOL}\n"
echo -e "\n DEFG${CLREOL}\n"
echo ${TC_RESET}
Padding filter
Unfortunely, you have to pad each line with exact number of spaces for changing the color of the whole line's background.
As you're speaking about bash, my solution will use bashisms (Won't work under other shell, or older version of bash).
syntax printf -v VAR FORM ARGS assign to varianble VAR then result of sprintf FORM ARGS. That's bashism, under other kind of shell, you have to replace this line by TC_SPC=$(printf "%${COLUMNS}s" '')
You may try this:
... lots of printing..."
echo -e "\n Welcome to Mokon's Linux! \n"
echo -n "${TC_GRD}"
printf -v TC_SPC "%${COLUMNS}s" ''
nodeinfo |
sed "s/$/$TC_SPC/;s/^\\(.\\{${COLUMNS}\\}\\) */\\1/" # Just prints the info seen below...
echo ${TC_RESET}
Maybe you have to ensure that $COLUMNS is correctly setted:
COLUMNS=$(tput cols)
As you could see, only the result of command filtered by sed is fully colored.
you may
use same filter many times:
cmd1 | sed '...'
cmd2 | sed '...'
or group your commands to use only one filter:
( cmd1 ; cmd 2 ) | sed '...'
But there is an issue in case you try to filter ouptut that contain formatting escapes:
(
echo $'\e[33;44;1mYellow text on blue background';
seq 1 6;
echo $'\e[0m'
) | sed "
s/$/$TC_SPC/;
s/^\\(.\\{${COLUMNS}\\}\\) */\\1/"
Il the lines you have to pad to contain escapes, you have to isolate thems:
(
echo $'\e[33;44;1mYellow text on blue background';
seq 1 6;
echo $'\e[0m'
) | sed "
s/\$/$TC_SPC/;
s/^\\(\\(\\o33\\[[0-9;]*[a-zA-Z]\\)*\\)\\([^\o033]\\{${COLUMNS}\\}\\) */\\1\\3/
"
And finally to be able to fill terminate very long lines:
(
echo $'\e[33;44;1mYellow text on blue background';
seq 1 6;
echo "This is a very very long long looooooooooong line that contain\
more characters than the line could hold...";
echo $'\e[0m';
) | sed "
s/\$/$TC_SPC/;
s/^\\(\\(\\o33\\[[0-9;]*[a-zA-Z]\\)*\\)\\(\\([^\o033]\\{${COLUMNS}\\}\\)*\\) */\\1\\3/"
Nota: This only work if formating escapes are located at begin of line.
Try with this:
echo -e '\E[33;44m'"yellow text on blue background"; tput sgr0

Bash - extracting a string between two points

For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.
If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks
Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end
The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text
sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.

How to left justify text in Bash

Given a text, $txt, how could I left justify it to a given width in Bash?
Example (width = 10):
If $txt=hello, I would like to print:
hello |
If $txt=1234567890, I would like to print:
1234567890|
You can use the printf command, like this:
printf "%-10s |\n" "$txt"
The %s means to interpret the argument as string, and the -10 tells it to left justify to width 10 (negative numbers mean left justify while positive numbers justify to the right). The \n is required to print a newline, since printf doesn't add one implicitly.
Note that man printf briefly describes this command, but the full format documentation can be found in the C function man page in man 3 printf.
You can use the - flag for left justification.
Example:
[jaypal:~] printf "%10s\n" $txt
hello
[jaypal:~] printf "%-10s\n" $txt
hello
Bash contains a printf builtin:
txt=1234567890
printf "%-10s\n" "$txt"

Resources