format in TCL in not working correctly - format

format in TCL in not working correctly,I am trying to format some text and writing them in a file and then sending that file as a mail to user.
I am seeing that format is correct in Linux but when mail comes to user then format is not proper.
code-
puts [format {%-50s%-170s%-50s%-50s} "Test_Id" "Test_Description" "Test_Ran_Count" "Test_Result"]
puts [format {%-50s%-170s%-50s%-50s} $test_id1 "$mail_desc1" $loop_count $test_result]
puts [format {%-50s%-170s%-50s%-50s} $test_id "$mail_desc" $loop_count $test_result]
output-
Test_Id Test_Description Test_Ran_Count Test_Result
test_id_1 To execute - test_id_1 10 PASS
test_id_2 To execute - test_id_2 10 PASS
here if test_id_2 is big then entire format is shifting,as per format behavior if test_id is less then 50 char then it should not shift other column as I am giving %-50s for test_id.

Format fields (which Tcl borrows from C's sprintf() with very few changes) are a bit tricky. When you use %50s (or %-50s — the - just affects alignment) you are setting a minimum field width. To set a maximum field width, you might use %.50s. Or, more likely in your case, you'll set minimum, maximum and alignment: %-50.50s.
Demonstrating with some narrower fields and simple strings of different lengths:
foreach str {abc defgh ijklmnop} {
puts [format ">%-5.5s< |%-5s| /%-.5s/" $str $str $str]
}
Which produces this output:
>abc < |abc | /abc/
>defgh< |defgh| /defgh/
>ijklm< |ijklmnop| /ijklm/
As you can see, left aligned, completely fixed width requires giving %-N.Ns (for some N).

Related

Pad Independently Missing Columns per Row in CSV with Bash (based off expected values)

I have a CSV file in which the ideal format for a row is this:
taxID#, scientific name, kingdom, k, phylum, p, class, c, order, o, family, f, genus, g
...where kingdom, phylum, etc. are identifiers, literals ("kingdom", ... "phylum"), and the values that follow the identifiers (k, p, etc.) are the actual values for those kingdoms, phyla, etc.
Example:
240395,Rugosa emeljanovi,kingdom,Metazoa,phylum,Chordata,class,Amphibia,order,Anura,family,Ranidae,genus,Rugosa
However, not all rows possess all levels of taxonomy, i.e. any one row might be missing the columns for an identifier/value pair, say, "class, c," and any 2-column PAIR can be missing independently of the other pairs missing or not. Also, if fields are missing, they will always be missing with their identifier field, so I'd never get "kingdom, phylum" together without the value for "k" between them. Thus much of my file is missing random fields:
...
135487,Nocardia cyriacigeorgica,class,Actinobacteria,order,Corynebacteriales,genus,Nocardia
10090,Mus musculus,kingdom,Metazoa,phylum,Chordata,class,Mammalia,order,Rodentia,family,Muridae,genus,Mus
152507,uncultured actinobacterium,phylum,Actinobacteria,class,Actinobacteria
171953,uncultured Acidobacteria bacterium,phylum,Acidobacteria
77133,uncultured bacterium
...
Question: How can I write a bash shell script that can "pad" every row in a file so that every field pair that may be missing from my ideal format is inserted, and its value column that follows is just blank. Desired output:
...
135487,Nocardia cyriacigeorgica,kingdom,,phylum,,class,Actinobacteria,order,Corynebacteriales,family,,genus,Nocardia
10090,Mus musculus,kingdom,Metazoa,phylum,Chordata,class,Mammalia,order,Rodentia,family,Muridae,genus,Mus
152507,uncultured actinobacterium,kingdom,,phylum,Actinobacteria,class,Actinobacteria,order,,family,,genus,
171953,uncultured Acidobacteria bacterium,phylum,Acidobacteria,clas,,order,,family,,genus,
77133,uncultured bacterium,kingdom,,phylum,,class,,order,,family,,genus,
...
Notes:
Notice if a genus was missing, the padded output should end with a comma to denote the value of genus doesn't exist.
taxID# and scientific name (the first two fields) will ALWAYS be present.
I don't care for time/resource efficiency if your solution is brute-forcey.
What I've tried:
I wrote a simple if/then script that checks sequentially if an expected field is gone. pseudocode:
if "$f3" is not "kingdom", pad
but the problem is that if kingdom was truly missing, it will get padded in output but the remaining field variables will be goofed up and I can't just follow that by saying
if "$f5" is not "phylum", pad
because if kingdom were missing, phylum would probably now be in field 3 ($f3), not $f5, that is, if it too weren't missing. (I did this by concatenating onto a string variable the expected output based on the absence of each field, and simply concatenating the original value if the field wasn't missing, and then echoing the finished, supposedly padded row to output).
I'd like to be able to execute my script like this
bash pad.sh prePadding.csv postPadding.csv
but I would accept answers using Mac Excel 2011 if needed.
Thank you!!
Although it should be possible in bash, I would use Perl for this. I tried to make the code as simple to understand as I could.
#!/usr/bin/perl
while (<>){
chomp;
my #fields=split ',';
my $kingdom='';
my $phylum='';
my $class='';
my $order='';
my $family='';
my $genus='';
for (my $i=2;$i<$#fields;$i+=2){
if ($fields[$i] eq 'kingdom'){$kingdom=$fields[$i+1];}
if ($fields[$i] eq 'phylum'){$phylum=$fields[$i+1];}
if ($fields[$i] eq 'class'){$class=$fields[$i+1];}
if ($fields[$i] eq 'order'){$order=$fields[$i+1];}
if ($fields[$i] eq 'family'){$family=$fields[$i+1];}
if ($fields[$i] eq 'genus'){$genus=$fields[$i+1];}
}
print "$fields[0],$fields[1],kingdom,$kingdom,phylum,$phylum,class,$class,order,$order,family,$family,genus,$genus\n";
}
Which gives me:
perl pad.pl input
135487,Nocardia cyriacigeorgica,kingdom,,phylum,,class,Actinobacteria,order,Corynebacteriales,family,,genus,Nocardia
10090,Mus musculus,kingdom,Metazoa,phylum,Chordata,class,Mammalia,order,Rodentia,family,Muridae,genus,Mus
152507,uncultured actinobacterium,kingdom,,phylum,Actinobacteria,class,Actinobacteria,order,,family,,genus,
171953,uncultured Acidobacteria bacterium,kingdom,,phylum,Acidobacteria,class,,order,,family,,genus,
(or for better reading:)
perl pad.pl input | tableize -t | sed 's/^/ /'
+------+----------------------------------+-------+-------+------+--------------+-----+--------------+-----+-----------------+------+-------+-----+--------+
|135487|Nocardia cyriacigeorgica |kingdom| |phylum| |class|Actinobacteria|order|Corynebacteriales|family| |genus|Nocardia|
+------+----------------------------------+-------+-------+------+--------------+-----+--------------+-----+-----------------+------+-------+-----+--------+
|10090 |Mus musculus |kingdom|Metazoa|phylum|Chordata |class|Mammalia |order|Rodentia |family|Muridae|genus|Mus |
+------+----------------------------------+-------+-------+------+--------------+-----+--------------+-----+-----------------+------+-------+-----+--------+
|152507|uncultured actinobacterium |kingdom| |phylum|Actinobacteria|class|Actinobacteria|order| |family| |genus| |
+------+----------------------------------+-------+-------+------+--------------+-----+--------------+-----+-----------------+------+-------+-----+--------+
|171953|uncultured Acidobacteria bacterium|kingdom| |phylum|Acidobacteria |class| |order| |family| |genus| |
+------+----------------------------------+-------+-------+------+--------------+-----+--------------+-----+-----------------+------+-------+-----+--------+
This would be the answer in bash using associative arrays:
#!/bin/bash
declare -A THIS
while IFS=, read -a LINE; do
# we always get the #ID and name
if (( ${#LINE[#]} < 2 || ${#LINE[#]} % 2 )); then
echo Invalid CSV line: "${LINE[#]}" >&2
continue
fi
echo -n "${LINE[0]},${LINE[1]},"
THIS=()
for (( INDEX=2; INDEX < ${#LINE[#]}; INDEX+=2 )); do
THIS[${LINE[INDEX]}]=${LINE[INDEX+1]}
done
for KEY in kingdom phylum class order family; do
echo -n $KEY,${THIS[$KEY]},
done
echo genus,${THIS[genus]}
done <$1 >$2
It also validates CSV lines so that they contain at least 2 columns (ID and name) and that they have an even number of columns.
The script can be extended to do more error checking (i.e. if both arguments are passed, if the input exists, etc), but it should work as expected with just the way you posted it.

Ruby pack and Latin (high-ASCII) characters

An action outputs a fixed-length string via Ruby's pack function
clean = [edc_unico, sequenza_sede, cliente_id.to_s, nome, indirizzo, cap, comune, provincia, persona, note, telefono, email]
string = clean.pack('A15A5A6A40A35A5A30A2A40A40A18A25')
However, the data is in UTF-8 as to allow latin/high-ascii characters. The result of the pack action is logical. high-ascii characters take the space of 2 regular ascii characters. The resulting string is shortened by 1 space character, defeating the original purpose.
What would be a concise ruby command to interpret high-ascii characters and thus add an extra space at the end of each variable for each high-ascii character, so that the length can be brought to its proper target? (note: I am assuming there is no directive that addresses this specifically, and the whole lot of pack directives is mind-muddling)
update an example where the second line shifts positions based on accented characters
CNFrigo 539 Via Privata Da Via Iseo 6C 20098San Giuliano Milanese MI02 98282410 02 98287686 12886480156 12886480156 Bo3 Euro Giuseppe Frigo Transport 349 2803433 M.Gianoli#Delanchy.Fr S.Galliard#Delanchy.Fr
CNIn's M 497 Via Istituto S.Maria della Pietà, 30173Venezia Ve041 8690111 340 6311408 0041 5136113 00115180283 02896940273 B60Fm Euro Per Documentazioni Tecniche Inviare Materiale A : Silvia_Scarpa#Insmercato.It Amministrazione : Michela_Bianco#Insmercato.It Silvia Scarpa Per Liberatorie 041/5136171 Sig.Ra Bianco Per Pagamento Fatture 041/5136111 (Solo Il Giovedi Pomeriggio Dalle 14 All Beniservizi.Insmercato#Pec.Gruppopam.It
It looks like you are trying to use pack to format strings to fixed width columns for display. That’s not what it’s for, it is generally used for packing data into fixed byte structures for things like network protocols.
You probably want to use a format string instead, which is better suited for manipulating data for display.
Have a look at String#% (i.e. the % method on string). Like pack it uses another little language which is defined in Kernel#sprintf.
Taking a simplified example, with the two arrays:
plain = ["Iseo", "Next field"]
accent = ["Pietà", "Next field"]
then using pack like this:
puts plain.pack("A10A10")
puts accent.pack("A10A10")
will produce a result that looks like this, where “Next field” isn’t aligned since pack is dealing with the width in bytes, not the displayed width:
Iseo Next field
Pietà Next field
Using a format string, like this:
puts "%-10s%-10s" % plain
puts "%-10s%-10s" % accent
produces the desired result, since it is dealing with the displayable width:
Iseo Next field
Pietà Next field

SAS format procedure, invalue statement ,UPCASE option does not work

I need to create SAS informat that will change all case versions of 'Male' and 'Female' to digits.
I found in the documentation that there is UPCASE options that does the job. "converts all raw data values to uppercase before they are compared to the possible ranges. If you use UPCASE, then make sure the values or ranges you specify are in uppercase"
Unfortunately after adding the UPCASE option none of the input values is read properly.
The SAS version id 9.2.
My code is below.
options fmtsearch=(WORK);
proc format lib=WORK;
invalue gender UPCASE
MALE = 1
FEMALE = 2
;run;
data _null_;
q='MALE';
x=input(q,gender.);
put q=;
put x=;
run;
The log is:
NOTE: Invalid argument to function INPUT at line 186 column 7.
q=MALE
x=.
q=MALE x=. _ERROR_=1 _N_=1
What is the proper usage of this option?
Very simple, just put UPCASE inside brackets...

Visual Works smalltalk, how to convert Ascii values to characters

using visualworks, in small talk, I'm receiving a string like '31323334' from a network connection.
I need a string that reads '1234' so I need a way of extracting two characters at a time, converting them to what they represent in ascii, and then building a string of them...
Is there a way to do so?
EDIT(7/24): for some reason many of you are assuming I will only be working with numbers and could just truncate 3s or read every other char. This is not the case, examples of strings read could include any keys on the US standard keyboard (a-z, A-Z,0-9,punctuation/annotation such as {}*&^%$...)
Following along the lines of what Max started to suggest:
x := '31323334'.
in := ReadStream on: x.
out := WriteStream on: String new.
[ in atEnd ] whileFalse: [ out nextPut: (in next digitValue * 16 + (in next digitValue)) asCharacter ].
newX := out contents.
newX will have the result '1234'. Or, if you start with:
x := '454647'
You will get a result of 'EFG'.
Note that digitValue might only recognize upper case hex digits, so an asUppercase may be needed on the string before processing.
There is usually a #fold: or #reduce: method that will let you do that. In Pharo there's also a message #allPairsDo: and #groupsOf:atATimeCollect:. Using one of these methods you could do:
| collectionOfBytes |
collectionOfBytes := '9798'
groupsOf: 2
atATimeCollect: [ :group |
(group first digitValue * 10) + (group second digitValue) ].
collectionOfBytes asByteArray asString "--> 'ab'"
The #digitValue message in Pharo simply returns the value of the digit for numerical characters.
If you're receiving the data on a stream you could replace #groupsOf:atATime: with a loop (result may be any collection that you then convert to a string like above):
...
[ stream atEnd ] whileFalse: [
result add: (stream next digitValue * 10) + (stream next digitValue) ]
...
in Smalltalk/X, there is a method called "fromHexBytes:" which the ByteArray class understands. I am not sure, but think that something similar exists in other ST dialects.
If present, you can solve this with:
(ByteArray fromHexString:'68656C6C6F31323334') asString
and the reverse would be:
'hello1234' asByteArray hexPrintString
Another possible solution is to read the string as a hex number,
fetch the digitBytes (which should give you a byte array) and then convert that to a string.
I.e.
(Integer readFrom:'68656C6C6F31323334' radix:16)
digitBytes asString
One problem with that is that I am not sure about which byte-order you will get the digitBytes (LSB or MSB), and if that is defined to be the same across architectures or converted at image loading time to use the native order. So it may be required to reverse the string at the end (to be portable, it may even be required to reverse it conditionally, depending on the endianess of the system.
I cannot test this on VisualWorks, but I assume it should work fine there, too.

Automatically increment filename VideoWriter MATLAB

I have MATLAB set to record three webcams at the same time. I want to capture and save each feed to a file and automatically increment it the file name, it will be replaced by experiment_0001.avi, followed by experiment_0002.avi, etc.
My code looks like this at the moment
set(vid1,'LoggingMode','disk');
set(vid2,'LoggingMode','disk');
avi1 = VideoWriter('X:\ABC\Data Collection\Presentations\Correct\ExperimentA_002.AVI');
avi2 = VideoWriter('X:\ABC\Data Collection\Presentations\Correct\ExperimentB_002.AVI');
set(vid1,'DiskLogger',avi1);
set(vid2,'DiskLogger',avi2);
and I am incrementing the 002 each time.
Any thoughts on how to implement this efficiently?
Thanks.
dont forget matlab has some roots to C programming language. That means things like sprintf will work
so since you are printing out an integer value zero padded to 3 spaces you would need something like this sprintf('%03d',n) then % means there is a value to print that isn't text. 0 means zero pad on the left, 3 means pad to 3 digits, d means the number itself is an integer
just use sprintf in place of a string. the s means String print formatted. so it will output a string. here is an idea of what you might do
set(vid1,'LoggingMode','disk');
set(vid2,'LoggingMode','disk');
for (n=1:2:max_num_captures)
avi1 = VideoWriter(sprintf('X:\ABC\Data Collection\Presentations\Correct\ExperimentA_%03d.AVI',n));
avi2 = VideoWriter(sprintf('X:\ABC\Data Collection\Presentations\Correct\ExperimentB_002.AVI',n));
set(vid1,'DiskLogger',avi1);
set(vid2,'DiskLogger',avi2);
end

Resources