Configure larger limit of columns on Format module - format

The Format module
The Format module is used to model and combine pretty printers with a syntactic extension that allows typed formats and it helps a lot when you are writing something like a code generator or a data structure printer.
The problem
However, there is a limit of 78 columns that is initialized on the margin of the formatter and will pull to the left anything that takes more than this limit.
I'm printing a lighter version of a Yojson.Basic.json program using the Format module, but when the input is too large, the output is collapsed, and that is not really "prettily".
Preview
Here is how it is is formatted when it is short:
Here is how it is formatted when the indentation becomes too large
I've been trying to exceed and configure this limit to 120 columns, but didn't have any success.
What have I tried?
Using Format.pp_set_margin ppf 120 to reconfigure
Using Format.pp_set_max_indent to a larger value
But they doesn't seem to have any effect and there is no documentation easily available about this limit. I've discovered it only by reading the source code.
What am I doing?
let string_of_cst program =
let ppf = Format.str_formatter in
(* I've enabled colors. *)
Format.pp_set_tags ppf colors;
Format.pp_set_formatter_tag_functions ppf with_colors;
(* [print_json] is my printer. *)
print_json ppf program;
(* Get string out of printer. *)
Format.flush_str_formatter ()
How can I configure a larger limit?

The issue is that the values for margin and max_indent are implicitly constrained to the cone 1 < max_indent < margin and the function set_max_indent silently fails and does nothing if this constraint is not respected.
To avoid this issue, in OCaml ≥4.08, it would be possible to use the new set_geometry function that requires to set both value simultaneously and fails with an exception if the required max_indent is greater than the margin.
Otherwise, you should always set both values at the same time, and always in the order
margin first, and max_indent second. If you don't know which value to chose for max_indent, margin - 10 is generally an alright choice.

Related

custom array printing in gdb

I know gdb has several means of exploring data, some of them quite convenient. However, I cannot combine them to get that I need/want. I would like to display some custom string based on the first n values of a big array starting at <PT_arr>, and the last m values of the same array at a distance (in this case) 4096. Looking something like this:
table beginning:
0x804cfe0 <PT_arr>: 0x00100300 0x00200300 0x00300300 0x00400300
table end:
0x804cfe0 <PT_arr+4064>: 0x00500300 0x00600300 0x00700300 0x00800300
printf let's me add custom text (like table beginning)
the examine x gives me that nice alignment, let's me read many elements and group them by byte, words, etc; and shows addresses at the left (which is ideal for my case).
x aligns the content of regions of memory in an easy to read manner with the size and unit parameters. (what I want)
display is constantly printing. (what I want).
The issue with display (manual), is that unlike examine x (manual) it doesn't have a size or unit parameter.
Is there a way to accomplish that?
Thanks.

Other options to resize barcode for zebra printer using ZPL?

I want to print a Code 128 barcode with a Zebra printer. But I just can't get exactly where I want because the barcode is either too small or too big for the label size of 40x20 mm. Is there anything else I can try besides using the ^BY (Bar Code Field Default) module width and ratio?
^XA^PQ2^LH0,0^FS
^MUM
^GB40,20,0.1,B^FS
^FO1.5,4
^BY0.2
^BCN,10,N,N
^FD*030493LEJCG002999*^FS
^FO8,15
^A0N,3,3
^FD*030493LEJCG002830*^FS
^MUD
^XZ
Above script gives me a label that looks like this:
But when I just decrease the module width to 0.1 (which is the lowest) the barcode becomes too small and may be problematic to scan with a hand scanner:
Code-128 is a fixed-ratio code, so you would appear to have the choice of two sizes. You may be able to solve the problem by using a 300dpi printer in place of a 200.
If you can change the format (and I'm intrigued by the barcode and readable being different values) then you could save a little by printing one number-sequence and one alpha-sequence, as an even count of numerics will be encoded as alphabet C so you'd save one change-alphabet element.
Do you really need the * on each end?
Otherwise, perhaps code 39 (which prints the * if you use the print-interpretation-line option) would suit your purposes better.
Another Possibility is to do on the fly code-set changes, Try something like
^XA^PQ2^LH0,0^FS
^MUM
^GB60,20,0.1,B^FS
^FO1.5,4
^BY0.2
^BCN,10,N,N
^FD>:*>5030493>6LEJCG>5002830>6*^FS
^FO8,15
^A0N,3,3
^FD*030493LEJCG002830*^FS
^MUD
^XZ
This will allow less symbols to encode your data
If you can structure content to have all the alpha chars a one end or the other.
or (Depending on your firmware) you could use auto ^BCN,10,N,N,N,A

Save a figure to file with specific resolution

In an old version of my code, I used to do a hardcopy() with a given resolution, ie:
frame = hardcopy(figHandle, ['-d' renderer], ['-r' num2str(round(pixelsperinch))]);
For reference, hardcopy saves a figure window to file.
Then I would typically perform:
ZZ = rgb2gray(frame) < 255/2;
se = strel('disk',diskSize);
ZZ2 = imdilate(ZZ,se); %perform dilation.
Surface = bwarea(ZZ2); %get estimated surface (in pixels)
This worked until I switched to Matlab 2017, in which the hardcopy() function is deprecated and we are left with the print() function instead.
I am unable to extract the data from figure handler at a specific resolution using print. I've tried many things, including:
frame = print(figHandle, '-opengl', strcat('-r',num2str(round(pixelsperinch))));
But it doesn't work. How can I overcome this?
EDIT
I don't want to 'save' nor create a figure file, my aim is to extract the data from the figure in order to mesure a surface after a dilation process. I just want to keep this information and since 'im processing a LOT of different trajectories (total is approx. 1e7 trajectories), i don't want to save each file to disk (this is costly, time execution speaking). I'm running this code on a remote server (without a graphic card).
The issue I'm struggling with is: "One or more output arguments not assigned during call to "varargout"."
getframe() does not allow for setting a specific resolution (it uses current resolution instead as far as I know)
EDIT2
Ok, figured out how to do, you need to pass the '-RGBImage' argument like this:
frame = print(figHandle, ['-' renderer], ['-r' num2str(round(pixelsperinch))], '-RGBImage');
it also accept custom resolution and renderer as specified in the documentation.
I think you must specify formattype too (-dtiff in my case). I've tried this in Matlab 2016b with no problem:
print(figHandle,'-dtiff', '-opengl', '-r600', 'nameofmyfig');
EDIT:
If you need the CData just find the handle of the corresponding axes and get its CData
f = findobj('Tag','mytag')
Then depending on your matlab version use:
mycdata = get(f,'CData');
or directly
mycdta = f.CData;
EDIT 2:
You can set the tag of your image programatically and then do what I said previously:
a = imshow('peppers.png');
set(a,'Tag','mytag');

Adjusting the MAFFT command line algorithm to better account for gaps

I've been attempting to use the MAFFT command line tool as a means to identify coding regions within a genome. My general process is to align the amino acid consensus sequence of a gene to a translated reading frame of a target sequence. My method has been largely successful. However, I've noticed some peculiar alignments which will unfortunately impede my annotation method. The following is one such example (Note - I've also included a pairwise alignment from the Pairwise2 Biopython module to demonstrate my desired output. Unfortunately, the computation time for Pairwise2 is nearly 20 times slower than MAFFT command line):
from time import *
from Bio.SubsMat import MatrixInfo as matlist
from Bio import pairwise2
from Bio.pairwise2 import format_alignment
from Bio.Align.Applications import MafftCommandline
startTime = time()
sample_tList = [['>Frame 1', 'RIGVGSIPRHLYCQELPLAQPKTCCAETPFRDSPLQGRLGVCPHLASGVALLYGLSTPLTMSGILDRCTCTPNARVFMAEGQVYCTRCLSARSLLPLNLQVPELGVLGLFYRPEEPLRWTLPRAFPTVECSPAGACWLSAIFPIARMTSGNLNFQQRMVRVAAEIYRAGQLTPAVLKVLQVYERGCRWYPIVGPVPGVGVYANSLHVSDKPFPGATHVLTNLPLPQRPKPEDFCPFECAMADVYDIGHGAVMFVAGGKVSWAPRGGDEVRFETVPEELKLIANRLHISFPPHHLVDMSKFAFIVPGSGVSLRVEHQHGCLPADIVPKGNCWWCLFDLLPPGVQNREIRYANQFGYQTKHGVSGKYLQRRLQINGLRAVTDTHGPIVVQYFSVKESWIRHFRLAGEPSLPGFEDLLRIRVESNTSPLADKDEKIFRFGSHKWYGAGKRARKARSGATTTVAHRASSARETRQAKKHEGVDANNAAHLEHYSPPAEGNCGWHCISAIVNRMVNSNFETTLPERVRPSDDWATDEDFVNTIQILRLPAALDRNGACKSAKYVLKLEGEHWTVSVAPGMSPSLLPLECVQGCCEHKGGLGSPDAVEVSGFDPTCLDRLAEVMHLPSSVIPAALAEMSNNSDRPASLVNTAWTVSQFYARHTGGNHRDQVRLGKIISLCQVIEECCCHQNKTNRATPEEVAAKIDQYLRGATSLEECLIKLERVSPPSAADTSFDWNVVLPGVEAAGPTTEQPHANQCCAPVPVVTQEPLDKDSVPLTAFSLSNCYYPAQGDEVRHRERLNSVLSKLEEVVLEEYGLMPTGLGPRPVLPSGLDELKDQMEEDLLKLANAQATSEMMALAAEQVDLKAWVKSYPRWIPPPPPPKVQPRRMKPVKSLPENKPVPAPRRKVRSDPGKSILAVGGPLNFSTPSELVTPLGEPVLMPASQHVSRPVTPLSEPAPVPAPRRIVSRPMTPLSEPTFVFAPWRKSQQVEEANPAAATLTCQDEPLDLSASSQTEYEAYPLAPLENIGVLEAGGQEAEEVLSGISDILDNTNPAPVSSSSSLSSVKITRPKYSAQAIIDSGGPCSGHLQKEKEACLRIMREACDAARLGDPATQEWLSHMWDRVDVLTWRNTSVYQAFRTLDGRFGFLPKMILETPPPYPCGFVMLPHTPTPSVSAESDLTIGSVATEDVPRILGKTENTGNVLNQKPLALFEEEPVCDQPAKDSRTLSRESGDSTTAPPVGTGGAGLPTDLPPLDGVDADGGGLLRTAKGKAERFFDQLSRQVFNIVSHLPVFFSHLFKSDSGYSPGDWGFAAFTLFCLFLCYSYPFFGFAPLLGVFSGSSRRVRMGVFGCWLAFAVGLFKPVSDPVGAACEFDSPECRNILHSFELLKPWDPVRSLVVGPVGLGLAILGRLLGGARYIWHFLLRLGIVADCILAGAYVLSQGRCKKCWGSCIRTAPNEIAFNVFPFTRATRSSLIDLCDRFCAPKGMDPIFLATGWRGCWTGQSPIEQPSEKPIAFAQLDEKRITARTVVSQPYDPNQAVKCLRVLQAGGAMVAEAVPKVVKVSAIPFRAPFFPTGVKVDPECRIVVDPDTFTTALRSGYSTTNLVLGVGDFAQLNGLKIRQISKPSGGGPHLIAALHVACSMVLHMLAGVYVTAVGSCGTGTSDPWCANPFAVPGYGPGSLCTSRLCISQHGLTLPLTALVAGFGLQEIALVVLIFVSIGGMAHRLSCKADMLCILLAIASYVWVPLTWLLCVFPCWLRWFSLHPLTILWLVFFLISVNMPSGILAVVLLVSLWLLGRYTNIAGLVTPYDIHHYTSGPRGVAALATAPDGTYLAAVRRAALTGRTMLFTPSQLGSLLEGAFRTRKPSLNTVNVVGSSMGSGGVFTIDGRIKCVTAAHVLTGNSARVSGVGFNQMLDFDVKGDFAIADCPNWQGVAPKTQFCGDGWTGRAYWLTSSGVEPGVIGDGFAFCFTACGDSGSPVITEAGELVGVHTGSNKQGGGIVTRPSGQFCNVTPIKLSELSEFFAGPKVPLGDVKVGSHIIKDTSEVPSDLCALLAAKPELEGGLSTVQLLCVFFLLWRMMGHAWTPLVAVGFFILNEVLPAVLVRSVFSFGMFALSWLTPWSAQVLMIRLLTAALNRNRVSLIFYSLGAVTGFVADLATTQGHPLQAVMNLSTYAFLPRMMVVTSPVPAIACGVVHLLAIILYLFKYRCLHHVLVGDGAFSAAFFLRYFAEGKLREGVSQSCGMSHESLTGALAIKLSDEDLDFLTKWTDFKCFVSASNMRNAAGQFIEAAYAKALRIELAQLVQVDKVRGTLAKLEAFADTVAPQLSPGDIVVALGHTPVGSIFDLKVGSTKHTLQAIETRVLAGSKMTVARVVDPTPAPPPAPVPIPLPPKVLENGPNAWGGEDRLNKRKRRRMEAVGIFVMDGKKYQKFWDKNSGDVFYEEVHNSTDEWECLRAGDPADFDPETGIQCGHVTIEDKVYNVFTSPSGRRFLVPANPENRRIQWEAARLSVEQALGMMNVDGELTAKELEKLKRIIDKLQGLTKEQCLNCPPVAPAVVAAAWLLLRQRKNFTTGPSPDLTKWPVRLSRTRSSTTNIRLPNRLMVVLCSCAPLFLRLMSSPALMHLLSYLPATGRETLGLMARFGILRPRPPKRKSHLVRKYRLVTLGAVTHLKLVSLISCTLLGATLSGKEFYRIQGLETYLTEPPVTLEAQCMRLPASRPMLLRLMGVPSWPQPCPPVLSCMYRPFQRPSLIILILGLTALNSQSTVVRMLLGTSPNTICPPKALFCLEFFALCGSTCLPMWVSARPFIGLPLTLPRILWLEMGTDFQPRIFRASLKSTFCAHRLCEKTGKLLLLVPSRSSIVGRRRLGQYLALITLRWPTGQRVVLPRASKRHSTRPSPSEKTNLRNYILQFAGALKLILHPAIDPHLQLSAGSLPIFFMNSPVLKSIYRRTCLTAVTTYWLRSPARLREAACRLATRLPPCQTPFTAYMHSTWCSVTLKVVTLMAFCFCKTSSLRTCSRFNPSSIQTTSCCMPSLPPCQITTGGLNITLCVSKRTQRRQPQTRHHFVAGMGVSSLTVTGFLRPSPTIRQAMSLNTTPRRLQYLWTAVLVSMILSGLKSSWLVRSAPARTVTASQARRSSCPCGKNSGPIMKGRSPECAGTAEPRLRTPLPVASTSVLTTPISTSIVLSSGVATRRVLALVVSVNLPWEKAQVLWMRCNKSRISLRGLSCMWSRVSPLLTQVDTKLAADSPLGVASGETKLTCQTVIMPVPPCSPLVKRSTWSLSPPTCCAAGSSSVPPALGKHTGSSNRSRMVMSFTRQLTRPCLTLGLWGCAGSTSQRVRRCNSLPPLVPARGFASWPAVGVLVRIPFWTKQRIAITLMSGFLAKPPLPAEISNNSTRWVLTLIAMFLTSCLRPNRPSGDSDRISVMPSNQITGTNLCPWSTQPVPRWTNLSGMGKSSPPTTGTERTAPSLSTPVKVPHLMWLHCICPLKIHSTGNEPLLLSPGQDMQSSCMTHTGNCRACLIFLRKAHPSTSQCSVTSSSYIEITKNARLLRLAMEINSGLQTSALILSAPFVQIWKGRAPRSPKLHITWGSISHLIHSLLNSQQNSHPTGPWQPRTMKSGLIGWLPAFAPSINIAARALVQAIWWAPRCFAPQGLCHTTSQNLLGARLKCFLRQSSAPAELRIAGSTSMIGSEKLLSPSHMPSLATSKALPVGDVITSPPDTFRASFLRNQLRSGFLAPEKLQRQFAHQMCTSQILKRTSTQRPSPSAGKCWILEKSDWSGKTRRPIFNLKAAISPGINLQATPHTSEFLLILQCIWTPAWALPFATGGLLGPPIGELTSRSPLMITVPKSFCLVHTMVKCLQGTKFWRARSSRLTTQGTNTLGDLNRIQRICTSLLGMVRTGRIIMKRFGRARKGKFIRLLPPASFIFPRALSLNQLATEMKWGLCRASLTKLVNFLWMLSRNFWCPLLISSYFWPFCLASPSPAGWWSFASDWFAPRYSVRALPFTLSNYRRSYEAFLSQCQVDIPTWGVKHPLGILWHHKVSTLIDEMVSRRMYRIMEKAGQAAWKQVVSEATLSRISNLDVVAHFQHLAAIEAETYKYLASRLPMLHNLRMTGSNVTIVYNSTLNQVFAIFPTSGSRPRLHDSQQWLIAVHSSIFSSVVASCTLFVVLWLRIPMLRSVFGFRWLGAIFLLNSRITRCVRLASPGRPLLRSMNPVGLFGAGGMTDAVRTTMTNGSWFRLASAKATPVFTPGWRSCHSATRPSSIPRYLGGTVKFMLTSRTNSFAPSTTGRTPPCLAMTTFQPYFRPTTNIRSTAVIGFTNGCAPSFPLGWFMFRGFSGVRLQAMFQFKSFRHQDQHYRSIRLCCPPGHQLPVWRLAPSDGSQELSVPHGDRDTRVHHHHSQCHRELFTFFSPHAFLLPFLCFDEKGIQSGIWQCVRHRGCVCLYQLRPTCQGVHPTLLGSRSCATASFHDTDHEVGNRFSLSFCHPTGNLNVQVCWGNAPRAVTRNCFLCGVSCRSVLLCSSTPAATAALIFSFITRYVSMAQIGWQKDLTGQWRLLSFFLCLTLFPMEHSPPAIFLTRLVSLCPPPGSITGGMSVVSMRSVLWLRFASSLGLRRTACPGATLVLDTPTSFWTLRADSIVGGRPLLRKGVRLKSRVTSTSKELCLMVPWQPLPEFQRNNGVVSRRLLPHGSTKGAFGVFHYLYASDDICSKGKSRPTARASAPFDLPELCFYLRVHDIRALSEHKGRAHYGGSSCTSLGGVLSHRNLEIHHLQMPFVLARPQVHSGPCPPRRKCRGLSSDCGKPRICRPASRLHYGRHIGARVEKPRVGWQKSCTGSGKPCQICQITTASSKRERRGTASQSISCARCWVRSSPNKTSPEARDRGRKIIREARRSPIFLRLKKMSGTTSPLVSGNCVCRRSRLPLTRAPGHVPCQIQGGVTLWSLVCRRIILCASASQHHPQHDELAFFGHLGVMIGRMCGEWHLTLCLVTYSIRATVWGSLIGENHAAAIKKKKKKKK'], ['>ORF2_GP2', 'MKWGLCKASLTKLANFLWMLSRSFWCPLLISSYFWPFCLASQSPVGWWSFASDWFAPRYSVRALPFTLSNYRRSYEAFLSQCQVDIPTWGVKHPLGVLWHHKVSTLIDEMVSRRMYRIMEKAGQAAWKQVVSEATLSRISGLDVVAHFQHLAAIEAETCKYLASRLPMLHNLRLTGSNVTIVYNSTLDQVFAIFPTPGSRPKLHDFQQWLIAVHSSIFSSVAASCTLFVVLWLRIPMLRSVFGFRWLGATFLLNSW']]
ex_file = open("newTempFile112233.fasta", "w")
for items in sample_tList:
ex_file.write(items[0] + "\n")
ex_file.write(items[1] + "\n")
ex_file.close()
in_file = '.../msa_example.fasta'
mafft_exe = '/usr/local/bin/mafft'
mafft_cline = MafftCommandline(mafft_exe, input=in_file) #have to change file path
#mafft_cline = MafftCommandline(mafft_exe, input=in_file, localpair=True, lexp=-1.5, lop=0.5)
stdout, stderr = mafft_cline()
print(stdout)
test_align = AlignIO.read(io.StringIO(stdout), "fasta")
#print(test_align)
os.remove("newTempFile112233.fasta")
print('Total time = ' + str(time() - startTime))
startTime = time()
matrix = matlist.blosum62
pWise_align = pairwise2.align.localds(sample_tList[0][1], sample_tList[1][1], matrix, -6, -1)
print(format_alignment(*pWise_align[0]))
print('Total time = ' + str(time() - startTime))
I've attempted to change the MAFFT command line alignment algorithm by referencing the help document (http://mafft.cbrc.jp/alignment/software/manual/manual.html). I don't get any error messages, but the alignment output does not change. I'm unsure what adjustments need to be made. I believe that by increasing the gap extension penalty (which is zero by default), the alignment will be improved. I haven't been able to find many documentation examples where custom variables are used when using MAFFT command line on this forum or through Google search. Help is much appreciated. For reference, documentation on the Pairwise2 alignment parameters can be found here: http://biopython.org/DIST/docs/api/Bio.pairwise2-module.html
Managed to figure out a possible solution. The alignment of the example sequences provided results in a long terminal/end gap which should not be present. Changing the MAFFT alignment algorithm using localpair, lexp, and lop had no effect (causing me a good deal of confusion). However, I have noticed differences in the alignment output when each input sequence is reversed. Oddly, the only way I was able to remove the terminal/end gap was to set the lop (gap opening penalty) to a lesser amount relative to lexp (gap extension penalty). I suspect my solution is niche and may not be applicable to other similar occurrences of terminal gaps. Changing the alignment settings also likely reduces the optimal alignment.
Going forward, I plan to use an automated process to run alignments of consensus sequences to raw sequences. In the event I detect irregularities with the alignment output (specifically terminal gaps), I'll attempt to reverse the input sequences and apply custom alignment settings. I suppose if that isn't a consistent solution, I'll figure out a way to refine the alignment output directly.
For anyone curious, I used a lexp value of -1.5 and lop value of 0.5 (now included in a hashed out line in my example code).

MATLAB ConnectedComponentLabeler does not work in for loop

I am trying to get a set of binary images' eccentricity and solidity values using the regionprops function. I obtain the label matrix using the vision.ConnectedComponentLabeler function.
This is the code I have so far:
files = getFiles('images');
ecc = zeros(length(files)); %eccentricity values
sol = zeros(length(files)); %solidity values
ccl = vision.ConnectedComponentLabeler;
for i=1:length(files)
I = imread(files{i});
[L NUM] = step(ccl, I);
for j=1:NUM
L = changem(L==j, 1, j); %*
end
stats = regionprops(L, 'all');
ecc(i) = stats.Eccentricity;
sol(i) = stats.Solidity;
end
However, when I run this, I get an error says indicating the line marked with *:
Error using ConnectedComponentLabeler/step
Variable-size input signals are not supported when the OutputDataType property is set to 'Automatic'.'
I do not understand what MATLAB is talking about and I do not have any idea about how to get rid of it.
Edit
I have returned back to bwlabel function and have no problems now.
The error is a bit hard to understand, but I can explain what exactly it means. When you use the CVST Connected Components Labeller, it assumes that all of your images that you're going to use with the function are all the same size. That error happens because it looks like the images aren't... hence the notion about "Variable-size input signals".
The "Automatic" property means that the output data type of the images are automatic, meaning that you don't have to worry about whether the data type of the output is uint8, uint16, etc. If you want to remove this error, you need to manually set the output data type of the images produced by this labeller, or the OutputDataType property to be static. Hopefully, the images in the directory you're reading are all the same data type, so override this field to be a data type that this function accepts. The available types are uint8, uint16 and uint32. Therefore, assuming your images were uint8 for example, do this before you run your loop:
ccl = vision.ConnectedComponentLabeler;
ccl.OutputDataType = 'uint8';
Now run your code, and it should work. Bear in mind that the input needs to be logical for this to have any meaningful output.
Minor comment
Why are you using the CVST Connected Component Labeller when the Image Processing Toolbox bwlabel function works exactly the same way? As you are using regionprops, you have access to the Image Processing Toolbox, so this should be available to you. It's much simpler to use and requires no setup: http://www.mathworks.com/help/images/ref/bwlabel.html

Resources