I have been trying to get correct 'epsilon' symbol for labeling the axes in the gnuplots. From my understanding TeXdistinguishes between two types of epsilon symbols: one is ε (varepsilon) and the other one is ϵ (epsilon). The latter one seems unavailable in gnuplot or I am not able to find a correct way out. And I want to use this one in my labels. Do I have to change to a particular font type where this is available? I tried some of the available font types but nothing worked.
I have used ϵ extensively in my document and suddenly I am a bit reluctant to use replace all in the TeX just for a single plot . Anyhow, I am also curious to know whether or not gnuplot got 'em both. Thanx in advance.
ϵ is Unicode U+03F5 GREEK LUNATE EPSILON SYMBOL
ε is Unicode U+03B5 GREEK SMALL LETTER EPSILON
Gnuplot is happy to work in UTF-8, so use of either or both is fine so long as you use a font that contains the corresponding glyphs.
As noted in a previous response, if you use one of gnuplot's LaTeX terminals for output (epslatex cairolatex tikz context) the text is passed to LaTeX for processing so you would use the TeX names.
Related
I am attempting to render a series of UNICODE characters onto a spritesheet. This all works quite well for most characters, including Cyrillic ones.
When using GetCharABCWidthsFloat with certain CJKV characters however, the ABCFLOAT::abcfB parameter provides a value lower than expected. It does not account for underhangs or overhangs, which is the exact purpose of the ABCs:
The B spacing is the width of the drawn portion of the character glyph.
Source: ABCFLOAT | Microsoft Docs
As you can see, all characters do not overlap left-to-right, except the last few characters:
I get around this by creating a customizable padding option, to handle such cases, but this bloats the rest of the glyphs and thus requires a larger surface:
Font being used is Arial. For the character 美, ABC returns (2, 10, 2), which sums to a advance of 14 pixels, when in fact, 17 pixels are needed.
I use TextOut to actually render the glyphs, but I do wonder if there is someone out there who's experienced this and came up with a universal solution.
Using functions like GetTextExtentPoint32W or DrawTextEx to get the rectangle does not allow precise per-character placement, which is the whole point of the ABC. And some unmentioned functions only work with TrueType fonts.
I question if certain characters shift to a different font under certain conditions, causing the results to be inaccurate. If that is the case, is there a way to determine if a character is not available for a font, knowing what Windows does automatically so I can reproduce the behaviour? That is, is there some sort of way to determine when a character should fall back on another font, and a way to determine what that font should be?
I have been on this problem for quite some time, so anyone with experience with these APIs would be greatly welcomed!
From the documentation on GetCharABCWidthsFloat:
The ABC widths of the default character are used for characters outside the range of the currently selected font.
Arial contains a lot of characters, including Cyrillic, but it does not contain CJKV ideographs. Other text-related calls may give you the false impression that it does have those characters (through a default/fallback font mechanism).
Before using (maybe before getting) the ABCFLOAT, you should first check that the characters you want metrics for are within the range of the currently selected font.
I'm learning the basics of the FreeType API for use in OpenGL and I'm confused about one thing. You load the font, then you load each glyph one by one into the font's glyph slot. The glyph has a number of fields, including advance, which has an x and a y field. Now, I understand that it is stated that y isn't used much, but on the offchance that I am in a situation where y is used, what I don't understand is that each character is being rendered in isolation to the glyph slot, so how can the glyph know that all subsequent characters should be rendered with a specific fractional offset? What if you were to render a lot of the same character in succession? Wouldn't you end up with either a slow diagonal incline or decline in your final text block?
Historically advance.y is mostly for vertical text, like used in Asia (FT_LOAD_VERTICAL_LAYOUT will trigger it.) In a normal rendering case, you should not get at the same time both non-zero values for advance.x and advance.y.
But it is also useful to use Freetype in a more generic way. If you want to write Latin upright text in a 30° inclined way, you still can use the same structures: you apply (through FT_Set_Transform) the 30° inclination matrix to each glyph, but also to the advance vector; and the result will indeed have a diagonal incline; as intended!
I have an application that needs separating Japanese characters one by one from an image.
Input: an image with ONE line of Japanese text. It can have halfwidth Katakana, halfwidth numbers, fullwidth Katakana, Hiragana and numbers as well. Maybe halfwidth or fullwidth English characters as well. (let's forget about English characters for the moment)
Issue:
I can easily separate out the characters by using adaptive thresholding, dilating and eroding. But there is one big issue.
Some of the Japanese characters have a space in between them. Like 川, 体, 休, 非. So simply looking at vertical white gaps doesn't help. Finding the width doesn't help either because there can be fullwidth characters (2btyte) or halfwidth characters (1byte). i seem to need an exquisite way to do this.
any idea how i should proceed with this? any idea is a good idea :)
here are couple of sample images. (characters circled in red are the problematic ones)
http://imageshack.us/a/img833/3810/e31z.png
http://imageshack.us/a/img12/2395/7mqn.png
Don't expect to find one single simple algorithm able to do what you want, be prepared to combine a handful of techniques, including, but not limiting to those you already mentioned.
My personal advice, taken out of previous personal experience, would be for you to take a look at template matching techniques.
Basicaly that's what you'll need to do:
Select a few sample images of each symbol you want to identify to form your templates database.
Develop an algorithm to segment each individual character out of the image. That I think you've acomplished already.
Here it is important that you scale the characters and normalize their perspective so that they match the exact conditions on which the templates were generated. getperspectivetransform and warpPerspective might come in handy.
Compare each character against each of your templates using cv::matchTemplate for example.
Out of the top matches do some fine selection using heuristics like those you mentioned yourself, namely, checking for the existance of gaps on expected places and so on.
Test and retest, refining the heuristics for the closest cases till you reach the desired accuracy.
If you find yourself dealing with too much variety in terms of lighting conditions, characters colors, fonts, sizes and so on, you'll realize you'll be needing a huge database to cover all the various possibilities. In this case, it might help to use some transform invariant to the varying conditions. For character identification I believe skeletonization could work well. Take a look at topological skeleton and morphological skeleton and also here for a brief example.
Hope OCR is what you need to do. As this link says opencv doesnt support OCR. But there is another opensource tesseract which will do this. Just check if this helps.
Few more links I got on googling.
Opencv OCR
OCR exaple in Opencv
Hope this helps!
I am trying to crop out the printer marks that are at the edges of a PDF.
The path i want to take to solve this problem is as follows:
Convert PDF into a bitmap and then traverse the bitmap and try to find the lines, then once the lines are found, find the coordinates of the edges of these lines, set the cropping coordinates to the coordinates just found.
However the problems that pop up in my mind with this approach is how to know when the lines end and the actual page starts. How to differentiate lines from letters.
How do I overcome these hurdles, or is there a better way to crop out the printer marks from a PDF?
There is no general answer that works for ALL PDF files, however there are a few useful strategies that are implemented by existing solutions for graphics arts such as callas pdfToolbox (watch it, I'm associated with this product) or PitStop. The strategies center around a number of facts:
trim and bleed marks are usually simple lines (though thin rectangles are sometimes used as well). They are short and straight (horizontal or vertical).
These marks are usually drawn in specific colours. Either CMYK with the color set to 100%, 100%, 100%, 100% or - more commonly - a special spot color called "All". You're almost guaranteed of this because these marks need to show up on every printed separation (sorry for the technical printing terms if you're not familiar with them).
These marks normally are mirrored symmetrically. You're not looking for a single mark - you're looking for a set of them and this typically helps with recognition a lot. Watch out however that you're not confused by bad applications which don't place marks with absolute accuracy.
Lastly but perhaps not important in your application, different regions can actually work with different types of marks. Japanese trim and bleed marks for example look completely different than European or US marks.
Does Mathematica support the installation of non-Wolfram fonts for math symbols?
Examples of other math symbol fonts include the recently released STIX fonts, Microsoft's Cambria font, the Math Times font used under Latex, etc.
I have not been able to find a way to substitute math-specific characters like Greek letters, integration operators, etc. It is, however, possible to substitute letters from normal text fonts by selecting the relevant bit of of the notebook (including 2D typesetting in text cells separately, it seems), and setting the desired font in the Option Inspector.
This shows how the difference between a typeset expression in Adobe Caslon Pro and the default in Times. The x is clearly different in the two fonts.
If you set the OperatorSubstitution option in the inspector to False, you will also get characters such as +,- etc in the text font rather than Mathematica's custom fonts.
The question is whether it would make sense to use other math fonts for what remains. Obviously it would be nice to use matching Greek letters if they were available. But given that it cannot be guaranteed that even an extensive math font like STIX has all the characters available in Mathematica (think esc-wolf-esc), I can understand why this might not be customisable. In addition, I doubt if most people could tell the difference between the Mathematica Times-based fonts, the LaTeX Times fonts and the STIX fonts, which are also pretty much like Times. The Microsoft Cambria fonts do look different, but aren't yet widely used in technical publishing.