Image properties "dimensions" with "odd" unicode code points - windows

I am poking around in the file properties for images, specifically jpg files created by a camera/scanner/adobe/etc.
There is one detail that is different than the rest. The image dimensions seems to have a Unicode codepoint that doesn't appear in the displayed text. The text appears as something like: ‪3264 x 2448.
As it turns out, there are codepoints on either end of this string that I cannot figure out. It is probably very straight forward, but after my searching I am at a loss.
The property documentation can be found here:
System.Image.Dimensions
property format: {6444048F-4C8B-11D1-8B70-080036B11A03}
0xd => 13 => property id (for Systems.Image.Dimensions)
3264 x 2448 => Image dimensions as the "appear" on the screen
Here is what I have (Python 3.5 output):
0xd => ‪3264 x 2448‬ 0xd => b"?3264 x 2448?" len: 13
This is the actual string converted to hex bytes.
Hex Bytes: e2 80 aa 33 32 36 34 20 78 20 32 34 34 38 e2 80 ac
Character: ?? ?? ?? 3 2 6 4 x 2 4 4 8 ?? ?? ??
Does anyone know what the "0xe280aa" and "0xe280ac" are and what I am missing?
They are the only "interesting" characters in the entire properties collection for a jpg image. I don't know what they are, or why they are present.

Your property text is encoded in UTF-8.
e2 80 aa is the UTF-8 encoding of Unicode codepoint U+202A LEFT-TO-RIGHT EMBEDDING.
e2 80 ac is the UTF-8 encoding of Unicode codepoint U+202C POP DIRECTIONAL FORMATTING.
These markers are used when embedding left-to-right text in bidirectional text.
Raymond Chen blogged about this in relation to a similar issue with filenames displayed in Windows Explorer:
Why is there an invisible U+202A at the start of my file name?

Related

Get GDI HFONT line height as interpreted by DrawText[Ex]

I want to know which metrics are used to calculate the correct line height (vertical distance between the baselines of 2 adjacent lines of text). "Correct" shall arbitrarily defined as "whatever DrawTextW does".
The accepted answer here appears to follow what the graph provided in this MSDN article says:
TEXTMETRICW.tmHeight + TEXTMETRICW.tmExternalLeading;
But that does not appear to be correct. Some testing with 2 pieces of text, each of which consists of 2 lines:
// RECT rc is more than large enough to fit any text
int HeightChinese = DrawTextW(hdc, L"中\r\n文", -1, &rc, 0);
int HeightLatin = DrawTextW(hdc, L"Latin,\r\nlatin!", -1, &rc, 0);
The expected return values should be 2 * <SomethingUnknown>.
One observation is that the return value of DrawTextW will always match the RECT output if DT_CALCRECT were used, for all fonts that I have on my machine. So I will assume that using DT_CALCRECT does not provide any additional value over using the return value of DrawTextW.
For all fonts on my machine, these are true:
HeightChinese == HeightLatin
LOGFONTW.lfHeight == TEXTMETRICW.tmHeight (1).
For most fonts on my machine, this is true:
HeightXxx == 2 * TEXTMETRICW.tmHeight
Which already contradicts the formula provided in the other question (TEXTMETRICW.tmExternalLeading does not play a role).
For example, "Arial" with LOGFONTW.lfHeight = 36 will have TEXTMETRICW.tmExternalLeading = 1, and HeightXxx == 72 (not 74). The distance between the lines when taking a screenshot and measuing the pixels is also 72 (so it appears that the return value can be trusted).
At the same time, "Segoe UI" with LOGFONTW.lHeight = 43 will have TEXTMETRICW.tmExternalLeading = 0, and HeightXxx == 84 (not 86).
This is a list of all anomalous fonts on my system:
"FontName" -- "DrawText return value" vs "2 * TEXTMETRICW.tmHeight"
Ebrima -- 84 vs 86
Leelawadee UI -- 84 vs 86
Leelawadee UI Semilight -- 84 vs 86
Lucida Sans Unicode -- 96 vs 98
Malgun Gothic -- 84 vs 86
Malgun Gothic Semilight -- 84 vs 86
Microsoft Tai Le -- 80 vs 82
Microsoft YaHei -- 82 vs 84
Microsoft YaHei UI Light -- 82 vs 84
MS Gothic -- 66 vs 64
MS UI Gothic -- 66 vs 64
MS PGothic -- 66 vs 64
Nirmala UI -- 84 vs 86
Nirmala UI Semilight -- 84 vs 86
Palatino Linotype -- 84 vs 86
Segoe UI -- 84 vs 86
Segoe UI Black -- 84 vs 86
Segoe UI Historic -- 84 vs 86
Segoe UI Light -- 84 vs 86
Segoe UI Semibold -- 84 vs 86
Segoe UI Semilight -- 84 vs 86
Segoe UI Symbol -- 84 vs 86
SimSun -- 66 vs 64
NSimSun -- 66 vs 64
SimSun-ExtB -- 66 vs 64
Verdana -- 76 vs 78
Webdings -- 62 vs 64
Yu Gothic UI -- 84 vs 86
Yu Gothic UI Semibold -- 84 vs 86
Yu Gothic UI Light -- 84 vs 86
Yu Gothic UI Semilight -- 84 vs 86
MS Mincho -- 66 vs 64
MS PMincho -- 66 vs 64
Ubuntu Mono -- 62 vs 64
Sometimes the return value is 2 bigger, sometimes it is 2 smaller than the calculated value.
I have looked at the other values in TEXTMETRICW, and I've also looked at the extra data available int OUTLINETEXTMETRICW, but I could not find any pattern that would explain the observations.
So then, what are the correct metrics to calculate line height? I understand that I could call DrawTextW with DT_CALCRECT to get this value, but I want to understand where this information comes from (and thus, how a font designer could control it in a predictable way).
Here is a gist with a complete Windows application that demonstrates this. All the interesting stuff is in WM_PAINT. Search for #EDIT for some interesting code switches and breakpoints. At the time of posting this question, my GitHub account has been flagged, and the Gist is temporarily unavailable. I hope this can be resolved quickly.
(1) I am using EnumFontFamiliesEx to enumerate all fonts, and it happens to provide LOGFONTW structs with positive lfHeight values. That means I am using cell height rather than character height. While character height is the more typical way of specifying font height, that is sort of irrelevant here, it just so happens that cell height is equal to TEXTMETRICW.tmHeight, but character height isn't. The relevant value for calculations is TEXTMETRICW.tmHeight, and not LOGFONTW.lfHeight.
As Jonathan Potter pointed out, the formula TEXTMETRICW.tmHeight should have been correct, and if the DT_EXTERNALLEADING flag is set, then it's TEXTMETRICW.tmHeight + TEXTMETRICW.tmExternalLeading.
I reverse-engineered DrawTextExW with Ghidra and the reason the numbers were sometimes off is not DrawTextExW itself. DrawTextExW internally uses a DT_InitDrawTextInfo, which in turn uses GetTextMetricsW and calculates the line height according to the above formula.
However, consider this code to probe all fonts:
LOGFONTW Probe = {};
Probe.lfCharSet = DEFAULT_CHARSET;
EnumFontFamiliesExW(hdc, &Probe, InitializeFontInfo_EnumFontFamiliesCallback, NULL, 0);
static int CALLBACK InitializeFontInfo_EnumFontFamiliesCallback(const LOGFONTW *LogFont, const TEXTMETRICW *TextMetric, DWORD FontType, LPARAM lParam)
{
FONT_INFO tmp = {};
tmp.LogFont = *LogFont;
tmp.TextMetric = *TextMetric;
FontInfo.push_back(tmp);
return 1;
}
Here, for the Segoe UI font, for example, LogFont->lfHeight will be 43.
And so, TextMetric->tmHeight will also be 43, which, you would think, makes sense to some degree.
However:
If you go ahead and select this LogFont into a HDC, and then use GetTextMetricsW, like so:
HFONT Font = CreateFontIndirectW(LogFont);
SelectObject(hdc, Font);
TEXTMETRICW TextMetric = {};
GetTextMetricsW(hdc, &TextMetric);
Then TextMetric->tmHeight == 42 even though LogFont->lfHeight == 43.
In other words, the values provided to the EnumFontFamiliesExW callback for its TEXTMETRICW parameter cannot be trusted. Although you could argue that the bug is elsewhere, and selecting a LogFont->lfHeight == 43 font should really also produce a TextMetric->tmHeight == 43 text metric, but I suppose that's too much to ask. My guess is that there's a floating point conversion going on somewhere in there, and that occasionally produces a rounding error for some numbers.
DrawText() only uses TEXTMETRIC.tmExternalLeading if the DT_EXTERNALLEADING flag is set when you call it - you don't seem to have taken that into account.
The line height formula is basically:
int iLineHeight = tm.tmHeight + ((format & DT_EXTERNALLEADING) ? tm.tmExternalLeading : 0);

Postscript image output wrong on some color models

I have written a C code to generate postscript files from PWG raster files. The output is working on (format is color model - bit depth): black-1, black-8, black-16, rgb-8, rgb-16, gray-1, gray-8, gray-16, srgb-8, srgb-16, adobergb-8, sgray-1, sgray-8, cmyk-1, cmyk-8, cmyk-16.
But the output of adobergb-16 and sgray-16 given is wrong. I get the pattern similar to the input file but the colors are all pixelated.
Actual code is very big, so I am posting what I did:
take all the image pixels in an unsigned char* variable (this sometimes becomes very large)
encode the pixels using deflate algorithm from zlib
display the result
For adobergb-16 I am setting PS colorspace to /DeviceRGB and the decode array is /Decode [0 1 0 1 0 1].
For sgray-16 I am setting the PS colorspace to /DeviceGray and the decode is /Decode [0 1]
These setting are similar to adobergb-8 and sgray-8.
EDIT 1:
Adding the example files I used to test HERE
If you want any further information or the code snippets, please feel free to ask.
Well you've set "/BitsPerComponent 16"; as I said above, that's not a legal value, since PostScript only supports 1, 2, 4, 8 and 12 bits per component.
Running this file through Adobe Acrobat Distiller gives:
%%[ Error: rangecheck; OffendingCommand: imageDistiller; ErrorInfo: BitsPerComponent 16 ]%%
Rewriting your image like this:
gsave
/DeviceRGB setcolorspace
/Input currentfile /FlateDecode filter def
4958 7017 scale
<<
/ImageType 1
/Width 4958
/Height 7017
/BitsPerComponent 8
/Decode [0 1 0 1 0 1]
/DataSource {3 string 0 1 2 {1 index exch Input read {pop}if Input read pop put } for} bind
/ImageMatrix [4958 0 0 -7017 0 7017]
>> image
Sets the BitsPerComponent to 8, discards the top byte of every 16-bit value, and the output works as expected.
When I said 'a nice simple small example' I didn't mean 30 MB of data, that is not necessary to exhibit the problem I am certain. When posting examples make a simple, small, example and use that. I haven't bothered to download your other files.
To reiterate; you cannot set /BitsPerComponent 16, PostScript does not support 16 bits per component.

NFC : APDU and SNEP length limitation

I'm working on a project in order to exchange large data from PC to Android device throught NFC. I'm using ACR122.
The following is a general exemple of data sent :
// ADPU
FF FF 00 00 00 nn // CLA, INS, P1, P2, Le, Lc
D4 40 // TFI, PD0
01 // (Mi), Target
// LLCP
13 20 // DSAP, PTYPE, SSAP
00 // Sequence
D4 40 // TFI, PD0
// SNEP
10 02 // Protocol Version, Action
nn nn nn nn // Total SNEP Length
// NDEF Header
A2 // First byte (MB = 1, ME = 0, Cf = 1, SR = 0, Il, TNF)
22 // Type length
mm mm mm mm // Payload length
// NDEF Content
61.....65 // Type (34 bytes in that case)
01.....01 // Payload (mm mm mm mm bytes)
Here, I send a Record (not short record).So the NDEF header allows to enter a 4 bytes payload length.
Finaly, my question is how could we send a such large payload regarding the 1 byte APDU Lc ?
If this limitation is only due to the pn532 chip or PS/SC, what alternative hardware would you suggest ?
Thank you for any clarification
EDIT :
I found what I was looking for here :
Sending Extended APDU to Javacard
It's a hardware problem, PN532 don't support Extended APDU.
As you've already found out the ACR122 does not support extended APDU due to a limitation of the PN532 chip.
However, there is no need to pack the entire SNEP transfer into a single APDU. You can split the payload into multiple smaller frames and send them one after another. It's only important that the NDEF header gets transmitted as a whole in the first frame.

NSButton tag ID returns inaccurate values past 7

I have a grid of 25 NSButtons. I'm attempting to set a tag on each of them, from 1-25, and link them to one IBAction, containing this:
- (IBAction)buttonClicked:(id)sender {
NSLog(#"Clicked button %lo.", [sender tag]);
}
However, I'm running into a problem. It works fine from buttons 1-7, but the 8th one returns 10, the 9th returns 11, and the 10th returns 12. I experimentally set a button's tag to 88, and it returned 130. Is this a bug, or am I going about this the wrong way?
Your button values are correct, you're just printing them wrong, in octal format (the 'o' in %lo) instead of decimal. That's why your 8 prints out as a 10 -- that's 8 in octal representation. 130 is octal for 88 decimal:
You should use an unsigned int (%u) format, not long (%lo):
NSLog(#"Clicked button %u.", [sender tag]);
depending on the format of your tag you could possibly just use %o. Treating the integer as long is what is adding to the number.

What's the best approach to generate image based on user submitted text on server side like Rails

Guys,
I see ImageMagick is capable to generate image using Pango formatted text, which looks like quite a good approach.
Just want to know if there's anything else out there, what's the most recommended way of doing this.
imagemagick is probably the easiest, but ghostscript can also be used to render images with text.
Here's a little postscript program that displays some text.
%!
5 5 moveto
/Palatino-Roman 20 selectfont
(Some Text) show
showpage
Using ps2eps will calculate the Bounding-Box and add this information as a comment conforming to the Document Structuring Conventions.
%!PS-Adobe-2.0 EPSF-2.0
%%BoundingBox: 5 5 97 20
%%HiResBoundingBox: 5.500000 5.000000 97.000000 19.500000
%%EndComments
% EPSF created by ps2eps 1.64
%%BeginProlog
save
countdictstack
mark
newpath
/showpage {} def
/setpagedevice {pop} def
%%EndProlog
%%Page 1 1
5 5 moveto
/Palatino-Roman 20 selectfont
(Some Text) show
showpage
%%Trailer
cleartomark
countdictstack
exch sub { end } repeat
restore
%%EOF
Then imagemagick's convert utility can render this as an image.
The ps2eps is necessary so the final image is cropped to the interesting part, rather than at the bottom of a page-sized image.
Here's a typescript of the whole sequence. 0> is the command prompt.
0> cat > t.ps
%!
5 5 moveto
/Palatino-Roman 20 selectfont
(Some Text) show
showpage
0> ps2eps t.ps
Input files: t.ps
Processing: t.ps
Calculating Bounding Box...ready. %%BoundingBox: 5 5 97 20
Creating output file t.eps...** Warning **: Weird heading line -- %! -- ready.
0> convert t.eps t.png

Resources