JPOS Q2 : Unable to see raw ISO 8583 data - jpos

[Update]
I was able to bring up the JPOS client and server simulator on the same box using this link : http://jpos.org/blog/2013/07/setting-up-the-client-simulator/( Please note the setup is pretty similar to one described in the link for running a server simulator too).
What i did next was to basically try to see the tcpdump ( also using wireshark). But what i see is not what i expected. Here's what i see ( Please note the data part)
Data (325 bytes)
0000 3c 69 73 6f 6d 73 67 3e 0a 20 20 3c 21 2d 2d 20 <isomsg>. <!--
0010 6f 72 67 2e 6a 70 6f 73 2e 69 73 6f 2e 70 61 63 org.jpos.iso.pac
0020 6b 61 67 65 72 2e 58 4d 4c 50 61 63 6b 61 67 65 kager.XMLPackage
0030 72 20 2d 2d 3e 0a 20 20 3c 66 69 65 6c 64 20 69 r -->. <field i
0040 64 3d 22 30 22 20 76 61 6c 75 65 3d 22 31 38 30 d="0" value="180
0050 30 22 2f 3e 0a 20 20 3c 66 69 65 6c 64 20 69 64 0"/>. <field id
0060 3d 22 37 22 20 76 61 6c 75 65 3d 22 30 37 32 30 ="7" value="0720
0070 30 30 33 36 33 39 22 2f 3e 0a 20 20 3c 66 69 65 003639"/>. <fie
0080 6c 64 20 69 64 3d 22 31 31 22 20 76 61 6c 75 65 ld id="11" value
0090 3d 22 37 39 39 38 31 33 22 2f 3e 0a 20 20 3c 66 ="799813"/>. <f
00a0 69 65 6c 64 20 69 64 3d 22 31 32 22 20 76 61 6c ield id="12" val
00b0 75 65 3d 22 37 39 39 38 30 35 22 2f 3e 0a 20 20 ue="799805"/>.
00c0 3c 66 69 65 6c 64 20 69 64 3d 22 36 33 22 20 76 <field id="63" v
00d0 61 6c 75 65 3d 22 4d 6f 6e 20 4a 75 6c 20 32 30 alue="Mon Jul 20
00e0 20 30 30 3a 33 36 3a 33 39 20 50 44 54 20 32 30 00:36:39 PDT 20
00f0 31 35 22 2f 3e 0a 20 20 3c 69 73 6f 6d 73 67 20 15"/>. <isomsg
0100 69 64 3d 22 31 32 30 22 3e 0a 20 20 20 20 3c 66 id="120">. <f
0110 69 65 6c 64 20 69 64 3d 22 30 22 20 76 61 6c 75 ield id="0" valu
0120 65 3d 22 32 39 31 31 30 30 30 31 22 2f 3e 0a 20 e="29110001"/>.
0130 20 3c 2f 69 73 6f 6d 73 67 3e 0a 3c 2f 69 73 6f </isomsg>.</iso
0140 6d 73 67 3e 0a msg>.
Data: 3c69736f6d73673e0a20203c212d2d206f72672e6a706f73...
[Length: 325]
If you look at the data, it looks like the XML ISO Msg. I was expecting something like the HEX representation of ISO 8583 where the first bytes are the MTI and etc etc..
After looking at the client simulator file, i realized that its a XML Channel and packager. I looked at the following channel & packager link here jpos.org/doc/javadoc/org/jpos/iso/packager/package-summary.html jpos.org/doc/javadoc/org/jpos/iso/channel/package-summary.html
After changing the packager to PostChannel and PostPackager, i still see the problems on my client and i see it times out. Was wondering if there was a way to see the actual raw data via tcpdump/wireshark. The most close is the Postilion which has data length prepended to the raw data.

After playing with the PostChannel and PostPackager, i was able to get it running and could see the message. The things i needed to do was basically change both the server simulator and client simulator configurations to use the desired Channel and Packager.
This is what i changed in both the server and client simulator
Server Simulator : Change the file src/dist/deploy/05_serversimulator.xml to use the desired channel and packager
<channel class="org.jpos.iso.channel.PostChannel" logger="Q2"
packager="org.jpos.iso.packager.PostPackager">
Client Simulator : Change the file ./src/dist/deploy/10_clientsimulator_channel.xml to use the desired channel and packager
<channel class="org.jpos.iso.channel.PostChannel" logger="Q2"
packager="org.jpos.iso.packager.PostPackager">
And then fire up the client and server simulators.

Channels assist you in connecting to the other entity and add headers, length headers , tpdu etc based on the implementation of the channel used.
PostChannel that you use here adds a 2 byte length header containing the size of the message. This assists the receiver in collecting the right amount of bytes from the tcp stream.
Packagers assist you in packing fields in the message, examples are fixed field, length prepended variables fields and what encoding these should have (hex,bcd, ascii).
The client server sims out of the box use xml for understanding the concepts.

Related

nifi convert text file to json

I'm trying to load log text files from a ftp server to elastic .
The log files look like this :
0:0:21: Processing events from events
0:0:21: Processing croned build types from q_type
0:0:21: Process croned releases from trls
0:0:22: Processing croned regression list from regression
0:0:22: Processing commit loop
in data provenance (hex view , because other views not showing anything)i see the data like this :
0x00000090 66 69 65 6C 64 3A 20 52 4E 20 53 74 61 74 75 73 field: RN Status
0x000000A0 2E 20 4F 62 6A 65 63 74 20 72 65 66 65 72 65 6E . Object referen
0x000000B0 63 65 20 6E 6F 74 20 73 65 74 20 74 6F 20 61 6E ce not set to an
0x000000C0 20 69 6E 73 74 61 6E 63 65 20 6F 66 20 61 6E 20 instance of an
0x000000D0 6F 62 6A 65 63 74 2E 0D 0A 30 3A 30 3A 31 34 3A object...0:0:14:
0x000000E0 20 43 61 6E 27 74 20 72 65 61 64 20 69 73 73 75 Can't read issu
0x000000F0 65 3A 20 41 49 2D 32 34 37 20 63 75 73 74 6F 6D e: AI-247 custom
0x00000100 20 66 69 65 6C 64 3A 20 52 4E 20 53 65 63 74 69 field: RN Secti
0x00000110 6F 6E 2E 20 4F 62 6A 65 63 74 20 72 65 66 65 72 on. Object refer
0x00000120 65 6E 63 65 20 6E 6F 74 20 73 65 74 20 74 6F 20 ence not set to
0x00000130 61 6E 20 69 6E 73 74 61 6E 63 65 20 6F 66 20 61 an instance of a
0x00000140 6E 20 6F 62 6A 65 63 74 2E 0D 0A 30 3A 30 3A 31 n object...0:0:1
0x00000150 34 3A 20 43 61 6E 27 74 20 72 65 61 64 20 69 73 4: Can't read is
0x00000160 73 75 65 3A 20 41 49 2D 32 34 37 20 63 75 73 74 sue: AI-247 cust
0x00000170 6F 6D 20 66 69 65 6C 64 3A 20 52 4E 20 44 6F 63 om field: RN Doc
0x00000180 20 69 6E 20 56 65 72 2E 20 4F 62 6A 65 63 74 20 in Ver. Object
0x00000190 72 65 66 65 72 65 6E 63 65 20 6E 6F 74 20 73 65 reference not se
0x000001A0 74 20 74 6F 20 61 6E 20 69 6E 73 74 61 6E 63 65 t to an instance
0x000001B0 20 6F 66 20 61 6E 20 6F 62 6A 65 63 74 2E 0D 0A of an object...
0x000001C0 30 3A 30 3A 31 34 3A 20 43 61 6E 27 74 20 72 65 0:0:14: Can't re
0x000001D0 61 64 20 69 73 73 75 65 3A 20 41 49 2D 32 34 37 ad issue: AI-247
I can get the file with "getftp" processor, but how do I convert it to json so I can send it to Elastic ?
I am new to nifi hope im not missing something basic, any help will be appreciated.
Thanks
You can use the ConvertRecord processor with a CSVReader for the input (configure to use : as the delimiter) and a JsonRecordSetWriter for the output.
NiFi can automatically infer the schema, but as it doesn't appear you have a header line for the incoming data, this will probably not be helpful. In that case, you can use the Schema Registry to hold two schemas -- one for the incoming log lines, indicating what each field should be called and the data type, and one for the JSON output. Bryan Bende has written a great article about this process.

How to scrub VT100/ANSI control chars in Net::Telnet

I am using Net::Telnet to connect to a HP ProCurve Switch to login and backup the config. However I ran into issues where waitfor returns VT100/ANSI control chars:
< 0x00000: ff fd 18 ff fd 1f ff fb 01 1b 5b 32 4a 1b 5b 3f ..........[2J.[?
< 0x00010: 37 6c 1b 5b 33 3b 32 33 72 1b 5b 3f 36 6c 1b 5b 7l.[3;23r.[?6l.[
< 0x00020: 31 3b 31 48 1b 5b 3f 32 35 6c 1b 5b 31 3b 31 48 1;1H.[?25l.[1;1H
< 0x00030: 48 50 20 4a 39 37 32 38 41 20 32 39 32 30 2d 34 HP J9728A 2920-4
< 0x00040: 38 47 20 53 77 69 74 63 68 0d 0d 0a 53 6f 66 74 8G Switch...Soft
< 0x00050: 77 61 72 65 20 72 65 76 69 73 69 6f 6e 20 57 42 ware revision WB
< 0x00060: 2e 31 35 2e 31 32 2e 30 30 31 35 0d 0d 0a 0d 0d .15.12.0015.....
< 0x00070: 0a 43 6f 70 79 72 69 67 68 74 20 28 43 29 20 31 .Copyright (C) 1
< 0x00080: 39 39 31 2d 32 30 31 34 20 48 65 77 6c 65 74 74 991-2014 Hewlett
< 0x00090: 2d 50 61 63 6b 61 72 64 20 44 65 76 65 6c 6f 70 -Packard Develop
< 0x000a0: 6d 65 6e 74 20 43 6f 6d 70 61 6e 79 2c 20 4c 2e ment Company, L.
< 0x000b0: 50 2e 0d 0a 0d 0a 20 20 20 20 20 20 20 20 20 20 P.....
Unfortunately, this screws up waitfor because if I try to waitfor(/^password:/i) it will return a string with those control chars in it, or wait forever since the regex is never matched.
Is there any way to have Net::Telnet automatically remove those control characters? Is there any way to have waitfor only care about ASCII printable characters?

Chunk size appears on Browser page

I'm implementing a small web server into a wifi micro. To aid in development and test, I have ported it to Windows console program.
I use chunked transfer processing. The following is what shows up on the browser:
0059
Hello World
0
The 59 is the hex size of the chunk and the 0 is the chunked terminating size
This is the data captured via wireshark:
This is the first message I send which are the headers
0000 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d HTTP/1.1 200 OK.
0010 0a 53 65 72 76 65 72 3a 20 54 72 61 6e 73 66 65 .Server: Transfe
0020 72 2d 45 6e 63 6f 64 69 6e 67 3a 20 63 68 75 6e r-Encoding: chun
0030 6b 65 64 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70 ked..Content-Typ
0040 65 3a 20 74 65 78 74 2f 68 74 6d 6c 0d 0a 43 61 e: text/html..Ca
0050 63 68 65 2d 43 6f 6e 74 72 6f 6c 3a 20 6d 61 78 che-Control: max
0060 2d 61 67 65 3d 33 36 30 30 2c 20 6d 75 73 74 2d -age=3600, must-
0070 72 65 76 61 6c 69 64 61 74 65 0d 0a 0d 0a revalidate....
The next block is the chunked data
0000 30 30 35 39 0d 0a 3c 68 74 6d 6c 3e 0a 3c 68 65 0059..<html>.<he
0010 61 64 3e 3c 74 69 74 6c 65 3e 57 65 62 20 53 65 ad><title>Web Se
0020 72 76 65 72 3c 2f 74 69 74 6c 65 3e 0a 3c 2f 68 rver</title>.</h
0030 65 61 64 3e 0a 3c 62 6f 64 79 3e 0a 3c 68 31 3e ead>.<body>.<h1>
0040 48 65 6c 6c 6f 20 57 6f 72 6c 64 3c 2f 68 31 3e Hello World</h1>
0050 0a 3c 2f 62 6f 64 79 3e 3c 2f 68 74 6d 6c 3e 0d .</body></html>.
0060 0a 30 0d 0a 0d 0a .0....
The chunked values are being displayed on both Chrome and IE.
Can anyone see an issue with my data that would cause the issue.
Thanks
Solved:
I mistakenly remove the server name so now the browser is taking the transfer encoding as the server name and does not understand the chunked message size -- it thinks its just data to display.

How do I connect to a websocket manually, with netcat/socat/telnet?

I am trying to connect to the reference websocket echo server "manually", in order to learn how the protocol works (I am using socat for that). However, the server invariably closes the connection without providing an answer. Any idea why?
Here is what I do:
socat - TCP:echo.websocket.org:80
Then, I paste the following text in the terminal:
GET /?encoding=text HTTP/1.1
Origin: http://www.websocket.org
Connection: Upgrade
Host: echo.websocket.org
Sec-WebSocket-Key: P7Kp2hTLNRPFMGLxPV47eQ==
Upgrade: websocket
Sec-WebSocket-Version: 13
I sniffed the parameters of the connection with the developer tools, in firefox, on the same machine, where this works flawlessly: therefore, I would assume they are correct. However after that, the server closes the connection immediately, without providing an answer. Why? How can I implement the protocol "manually"?
I would like type test in my terminal and get the server to reply with what I typed (It works in a web browser).
I think you want to modify the socket stream to translate \n (line feed) to CRLF (Carriage return & line feed). Doing info socat produces detailed information which includes this modifier:
crnl Converts the default line termination character NL ('\n', 0x0a)
to/from CRNL ("\r\n", 0x0d0a) when writing/reading on this chan-
nel (example). Note: socat simply strips all CR characters.
So I think you should be able to do this:
socat - TCP:echo.websocket.org:80,crnl
I'd like to add that my WebSocket tool websocat can help in debugging the WebSocket protocol, especially when combined with socat:
$ websocat - ws-c:sh-c:"socat -v -x - tcp:echo.websocket.org:80" --ws-c-uri ws://echo.websocket.org
> 2018/07/03 16:30:06.021658 length=157 from=0 to=156
47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a GET / HTTP/1.1..
48 6f 73 74 3a 20 65 63 68 6f 2e 77 65 62 73 6f Host: echo.webso
63 6b 65 74 2e 6f 72 67 0d 0a cket.org..
43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72 Connection: Upgr
61 64 65 0d 0a ade..
55 70 67 72 61 64 65 3a 20 77 65 62 73 6f 63 6b Upgrade: websock
65 74 0d 0a et..
53 65 63 2d 57 65 62 53 6f 63 6b 65 74 2d 56 65 Sec-WebSocket-Ve
72 73 69 6f 6e 3a 20 31 33 0d 0a rsion: 13..
53 65 63 2d 57 65 62 53 6f 63 6b 65 74 2d 4b 65 Sec-WebSocket-Ke
79 3a 20 59 76 36 32 44 31 57 6d 7a 79 79 31 65 y: Yv62D1Wmzyy1e
69 6d 62 47 6d 68 69 61 67 3d 3d 0d 0a imbGmhiag==..
0d 0a ..
--
< 2018/07/03 16:30:06.164057 length=201 from=0 to=200
48 54 54 50 2f 31 2e 31 20 31 30 31 20 57 65 62 HTTP/1.1 101 Web
20 53 6f 63 6b 65 74 20 50 72 6f 74 6f 63 6f 6c Socket Protocol
20 48 61 6e 64 73 68 61 6b 65 0d 0a Handshake..
43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72 Connection: Upgr
61 64 65 0d 0a ade..
44 61 74 65 3a 20 54 75 65 2c 20 30 33 20 4a 75 Date: Tue, 03 Ju
6c 20 32 30 31 38 20 31 33 3a 31 35 3a 30 30 20 l 2018 13:15:00
47 4d 54 0d 0a GMT..
53 65 63 2d 57 65 62 53 6f 63 6b 65 74 2d 41 63 Sec-WebSocket-Ac
63 65 70 74 3a 20 55 56 6a 32 74 35 50 43 7a 62 cept: UVj2t5PCzb
58 49 32 52 4e 51 75 70 2f 71 48 31 63 5a 44 6e XI2RNQup/qH1cZDn
38 3d 0d 0a 8=..
53 65 72 76 65 72 3a 20 4b 61 61 7a 69 6e 67 20 Server: Kaazing
47 61 74 65 77 61 79 0d 0a Gateway..
55 70 67 72 61 64 65 3a 20 77 65 62 73 6f 63 6b Upgrade: websock
65 74 0d 0a et..
0d 0a ..
--
ABCDEF
> 2018/07/03 16:30:12.707919 length=13 from=157 to=169
82 87 40 57 f5 88 01 15 b6 cc 05 11 ff ..#W.........
--
< 2018/07/03 16:30:12.848398 length=9 from=201 to=209
82 07 41 42 43 44 45 46 0a ..ABCDEF.
--
ABCDEF
> 2018/07/03 16:30:14.528333 length=6 from=170 to=175
88 80 18 ec 05 a8 ......
--
< 2018/07/03 16:30:14.671629 length=2 from=210 to=211
88 00 ..
--
In case of failures with manually driven socat -v -x - TCP:echo.websocket.org:80,crnl (mentioned in the other answer), you can compare it with WebSocat-driven socat like in session depicted above.
Reverse (server) example with socat debug dump:
socat -v -x tcp-l:1234,fork,reuseaddr exec:'websocat -t ws-u\:stdio\: mirror\:'
Alternatively, here is a way to connect and read the stream from a wss secure websocket stream from the command line using solely core php.
php -r '$sock=stream_socket_client("tls://echo.websocket.org:443",$e,$n,30,STREAM_CLIENT_CONNECT,stream_context_create(null));if(!$sock){echo"[$n]$e".PHP_EOL;}else{fwrite($sock,"GET / HTTP/1.1\r\nHost: echo.websocket.org\r\nAccept: */*\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: ".rand(0,999)."\r\n\r\n");while(!feof($sock)){var_dump(fgets($sock,2048));}}'
Other similar example, pulling from another wss server: (Do not get rekt)
php -r '$sock=stream_socket_client("tls://stream.binance.com:9443",$e,$n,30,STREAM_CLIENT_CONNECT,stream_context_create(null));if(!$sock){echo"[$n]$e".PHP_EOL;}else{fwrite($sock,"GET /stream?streams=btcusdt#kline_1m HTTP/1.1\r\nHost: stream.binance.com:9443\r\nAccept: */*\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: ".rand(0,999)."\r\n\r\n");while(!feof($sock)){var_dump(explode(",",fgets($sock,512)));}}'

wcslen() works differently in Xcode and VC++

I found that wcslen() in VC++2010 returns correct count of letters; meanwhile Xcode does not.
For example, the code below returns correct 11 in VC++ 2010, but returns incorrect 17 in Xcode 4.2.
const wchar_t *p = L"123abc가1나1다";
size_t plen = wcslen(p);
I guess Xcode app stores wchar_t string as UTF-8 in memory. This is another strange thing.
How can I get 11 just like VC++ in Xcode too?
I ran this program on a Mac Mini running MacOS X 10.7.2 (Xcode 4.2):
#include <stdio.h>
#include <wchar.h>
int main(void)
{
const wchar_t p[] = L"123abc가1나1다";
size_t plen = wcslen(p);
if (fwide(stdout, 1) <= 0)
{
fprintf(stderr, "Failed to make stdout wide-oriented\n");
return -1;
}
wprintf(L"String <<%ls>>\n", p);
putwc(L'\n', stdout);
wprintf(L"Length = %zu\n", plen);
for (size_t i = 0; i < sizeof(p)/sizeof(*p); i++)
wprintf(L"Character %zu = 0x%X\n", i, p[i]);
return 0;
}
When I do a hex dump of the source file, I see:
0x0000: 23 69 6E 63 6C 75 64 65 20 3C 73 74 64 69 6F 2E #include <stdio.
0x0010: 68 3E 0A 23 69 6E 63 6C 75 64 65 20 3C 77 63 68 h>.#include <wch
0x0020: 61 72 2E 68 3E 0A 0A 69 6E 74 20 6D 61 69 6E 28 ar.h>..int main(
0x0030: 76 6F 69 64 29 0A 7B 0A 20 20 20 20 63 6F 6E 73 void).{. cons
0x0040: 74 20 77 63 68 61 72 5F 74 20 70 5B 5D 20 3D 20 t wchar_t p[] =
0x0050: 4C 22 31 32 33 61 62 63 EA B0 80 31 EB 82 98 31 L"123abc...1...1
0x0060: EB 8B A4 22 3B 0A 20 20 20 20 73 69 7A 65 5F 74 ...";. size_t
0x0070: 20 70 6C 65 6E 20 3D 20 77 63 73 6C 65 6E 28 70 plen = wcslen(p
0x0080: 29 3B 0A 20 20 20 20 69 66 20 28 66 77 69 64 65 );. if (fwide
0x0090: 28 73 74 64 6F 75 74 2C 20 31 29 20 3C 3D 20 30 (stdout, 1) <= 0
0x00A0: 29 0A 20 20 20 20 7B 0A 20 20 20 20 20 20 20 20 ). {.
0x00B0: 66 70 72 69 6E 74 66 28 73 74 64 65 72 72 2C 20 fprintf(stderr,
0x00C0: 22 46 61 69 6C 65 64 20 74 6F 20 6D 61 6B 65 20 "Failed to make
0x00D0: 73 74 64 6F 75 74 20 77 69 64 65 2D 6F 72 69 65 stdout wide-orie
0x00E0: 6E 74 65 64 5C 6E 22 29 3B 0A 20 20 20 20 20 20 nted\n");.
0x00F0: 20 20 72 65 74 75 72 6E 20 2D 31 3B 0A 20 20 20 return -1;.
0x0100: 20 7D 0A 20 20 20 20 77 70 72 69 6E 74 66 28 4C }. wprintf(L
0x0110: 22 53 74 72 69 6E 67 20 3C 3C 25 6C 73 3E 3E 5C "String <<%ls>>\
0x0120: 6E 22 2C 20 70 29 3B 0A 20 20 20 20 70 75 74 77 n", p);. putw
0x0130: 63 28 4C 27 5C 6E 27 2C 20 73 74 64 6F 75 74 29 c(L'\n', stdout)
0x0140: 3B 0A 20 20 20 20 77 70 72 69 6E 74 66 28 4C 22 ;. wprintf(L"
0x0150: 4C 65 6E 67 74 68 20 3D 20 25 7A 75 5C 6E 22 2C Length = %zu\n",
0x0160: 20 70 6C 65 6E 29 3B 0A 20 20 20 20 66 6F 72 20 plen);. for
0x0170: 28 73 69 7A 65 5F 74 20 69 20 3D 20 30 3B 20 69 (size_t i = 0; i
0x0180: 20 3C 20 73 69 7A 65 6F 66 28 70 29 2F 73 69 7A < sizeof(p)/siz
0x0190: 65 6F 66 28 2A 70 29 3B 20 69 2B 2B 29 0A 20 20 eof(*p); i++).
0x01A0: 20 20 20 20 20 20 77 70 72 69 6E 74 66 28 4C 22 wprintf(L"
0x01B0: 43 68 61 72 61 63 74 65 72 20 25 7A 75 20 3D 20 Character %zu =
0x01C0: 30 78 25 58 5C 6E 22 2C 20 69 2C 20 70 5B 69 5D 0x%X\n", i, p[i]
0x01D0: 29 3B 0A 20 20 20 20 72 65 74 75 72 6E 20 30 3B );. return 0;
0x01E0: 0A 7D 0A .}.
0x01E3:
The output when compiled with GCC is:
String <<123abc
Length = 11
Character 0 = 0x31
Character 1 = 0x32
Character 2 = 0x33
Character 3 = 0x61
Character 4 = 0x62
Character 5 = 0x63
Character 6 = 0xAC00
Character 7 = 0x31
Character 8 = 0xB098
Character 9 = 0x31
Character 10 = 0xB2E4
Character 11 = 0x0
Note that the string is truncated at the zero byte - I think that is probably a bug in the system, but it seems a little unlikely that I'd manage to find one on my first attempt at using wprintf(), so it is more likely I'm doing something wrong.
You're right, in the multi-byte UTF-8 source code, the string occupies 17 bytes (8 one-byte basic Latin-1 characters, and 3 characters each encoded using 3 bytes). So, the raw strlen() on the source string would return 17 bytes.
GCC version is:
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Just for giggles, I tried clang, and I get a different result. Compiled using:
clang -o row row.c -Wall -std=c99
using:
Apple clang version 2.1 (tags/Apple/clang-163.7.1) (based on LLVM 3.0svn)
Target: x86_64-apple-darwin11.3.0
Thread model: posix
The output when compiled with clang is:
String <<123abc가1나1다>>
Length = 17
Character 0 = 0x31
Character 1 = 0x32
Character 2 = 0x33
Character 3 = 0x61
Character 4 = 0x62
Character 5 = 0x63
Character 6 = 0xEA
Character 7 = 0xB0
Character 8 = 0x80
Character 9 = 0x31
Character 10 = 0xEB
Character 11 = 0x82
Character 12 = 0x98
Character 13 = 0x31
Character 14 = 0xEB
Character 15 = 0x8B
Character 16 = 0xA4
Character 17 = 0x0
So, now the string appears correctly, but the length is given as 17 instead of 11. Superficially, you can take your choice of bugs - string looks OK (in a terminal - /Applications/Utilities/Terminal - acclimatized to UTF8) but length is wrong, or length is right but string does not appear correctly.
I note that sizeof(wchar_t) in both gcc and clang is 4.
The left hand does not understand what the right hand is doing. I think there's a case for claiming both are broken, in different ways.

Resources