If you were writing to a text file in Fortran using WRITE/FORMAT, how would you set up the FORMAT statement to write a character that is non-printable? Basically a non-keyboard/invisible character? I know in C++ you can give it an ASCII number, not sure how Fortran would do it.
Related
I need to sort some string of Japanese/Chinese string. I am using UTF16 format here. to start with sorting, i am first normalising its character by using tolower function. but this function giving me same reply (val:31) for some characters. I tried using toupper function as well but there was no change.If I first convert UTF-16 to UTF-8 function then things start working correctly.Could any one help me what I am doing wrong.This bug is limited to these language only.
If I convert UTF-16 to UTF-8 then things starting working correctly.
I'm developing a software that stores its data in a binary file format. However, as a courtesy to innocent shell users that might cat to inspect the contents of such a file, I'm thinking of having an ASCII-compatible "magic string" in the start of the file that tells the name and the version of the binary format.
I'm thinking of having at least ten rows (\n) in the message so that head by default settings doesn't hit the binary part.
Now, I wonder if there is any control character or escape code that would hint to the shell that the following content isn't interpretable as printable text, and should be just ignored? I tried 0x00 (the null byte) and 0x04 (ctrl-D) but they seem to be just ignored when catting the file.
Cat regards a file as text. There is no way you can trigger an end-of-file, since EOF is not actually any character.
The other way around works of course; specifying a format that only start reading binary format from a certain character on.
I'm trying to understand exactly who among the set of TTY, kernel, line discipline, and shell actually deals with any given part of my input. In particular, I've been working through this article:
The TTY demystified
My question: if I want to have the actual delete character show up in BASH, I can use ^V to make it verbatim. E.g. I can use Ctrl+V then Ctrl+H to have a backspace character. Pretty neat!
echo '3^H'
3
My question is this: if I'm in something that reads in canonical mode (I believe cat does this) I can put a null character in there by doing Ctrl+V then Ctrl+2 (basically Ctrl+#, which is the caret notation for the null character).
BASH won't allow me to have a verbatim null character on one of its lines, though, and it looks like python and other readline programs won't either.
Does anybody know why this is, or a general-purpose workaround?
The C library uses a literal null as a string terminator, so anything using that is unable to represent strings containing literal nulls.
Programs which need to support literal nulls define their own string data type.
I want to convert a string from 1252 char code set to UTF-8. For this I used iconv library in my c++ application development which is based on linux platform.
I used the the API iconv() and converted my string.
there is a character è in my input. UTF-8 also does support to this character. So when my conversion is over, my output also should contain the same character è.
But When I see the output, Character è is converted to è which I don't want.
One more point is if the converter found any unknown character, that should be automatically replaced with the default REPLACEMENT CHARACTER of UTF-8 �(FFFD) which is not happening.
How can I achieve the above two points with the library iconv.
I used the below APIs to convert the string
1)iconv_open("UTF-8","CP1252")
2)iconv() - Pass the parameters required
3)iconv_close(cd)
Can any body help me to sort out this issue please......
Please use this to replace invalid utf-8 charaters.
iconv_open("UTF-8//IGNORE","CP1252")
I have some UTF-Text starting with "ef bb bf". How can I turn this message to human read-able text? vim, gedit, etc. interpret the file as plain text and show all the ef-text even when I force them to read the file with several utf-encodings. I tried the "recode" tool, it doesn't work. Even php's utf8_decode failed to produce the expected text output.
Please help, how can I convert this file so that I can read it?
ef bb bf is the UTF-8 BOM. Strip of the first three bytes and try to utf8_decode the remainder.
$text = "\xef\xbb\xbf....";
echo utf8_decode(substr($text, 3));
Is it UFT8, UTF16, UTF32? It matters a lot! I assume you want to convert the text into old-fashioned ASCII (all characters are 1 byte long).
UTF8 should already be (at least mostly) readable as it uses 1 byte for standard ASCII characters and only uses multiple bytes for special/multilingual characters (Character codes > 127). It sounds like your file isn't UTF8, or you'd already be able to read it! Online content is generally UTF-8.
Unicode character codes are the same as the old ASCII codes up to 127.
UTF16 and UTF32 always use 2 and 4 bytes respectively to encode every character, whether those characters can be represented in a single byte or not. That makes it unreadable if the text editor is expecting UTF8.
Gedit supports UTF16 and UTF32 but you need to 'add' those encoding explicitly in the open dialog box (and possibly select them explicitly instead of using auto-detect)