Does V8 have Unicode support?

Does V8 have Unicode support? - v8

I'm using v8 to use JavaScript in native(c++) code. To call a Javascript function I need to convert all the parameters to v8 data types.
For eg: Code to convert char* to v8 data type
char* value;
...
v8::String::New(value);
Now, I need to pass unicode chars(wchar_t) to JavaScript.
First of all does v8 supports Unicode chars? If yes, how to convert wchar_t/std::wstring to v8 data type?

I'm not sure if this was the case at the time this question was asked, but at the moment the V8 API has a number of functions which support UTF-8, UTF-16 and Latin-1 encoded text:
https://github.com/v8/v8/blob/master/include/v8.h
The relevant functions to create new string objects are:
String::NewFromUtf8 (UTF-8 encoded, obviously)
String::NewFromOneByte (Latin-1 encoded)
String::NewFromTwoByte (UTF-16 encoded)
Alternatively, you can avoid copying the string data and construct a V8 string object that refers to existing data (whose lifecycle you control):
String::NewExternalOneByte (Latin-1 encoded)
String::NewExternalTwoByte (UTF-16 encoded)

Unicode just maps Characters to Number. What you need is proper encoding, like UTF8 or UTF-16.
V8 seems to support UTF-8 (v8::String::WriteUtf8) and a not further described 16bit type (Write). I would give it a try and write some UTF-16 into it.
In unicode applications, windows stores UTF-16 in std::wstring. Maybe you try something like
std::wstring yourString;
v8::String::New (yourString.c_str());

No it doesn't have unicode support, the above solution is fine.

The following code did the trick
wchar_t path[1024] = L"gokulestás";
v8::String::New((uint16_t*)path, wcslen(path))

Related

universal Detector encoding in golang?

there are some sample code http.DetectContentType(buffer[:n]) detect for limited charset , in case like ANSSI it recognize as UTF-8
is any universal solution for this problem?

to check that byte array is UTF8 string, you can use utf8.valid

Can you use UTF-8 code for HttpServletRequest.setAttribute?

e.g. take this example
https://alvinalexander.com/blog/post/servlets/how-put-object-request-httpservletrequest-servlet
request.setAttribute("YOUR_KEY", yourVariable);
How to make yourVariable to be a UFT-8 code string ?
Thanks !

In Java Servlets, request-scoped variables are internal to the JVM, so you don't have to worry about encoding them. They're just regular Java strings, which are internally stored as a series of 16-bit characters. You only have to worry about encoding strings as UTF-8 (or decoding them from UTF-8) when sending them outside of the JVM (or receiving them from outside of the JVM). You could encode a Java string into a byte buffer using UTF-8, but then it would just be a byte buffer, not a string. You're best off treating strings within the JVM as regular String instances and only UTF-8 encoding them when sending them to a destination that expects UTF-8. If you're using the string in a JSP, then (assuming that the JSP is using UTF-8) the string will be encoded as UTF-8 during the rendering of the JSP.

Decoding unicode code point into utf8 using ICU

I have a unicode character code point stored as a string.
std::string code = "0663";
I need to decode it into utf8 and get as a standard std::string using the ICU library.
I decided to use ICU to get a cross-platform bit-independent solution.

Untested:
Convert the string into a int32_t.
Treat the int32_t as a UChar32.
Create a UnicodeString with UnicodeString::setTo from the UChar32.
Create a string object with UnicodeString::toUTF8String from the UnicodeString.

iconv is not working properly in linux (C++)

I want to convert a string from 1252 char code set to UTF-8. For this I used iconv library in my c++ application development which is based on linux platform.
I used the the API iconv() and converted my string.
there is a character è in my input. UTF-8 also does support to this character. So when my conversion is over, my output also should contain the same character è.
But When I see the output, Character è is converted to Ã¨ which I don't want.
One more point is if the converter found any unknown character, that should be automatically replaced with the default REPLACEMENT CHARACTER of UTF-8 �(FFFD) which is not happening.
How can I achieve the above two points with the library iconv.
I used the below APIs to convert the string
1)iconv_open("UTF-8","CP1252")
2)iconv() - Pass the parameters required
3)iconv_close(cd)
Can any body help me to sort out this issue please......

Please use this to replace invalid utf-8 charaters.
iconv_open("UTF-8//IGNORE","CP1252")

libxml2 questions about xmlChar*

I'm using libxml2. All function are working with xmlChar*. I found that xmlChar is an unsigned char.
So I have some questions about how to work with it.
1) For example if I working with utf-16 or utf-32 file how libxml2 process it and returns xmlChar in function? Will I lose some characters then??
2) If I want to do something with this string, should I cast it to char* or wchar_t* and how??
Will I lose some characters?

xmlChar is for handling UTF-8 encoding only.
So, to answer your questions:
No, you won't loose any characters if using UTF-16 or UTF-32. Just use iconv or any other library to encode your UTF-16 or UTF-32 data before passing it to the API.
Do not just "cast" the string. Convert them if needed in some other encoding.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Does V8 have Unicode support? - v8

No it doesn't have unicode support, the above solution is fine.

The following code did the trick wchar_t path[1024] = L"gokulestás"; v8::String::New((uint16_t*)path, wcslen(path))

Related

universal Detector encoding in golang?

Can you use UTF-8 code for HttpServletRequest.setAttribute?

Decoding unicode code point into utf8 using ICU

iconv is not working properly in linux (C++)

libxml2 questions about xmlChar*

Categories

Resources