Need a lua script to convert text into the utf8 encoding string - utf-8

I want to create a lua script which will convert text to utf8 encoded string.
The problem is I am using lua version 5.2 which does not support LUAJit which are having libraries to do so.
So, I need a function which will do this task for me.
For example I will pass "hey this is sam this side" it should give me the utf8 encoded string like "\x68\x65\x79\x20\x74\x68\x69\x73\x20\x69\x73\x20\x73\x61\x6d\x20\x74\x68\x69\x73\x20\x73\x69\x64\x65"
The requirement is like that need to use lua only.

You can do it like this:
local str = "hey this is sam this side"
local answer = string.gsub(str,"(.)",function (x) return string.format("\\x%02X",string.byte(x)) end)
print(answer)
The answer is:
"\x68\x65\x79\x20\x74\x68\x69\x73\x20\x69\x73\x20\x73\x61\x6D\x20\x74\x68\x69\x73\x20\x73\x69\x64\x65"

Related

Failed to compare UTF-8 chrs in Ruby

I'm using Ruby - Cucumber for automation.
I'm trying to send Japanese chars as a parameter to the user defined function to verify in db.
Below is the statement what I have used :
x=$objDB.run_select_query_verifyText('select name from xxxx where id=1','ごせり槎ゃぱ')
In the run_select_query_verifyText() function I have the code to connect db and get the records from db and it will verify the the text which is passed as a parameter(Japanese chars. )
This function returns true if the string is match with table data in DB else false.
But I'm getting always false and I found that the Japanese string is converting as "??????" while comparing the data.
Note: My program is working fine with English chars.
Your problem is most likely with character encodings. The database returns the content in a different encoding that the Ruby string you are working with. You need to figure out what the db encoding is and make sure both are the same.
If you are using ruby 1.9, you can check the encoding current encoding with yourstring.encoding and change it to e.g. UTF-8 with yourstring.encode("UTF-8").
If you are on ruby 1.8 things are bit more tricky as the String class doesn't natively support encodings. You can use e.g. the character-encodings gem to work around this.

Open a file with user input Ruby

I take a string variable from a user like this:
mail = gets
and I want to use this variable to open a file.
file = File.new(mail, "r") ##obviously this isn't working
How do I actually use this mail variable to open a file of that name?
Thanks
mail = gets.chomp
gets function gives a string with \n in the end.
I prefer mail = gets.strip.
strip seems to be slightly slower than chomp but I find it to be a little bit more readable.
If you're curious about the benchmark, check out the gist here.

Print number of characters in UTF-8 string

For example:
local a = "Lua"
local u = "Луа"
print(a:len(), u:len())
output:
3 6
How can I output number of characters in utf-8 string?
If you need to use Unicode/UTF-8 in Lua, you need to use external libraries, because Lua only works with 8-bit strings. One such library is slnunicode. Example code how to calculate the length of your string:
local unicode = require "unicode"
local utf8 = unicode.utf8
local a = "Lua"
local u = "Луа"
print(utf8.len(a), utf8.len(u)) --> 3 3
In Lua 5.3, you can use utf8.len to get the length of a UTF-8 string:
local a = "Lua"
local u = "Луа"
print(utf8.len(a), utf8.len(u))
Output: 3 3
You don't.
Lua is not Unicode aware. All it sees is a string of bytes. When you ask for the length, it gives you the length of that byte string. If you want to use Lua to interact in some way with Unicode strings, you have to either write a Lua module that implements those interactions or download such a module.
Another alternative is to wrap the native os UTF-8 string functions and use the os functions to do the heavy lifting. This depends on which OS you use - I've done this on OSX and it works a treat. Windows would be similar. Of course it opens another can of worms if you just want to run a script from the command line - depends on your app.

Ruby replace html entity with hexadecimal equivalent

Is there a way to use gsub (or something else) in Ruby to replace a string with its hexadecimal equivalent? In Mysql you'd do something like this:
self.connection.execute("UPDATE `dvd_actor` SET actor = replace(actor, '&pound,', CHAR(163));")
I'm rewriting this in Rails and using gsub, something like this:
self.actor = actor.gsub(/£/, "£").strip if actor =~ /£/
But I already have all the lines written with the hexadecimal character and I'm trying to avoid finding out which character is which (some of them require copy/pasting because I don't have them in the english keyboard).
I tried this (which I saw in a post here):
actor.gsub(/"/) { "0x134".hex } if actor =~ /"/
But that doesn't do the trick, it produces a number.
Or better yet, maybe there's a gem that already does that? Basically take the HTML values and fix them? Oh, that would be nice.
I would try "0x134".hex.to_s(16). It converts "0x134" into "134".
I believe I found it: a gem called htmlentities is supposed to do just what I want. So I have this:
ampersands = where("actor LIKE ?", "%&%;%")
ampersands.each do |actor|
fixed_actor = fixer.decode(actor.actor)
self.update(actor.id, :actor => fixed_actor)

Char to UTF code in vbscript

I'd like to create a .properties file to be used in a Java program from a VBScript. I'm going to use some strings in languages that use characters outside the ASCII map. So, I need to replace these characters for its UTF code. This would be \u0061 for a, \u0062 fro b and so on.
Is there a way to get the UTF code for a char in VBScript?
VBScript has the AscW function that returns the Unicode (wide) code of the first character in the specified string.
Note that AscW returns the character code as a decimal number, so if you need it in a specific format, you'll have to write some additional code for that (and the problem is, VBScript doesn't have decent string formatting functions). For example, if you need the code formatted as \unnnn, you could use a function like this:
WScript.Echo ToUnicodeChar("✈") ''# \u2708
Function ToUnicodeChar(Char)
str = Hex(AscW(Char))
ToUnicodeChar = "\u" & String(4 - Len(str), "0") & str
End Function

Resources