AWS lambda function to speak number as digit in alexa - aws-lambda

I have tried to use say-as interpret-as to make Alexa speak number in digits
Example - 9822 must not read in words instead '9,8,2,2'
One of the two ways I have tried is as follows:
this.emit(':tell',"Hi "+clientname+" your "+theIntentConfirmationStatus+" ticket is sent to "+ "<say-as interpret-as='digits'>" + clientno + "</say-as>",'backup');
The other one is this:
this.response.speak("Hi "+clientname+" your "+theIntentConfirmationStatus+" ticket is sent to "+ "<say-as interpret-as='digits'>" + clientno + "</say-as>");
Both are not working but working on a separate fresh function.

Actually your code SHOULD work.
Maybe you can try in test simulator and send us the code your script produces? Or the logs?
I've tried the following:
<speak>
1. The numbers are: <say-as interpret-as="digits">5498</say-as>.
2. The numbers are: <say-as interpret-as="spell-out">5498</say-as>.
3. The numbers are: <say-as interpret-as="characters">5498</say-as>.
4. The numbers are: <prosody rate="x-slow"><say-as interpret-as="digits">5498</say-as></prosody>.
5. The number is: 5498.
</speak>
Digits, Spell-out and Characters all have the effect you want.
If you want to Alexa to say it extra slow, use the prosody in #4.
Try using examples #2 or #3, maybe this works out?
Otherwise the example from Amod will work too.

You can split number into individual digits using sample function ( please test it for your possible inputs-its not tested for all input). You can search for similar function on stackoverflow
function getNumber(tablenumber) {
var number = (""+tablenumber).split("");
var arrayLength = number.length;
var tmp =" ";
for (var i = 0; i < arrayLength; i++) {
var tmp = tmp + myStringArray[i] + ", <break time=\"0.4s\"/> ";
}
return tmp;
}
In your main function... call this
var finalresult = getNumber(clientno);
this.emit(':tell',"Hi "+clientname+" your "+theIntentConfirmationStatus+" ticket is sent to "+ finalresult ,'backup');

Edited: Yep, nightflash's answer is great.
You could also break the numbers up yourself if you need other formatting, such as emphasizing particular digits, add pauses, etc. You would need to use your Lambda code to convert the numeric string to multiple digits separated by spaces and any other formatting you need.
Here's an example based on the answers in this post:
var inputNumber = 12354987;
var output = '';
var sNumber = inputNumber.toString();
for (var i = 0, len = sNumber.length; i < len; i += 1) {
// just adding spaces here, but could be SSML attributes, etc.
output = output + sNumber.charAt(i) + ' ';
}
console.log(output);
This code could be refactored and done many other ways, but I think this is about the easiest to understand.

Related

How to display a list of number of words of each length - Javascript

Hi guys I am really stuck in this one situation :S I have a local .txtfile with a random sentence and my program is meant to :
I am finding it difficult to execute the third question. My code is ..
JavaScript
lengths.forEach((leng) => {
counter[leng] = counter[leng] || 0;
counter[leng]++;
});
$("#display_File_most").text(counter);
}
}
r.readAsText(f);
}
});
</script>
I have used this question for help but no luck - Using Javascript to find most common words in string?
I believe I have to store the sentence in an array and loop through it, uncertain if that is the correct step or if there is quicker way of finding the solution so I ask you guys.
Thanks for your time & I hope my question made sense :)
If you think of your solution as separated well done tasks, it would be really simple to find it. Here you have them together:
Convert the words into an array. Your guts were right about this :)
var source = "Hello world & good morning. The date is 18/09/2018";
var words = source.split(' ');
The next step is to find out the length of each word
var lengths = words.map(function(word) {
return word.length;
});
Finally the most complicated part is to get the number of occurrences for each length. One idea is to use an object to use key/value where key is the length and value is its count (source: https://stackoverflow.com/a/10541220/1505348)
Now you will see under the counter object have each word length with its repetition number on the source string.
var source = "Hello world & good morning. The date is 18/09/2018";
var words = source.split(' ');
var lengths = words.map(function(word) {
return word.length;
});
var counter = {};
lengths.forEach((leng) => {
counter[leng] = counter[leng] || 0;
counter[leng]++;
});
console.log(counter);
3.Produce a list of number of words of each length in sentence (not done).
Based on the question would this not be the solution?
var words = str.split(" ");
var count = {};
for (var i = 0; i<words.length; i++){
count[words[i].length] = (count [words[i].length] || 0) + 1
}

How to find a frequent character in a string written in pseudocode. Thanks

Most Frequent Character
Design a program that prompts the user to enter a string, and displays the character that appears most frequently in the string.
It is a homework question, but my teacher wasn't helpful and its driving me crazy i can't figure this out.
Thank You in advance.
This is what i have so far!
Declare String str
Declare Integer maxChar
Declare Integer index
Set maxChar = 0
Display “Enter anything you want.”
Input str
For index = 0 To length(str) – 1
If str[index] =
And now im stuck. I dont think its right and i dont know where to go with it!
It seems to me that the way you want to do it is:
"Go through every character in the string and remember the character we've seen most times".
However, that won't work. If we only remember the count for a single character, like "the character we've seen most times is 'a' with 5 occurrences", we can't know if perhaps the character in the 2nd place doesn't jump ahead.
So, what you have to do is this:
Go through every character of the string.
For every character, increase the occurrence count for that character. Yes, you have to save this count for every single character you encounter. Simple variables like string or int are not going to be enough here.
When you're done, you're left with a bunch of data looking like "a"=5, "b"=2, "e"=7,... you have to go though that and find the highest number (I'm sure you can find examples for finding the highest number in a sequence), then return the letter which this corresponds to.
Not a complete answer, I know, but that's all I'm going to say.
If you're stuck, I suggest getting a pen and a piece of paper and trying to calculate it manually. Try to think - how would you do it without a computer? If your answer is "look and see", what if the text is 10 pages? I know it can be pretty confusing, but the point of all this is to get you used to a different way of thinking. If you figure this one out, the next time will be easier because the basic principles are always the same.
This is the code I have created to count all occurences in a string.
String abc = "aabcabccc";
char[] x = abc.toCharArray();
String _array = "";
for(int i = 0; i < x.length; i++) //copy distinct data to a new string
{
if(_array.indexOf(x[i]) == -1)
_array = _array+x[i];
}
char[] y = _array.toCharArray();
int[] count1 = new int[_array.length()];
for(int j = 0; j<x.length;j++) //count occurences
{
count1[new String(String.valueOf(y)).indexOf(x[j])]++;
}
for(int i = 0; i<y.length;i++) //display
{
System.out.println(y[i] + " = " + count1[i]);
}

Converting an if code into forloop statement

Right now i have to write a code that will print out "*" for each collectedDots, however if it doesn't collect any collectedDots==0 then it print out "". Using too many if statements look messy and i was wandering how you would implement the forloop in this case.
As a general principle the kind of rearrangement you've done here is good. You have found a way to express the rule in a general way rather than as a sequence of special cases. This is much easier to reason about and to check, and it's obviously extensible to cases where you have more than 3 dots.
You probably have made an error in confusing your target number and the iteration value, I assume that collectedDots contains the number of dots you have (as per your if statement) and so you need to introduce a variable to count up to that value
for (int i =0; i <= collectedDots; i++)
{
stars = "*";
System.out.print(stars);
}
Ok, so you already have a variable called collectedDots that is a number which tells you how many stars to print?
So your loop would be something like
for every collected dot
print *
But you can't just print it out, you need to return a string that will be printed out. So it's more like
for every collected dot
add a * to our string
return the string
They key difference between this and your attempt so far is that you were assigning a star to be your string each time through the loop, then at the end of it, you return that string–no matter how many times you assign a star to the string, the string will always just be one star.
You also need a separate variable to keep track of your loop, this should do the trick:
String stars = "";
for(int i = 0; i < collectedDots; i++)
{
stars = stars + "*";
}
return stars;
You are almost correct. Just need to change range limit of looping. Looping initial value is set to 1. So whenever you have collectedDots = 0, it will not go in loop and will return "", as stars is intialized with "" before loop.
String stars = "";
for (int i =1; i <= collectedDots; i++)
{
stars = "*";
System.out.print(stars);
}
return stars;

AS3 dynamic text algorithm

Afternoon,
I have an odd algorithm. I would like to populate a string of code dynamically based on some user entry.
I have a multi-dimensional array with data in it and a multi-line input text field.
What I want is for a user to be able to enter some text
example:
00
01 - 02 - 03
comments: 12
my code would identify the numbers an treat everything else as text.
Thus, if my array is data[x][#], the # will correspond to their entry.
I would get
algorithm_string = data[x][0] + "\n" + data[x][1] + " - " + data[x][2] + " - " + data[x][3] + "\n" + "comments: " + data[x][12]
So the algorithm would construct the above, and then I could run through the code.
for(var x:int = 0; x < data.length; x++){
some_object._display_text.text = algorithm_string;
}
Ok so I want to first say that relying on a user to put in the entry exactly the way you want is probably not a good idea. They WILL make mistakes and your code WILL eventually not work as expected. I would recommend using 5 inputs restricted to numeric input, and labeling each field with which number should go in it.
However, you can accomplish what you are trying to do above like this:
var parts:Array = myInput.text.split(" ");
for (var i:int=0; i<parts.length, i++){
if(!isNaN(parseInt(parts[i]))){
// you have a number here.
data[x].push(parts[i]);
} else {
//this was not a number so ignore it
}
}
Again let me state I think you should refactor how you get the numbers, but that code will grab the numbers out and put them in the 0,1,2,3,and 4 indexes of your data[x], but relies on the user perfectly inputting the text every time.
Good luck! (refactor) :)

How to convert Chinese characters to Pinyin [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed last year.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
For sorting Chinese language text, I want to convert Chinese characters to Pinyin, properly separating each Chinese character and grouping successive characters together.
Can you please help me in this task by providing the logic or source code for doing this?
Please let me know if any open source or lib already present for this.
Short answer: you don't.
Long answer: There is no one-to-one mapping for 汉字 to 汉语拼音. Just some quick examples:
把 can be "ba" in the third tone or fourth tone.
了 can be "le" toneless or "liao" third tone.
乐 can be "le" or "yue", both in the fourth tone.
落 can be "luo", "la" or "lao", all in the fourth tone.
And so on. I have a beginners' book on this topic that has 207 examples. I stress that this is a beginners' book and is by no means complete. Each one has a page or two of examples of use and conditions under which you choose the appropriate pronunciation. It is not something that could be easily programmed (if at all).
And this doesn't even address the other slippery thing you want to deal with: the separation of characters into grouped words. The very notion of a word is a bit slippery in Chinese. (There's two terms that correspond, roughly to "word" in Chinese for example: 字 and 词. The first is the character, the second groups of characters that are put together into one concept. (I frequently get asked by Chinese speakers how many "words" I can read when they really mean "characters".) While in some cases the distinction is clear (the 词 "乌鸦", for example, is "crow" -- the two 字 must be together to express the idea properly and it would be incorrect to translate it as "black crow"), in others it is not so clear. What does "你好" translate to? Is it one word meaning, idiomatically, "hello"? Or is it two words translating literally to "you good"? Each of the characters involved stands alone or in groups with other words, but together they mean something entirely different from their individual meanings. Given this, how, precisely, do you plan to group the 汉语拼音 transliterations (which are difficult to impossible to get right in the first place!) into "words"?
While #JUST MY correct OPINION's answer addresses some of the difficulties of converting characters into pinyin, it is not an impossible problem to solve.
I have written a library (pinyinify) that solves this task with decent accuracy. Even though there is not a one-to-one mapping between characters and pinyin, my library can usually decide which pronunciation is correct. For example, "我受不了了" correctly converts to "wǒ shòubùliǎo le", with two different pronunciations of 了.
My approach to solving the problem is pretty simple:
First segment the text into words. For example, 我喜欢旅游 would be divided into three words: 我 喜欢 旅游. This is also not a simple process, but there are many libraries for it. jieba is one of the more popular libraries for this purpose.
Use a dictionary to convert the words into pinyin.
If the word is not in the dictionary, fall back to converting the individual characters to pinyin using their most common pronunciation.
CoreFoundation provides certain method to do the conversion:
CFMutableStringRef string = CFStringCreateMutableCopy(NULL, 0, CFSTR("中文"));
CFStringTransform(string, NULL, kCFStringTransformMandarinLatin, NO);
CFStringTransform(string, NULL, kCFStringTransformStripDiacritics, NO);
NSLog(#"%#", string);
The output is
zhong wen
the following code writing in C# can help you to simply convert chinese words that including in gb2312 encodec(just 2312 of often used Simplified-Chinese words) to pinyin.like convert "今天天气不错" to "JinTianTianQiBuCuo".
sometimes a chinese word is not one to one map to a pinyin,it depends on the context we talk about.like the "行" in "自行车"(bike) is pronounced "Xing",but in "银行"(bank) it pronounced "Hang".so if you have problem with this,you may find more complex solution to handle this.
sorry for my poor english.i hope this could give you a little help.
public class ChineseToPinYin
{
private static int[] pyValue = new int[]
{
-20319,-20317,-20304,-20295,-20292,-20283,-20265,-20257,-20242,-20230,-20051,-20036,
-20032,-20026,-20002,-19990,-19986,-19982,-19976,-19805,-19784,-19775,-19774,-19763,
-19756,-19751,-19746,-19741,-19739,-19728,-19725,-19715,-19540,-19531,-19525,-19515,
-19500,-19484,-19479,-19467,-19289,-19288,-19281,-19275,-19270,-19263,-19261,-19249,
-19243,-19242,-19238,-19235,-19227,-19224,-19218,-19212,-19038,-19023,-19018,-19006,
-19003,-18996,-18977,-18961,-18952,-18783,-18774,-18773,-18763,-18756,-18741,-18735,
-18731,-18722,-18710,-18697,-18696,-18526,-18518,-18501,-18490,-18478,-18463,-18448,
-18447,-18446,-18239,-18237,-18231,-18220,-18211,-18201,-18184,-18183, -18181,-18012,
-17997,-17988,-17970,-17964,-17961,-17950,-17947,-17931,-17928,-17922,-17759,-17752,
-17733,-17730,-17721,-17703,-17701,-17697,-17692,-17683,-17676,-17496,-17487,-17482,
-17468,-17454,-17433,-17427,-17417,-17202,-17185,-16983,-16970,-16942,-16915,-16733,
-16708,-16706,-16689,-16664,-16657,-16647,-16474,-16470,-16465,-16459,-16452,-16448,
-16433,-16429,-16427,-16423,-16419,-16412,-16407,-16403,-16401,-16393,-16220,-16216,
-16212,-16205,-16202,-16187,-16180,-16171,-16169,-16158,-16155,-15959,-15958,-15944,
-15933,-15920,-15915,-15903,-15889,-15878,-15707,-15701,-15681,-15667,-15661,-15659,
-15652,-15640,-15631,-15625,-15454,-15448,-15436,-15435,-15419,-15416,-15408,-15394,
-15385,-15377,-15375,-15369,-15363,-15362,-15183,-15180,-15165,-15158,-15153,-15150,
-15149,-15144,-15143,-15141,-15140,-15139,-15128,-15121,-15119,-15117,-15110,-15109,
-14941,-14937,-14933,-14930,-14929,-14928,-14926,-14922,-14921,-14914,-14908,-14902,
-14894,-14889,-14882,-14873,-14871,-14857,-14678,-14674,-14670,-14668,-14663,-14654,
-14645,-14630,-14594,-14429,-14407,-14399,-14384,-14379,-14368,-14355,-14353,-14345,
-14170,-14159,-14151,-14149,-14145,-14140,-14137,-14135,-14125,-14123,-14122,-14112,
-14109,-14099,-14097,-14094,-14092,-14090,-14087,-14083,-13917,-13914,-13910,-13907,
-13906,-13905,-13896,-13894,-13878,-13870,-13859,-13847,-13831,-13658,-13611,-13601,
-13406,-13404,-13400,-13398,-13395,-13391,-13387,-13383,-13367,-13359,-13356,-13343,
-13340,-13329,-13326,-13318,-13147,-13138,-13120,-13107,-13096,-13095,-13091,-13076,
-13068,-13063,-13060,-12888,-12875,-12871,-12860,-12858,-12852,-12849,-12838,-12831,
-12829,-12812,-12802,-12607,-12597,-12594,-12585,-12556,-12359,-12346,-12320,-12300,
-12120,-12099,-12089,-12074,-12067,-12058,-12039,-11867,-11861,-11847,-11831,-11798,
-11781,-11604,-11589,-11536,-11358,-11340,-11339,-11324,-11303,-11097,-11077,-11067,
-11055,-11052,-11045,-11041,-11038,-11024,-11020,-11019,-11018,-11014,-10838,-10832,
-10815,-10800,-10790,-10780,-10764,-10587,-10544,-10533,-10519,-10331,-10329,-10328,
-10322,-10315,-10309,-10307,-10296,-10281,-10274,-10270,-10262,-10260,-10256,-10254
};
private static string[] pyName = new string[]
{
"A","Ai","An","Ang","Ao","Ba","Bai","Ban","Bang","Bao","Bei","Ben",
"Beng","Bi","Bian","Biao","Bie","Bin","Bing","Bo","Bu","Ba","Cai","Can",
"Cang","Cao","Ce","Ceng","Cha","Chai","Chan","Chang","Chao","Che","Chen","Cheng",
"Chi","Chong","Chou","Chu","Chuai","Chuan","Chuang","Chui","Chun","Chuo","Ci","Cong",
"Cou","Cu","Cuan","Cui","Cun","Cuo","Da","Dai","Dan","Dang","Dao","De",
"Deng","Di","Dian","Diao","Die","Ding","Diu","Dong","Dou","Du","Duan","Dui",
"Dun","Duo","E","En","Er","Fa","Fan","Fang","Fei","Fen","Feng","Fo",
"Fou","Fu","Ga","Gai","Gan","Gang","Gao","Ge","Gei","Gen","Geng","Gong",
"Gou","Gu","Gua","Guai","Guan","Guang","Gui","Gun","Guo","Ha","Hai","Han",
"Hang","Hao","He","Hei","Hen","Heng","Hong","Hou","Hu","Hua","Huai","Huan",
"Huang","Hui","Hun","Huo","Ji","Jia","Jian","Jiang","Jiao","Jie","Jin","Jing",
"Jiong","Jiu","Ju","Juan","Jue","Jun","Ka","Kai","Kan","Kang","Kao","Ke",
"Ken","Keng","Kong","Kou","Ku","Kua","Kuai","Kuan","Kuang","Kui","Kun","Kuo",
"La","Lai","Lan","Lang","Lao","Le","Lei","Leng","Li","Lia","Lian","Liang",
"Liao","Lie","Lin","Ling","Liu","Long","Lou","Lu","Lv","Luan","Lue","Lun",
"Luo","Ma","Mai","Man","Mang","Mao","Me","Mei","Men","Meng","Mi","Mian",
"Miao","Mie","Min","Ming","Miu","Mo","Mou","Mu","Na","Nai","Nan","Nang",
"Nao","Ne","Nei","Nen","Neng","Ni","Nian","Niang","Niao","Nie","Nin","Ning",
"Niu","Nong","Nu","Nv","Nuan","Nue","Nuo","O","Ou","Pa","Pai","Pan",
"Pang","Pao","Pei","Pen","Peng","Pi","Pian","Piao","Pie","Pin","Ping","Po",
"Pu","Qi","Qia","Qian","Qiang","Qiao","Qie","Qin","Qing","Qiong","Qiu","Qu",
"Quan","Que","Qun","Ran","Rang","Rao","Re","Ren","Reng","Ri","Rong","Rou",
"Ru","Ruan","Rui","Run","Ruo","Sa","Sai","San","Sang","Sao","Se","Sen",
"Seng","Sha","Shai","Shan","Shang","Shao","She","Shen","Sheng","Shi","Shou","Shu",
"Shua","Shuai","Shuan","Shuang","Shui","Shun","Shuo","Si","Song","Sou","Su","Suan",
"Sui","Sun","Suo","Ta","Tai","Tan","Tang","Tao","Te","Teng","Ti","Tian",
"Tiao","Tie","Ting","Tong","Tou","Tu","Tuan","Tui","Tun","Tuo","Wa","Wai",
"Wan","Wang","Wei","Wen","Weng","Wo","Wu","Xi","Xia","Xian","Xiang","Xiao",
"Xie","Xin","Xing","Xiong","Xiu","Xu","Xuan","Xue","Xun","Ya","Yan","Yang",
"Yao","Ye","Yi","Yin","Ying","Yo","Yong","You","Yu","Yuan","Yue","Yun",
"Za", "Zai","Zan","Zang","Zao","Ze","Zei","Zen","Zeng","Zha","Zhai","Zhan",
"Zhang","Zhao","Zhe","Zhen","Zheng","Zhi","Zhong","Zhou","Zhu","Zhua","Zhuai","Zhuan",
"Zhuang","Zhui","Zhun","Zhuo","Zi","Zong","Zou","Zu","Zuan","Zui","Zun","Zuo"
};
/// <summary>
/// 把汉字转换成拼音(全拼)
/// </summary>
/// <param name="hzString">汉字字符串</param>
/// <returns>转换后的拼音(全拼)字符串</returns>
public static string Convert(string hzString)
{
// 匹配中文字符
Regex regex = new Regex("^[\u4e00-\u9fa5]$");
byte[] array = new byte[2];
string pyString = "";
int chrAsc = 0;
int i1 = 0;
int i2 = 0;
char[] noWChar = hzString.ToCharArray();
for (int j = 0; j < noWChar.Length; j++)
{
// 中文字符
if (regex.IsMatch(noWChar[j].ToString()))
{
array = System.Text.Encoding.Default.GetBytes(noWChar[j].ToString());
i1 = (short)(array[0]);
i2 = (short)(array[1]);
chrAsc = i1 * 256 + i2 - 65536;
if (chrAsc > 0 && chrAsc < 160)
{
pyString += noWChar[j];
}
else
{
// 修正部分文字
if (chrAsc == -9254) // 修正“圳”字
pyString += "Zhen";
else
{
for (int i = (pyValue.Length - 1); i >= 0; i--)
{
if (pyValue[i] <= chrAsc)
{
pyString += pyName[i];
break;
}
}
}
}
}
// 非中文字符
else
{
pyString += noWChar[j].ToString();
}
}
return pyString;
}
}
You can use the following method:
from __future__ import unicode_literals
from pypinyin import lazy_pinyin
hanzi_list = ['如何', '将', '汉字','转为', '拼音']
pinyin_list = [''.join(lazy_pinyin(_)) for _ in hanzi_list]
Output:
['ruhe', 'jiang', 'hanzi', 'zhuanwei', 'pinyin']
i had this problem and i found a solution in PHP (which could be cleaner i suppose but it works). I had some troubles because the file given in this topic is from hexa unicode.
1) Import the data from ftp://ftp.cuhk.hk/pub/chinese/ifcss/software/data/Uni2Pinyin.gz (thanks pierr) to your database or whatever
2) Import your data in an array as $pinyinArray[$hexaUnicode] = $pinyin;
3) Use this code:
/*
* Decimal representation of $c
* function found there: http://www.cantonese.sheik.co.uk/phorum/read.php?2,19594
*/
function uniord($c)
{
$ud = 0;
if (ord($c{0})>=0 && ord($c{0})<=127)
$ud = $c{0};
if (ord($c{0})>=192 && ord($c{0})<=223)
$ud = (ord($c{0})-192)*64 + (ord($c{1})-128);
if (ord($c{0})>=224 && ord($c{0})<=239)
$ud = (ord($c{0})-224)*4096 + (ord($c{1})-128)*64 + (ord($c{2})-128);
if (ord($c{0})>=240 && ord($c{0})<=247)
$ud = (ord($c{0})-240)*262144 + (ord($c{1})-128)*4096 + (ord($c{2})-128)*64 + (ord($c{3})-128);
if (ord($c{0})>=248 && ord($c{0})<=251)
$ud = (ord($c{0})-248)*16777216 + (ord($c{1})-128)*262144 + (ord($c{2})-128)*4096 + (ord($c{3})-128)*64 + (ord($c{4})-128);
if (ord($c{0})>=252 && ord($c{0})<=253)
$ud = (ord($c{0})-252)*1073741824 + (ord($c{1})-128)*16777216 + (ord($c{2})-128)*262144 + (ord($c{3})-128)*4096 + (ord($c{4})-128)*64 + (ord($c{5})-128);
if (ord($c{0})>=254 && ord($c{0})<=255) //error
$ud = false;
return $ud;
}
/*
* Translate the $string string of a single chinese charactere to unicode
*/
function chineseToHexaUnicode($string) {
return strtoupper(dechex(uniord($string)));
}
/*
*
*/
function convertChineseToPinyin($string,$pinyinArray) {
$pinyinValue = '';
for ($i = 0; $i < mb_strlen($string);$i++)
$pinyinValue.=$pinyinArray[chineseToHexaUnicode(mb_substr($string, $i, 1))];
return $pinyinValue;
}
$string = '龙江省五大';
echo convertChineseToPinyin($string,$pinyinArray);
echo: (long2)(jiang1)(sheng3,xing3)(wu3)(da4,dai4)
Of course, $pinyinArray is your array of data (hexoUnicode => pinyin)
Hope it will help someone.
If you use Visual Studio, this might be an option:
Microsoft.International.Converters.PinYinConverter
How to install:
First, download the Visual Studio International Pack 2.0, Official Download. Once the download is complete install the run file VSIPSetup.msi installation (x86 operating system on the default installation directory (C:\Program Files\Microsoft Visual Studio International Feature Pack 2.0).
After installation, you need to add a reference in VS, respectively reference:
C:\Program Files\Microsoft Visual Studio International Pack\Simplified Chinese Pin-Yin Conversion Library (Pinyin)
and
C:\Program Files\Microsoft Visual Studio International Pack\Traditional Chinese to Simplified Chinese Conversion Library and Add-In Tool (Traditional and Simplified Huzhuan to)
How to use:
public static string GetPinyin(string str)
{
string r = string.Empty;
foreach (char obj in str)
{
try
{
ChineseChar chineseChar = new ChineseChar(obj);
string t = chineseChar.Pinyins[0].ToString();
r += t.Substring(0, t.Length - 1);
}
catch
{
r += obj.ToString();
}
}
return r;
}
Source:
http://www.programering.com/a/MzM3cTMwATA.html

Resources