Processing language - writing a basic program of data - processing
So I have just started studying processing at university, but unfortunately missed the firth few lectures, so with the text books at hand, I am attempting to learn this my self, could someone let me know if im right or wrong with this? You would be saving my life. kinda.. thanks :D
This is the question:
Write a program which declares and initialises the following variables
with their values. Add comments to your program to separate the
different subsections. i. noOfstudents =50 ii. examMark = -50
iii. priceOfShoes = 59.99 iv. income = 10,750.99 v.
greetingMessage='hello there' vi. alphabet = 'A'
vii lossOfIncome = -20.30
viii sum=0.0000000000076
and this is the code I have typed up for the question;
void setup() {
int noOfStudents;
noOfStudents = 50; //Number of Students
float examMark;
examMark = -50; // Exam Mark
float shoePrice;
shoePrice = 59.99; // The price of the shoes
double income;
income = 10750.99; // The income
print ("hello world"); //displays hello world
char message;
message = 'A'; //displays the character A
float lossOfIncome = -20.30;
double lossOfIncome2 = 0.0000000000076; //The loss of income
}
you could declare and init your variables on one line, ie int noOfStudents = 50;. Also, you are using the print() method. It is more likely you would want to use the String variable type.
You might also want to take a look at Ch. 4 of Daniel Shiffman's, Learning Processing book, or at least the examples for ch. 4 online.
http://www.learningprocessing.com/examples/
Related
How to access members of a class with dynamic names Cocos2d-x C++
I have some numbers in my header that I want to access in the code like this. int _number0; int _number1; Then in implementation _number0 = 10; _number1 = 20; int i; for(i=0; i<2, i++){ auto number = _number+i; //This is where I'm lost, how to do the right part right in order to get this int by its name created from a String + an Integer. CCLOG("Number: %i", number); //Output Number: 10 // Number: 20 } I was thinking on a pseudo code like this: auto number = dynamic_cast<Int*>(this->findTheMemberWithName("level%i",i)); Is there any way to do something like this in C++? Thanks for any guideline. Greetings.
I think std::map should fulfill your requirement. P.S:In fact, why don't you use array or vector to do the job? It is really not a good idea to make up the parameter names in CPP.
How to find a frequent character in a string written in pseudocode. Thanks
Most Frequent Character Design a program that prompts the user to enter a string, and displays the character that appears most frequently in the string. It is a homework question, but my teacher wasn't helpful and its driving me crazy i can't figure this out. Thank You in advance. This is what i have so far! Declare String str Declare Integer maxChar Declare Integer index Set maxChar = 0 Display “Enter anything you want.” Input str For index = 0 To length(str) – 1 If str[index] = And now im stuck. I dont think its right and i dont know where to go with it!
It seems to me that the way you want to do it is: "Go through every character in the string and remember the character we've seen most times". However, that won't work. If we only remember the count for a single character, like "the character we've seen most times is 'a' with 5 occurrences", we can't know if perhaps the character in the 2nd place doesn't jump ahead. So, what you have to do is this: Go through every character of the string. For every character, increase the occurrence count for that character. Yes, you have to save this count for every single character you encounter. Simple variables like string or int are not going to be enough here. When you're done, you're left with a bunch of data looking like "a"=5, "b"=2, "e"=7,... you have to go though that and find the highest number (I'm sure you can find examples for finding the highest number in a sequence), then return the letter which this corresponds to. Not a complete answer, I know, but that's all I'm going to say. If you're stuck, I suggest getting a pen and a piece of paper and trying to calculate it manually. Try to think - how would you do it without a computer? If your answer is "look and see", what if the text is 10 pages? I know it can be pretty confusing, but the point of all this is to get you used to a different way of thinking. If you figure this one out, the next time will be easier because the basic principles are always the same.
This is the code I have created to count all occurences in a string. String abc = "aabcabccc"; char[] x = abc.toCharArray(); String _array = ""; for(int i = 0; i < x.length; i++) //copy distinct data to a new string { if(_array.indexOf(x[i]) == -1) _array = _array+x[i]; } char[] y = _array.toCharArray(); int[] count1 = new int[_array.length()]; for(int j = 0; j<x.length;j++) //count occurences { count1[new String(String.valueOf(y)).indexOf(x[j])]++; } for(int i = 0; i<y.length;i++) //display { System.out.println(y[i] + " = " + count1[i]); }
Should I always avoid lower case class names?
The following is incompatible with the Dart Style Guide. Should I not do this? I'm extending the num class to build a unit of measure library for medical applications. Using lower case looks more elegant than Kg or Lbs. In some cases using lower case is recommended for safety i.e. mL instead of Ml. class kg extends num { String uom = "kg"; num _value; lbs toLbs() => new lbs(2.20462 * _value); } class lbs extends num { String uom = "lbs"; num _value; kg toKg() => new kg(_value/2.20462); }
For your case I might pick a unit (e.g. milligrams) and make other units multiples of it. You can use division for conversion: const mg = 1; // The unit const g = mg * 1000; const kg = g * 1000; const lb = mg * 453592; main() { const numPoundsPerKilogram = kg / lb; print(numPoundsPerKilogram); // 2.20462... const twoPounds = lb * 2; const numGramsInTwoPounds = twoPounds / g; print(numGramsInTwoPounds); // 907.184 } It's best to make the unit small, so other units can be integer multiples of it (ints are arbitrary precision).
It's up to you if you use them. There might be situations where it can be important. One I can think of currently is when you have an open source project and you don't want to alienate potential contributors. When you have no special reason stick with the guidelines, if you have a good reason deviate from it.
Don't use classes for units. Use them for quantities: class Weight { static const double LBS_PER_KG = 2.20462; num _kg; Weight.fromKg(this._kg); Weight.fromLbs(lbs) { this._kg = lbs / LBS_PER_KG; } get inKg => _kg; get inLbs => LBS_PER_KG * _kg; } take a look at the Duration class for ideas.
Coding conventions serve to help you write quality, readable code. If you find that ignoring a certain convention helps to improve the readability, that is your decision. However, code is rarely only seen by one pair of eyes. Other programmers will be confused when reading your code if it doesn't follow the style guide. Following coding conventions allows others to quickly dive into your code and easily understand what is going on. Of course, it is possible that you really will be the only one to ever view this code, in which case this is moot. I would avoid deviating from the style guide in most cases, except where the advantage is very obvious. For your situation I don't the advantage outweighs the disadvantage.
I wouldn't really take those coding standards too seriously. I think these guidelines are much more reasonable: less arbitrary, and more useful: Java: Code Conventions Javascript: http://javascript.crockford.com/code.html There are many more you can choose from - pick whatever works best for you! PS: As long as you're talking about Dart, check out this article: Google Dart to ultimately replace Javascript ... not!
Making a list of integers more human friendly
This is a bit of a side project I have taken on to solve a no-fix issue for work. Our system outputs a code to represent a combination of things on another thing. Some example codes are: 9-9-0-4-4-5-4-0-2-0-0-0-2-0-0-0-0-0-2-1-2-1-2-2-2-4 9-5-0-7-4-3-5-7-4-0-5-1-4-2-1-5-5-4-6-3-7-9-72 9-15-0-9-1-6-2-1-2-0-0-1-6-0-7 The max number in one of the slots I've seen so far is about 150 but they will likely go higher. When the system was designed there was no requirement for what this code would look like. But now the client wants to be able to type it in by hand from a sheet of paper, something the code above isn't suited for. We've said we won't do anything about it, but it seems like a fun challenge to take on. My question is where is a good place to start loss-less compressing this code? Obvious solutions such as store this code with a shorter key are not an option; our database is read only. I need to build a two way method to make this code more human friendly.
1) I agree that you definately need a checksum - data entry errors are very common, unless you have really well trained staff and independent duplicate keying with automatic crosss-checking. 2) I suggest http://en.wikipedia.org/wiki/Huffman_coding to turn your list of numbers into a stream of bits. To get the probabilities required for this, you need a decent sized sample of real data, so you can make a count, setting Ni to the number of times number i appears in the data. Then I suggest setting Pi = (Ni + 1) / (Sum_i (Ni + 1)) - which smooths the probabilities a bit. Also, with this method, if you see e.g. numbers 0-150 you could add a bit of slack by entering numbers 151-255 and setting them to Ni = 0. Another way round rare large numbers would be to add some sort of escape sequence. 3) Finding a way for people to type the resulting sequence of bits is really an applied psychology problem but here are some suggestions of ideas to pinch. 3a) Software licences - just encode six bits per character in some 64-character alphabet, but group characters in a way that makes it easier for people to keep place e.g. BC017-06777-14871-160C4 3b) UK car license plates. Use a change of alphabet to show people how to group characters e.g. ABCD0123EFGH4567IJKL... 3c) A really large alphabet - get yourself a list of 2^n words for some decent sized n and encode n bits as a word e.g. GREEN ENCHANTED LOGICIAN... -
i worried about this problem a while back. it turns out that you can't do much better than base64 - trying to squeeze a few more bits per character isn't really worth the effort (once you get into "strange" numbers of bits encoding and decoding becomes more complex). but at the same time, you end up with something that's likely to have errors when entered (confusing a 0 with an O etc). one option is to choose a modified set of characters and letters (so it's still base 64, but, say, you substitute ">" for "0". another is to add a checksum. again, for simplicity of implementation, i felt the checksum approach was better. unfortunately i never got any further - things changed direction - so i can't offer code or a particular checksum choice. ps i realised there's a missing step i didn't explain: i was going to compress the text into some binary form before encoding (using some standard compression algorithm). so to summarize: compress, add checksum, base64 encode; base 64 decode, check checksum, decompress.
This is similar to what I have used in the past. There are certainly better ways of doing this, but I used this method because it was easy to mirror in Transact-SQL which was a requirement at the time. You could certainly modify this to incorporate Huffman encoding if the distribution of your id's is non-random, but it's probably unnecessary. You didn't specify language, so this is in c#, but it should be very easy to transition to any language. In the lookup you'll see commonly confused characters are omitted. This should speed up entry. I also had the requirement to have a fixed length, but it would be easy for you to modify this. static public class CodeGenerator { static Dictionary<int, char> _lookupTable = new Dictionary<int, char>(); static CodeGenerator() { PrepLookupTable(); } private static void PrepLookupTable() { _lookupTable.Add(0,'3'); _lookupTable.Add(1,'2'); _lookupTable.Add(2,'5'); _lookupTable.Add(3,'4'); _lookupTable.Add(4,'7'); _lookupTable.Add(5,'6'); _lookupTable.Add(6,'9'); _lookupTable.Add(7,'8'); _lookupTable.Add(8,'W'); _lookupTable.Add(9,'Q'); _lookupTable.Add(10,'E'); _lookupTable.Add(11,'T'); _lookupTable.Add(12,'R'); _lookupTable.Add(13,'Y'); _lookupTable.Add(14,'U'); _lookupTable.Add(15,'A'); _lookupTable.Add(16,'P'); _lookupTable.Add(17,'D'); _lookupTable.Add(18,'S'); _lookupTable.Add(19,'G'); _lookupTable.Add(20,'F'); _lookupTable.Add(21,'J'); _lookupTable.Add(22,'H'); _lookupTable.Add(23,'K'); _lookupTable.Add(24,'L'); _lookupTable.Add(25,'Z'); _lookupTable.Add(26,'X'); _lookupTable.Add(27,'V'); _lookupTable.Add(28,'C'); _lookupTable.Add(29,'N'); _lookupTable.Add(30,'B'); } public static bool TryPCodeDecrypt(string iPCode, out Int64 oDecryptedInt) { //Prep the result so we can exit without having to fiddle with it if we hit an error. oDecryptedInt = 0; if (iPCode.Length > 3) { Char[] Bits = iPCode.ToCharArray(0,iPCode.Length-2); int CheckInt7 = 0; int CheckInt3 = 0; if (!int.TryParse(iPCode[iPCode.Length-1].ToString(),out CheckInt7) || !int.TryParse(iPCode[iPCode.Length-2].ToString(),out CheckInt3)) { //Unsuccessful -- the last check ints are not integers. return false; } //Adjust the CheckInts to the right values. CheckInt3 -= 2; CheckInt7 -= 2; int COffset = iPCode.LastIndexOf('M')+1; Int64 tempResult = 0; int cBPos = 0; while ((cBPos + COffset) < Bits.Length) { //Calculate the current position. int cNum = 0; foreach (int cKey in _lookupTable.Keys) { if (_lookupTable[cKey] == Bits[cBPos + COffset]) { cNum = cKey; } } tempResult += cNum * (Int64)Math.Pow((double)31, (double)(Bits.Length - (cBPos + COffset + 1))); cBPos += 1; } if (tempResult % 7 == CheckInt7 && tempResult % 3 == CheckInt3) { oDecryptedInt = tempResult; return true; } return false; } else { //Unsuccessful -- too short. return false; } } public static string PCodeEncrypt(int iIntToEncrypt, int iMinLength) { int Check7 = (iIntToEncrypt % 7) + 2; int Check3 = (iIntToEncrypt % 3) + 2; StringBuilder result = new StringBuilder(); result.Insert(0, Check7); result.Insert(0, Check3); int workingNum = iIntToEncrypt; while (workingNum > 0) { result.Insert(0, _lookupTable[workingNum % 31]); workingNum /= 31; } if (result.Length < iMinLength) { for (int i = result.Length + 1; i <= iMinLength; i++) { result.Insert(0, 'M'); } } return result.ToString(); } }
How to convert Chinese characters to Pinyin [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations. Closed last year. The community reviewed whether to reopen this question 4 months ago and left it closed: Original close reason(s) were not resolved Improve this question For sorting Chinese language text, I want to convert Chinese characters to Pinyin, properly separating each Chinese character and grouping successive characters together. Can you please help me in this task by providing the logic or source code for doing this? Please let me know if any open source or lib already present for this.
Short answer: you don't. Long answer: There is no one-to-one mapping for 汉字 to 汉语拼音. Just some quick examples: 把 can be "ba" in the third tone or fourth tone. 了 can be "le" toneless or "liao" third tone. 乐 can be "le" or "yue", both in the fourth tone. 落 can be "luo", "la" or "lao", all in the fourth tone. And so on. I have a beginners' book on this topic that has 207 examples. I stress that this is a beginners' book and is by no means complete. Each one has a page or two of examples of use and conditions under which you choose the appropriate pronunciation. It is not something that could be easily programmed (if at all). And this doesn't even address the other slippery thing you want to deal with: the separation of characters into grouped words. The very notion of a word is a bit slippery in Chinese. (There's two terms that correspond, roughly to "word" in Chinese for example: 字 and 词. The first is the character, the second groups of characters that are put together into one concept. (I frequently get asked by Chinese speakers how many "words" I can read when they really mean "characters".) While in some cases the distinction is clear (the 词 "乌鸦", for example, is "crow" -- the two 字 must be together to express the idea properly and it would be incorrect to translate it as "black crow"), in others it is not so clear. What does "你好" translate to? Is it one word meaning, idiomatically, "hello"? Or is it two words translating literally to "you good"? Each of the characters involved stands alone or in groups with other words, but together they mean something entirely different from their individual meanings. Given this, how, precisely, do you plan to group the 汉语拼音 transliterations (which are difficult to impossible to get right in the first place!) into "words"?
While #JUST MY correct OPINION's answer addresses some of the difficulties of converting characters into pinyin, it is not an impossible problem to solve. I have written a library (pinyinify) that solves this task with decent accuracy. Even though there is not a one-to-one mapping between characters and pinyin, my library can usually decide which pronunciation is correct. For example, "我受不了了" correctly converts to "wǒ shòubùliǎo le", with two different pronunciations of 了. My approach to solving the problem is pretty simple: First segment the text into words. For example, 我喜欢旅游 would be divided into three words: 我 喜欢 旅游. This is also not a simple process, but there are many libraries for it. jieba is one of the more popular libraries for this purpose. Use a dictionary to convert the words into pinyin. If the word is not in the dictionary, fall back to converting the individual characters to pinyin using their most common pronunciation.
CoreFoundation provides certain method to do the conversion: CFMutableStringRef string = CFStringCreateMutableCopy(NULL, 0, CFSTR("中文")); CFStringTransform(string, NULL, kCFStringTransformMandarinLatin, NO); CFStringTransform(string, NULL, kCFStringTransformStripDiacritics, NO); NSLog(#"%#", string); The output is zhong wen
the following code writing in C# can help you to simply convert chinese words that including in gb2312 encodec(just 2312 of often used Simplified-Chinese words) to pinyin.like convert "今天天气不错" to "JinTianTianQiBuCuo". sometimes a chinese word is not one to one map to a pinyin,it depends on the context we talk about.like the "行" in "自行车"(bike) is pronounced "Xing",but in "银行"(bank) it pronounced "Hang".so if you have problem with this,you may find more complex solution to handle this. sorry for my poor english.i hope this could give you a little help. public class ChineseToPinYin { private static int[] pyValue = new int[] { -20319,-20317,-20304,-20295,-20292,-20283,-20265,-20257,-20242,-20230,-20051,-20036, -20032,-20026,-20002,-19990,-19986,-19982,-19976,-19805,-19784,-19775,-19774,-19763, -19756,-19751,-19746,-19741,-19739,-19728,-19725,-19715,-19540,-19531,-19525,-19515, -19500,-19484,-19479,-19467,-19289,-19288,-19281,-19275,-19270,-19263,-19261,-19249, -19243,-19242,-19238,-19235,-19227,-19224,-19218,-19212,-19038,-19023,-19018,-19006, -19003,-18996,-18977,-18961,-18952,-18783,-18774,-18773,-18763,-18756,-18741,-18735, -18731,-18722,-18710,-18697,-18696,-18526,-18518,-18501,-18490,-18478,-18463,-18448, -18447,-18446,-18239,-18237,-18231,-18220,-18211,-18201,-18184,-18183, -18181,-18012, -17997,-17988,-17970,-17964,-17961,-17950,-17947,-17931,-17928,-17922,-17759,-17752, -17733,-17730,-17721,-17703,-17701,-17697,-17692,-17683,-17676,-17496,-17487,-17482, -17468,-17454,-17433,-17427,-17417,-17202,-17185,-16983,-16970,-16942,-16915,-16733, -16708,-16706,-16689,-16664,-16657,-16647,-16474,-16470,-16465,-16459,-16452,-16448, -16433,-16429,-16427,-16423,-16419,-16412,-16407,-16403,-16401,-16393,-16220,-16216, -16212,-16205,-16202,-16187,-16180,-16171,-16169,-16158,-16155,-15959,-15958,-15944, -15933,-15920,-15915,-15903,-15889,-15878,-15707,-15701,-15681,-15667,-15661,-15659, -15652,-15640,-15631,-15625,-15454,-15448,-15436,-15435,-15419,-15416,-15408,-15394, -15385,-15377,-15375,-15369,-15363,-15362,-15183,-15180,-15165,-15158,-15153,-15150, -15149,-15144,-15143,-15141,-15140,-15139,-15128,-15121,-15119,-15117,-15110,-15109, -14941,-14937,-14933,-14930,-14929,-14928,-14926,-14922,-14921,-14914,-14908,-14902, -14894,-14889,-14882,-14873,-14871,-14857,-14678,-14674,-14670,-14668,-14663,-14654, -14645,-14630,-14594,-14429,-14407,-14399,-14384,-14379,-14368,-14355,-14353,-14345, -14170,-14159,-14151,-14149,-14145,-14140,-14137,-14135,-14125,-14123,-14122,-14112, -14109,-14099,-14097,-14094,-14092,-14090,-14087,-14083,-13917,-13914,-13910,-13907, -13906,-13905,-13896,-13894,-13878,-13870,-13859,-13847,-13831,-13658,-13611,-13601, -13406,-13404,-13400,-13398,-13395,-13391,-13387,-13383,-13367,-13359,-13356,-13343, -13340,-13329,-13326,-13318,-13147,-13138,-13120,-13107,-13096,-13095,-13091,-13076, -13068,-13063,-13060,-12888,-12875,-12871,-12860,-12858,-12852,-12849,-12838,-12831, -12829,-12812,-12802,-12607,-12597,-12594,-12585,-12556,-12359,-12346,-12320,-12300, -12120,-12099,-12089,-12074,-12067,-12058,-12039,-11867,-11861,-11847,-11831,-11798, -11781,-11604,-11589,-11536,-11358,-11340,-11339,-11324,-11303,-11097,-11077,-11067, -11055,-11052,-11045,-11041,-11038,-11024,-11020,-11019,-11018,-11014,-10838,-10832, -10815,-10800,-10790,-10780,-10764,-10587,-10544,-10533,-10519,-10331,-10329,-10328, -10322,-10315,-10309,-10307,-10296,-10281,-10274,-10270,-10262,-10260,-10256,-10254 }; private static string[] pyName = new string[] { "A","Ai","An","Ang","Ao","Ba","Bai","Ban","Bang","Bao","Bei","Ben", "Beng","Bi","Bian","Biao","Bie","Bin","Bing","Bo","Bu","Ba","Cai","Can", "Cang","Cao","Ce","Ceng","Cha","Chai","Chan","Chang","Chao","Che","Chen","Cheng", "Chi","Chong","Chou","Chu","Chuai","Chuan","Chuang","Chui","Chun","Chuo","Ci","Cong", "Cou","Cu","Cuan","Cui","Cun","Cuo","Da","Dai","Dan","Dang","Dao","De", "Deng","Di","Dian","Diao","Die","Ding","Diu","Dong","Dou","Du","Duan","Dui", "Dun","Duo","E","En","Er","Fa","Fan","Fang","Fei","Fen","Feng","Fo", "Fou","Fu","Ga","Gai","Gan","Gang","Gao","Ge","Gei","Gen","Geng","Gong", "Gou","Gu","Gua","Guai","Guan","Guang","Gui","Gun","Guo","Ha","Hai","Han", "Hang","Hao","He","Hei","Hen","Heng","Hong","Hou","Hu","Hua","Huai","Huan", "Huang","Hui","Hun","Huo","Ji","Jia","Jian","Jiang","Jiao","Jie","Jin","Jing", "Jiong","Jiu","Ju","Juan","Jue","Jun","Ka","Kai","Kan","Kang","Kao","Ke", "Ken","Keng","Kong","Kou","Ku","Kua","Kuai","Kuan","Kuang","Kui","Kun","Kuo", "La","Lai","Lan","Lang","Lao","Le","Lei","Leng","Li","Lia","Lian","Liang", "Liao","Lie","Lin","Ling","Liu","Long","Lou","Lu","Lv","Luan","Lue","Lun", "Luo","Ma","Mai","Man","Mang","Mao","Me","Mei","Men","Meng","Mi","Mian", "Miao","Mie","Min","Ming","Miu","Mo","Mou","Mu","Na","Nai","Nan","Nang", "Nao","Ne","Nei","Nen","Neng","Ni","Nian","Niang","Niao","Nie","Nin","Ning", "Niu","Nong","Nu","Nv","Nuan","Nue","Nuo","O","Ou","Pa","Pai","Pan", "Pang","Pao","Pei","Pen","Peng","Pi","Pian","Piao","Pie","Pin","Ping","Po", "Pu","Qi","Qia","Qian","Qiang","Qiao","Qie","Qin","Qing","Qiong","Qiu","Qu", "Quan","Que","Qun","Ran","Rang","Rao","Re","Ren","Reng","Ri","Rong","Rou", "Ru","Ruan","Rui","Run","Ruo","Sa","Sai","San","Sang","Sao","Se","Sen", "Seng","Sha","Shai","Shan","Shang","Shao","She","Shen","Sheng","Shi","Shou","Shu", "Shua","Shuai","Shuan","Shuang","Shui","Shun","Shuo","Si","Song","Sou","Su","Suan", "Sui","Sun","Suo","Ta","Tai","Tan","Tang","Tao","Te","Teng","Ti","Tian", "Tiao","Tie","Ting","Tong","Tou","Tu","Tuan","Tui","Tun","Tuo","Wa","Wai", "Wan","Wang","Wei","Wen","Weng","Wo","Wu","Xi","Xia","Xian","Xiang","Xiao", "Xie","Xin","Xing","Xiong","Xiu","Xu","Xuan","Xue","Xun","Ya","Yan","Yang", "Yao","Ye","Yi","Yin","Ying","Yo","Yong","You","Yu","Yuan","Yue","Yun", "Za", "Zai","Zan","Zang","Zao","Ze","Zei","Zen","Zeng","Zha","Zhai","Zhan", "Zhang","Zhao","Zhe","Zhen","Zheng","Zhi","Zhong","Zhou","Zhu","Zhua","Zhuai","Zhuan", "Zhuang","Zhui","Zhun","Zhuo","Zi","Zong","Zou","Zu","Zuan","Zui","Zun","Zuo" }; /// <summary> /// 把汉字转换成拼音(全拼) /// </summary> /// <param name="hzString">汉字字符串</param> /// <returns>转换后的拼音(全拼)字符串</returns> public static string Convert(string hzString) { // 匹配中文字符 Regex regex = new Regex("^[\u4e00-\u9fa5]$"); byte[] array = new byte[2]; string pyString = ""; int chrAsc = 0; int i1 = 0; int i2 = 0; char[] noWChar = hzString.ToCharArray(); for (int j = 0; j < noWChar.Length; j++) { // 中文字符 if (regex.IsMatch(noWChar[j].ToString())) { array = System.Text.Encoding.Default.GetBytes(noWChar[j].ToString()); i1 = (short)(array[0]); i2 = (short)(array[1]); chrAsc = i1 * 256 + i2 - 65536; if (chrAsc > 0 && chrAsc < 160) { pyString += noWChar[j]; } else { // 修正部分文字 if (chrAsc == -9254) // 修正“圳”字 pyString += "Zhen"; else { for (int i = (pyValue.Length - 1); i >= 0; i--) { if (pyValue[i] <= chrAsc) { pyString += pyName[i]; break; } } } } } // 非中文字符 else { pyString += noWChar[j].ToString(); } } return pyString; } }
You can use the following method: from __future__ import unicode_literals from pypinyin import lazy_pinyin hanzi_list = ['如何', '将', '汉字','转为', '拼音'] pinyin_list = [''.join(lazy_pinyin(_)) for _ in hanzi_list] Output: ['ruhe', 'jiang', 'hanzi', 'zhuanwei', 'pinyin']
i had this problem and i found a solution in PHP (which could be cleaner i suppose but it works). I had some troubles because the file given in this topic is from hexa unicode. 1) Import the data from ftp://ftp.cuhk.hk/pub/chinese/ifcss/software/data/Uni2Pinyin.gz (thanks pierr) to your database or whatever 2) Import your data in an array as $pinyinArray[$hexaUnicode] = $pinyin; 3) Use this code: /* * Decimal representation of $c * function found there: http://www.cantonese.sheik.co.uk/phorum/read.php?2,19594 */ function uniord($c) { $ud = 0; if (ord($c{0})>=0 && ord($c{0})<=127) $ud = $c{0}; if (ord($c{0})>=192 && ord($c{0})<=223) $ud = (ord($c{0})-192)*64 + (ord($c{1})-128); if (ord($c{0})>=224 && ord($c{0})<=239) $ud = (ord($c{0})-224)*4096 + (ord($c{1})-128)*64 + (ord($c{2})-128); if (ord($c{0})>=240 && ord($c{0})<=247) $ud = (ord($c{0})-240)*262144 + (ord($c{1})-128)*4096 + (ord($c{2})-128)*64 + (ord($c{3})-128); if (ord($c{0})>=248 && ord($c{0})<=251) $ud = (ord($c{0})-248)*16777216 + (ord($c{1})-128)*262144 + (ord($c{2})-128)*4096 + (ord($c{3})-128)*64 + (ord($c{4})-128); if (ord($c{0})>=252 && ord($c{0})<=253) $ud = (ord($c{0})-252)*1073741824 + (ord($c{1})-128)*16777216 + (ord($c{2})-128)*262144 + (ord($c{3})-128)*4096 + (ord($c{4})-128)*64 + (ord($c{5})-128); if (ord($c{0})>=254 && ord($c{0})<=255) //error $ud = false; return $ud; } /* * Translate the $string string of a single chinese charactere to unicode */ function chineseToHexaUnicode($string) { return strtoupper(dechex(uniord($string))); } /* * */ function convertChineseToPinyin($string,$pinyinArray) { $pinyinValue = ''; for ($i = 0; $i < mb_strlen($string);$i++) $pinyinValue.=$pinyinArray[chineseToHexaUnicode(mb_substr($string, $i, 1))]; return $pinyinValue; } $string = '龙江省五大'; echo convertChineseToPinyin($string,$pinyinArray); echo: (long2)(jiang1)(sheng3,xing3)(wu3)(da4,dai4) Of course, $pinyinArray is your array of data (hexoUnicode => pinyin) Hope it will help someone.
If you use Visual Studio, this might be an option: Microsoft.International.Converters.PinYinConverter How to install: First, download the Visual Studio International Pack 2.0, Official Download. Once the download is complete install the run file VSIPSetup.msi installation (x86 operating system on the default installation directory (C:\Program Files\Microsoft Visual Studio International Feature Pack 2.0). After installation, you need to add a reference in VS, respectively reference: C:\Program Files\Microsoft Visual Studio International Pack\Simplified Chinese Pin-Yin Conversion Library (Pinyin) and C:\Program Files\Microsoft Visual Studio International Pack\Traditional Chinese to Simplified Chinese Conversion Library and Add-In Tool (Traditional and Simplified Huzhuan to) How to use: public static string GetPinyin(string str) { string r = string.Empty; foreach (char obj in str) { try { ChineseChar chineseChar = new ChineseChar(obj); string t = chineseChar.Pinyins[0].ToString(); r += t.Substring(0, t.Length - 1); } catch { r += obj.ToString(); } } return r; } Source: http://www.programering.com/a/MzM3cTMwATA.html