Challenge on ASCII Codes - ascii

My friend shared a challenge puzzle with me but i dont have any idea on this? I believe that its related to ascii characters

Related

How can I use stanford-openIE for Chinese text

I am working on on Stanford-openIE but I do not know whether it supports Chinese text or not. If it supports Chinese language, How can I use stanford-openIE for Chinese text?
Any guidance will be appreciated.
Stanford's OpenIE system was developed for English. It's based off of universal dependencies, meaning that in theory it shouldn't be too hard to adapt to other languages; but, nonetheless, it's highly unlikely that it would work out of the box.
At minimum, the relation triple segmenter would have to be adapted for Chinese. For some of the more subtle functionality, the code to mark natural logic polarity and the code to score prepositional phrase deletions would have to be rewritten.

Checking for typographical error / comparing strings

I'm currently making a language learning app and I want to check where the user made a typo and notify him where he made the typo. For example, the "correct answer" is TODAY IS GREAT but they only typed TDAY IS GREAT or TDAYIS GREAT, in which case the only mistake is in the O of TODAY and I need to make that O bold to emphasize (I know how to make it bold I just need to know the location). Can anyone suggest a good algorithm for this? Thanks!
EDIT: Finally used it, and while it works great at pointing out the number of mistakes, I'm having trouble finding the index of where the mistake is. For example, the user answer is "TEAM" but the correct answer is "THEME". In that case, I should mark the H and the second E. I tried to put the start and end index but it gets even more confusing which letter to mark if for example the string is extremely long and there are even more duplicate consecutive letters in the substring (especially in the korean language). Could anyone give me some ideas on how to do this? Thanks!
Have you tried the "Longest Common Subsequence" algorithm?
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

How can I find out the language from a character?

Given a Unicode character, we want to find out what languages include this character, and more importantly, understand whether or not each language is Left-To-Right.
For example, the character A might be both English and Spanish which are both LTR languages.
I want this for my own text editor.
Can anyone help me in finding an API function or something that solves my problem?
Thanks in advance
Unicode-wise, LTR/RTL is a property of characters, not of the languages that use that character. This matters because embedded English in an Arabic text should be displayed left-to-right, even if for simplicity the document as a whole may be marked as Arabic. If you're using JCL, these properties can be obtained using the UnicodeIsLeftToRight and UnicodeIsRightToLeft functions. Note that characters may be neither left-to-right nor right-to-left, and also note that JCL uses a private copy of the Unicode character list that may be a subtly different version from what any specific version of Windows uses.
Regarding the question in the title, you would need to carry out an extensive study of the use of characters in the languages of the world. There are a few thousands of languages, though many of them have no regular writing system; on the other hand, some languages have several writing systems. Different variants of a language may have different repertoires of characters.
So it would be a major effort, though some data has been compiled e.g. in the CLDR repertoire – but the concept “characters used in a language” is far from clear. (Are the characters æ, è, and ö used in English? They sure appear in some forms of written English.)
So it would be unrealistic to expect to find a library routine for such purposes.
Apparently your real need was for deciding whether a character is a left-to-right character or a right-to-left character. But for completeness, I have provided an answer to what you actually asked and that might be relevant in some other contexts.

Virus Signatures and Genetic Algorithms

I would like to know how one achieves the following signature. I have read online that (al least in the past) researchers will take the "suspected" file the binary code, convert it to assembly, examine it, pick sections of code that appear to be unusual, and identifying the corresponding bytes in the machine code.
But then how is the bellow virus string signature achieved?
MIRC.Julie=6463632073656e6420246e69636b20433a5c57696e646f77735c4a756c696531362c4a50472e636f6d0a0d6e31333d207d0a0d6e31343d200a0d6e31353d206374637020313a70696e673a2f6463632073656e6420246e69636b20433a5c57696e646f77735c4a756c696531362c4a50
Also, (although this might sound completely crazy) that string above must mean something, i can only guess a sequence of actions, actual code, etc. So if it was once "translated" in this form (virus signature) from assembly, is it possible to convert it back?
Just in case you might wonder why am asking what even I think is a weird question. This is why... I am preparing my BSc final year computer science project, and at this point I am wondering whether it would be possible to maybe generate/estimate/evaluate/predict virus signatures by using GA's (Genetic Algorithms). Maybe that will help make my question a bit easier to understand, I hope.
Thanks!
You cannot revert it back because normally virus signatures are encrypted. The way they are obtained is by extracting the binary mallicious code from an executable and then converting it to hexadecimal representation.
Hopefully helps
The virus signature shown is probably dependent on the scanner that generated it. I find it extremely easy to believe that all virus scanners create their signatures in different ways. Without a source, there's no way to explain how it was developed, and even with a source I doubt this is something that AV companies will reveal, since it allows virus developers the opportunity to avoid detection.
"Generate/estimate/evaluate/predict" are four different problems and not all of them are best done with a GA. You need to select your problem before selecting an algorithm.

Word wrap algorithms for Japanese

In a recent web application I built, I was pleasantly surprised when one of our users decided to use it to create something entirely in Japanese. However, the text was wrapped strangely and awkwardly. Apparently browsers don't cope with wrapping Japanese text very well, probably because it contains few spaces, as each character forms a whole word. However, that's not really a safe assumption to make as some words are constructed of several characters, and it is not safe to break some character groups into different lines.
Googling around hasn't really helped me understand the problem any better. It seems to me like one would need a dictionary of unbreakable patterns, and assume that everywhere else is safe to break. But I fear I don't know enough about Japanese to really know all the words, which I understand from some of my searching, are quite complicated.
How would you approach this problem? Are there any libraries or algorithms you are aware of that already exist that deal with this in a satisfactory way?
Japanese word wrap rules are called kinsoku shori and are surprisingly simple. They're actually mostly concerned with punctuation characters and do not try to keep words unbroken at all.
I just checked with a Japanese novel and indeed, both words in the syllabic kana script and those consisting of multiple Chinese ideograms are wrapped mid-word with impunity.
Below listed projects are useful to resolve Japanese wordwrap (or wordbreak from another point of view).
budou (Python): https://github.com/google/budou
mikan (JS): https://github.com/trkbt10/mikan.js
mikan.sharp (C#): https://github.com/YoungjaeKim/mikan.sharp
mikan has regex-based approach while budou uses natural language processing.

Resources