I am working on on Stanford-openIE but I do not know whether it supports Chinese text or not. If it supports Chinese language, How can I use stanford-openIE for Chinese text?
Any guidance will be appreciated.
Stanford's OpenIE system was developed for English. It's based off of universal dependencies, meaning that in theory it shouldn't be too hard to adapt to other languages; but, nonetheless, it's highly unlikely that it would work out of the box.
At minimum, the relation triple segmenter would have to be adapted for Chinese. For some of the more subtle functionality, the code to mark natural logic polarity and the code to score prepositional phrase deletions would have to be rewritten.
I am trying to develop a Sinhala (My native language) to English translator. Still I am thinking for an approach.
If I however parse a sentence of my language, then can I use that for generating english sentence with the help of stanford parser or any other parser. Or is there any other method you can recommend.
And I am thinking of a bottom up parser for my language, but still have no idea how to implement. Any suggestions for steps I can follow.
Thanks Mathee
This course on Coursera may help you implement a translator. From what I know based on that course, you can use a training set tagged by parts of speech (i.e. noun, verb, etc.) and use that training test to parse other sentence. I suggest looking into hidden Markov models.
My Pyramids parser is an unconventional single-sentence parser for English. (It is capable of parsing other languages too, but a grammar must be specified.) The parser can not only parse English into parse trees, but can convert back and forth between parse trees and word-level semantic graphs, which are graphs that describe the semantic relationships between all the words in a sentence. The correct word order is reconstructed based on the contents of the graph; all that needs to be provided, other than the words and their relationships, is the type of sentence (statement, question, command) and the linguistic category of each word (noun, determiner, verb, etc.). From there it is straight-forward to join the tokens of the parse tree into a sentence.
The parser is a (very early) alpha pre-release, but it is functional and actively maintained. I am currently using it to translate back and forth between English and an internal semantic representation used by a conversational agent (a "chat bot", but capable of more in-depth language understanding). If you decide to use the parser, do let me know. I will be happy to provide any assistance you might need with installing, using, or improving it.
I suppose maybe it's because I don't know the keywords to google for, but I can't find any sources on how to read those formulas you see on wikipedia, like this for instance:
Erlang Distribution
I've searched in the math world and computer science world. It feels like it is assumed that we're supposed to understand it out of thin air. Beginner lessons seem scarce.
So far I know how sigma works. And that upside-down shape that is used as the half-life logo is called lambda. But what the heck is it trying to say?? Why is there a semi-colon in the function, etc..
If there is a book on this stuff I'd buy it in an instant. It is probably very basic stuff but I never had experience in theoretical math or even know where to look.
Does anyone know what this subject is called, and what to google for?
Formulas with this symbols usually are statistics or probability notations.
Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
Greek letters used in mathematics, science, and engineering
you can find info here
Notation in probability and statistics
here
I think the colon in alt.: \scriptstyle \theta \;=\; \frac{1}{\lambda} > 0\, scale (real) in the box in Wikipedia is just saying that there is an alternative definition, in which you specify theta rather than lambda, and in that definition what is called theta is the reciprocal of lambda in the other definition.
I once complained to a much better mathematician than I was that I came unstuck with formulas with some of the weirder greek letters in because I couldn't write them recognisably in my handwriting (which is bad enough for the latin alphabet). He said a lot of the people he knew simply said "let x be funny-squiggle-thing" and rewrote with sensible letters. I really wish I'd thought of that.
In general, letters in weird alphabets behave pretty much like sensible letters, at least in the sort of thing you are pointing at. It's done as a sort of type-checking - usually all of the letters pinched from some particular foreign language are related in some way - e.g. all parameters. Unfortunately that doesn't hold exactly in the Wikipedia example you quote, where two of the greek letters stand for functions - one is definitely the Gamma function. I suspect the other is http://en.wikipedia.org/wiki/Digamma_function, but I'm not really sure.
Check out the resources list here: http://en.wikipedia.org/wiki/Greek_alphabet
I would say your best bet is still searching in Google (or other search engine, whatever float your boat) about the specific formula you are trying to learn. Sometimes a symbol may be used in different meaning in different formula.
Anyway, there is a good resource in here that explained a lot of math symbols, not just the Greek symbol.
Some link that may interest you here and here.
First, find a Greek alphabet (upper and lower case) to refer to, so that you can at least call lambda by it's name. No one starts out knowing automatically what the various Greek characters are, not even Greeks.
Second, Read the actual article, usually either the character is defined (as lambda happens to be in the Wikipedia page you references) or it's standard nomenclature (in which case you've done the right thing by looking for a basic article on the function in question-- I do this all the time so don't feel bad.) Or, as a third option, it's a crappy paper. Happens sometimes. It's kind of a pain, though, since you can't just do a text search on the lambda character in a PDF.
(Someone educate me on that if I'm wrong....)
Third, try to pick out which unfamiliar symbols are variables (like lambda) and which are operators (like sigma, and it's helpers.) It's the operators that can sometimes cause real trouble. A variable is just a name for something, but operators come freighted with more meaning, more rules, and more syntax. It's not always obvious which symbols are operators, either.
Finally, and specifically for computer science, a good introductory book (college freshman/sophomore level) on discrete math will hopefully treat most of the basic notations and operators to at least get your feet on the ground. Nowadays, you kids and your newfangled internet might be able to get something similar from Udacity, Edx, Course RA, or the Khan Academy.
Basically, it's a lot of hard work, especially on your own, but you're already doing most of the right things.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
What's so difficult about the subject that algorithm designers are having a hard time tackling it?
Is it really that complex?
I'm having a hard time grasping why this topic is so problematic. Can anyone give me an example as to why this is the case?
Auditory processing is a very complex task. Human evolution has produced a system so good that we don't realize how good it is. If three persons are talking to you at the same time you will be able to focus in one signal and discard the others, even if they are louder. Noise is very well discarded too. In fact, if you hear human voice played backwards, the first stages of the auditory system will send this signal to a different processing area than if it is real speech signal, because the system will regard it as "no-voice". This is an example of the outstanding abilities humans have.
Speech recognition advanced quickly from the 70s because researchers were studying the production of voice. This is a simpler system: vocal chords excited or not, resonation of vocal tractus... it is a mechanical system easy to understand. The main product of this approach is the cepstral analysis. This led automatic speech recognition (ASR) to achieve acceptable results. But this is a sub-optimal approach. Noise separation is quite bad, even when it works more or less in clean environments, it is not going to work with loud music in the background, not as humans will.
The optimal approach depends on the understanding of the auditory system. Its first stages in the cochlea, the inferior colliculus... but also the brain is involved. And we don't know so much about this. It is being a difficult change of paradigm.
Professor Hynek Hermansky compared in a paper the current state of the research with when humans wanted to fly. We didn't know what was the secret —The feathers? wings flapping?— until we discovered Bernoulli's force.
Because if people find it hard to understand other people with a strong accent why do you think computers will be any better at it?
I remember reading that Microsoft had a team working on speech recognition, and they called themselves the "Wreck a Nice Beach" team (a name given to them by their own software).
To actually turn speech into words, it's not as simple as mapping discrete sounds, there has to be an understanding of the context as well. The software would need to have a lifetime of human experience encoded in it.
This kind of problem is more general than only speech recognition.
It exists also in vision processing, natural language processing, artificial intelligence, ...
Speech recognition is affected by the semantic gap problem :
The semantic gap characterizes the
difference between two descriptions of
an object by different linguistic
representations, for instance
languages or symbols. In computer
science, the concept is relevant
whenever ordinary human activities,
observations, and tasks are
transferred into a computational
representation
Between an audio wave form and a textual word, the gap is big,
Between the word and its meaning, it is even bigger...
beecos iyfe peepl find it hard to arnerstand uvver peepl wif e strang acsent wie doo yoo fink compootrs wyll bee ani bettre ayt it?
I bet that took you half a second to work out what the hell I was typing and all Iw as doing was repeating Simons answer in a different 'accent'. The processing power just isn't there yet but it's getting there.
The variety in language would be the predominant factor, making it difficult. Dialects and accents would make this more complicated. Also, context. The book was read. The book was red. How do you determine the difference. The extra effort needed for this would make it easier to just type the thing in the first place.
Now, there would probably be more effort devoted to this if it was more necessary, but advances in other forms of data input have come along so quickly that it is not deemed that necessary.
Of course, there are areas where it would be great, even extremely useful or helpful. Situations where you have your hands full or can't look at a screen for input. Helping the disabled etc. But most of these are niche markets which have their own solutions. Maybe some of these are working more towards this, but most environments where computers are used are not good candidates for speech recognition. I prefer my working environment to be quiet. And endless chatter to computers would make crosstalk a realistic problem.
On top of this, unless you are dictating prose to the computer, any other type of input is easier and quicker using keyboard, mouse or touch. I did once try coding using voice input. The whole thing was painful from beginning to end.
Because Lernout&Hauspie went bust :)
(sorry, as a Belgian I couldn't resist)
The basic problem is that human language is ambiguous. Therefore, in order to understand speech, the computer (or human) needs to understand the context of what is being spoken. That context is actually the physical world the speaker and listener inhabit. And no AI program has yet demonstrated having adeep understanding of the physical world.
Speech synthesis is very complex by itself - many parameters are combined to form the resulting speech. Breaking it apart is hard even for people - sometimes you mishear one word for another.
Most of the time we human understand based on context. So that a perticular sentence is in harmony with the whole conversation unfortunately computer have a big handicap in this sense. It is just tries to capture the word not whats between it.
we would understand a foreigner whose english accent is very poor may be guess what is he trying to say instead of what is he actually saying.
To recognize speech well, you need to know what people mean - and computers aren't there yet at all.
You said it yourself, algorithm designers are working on it... but language and speech are not an algorithmic constructs. They are the peak of the development of the highly complex human system involving concepts, meta-concepts, syntax, exceptions, grammar, tonality, emotions, neuronal as well as hormon activity, etc. etc.
Language needs a highly heuristic approach and that's why progress is slow and prospects maybe not too optimistic.
I once asked a similar question to my instructor; i asked him something like what challenge is there in making a speech-to-text converter. Among the answers he gave, he asked me to pronounce 'p' and 'b'. Then he said that they differ for a very small time in the beginning, and then they sound similar. My point is that it is even hard to recognize what sound is made, recognizing voice would be even harder. Also, note that once you record people's voices, it is just numbers that you store. Imagine trying to find metrics like accent, frequency, and other parameters useful for identifying voice from nothing but input such as matrices of numbers. Computers are good at numerical processing etc, but voice is not really 'numbers'. You need to encode voice in numbers and then do all computation on them.
I would expect some advances from Google in the future because of their voice data collection through 1-800-GOOG411
It's not my field, but I do believe it is advancing, just slowly.
And I believe Simon's answer is somewhat correct in a way: part of the problem is that no two people speak alike in terms of the patterns that a computer is programmed to recognize. Thus, it is difficult to analysis speech.
Computers are not even very good at natural language processing to start with. They are great at matching but when it comes to inferring, it gets hairy.
Then, with trying to figure out the same word from hundreds of different accents/inflections and it suddenly doesn't seem so simple.
Well I have got Google Voice Search on my G1 and it works amazingly well. The answer is, the field is advancing, but you just haven't noticed!
If speech recognition was possible with substantially less MIPS than the human brain, we really could talk to the animals.
Evolution wouldn't spend all those calories on grey matter if they weren't required to do the job.
Spoken language is context sensitive, ambiguous. Computers don't deal well with ambiguous commands.
I don't agree with the assumption in the question - I have recently been introduced to Microsoft's speech recognition and am impressed. It can learn my voice after a few minutes and usually identifies common words correctly. It also allows new words to be added. It is certainly usable for my purposes (understanding chemistry).
Differentiate between recognising the (word) tokens and understanding the meaning of them.
I don't yet know about other languages or operating systems.
The problem is that there are two types of speech recognition engines. Speaker-trained ones such as Dragon are good for dictation. They can recognize almost any spoke text with fairly good accuracy, but require (a) training by the user, and (b) a good microphone.
Speaker-independent speech rec engines are most often used in telephony. They require no "training" by the user, but must know ahead of time exactly what words are expected. The application development effort to create these grammars (and deal with errors) is huge. Telephony is limited to a 4Khz bandwidth due to historical limits in our public phone network. This limited audio quality greatly hampers the speech rec engines' ability to "hear" what people are saying. Digits such as "six" or "seven" contain an ssss sound that is particularly hard for the engines to distinguish. This means that recognizing strings of digits, one of the most basic recognition tasks, is problematic. Add in regional accents, where "nine" is pronounced "nan" in some places, and accuracy really suffers.
The best hope are interfaces that combine graphics and speech rec. Think of an IPhone application that you can control with your voice.