I have read the H.262 document and a draft version of ISO/IEC 13818-2. I cannot figure out the difference between the two, can anyone point it out??
Thanks in Advance
ISO/IEC 13818-2 is identical to ITU-T Rec. H.262. The first two parts of MPEG-2 were developed in collaboration with ITU.
In some areas of information technology which fall within ITU-T’s purview, the necessary standards are prepared on a
collaborative basis with ISO and IEC. The text of ITU-T Recommendation H.262 was approved on 10th of July 1995.
The identical text is also published as ISO/IEC International Standard 13818-2.
Source
Related
I am working on on Stanford-openIE but I do not know whether it supports Chinese text or not. If it supports Chinese language, How can I use stanford-openIE for Chinese text?
Any guidance will be appreciated.
Stanford's OpenIE system was developed for English. It's based off of universal dependencies, meaning that in theory it shouldn't be too hard to adapt to other languages; but, nonetheless, it's highly unlikely that it would work out of the box.
At minimum, the relation triple segmenter would have to be adapted for Chinese. For some of the more subtle functionality, the code to mark natural logic polarity and the code to score prepositional phrase deletions would have to be rewritten.
This question was asked in GATE 2015 CSE, I think design specs should not be discussed in SRS.
Ans: (C) Design specifications should not be discussed in the SRS. Instead, they are reserved for the SDD(Software design description).
A SRS does discuss (A) (B) and (D).
SRS document does not contain design specifications. It mainly contains the following things:
Functional/Non Functional Requirement
Scope
Targeted Audience
Purpose
UML Diagrams which will be base on the requirement
The first part of this question is now its own, here: Analyzing Text for Accents
Question: How could accents be added to generated speech?
What I've come up with:
I do not mean just accent marks, or inflection, or anything singular like that. I mean something like a full British accent, or a Scottish accent, or Russian, etc.
I would think that this could be done outside of the language as well. Ex: something in Russian could be generated with a British accent, or something in Mandarin could have a Russian accent.
I think the basic process would be this:
Analyze the text
Compare with a database (or something like that) to determine what needs an accent, how strong it should be, etc.
Generate the speech in specified language
Easy with normal text-to-speech processors.
Determine the specified accent based on the analyzed text.
This is the part in question.
I think an array of amplitudes and filters would work best for the next step.
Mesh speech and accent.
This would be the easy part.
It could probably be done by multiplying the speech by the accent, like many other DSP methods do.
This is really more of a general DSP question, but I'd like to come up with a programatic algorithm to do this instead of a general idea.
This question isn't really "programming" per se: It's linguistics. The programming is comparatively easy. For the analysis, that's going to be really difficult, and in truth you're probably better off getting the user to specify the accent; Or are you going for an automated story reader?
However, a basic accent is doable with modern text-to speech. Are you aware of the international phonetic alphabet? http://en.wikipedia.org/wiki/International_Phonetic_Alphabet
It basically lists all the sounds a human voice can possibly make. An accent is then just a mapping (A function) from the alphabet to itself. For instance, to make an American accent sound British to an American person (Though not sufficient to make it sound British to a British person), you can de-rhotacise all the "r" sounds in the middle of a word. So for instance the alveolar trill would be replaced with the voiced uvular fricative. (Lots of corner cases to work out just for this).
Long and short: It's not easy, which is probably why no-one has done it. I'm sure a couple of linguistics professors out their would say its impossible. But that's what linguistics professors do. But you'll basically need to read several thick textbooks on accents and pronunciation to make any headway with this problem. Good luck!
What is an accent?
An accent is not a sound filter; it's a pattern of acoustic realization of text in a language. You can't take a recording of American English, run it through "array of amplitudes and filters", and have British English pop out. What DSP is useful for is in implementing prosody, not accent.
Basically (and simplest to model), an accent consists of rules for phonetic realization of a sequence of phonemes. Perception of accent is further influenced by prosody and by which phonemes a speaker chooses when reading text.
Speech generation
The process of speech generation has two basic steps:
Text-to-phonemes: Convert written text to a sequence of phonemes (plus suprasegmentals like stress, and prosodic information like utterance boundaries). This is somewhat accent-dependent (e.g. the output for "laboratory" differs between American and British speakers).
Phoneme-to-speech: given the sequence of phonemes, generate audio according to the dialect's rules for phonetic realizations of phonemes. (Typically you then combine diphones and then adjust acoustically the prosody). This is highly accent-dependent, and it is this step that imparts the main quality of the accent. A particular phoneme, even if shared between two accents, may have strikingly different acoustic realizations.
Normally these are paired. While you could have a British-accented speech generator that uses American pronunciations, that would sound odd.
Generating speech with a given accent
Writing a text-to-speech program is an enormous amount of work (in particular, to implement one common scheme, you have to record a native speaker speaking each possible diphone in the language), so you'd be better off using an existing one.
In short, if you want a British accent, use a British English text-to-phoneme engine together with a British English phoneme-to-speech engine.
For common accents like American and British English, Standard Mandarin, Metropolitan French, etc., there will be several choices, including open-source ones that you will be able to modify (as below). For example, look at FreeTTS and eSpeak. For less common accents, existing engines unfortunately may not exist.
Speaking text with a foreign accent
English-with-a-foreign-accent is socially not very prestigious, so complete systems probably don't exist.
One strategy would be to combine an off-the-shelf text-to-phoneme engine for a native accent with a phoneme-to-speech engine for the foreign language. For example, a native Russian speaker that learned English in the U.S. would plausibly use American pronunciations of words like laboratory, and map its phonemes onto his native Russian phonemes, pronouncing them as in Russian. (I believe there is a website that does this for English and Japanese, but I don't have the link.)
The problem is that the result is too extreme. A real English learner would attempt to recognize and generate phonemes that do not exist in his native language, and would also alter his realization of his native phonemes to approximate the native pronunciation. How closely the result matches a native speaker of course varies, but using the pure foreign extreme sounds ridiculous (and mostly incomprehensible).
So to generate plausible American-English-with-a-Russian-accent (for instance), you'd have to write a text-to-phoneme engine. You could use existing American English and Russian text-to-phoneme engines as a starting point. If you're not willing to find and record such a speaker, you could probably still get a decent approximation using DSP to combine the samples from those two engines. For eSpeak, it uses formant synthesis rather than recorded samples, so it might be easier to combine information from multiple languages.
Another thing to consider is that foreign speakers often modify the sequence of phonemes under influence by the phonotactics of their native language, typically by simplifying consonant clusters, inserting epenthetic vowels, or diphthongizing or breaking vowel sequences.
There is some literature on this topic.
I need a regex to seperate references from a mountain of psycinfo lit searches that look like this:
http://rubular.com/r/bKMoDpAJvY
(I can't post the text - something about this edit control bungs it up horribly)
I just want matches that are all the text that is between the numbering but it is doing my head in. Also an explanation would be fabulous so I can learn.
Does teststring.split(/^\d+\./) work for you?
With String#split you get an array out of your string, the string is splitted at the regex, in this case a numbers at begin of the line, followed by a dot, optional some spaces and a newline.
My testcode
teststring = DATA.read
teststring.split(/^\d+\.\s*$/).each{|m|
puts "==========="
puts m
}
__END__
1.
Reframing the rocky road: From causal analysis to mindreading as the drama of disposition inference. [References].
Ames, Daniel R.
Psychological Inquiry. Vol.20(1), Jan 2009, pp. 19-23.
AN: Peer Reviewed Journal: 2009-04633-002.
Comments on an article by Glenn D. Reeder (see record 2009-04633-001). My misgivings with Reeder's account are relatively minor. For one, I am not sure that the "multiple inference model" label quite captures the essential part of Reeder's argument. Although it suggests the plurality of judgments that perceivers often make, it does not seem to reflect Reeder's central point that, for intentional behaviors, perceivers typically make motive inferences and these guide trait inferences. Another stumbling point for me was the identification of five categories that accounted for "the majority of studies" on dispositional inference (attitude attribution, moral attribution, ability attribution, the silent interview paradigm, and the quiz-role paradigm). These are noteworthy paradigms, to be sure, but they hardly seem to exhaust the research on dispositional inference, which I take as a perceiver's ascription of an enduring trait to a target. (PsycINFO Database Record (c) 2010 APA, all rights reserved)
Publication Date
Jan 2009
Year of Publication
2009
E-Mail Address
Ames, Daniel R.: da358#columbia.edu
Other Publishers
Lawrence Erlbaum; US
Link to the Ovid Full Text or citation:
http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc6&AN=2009-04633-002
Link to the External Link Resolver:
http://diglib1.bham.ac.uk:3210/sfxlcl3?sid=OVID:psycdb&id=pmid:&id=doi:10.1080%2F10478400902744253&issn=1047-840X&isbn=&volume=20&issue=1&spage=19&pages=19-23&date=2009&title=Psychological+Inquiry&atitle=Reframing+the+rocky+road%3A+From+causal+analysis+to+mindreading+as+the+drama+of+disposition+inference.&aulast=Ames&pid=%3Cauthor%3EAmes%2C+Daniel+R%3C%2Fauthor%3E%3CAN%3E2009-04633-002%3C%2FAN%3E%3CDT%3EComment%2FReply%3C%2FDT%3E
2.
Everyday Solutions to the Problem of Other Minds: Which Tools Are Used When? [References].
Ames, Daniel R.
Malle, Bertram F [Ed]; Hodges, Sara D [Ed]. (2005). Other minds: How humans bridge the divide between self and others. (pp. 158-173). xiii, 354 pp. New York, NY, US: Guilford Press; US.
AN: Book: 2005-09375-010.
(from the chapter) Intuiting what the people around us think, want, and feel is essential to much of social life. Some scholars have gone so far as to declare the "problem of other minds"--whether a person can know if anyone else has thoughts and, if so, what they are--intractable. And yet countless times a day, we solve such problems with ease, if not perfectly then at least to our own satisfaction. What strategies underlie these everyday solutions? And how are these tools employed? This chapter offers 4 contingencies about when various inferential tools might be used. First, that affect qualifies behavior in the near term: perceived remorseful affect can lead to ascriptions of good intent to harm-doers in the short run, but repeated harm drives long-run ascriptions of bad intent. Second, that perceived similarity governs projection and stereotyping: perceptions of general similarity to a target typically draw a mindreader toward projection and away from stereotyping; perceived dissimilarity does the opposite. Third, that cumulative behavioral evidence supersedes extratarget strategies: projection and stereotyping will drive mindreading when behavioral evidence is ambiguous, but as apparent evidence accumulates, inductive judgments will dominate. Fourth, that negative social intention information weighs heavily in mindreading: within a mindreading strategy, cues signaling negative social intentions may dominate neutral or positive cues; between mindreading strategies, those strategies that signal negative social intentions may dominate. These contingencies have varying degrees of empirical support and would benefit from additional research and thinking. (PsycINFO Database Record (c) 2010 APA, all rights reserved)
Publication Date
2005
Year of Publication
2005
Link to the Ovid Full Text or citation:
http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc5&AN=2005-09375-010
Link to the External Link Resolver:
http://diglib1.bham.ac.uk:3210/sfxlcl3?sid=OVID:psycdb&id=pmid:&id=doi:&issn=&isbn=1-59385-187-1&volume=&issue=&spage=158&pages=158-173&date=2005&title=Other+minds%3A+How+humans+bridge+the+divide+between+self+and+others.&atitle=Everyday+Solutions+to+the+Problem+of+Other+Minds%3A+Which+Tools+Are+Used+When%3F&aulast=Ames&pid=%3Cauthor%3EAmes%2C+Daniel+R%3C%2Fauthor%3E%3CAN%3E2005-09375-010%3C%2FAN%3E%3CDT%3EChapter%3C%2FDT%3E
results in:
===========
===========
Reframing the rocky road: From causal analysis to mindreading as the drama of disposition inference. [References].
Ames, Daniel R.
Psychological Inquiry. Vol.20(1), Jan 2009, pp. 19-23.
AN: Peer Reviewed Journal: 2009-04633-002.
Comments on an article by Glenn D. Reeder (see record 2009-04633-001). My misgivings with Reeder's account are relatively minor. For one, I am not sure that the "multiple inference model" label quite captures the essential part of Reeder's argument. Although it suggests the plurality of judgments that perceivers often make, it does not seem to reflect Reeder's central point that, for intentional behaviors, perceivers typically make motive inferences and these guide trait inferences. Another stumbling point for me was the identification of five categories that accounted for "the majority of studies" on dispositional inference (attitude attribution, moral attribution, ability attribution, the silent interview paradigm, and the quiz-role paradigm). These are noteworthy paradigms, to be sure, but they hardly seem to exhaust the research on dispositional inference, which I take as a perceiver's ascription of an enduring trait to a target. (PsycINFO Database Record (c) 2010 APA, all rights reserved)
Publication Date
Jan 2009
Year of Publication
2009
E-Mail Address
Ames, Daniel R.: da358#columbia.edu
Other Publishers
Lawrence Erlbaum; US
Link to the Ovid Full Text or citation:
http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc6&AN=2009-04633-002
Link to the External Link Resolver:
http://diglib1.bham.ac.uk:3210/sfxlcl3?sid=OVID:psycdb&id=pmid:&id=doi:10.1080%2F10478400902744253&issn=1047-840X&isbn=&volume=20&issue=1&spage=19&pages=19-23&date=2009&title=Psychological+Inquiry&atitle=Reframing+the+rocky+road%3A+From+causal+analysis+to+mindreading+as+the+drama+of+disposition+inference.&aulast=Ames&pid=%3Cauthor%3EAmes%2C+Daniel+R%3C%2Fauthor%3E%3CAN%3E2009-04633-002%3C%2FAN%3E%3CDT%3EComment%2FReply%3C%2FDT%3E
===========
Everyday Solutions to the Problem of Other Minds: Which Tools Are Used When? [References].
Ames, Daniel R.
Malle, Bertram F [Ed]; Hodges, Sara D [Ed]. (2005). Other minds: How humans bridge the divide between self and others. (pp. 158-173). xiii, 354 pp. New York, NY, US: Guilford Press; US.
AN: Book: 2005-09375-010.
(from the chapter) Intuiting what the people around us think, want, and feel is essential to much of social life. Some scholars have gone so far as to declare the "problem of other minds"--whether a person can know if anyone else has thoughts and, if so, what they are--intractable. And yet countless times a day, we solve such problems with ease, if not perfectly then at least to our own satisfaction. What strategies underlie these everyday solutions? And how are these tools employed? This chapter offers 4 contingencies about when various inferential tools might be used. First, that affect qualifies behavior in the near term: perceived remorseful affect can lead to ascriptions of good intent to harm-doers in the short run, but repeated harm drives long-run ascriptions of bad intent. Second, that perceived similarity governs projection and stereotyping: perceptions of general similarity to a target typically draw a mindreader toward projection and away from stereotyping; perceived dissimilarity does the opposite. Third, that cumulative behavioral evidence supersedes extratarget strategies: projection and stereotyping will drive mindreading when behavioral evidence is ambiguous, but as apparent evidence accumulates, inductive judgments will dominate. Fourth, that negative social intention information weighs heavily in mindreading: within a mindreading strategy, cues signaling negative social intentions may dominate neutral or positive cues; between mindreading strategies, those strategies that signal negative social intentions may dominate. These contingencies have varying degrees of empirical support and would benefit from additional research and thinking. (PsycINFO Database Record (c) 2010 APA, all rights reserved)
Publication Date
2005
Year of Publication
2005
Link to the Ovid Full Text or citation:
http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc5&AN=2005-09375-010
Link to the External Link Resolver:
http://diglib1.bham.ac.uk:3210/sfxlcl3?sid=OVID:psycdb&id=pmid:&id=doi:&issn=&isbn=1-59385-187-1&volume=&issue=&spage=158&pages=158-173&date=2005&title=Other+minds%3A+How+humans+bridge+the+divide+between+self+and+others.&atitle=Everyday+Solutions+to+the+Problem+of+Other+Minds%3A+Which+Tools+Are+Used+When%3F&aulast=Ames&pid=%3Cauthor%3EAmes%2C+Daniel+R%3C%2Fauthor%3E%3CAN%3E2005-09375-010%3C%2FAN%3E%3CDT%3EChapter%3C%2FDT%3E
The first empty "" is obsolete, you may delete it.
I found another solution with String#scan:
(teststring + "99.\n").scan(/^\d+\.\s*\n(.*?)(?=^\d+\.\s*\n)/m).each{|m|
puts "==========="
puts m
}
Explanation:
^\d+\.\s*\n Look for numbers with dot at line start until line end. Ignore trailing spaces
(.*?) take everything, but not greedy (use shortest hit)
(?=^\d+\.\s*\n) Check for next entry, but don't consume it
m use multiline code
(teststring + "99.\n") This solution will loose the last entry. So we add a 'endtag'
i'm searching for an implementation of a croatian word stemming algorithm. Ideally in Java but i would also accept any other language.
Is there somewhere a community of english speaking developers, who are developing search applications for the croatian language?
Thanks,
Slavic languages are highly inflective. The most accurate and fast approach would be a combination of rules and large mappings/dictionaries.
Work has been done, but it has been held back. The Croatian morphological lexicon will help, but it's behind a slow API. More work can be found between Bosnian, Serbian and Croatian, than just Croatian alone.
Large mappings aren't always convenient (and one could effectively build a better rule transformer from the mapping/dictionaries/corpus).
Implementing using Hunspell and affix files could be a great way to get the community and java support. Eg. Google search: hr_hr.aff
Not tested: One should be able to reverse all the words, build a trie of the ending characters, traverse using some rules (eg LCS) and build an accurate statistical transformer using corpus text.
Best I can do is some python:
import hunspell
hs = hunspell.HunSpell(
'/usr/share/myspell/hr_HR.dic',
'/usr/share/myspell/hr_HR.aff')
# The following should return ['hrvatska']:
print hs.stem('hrvatski')
here you can find a recent implementation done on ffzg in python - stemmer for croatian.
We performed basic evaluation of the stemmer on a lemmatized newspaper corpus as gold standard with a precision of 0.986 and recall of 0.961 (F1 0.973) for adjectives and nouns. On all parts of speech we obtained precision of 0.98 and recall of 0.92 (F1 0.947).
It is released under GNU licence but feel free to contact the author on further help (I only know the original author Nikola, but not his student).