Does exist a Cellular Automata Rule that is RANDOM (like the rule 30) and has 3 colors?
This is rather a research problem - you have to run statistical tests on the Cellular Automata (CA) Rule you find to show that it is random. If you would like to do a research projects like this check out The Wolfram Science Summer School.
For now let see what information and tools can get you started.
First of all I would read Chapter 6: Starting from Randomness - Section 5: Randomness in Class 3 Systems in the "New Kind of Science" (NKS) book and surounding chapters for better understanding of the subject.
I would also look at many free apps exploring 3-color rules at The Wolfram Demonstrations Project.
Next you can start from good candidates found on page 64. Follow that link and read the image captions about 3-color CAs with seamingly random behavior. The online book is free (you may need to register once). I would recommend also reading pages 62 - 70 exaplaining those images.
Also take a look at "Random Sequence Generation by Cellular Automata" by Stephen Wolfram.
If you do no thave Mathematica, then Wolfram|Alpha can provide tons of valuable information. Here are the queries for the CAs from NKS book: rule 177, rule 912, and rule 2040. Not how Wolfram|Alpha gives you, for example, difference pattern images - higly divergent (spread fast) means chaos and randomness:
If you have Mathematica - it is easy to evolve CAs (and further test their random properties say with Chi-squared test). This is how you set up a 3 color range 1 totalistic CAs from pictures in NKS book (you can dig further with Hypothesis Testing):
ArrayPlot[CellularAutomaton[{#, {3, 1}}, {{1}, 0}, 50], Mesh -> True,
PixelConstrained -> 7, ColorRules -> {0 -> White, 1 -> Red},
Epilog -> Text[Style["Rule " <> ToString##, Red, Bold, 25], {50, 340}]] & /#
{177, 912, 2040} // Column
Related
Hi
I have seen this paper (and also Game Theory applied to gene expression analysis, Game Theory and Microarray Data Analysis) authors have used game theory for their microarray DEG analysis (microarray
game).
Is there any simple guide from you (or other online resources) that can describe how to use related formula for checking game theory concept in the DEG analysis of RNA-seq experiences ? (Basically is it even practical?)
Maybe there is some software for doing such investigation, painlessly.
NOTE1: For example please have a look at "Game Theory Method" in the first paper above :
Let N 5 {1,. . .,n} be a set of genes. A microarray game is a
coalitional game (N, w) where the function w assigns to each coalition
S ( N a frequency of associations, between a condition and a
expression property, of genes realized in the coalition S."
Imagin we have 150 gene up-regulated in females and 80 up-regulated in males (using de novo assembly and DESeq2 package), now how I can use game theory for mining something new or some extra connections between this collection of genes?
NOTE2: I have asked this question in BIOSTARS but no answer after 8 weeks.
Thanks
I'm reading the book 'The Practice of Programming' by Brian W. Kernighan and Rob Pike. Chapter 3 provides the algorithm for a Markov chain approach that reads a source text and uses it to generate random text that "reads well" (meaning the output is closer to proper-sounding English than gibberish):
set w1 and w2 to the first two words in the source text
print w1 and w2
loop:
randomly choose w3, one of the successors of prefix w1 and w2 in the source text
print w3
replace w1 and w2 by w2 and w3
repeat loop
My question is: what's the standard way to handle the situation where the new values for w2 and w3 have no successor in the source text?
Many thanks in advance!
Here your options:
Choose a word at random? (Always works)
Choose a new W2 at random? (Can conceivably still loop)
Back up to previous W1 and W2? (Can conceivably still loop)
I'd probably go with trying #2 or #3 once, then fallback to #1 -- which will always work.
The situation you are describing considers 3-grams, that is the statistical frequency of a 3-tuple in a given dataset. To create a Markov matrix with no adsorbing states, that is no points where a f_2(w1,w2) -> w3 and f_2(w2,w3) = 0, you'll have to extend the possibilities. A generalized extension to #ThomasW's answers would be:
If the set predictor f_2(w1,w2) != 0 draw from that
If the set predictor f_1(w2) != 0 draw from that
If the set predictor f_0() != 0 draw from that
That is, draw like normally from the 3-gram set, than the 2-gram set than the 1-gram set. At the last step you'll simply be drawing a word at random weighted by it's statistical frequency.
I believe that this is a serious problem in NLP, one without a straightforward solution. One approach is to tag the parts of speech in addition to the actual words, in order to generalize the mappings. Using parts of speech, the program can at least predict what part of speech should follow the words W2 and W3 if there is no precedent for the word sequence. "Once this mapping has been performed on training examples, we can train a tagging model on these training examples. Given a new test sentence we can then recover the sequence of tags from the model, and it is straightforward to identify the entities identified by the model." From Columbia notes.
Im having trouble getting started. I am in a Financial Engineering program, and I am trying to use a book written in 2003 to help me model partial differential equations, the black scholes model, etc.
But in the introductory chapter there is a very basic ODE interest rate problem, and my output is very different from the book.
DSolve[{y'[t] == ry[t], y[0] == P}, y[t], t]
is what I put in. The book has a very neat solution of {{y(t)->P*exp^(rt)}}
What I get is something like (Note, I can't post the output)
{{y(t) -> integral_1_to_t ry(K[1]]dK[1] - integral_1_to_0 ry(K[1])dK[1]+P}}
What are the big K's? Is this just some rule output that can't generate a symbolic solution? Because of some problem with my set up or filesystem? Also, are there any suggestions for using old books on Mathematica where the code provided may be out of date? I just need to find a way to move forward and apply this to my studies.
Last, sometimes with other ODE's I will get results different than my source. I.E. I followed a Mathematica ODE tutorial and my output was different too. In some places my version of Mathematica won't calculate, or drops certain variable s or constants in the solution, or there is no output. I have browsed for general troubleshooting for DSolve, but have found no persistent and recognized bug. I am wondering if there is something wrong in my file system, or something else? Please help!
You've an space missing between the r and the y[t].
Try:
DSolve[{y'[t] == r y[t], y[0] == P}, y[t], t]
I have been wondering for some time how does Google translate(or maybe a hypothetical translator) detect language from the string entered in the "from" field. I have been thinking about this and only thing I can think of is looking for words that are unique to a language in the input string. The other way could be to check sentence formation or other semantics in addition to keywords. But this seems to be a very difficult task considering different languages and their semantics. I did some research to find that there are ways that use n-gram sequences and use some statistical models to detect language. Would appreciate a high level answer too.
Take the Wikipedia in English. Check what is the probability that after the letter 'a' comes a 'b' (for example) and do that for all the combination of letters, you will end up with a matrix of probabilities.
If you do the same for the Wikipedia in different languages you will get different matrices for each language.
To detect the language just use all those matrices and use the probabilities as a score, let say that in English you'd get this probabilities:
t->h = 0.3 h->e = .2
and in the Spanish matrix you'd get that
t->h = 0.01 h->e = .3
The word 'the', using the English matrix, would give you a score of 0.3+0.2 = 0.5
and using the Spanish one: 0.01+0.3 = 0.31
The English matrix wins so that has to be English.
If you want to implement a lightweight language guesser in the programming language of your choice you can use the method of 'Cavnar and Trenkle '94: N-Gram-Based Text Categorization'. You can find the Paper on Google Scholar and it is pretty straight forward.
Their method builds a N-Gram statistic for every language it should be able to guess afterwards from some text in that language. Then such statistic is build for the unknown text aswell and compared to the previously trained statistics by a simple out-of-place measure.
If you use Unigrams+Bigrams (possibly +Trigrams) and compare the 100-200 most frequent N-Grams your hit rate should be over 95% if the text to guess is not too short.
There was a demo available here but it doesn't seem to work at the moment.
There are other ways of Language Guessing including computing the probability of N-Grams and more advanced classifiers, but in the most cases the approach of Cavnar and Trenkle should perform sufficiently.
You don't have to do deep analysis of text to have an idea of what language it's in. Statistics tells us that every language has specific character patterns and frequencies. That's a pretty good first-order approximation. It gets worse when the text is in multiple languages, but still it's not something extremely complex.
Of course, if the text is too short (e.g. a single word, worse, a single short word), statistics doesn't work, you need a dictionary.
An implementation example.
Mathematica is a good fit for implementing this. It recognizes (ie has several dictionaries) words in the following languages:
dicts = DictionaryLookup[All]
{"Arabic", "BrazilianPortuguese", "Breton", "BritishEnglish", \
"Catalan", "Croatian", "Danish", "Dutch", "English", "Esperanto", \
"Faroese", "Finnish", "French", "Galician", "German", "Hebrew", \
"Hindi", "Hungarian", "IrishGaelic", "Italian", "Latin", "Polish", \
"Portuguese", "Russian", "ScottishGaelic", "Spanish", "Swedish"}
I built a little and naive function to calculate the probability of a sentence in each of those languages:
f[text_] :=
SortBy[{#[[1]], #[[2]] / Length#k} & /# (Tally#(First /#
Flatten[DictionaryLookup[{All, #}] & /# (k =
StringSplit[text]), 1])), -#[[2]] &]
So that, just looking for words in dictionaries, you may get a good approximation, also for short sentences:
f["we the people"]
{{BritishEnglish,1},{English,1},{Polish,2/3},{Dutch,1/3},{Latin,1/3}}
f["sino yo triste y cuitado que vivo en esta prisión"]
{{Spanish,1},{Portuguese,7/10},{Galician,3/5},... }
f["wszyscy ludzie rodzą się wolni"]
{{"Polish", 3/5}}
f["deutsch lernen mit jetzt"]
{{"German", 1}, {"Croatian", 1/4}, {"Danish", 1/4}, ...}
You might be interested in The WiLI benchmark dataset for written language identification. The high level-answer you can also find in the paper is the following:
Clean the text: Remove things you don't want / need; make unicode un-ambiguious by applying a normal form.
Feature Extraction: Count n-grams, create tf-idf features. Something like that
Train a classifier on the features: Neural networks, SVMs, Naive Bayes, ... whatever you think could work.
The application I'm working on is a "configurator" of sorts. It's written in C# and I even wrote a rules engine to go with it. The idea is that there are a bunch of propositional logic statements, and the user can make selections. Based on what they've selected, some other items become required or completely unavailable.
The propositional logic statements generally take the following forms:
A => ~X
ABC => ~(X+Y)
A+B => Q
A(~(B+C)) => ~Q A <=> B
The symbols:
=> -- Implication
<=> -- Material Equivalence
~ -- Not
+ -- Or
Two letters side-by-side -- And
I'm very new to Prolog, but it seems like it might be able to handle all of the "rules processing" for me, allowing me to get out of my current rules engine (it works, but it's not as fast or easy to maintain as I would like).
In addition, all of the available options fall in a hierarchy. For instance:
Outside
Color
Red
Blue
Green
Material
Wood
Metal
If an item at the second level (feature, such as Color) is implied, then an item at the third level (option, such as Red) must be selected. Similarly if we know that a feature is false, then all of the options under it are also false.
The catch is that every product has it's own set of rules. Is it a reasonable approach to set up a knowledge base containing these operators as predicates, then at runtime start buliding all of the rules for the product?
The way I imagine it might work would be to set up the idea of components, features, and options. Then set up the relationships between then (for instance, if the feature is false, then all of its options are false). At runtime, add the product's specific rules. Then pass all of the user's selections to a function, retrieving as output which items are true and which items are false.
I don't know all the implications of what I'm asking about, as I'm just getting into Prolog, but I'm trying to avoid going down a bad path and wasting lots of time in the process.
Some questions that might help target what I'm trying to find out:
Does this sound do-able?
Am I barking up the wrong tree?
Are there any drawbacks or concerns to trying to create all of these rules at runtime?
Is there a better system for this kind of thing out there that I might be able to squeeze into a C# app (Silverlight, to be exact)?
Are there other competing systems that I should examine?
Do you have any general advice about this sort of thing?
Thanks in advance for your advice!
Sure, but Prolog has a learning curve.
Rule-based inference is Prolog's game, though you may have to rewrite many rules into Horn clauses. A+B => Q is doable (it becomes q :- a. q :- b. or q :- (a;b).) but your other examples must be rewritten, including A => ~X.
Depends on your Prolog compiler, specifically whether it supports indexing for dynamic predicates.
Search around for terms like "forward checking", "inference engine" and "business rules". Various communities keep inventing different terminologies for this problem.
Constraint Handling Rules (CHR) is a logic programming language, implemented as a Prolog extension, that is much closer to rule-based inference/forward chaining/business rules engines. If you want to use it, you'll still have to learn basic Prolog, though.
Keep in mind that Prolog is a programming language, not a silver bullet for logical inference. It cuts some corners of first-order logic to keep things efficiently computable. This is why it only handles Horn clauses: they can be mapped one-to-one with procedures/subroutines.
You can also throw in DCGs to generate bill of materials. The idea is
roughly that terminals can be used to indicate subproducts, and
non-terminals to define more and more complex combinations of a subproducts
until you arrive at your final configurable products.
Take for example the two attribute value pairs Color in {red, blue, green}
and Material in {wood, metal}. These could specify a door knob, whereby
not all combinations are possible:
knob(red,wood) --> ['100101'].
knob(red,metal) --> ['100102'].
knob(blue,metal) --> ['100202'].
You could then define a door as:
door ... --> knob ..., panel ...
Interestingly you will not see any logic formula in such a product specification,
only facts and rules, and a lot of parameters passed around. You can use the
parameters in a knowledge acquisition component. By just running uninstantiated
goals you can derive possible values for the attribute value pairs. The predicate
setof/3 will sort and removen duplicates for you:
?- setof(Color,Material^Bill^knob(Color,Material,Bill,[]),Values).
Value = [blue, red]
?- setof(Material,Color^Bill^knob(Color,Material,Bill,[]),Values).
Material = [metal, wood]
Now you know the range of the attributes and you can let the end-user successively
pick an attribute and a value. Assume he takes the attribute Color and its value blue.
The range of the attribute Material then shrinks accordingly:
?- setof(Material,Bill^knob(blue,Material,Bill,[]),Values).
Material = [metal]
In the end when all attributes have been specified you can read off the article
numbers of the subproducts. You can use this for price calculation, by adding some
facts that give you additional information on the article numbers, or to generate
ordering lists etc..:
?- knob(blue,metal,Bill,[]).
Bill = ['100202']
Best Regards
P.S.:
Oh it seems that the bill of materials idea used in the product configurator
goes back to Clocksin & Mellish. At least I find a corresponding
comment here:
http://www.amzi.com/manuals/amzi/pro/ref_dcg.htm#DCGBillMaterials