How to avoid exponential notation using Scratch? - algorithm

In my program, I have a large string of numbers that have been compiled together, and I'm switching it back and forth between different base values. But when I switch back to decimal, the computer directly switches to a number using exponential notation. The program I'm using is Scratch, but as long as any algorithms that are given are readable, I should be able to translate.
Essentially, I just need a way to go from like 1.0e13 to 10000000000000. Any ideas?

This script is the best I could muster:
And a sample output:
As well as a project containing the custom block for your convenience: https://scratch.mit.edu/projects/150067538/
Unfortunately, Scratch still rounds numbers, so your answers won't always be 100% exact, but at least they won't be in scientific (e) notation. If somebody else has an even better solution, I'd love to see it.

Like PullJosh said, (Hey again PullJosh!) scratch rounds numbers off the Scientific Notation so it won't be exactly accurate but their is always a solution to a problem!
My theory is that you can put each digit of the scientific notation into a list. This will make the conversion much easier! I will not take a picture of my code but send you the link to it as the code is massive mostly because I added some code that will detect if your scientific notation is a number and it can convert numbers like 1.123e2.
https://scratch.mit.edu/projects/341550388/editor
You can use the code without credit,yay! Just put it in your backpack and you're good to go.
Edit: Also, if you need more help with Scratch and stuff, feel free to follow me # endermite334 (you don't have to) and I will be happy to help you!

Related

GAUL library seems do not work

I am learning how to use GAUL right now. I started from the first example struggle.c
I can understand it and run it successfully. However it seems like the best result can never be the same as the target string.
The target string is "When we reflect on this struggle, we may console ourselves with the full belief, that the war of nature is not incessant, that no fear is felt, that death is generally prompt, and that the vigorous, the healthy, and the happy survive and multiply."
The GA run 50 times and the best result is something like that
"When w^ yeil^ct%on%this strsggln,#we may console ourselves nith,she gbll ^eomef' that&thk wir#od(nqure bl nfx kgciss\nt,)what no#bear is-[egt, wh_t deaxh is g_jerally promph, an s[at+th] vgormxs, rhe'he_jshy,&apd the hapsy survivTna#kqitiphy."
Is this normal or I installed wrong somehow? Thanks.
Here is the link for the tutorial of struggle.c
http://gaul.sourceforge.net/tutorial/simple.html
I'm not familiar with GAUL but for this type of example elitism normally makes a big difference. If you can turn that on you'll probably see much faster convergence as you're not constantly discarding good partial solutions.
As an alternative to increasing the number of generations, you could instead try increasing the population size so that you have more diversity to work with. You could also try altering the mutation rate. If you're using the 0.2 mentioned on the page that you're linking to it could be a bit high. When mutation is too frequent it undermines the progress that the algorithm is making.
If you just want to see the program find the target string you could try using a shorter string. The longer the string the more generations it will take to converge.

How can I learn to read formulas with greek symbols?

I suppose maybe it's because I don't know the keywords to google for, but I can't find any sources on how to read those formulas you see on wikipedia, like this for instance:
Erlang Distribution
I've searched in the math world and computer science world. It feels like it is assumed that we're supposed to understand it out of thin air. Beginner lessons seem scarce.
So far I know how sigma works. And that upside-down shape that is used as the half-life logo is called lambda. But what the heck is it trying to say?? Why is there a semi-colon in the function, etc..
If there is a book on this stuff I'd buy it in an instant. It is probably very basic stuff but I never had experience in theoretical math or even know where to look.
Does anyone know what this subject is called, and what to google for?
Formulas with this symbols usually are statistics or probability notations.
Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
Greek letters used in mathematics, science, and engineering
you can find info here
Notation in probability and statistics
here
I think the colon in alt.: \scriptstyle \theta \;=\; \frac{1}{\lambda} > 0\, scale (real) in the box in Wikipedia is just saying that there is an alternative definition, in which you specify theta rather than lambda, and in that definition what is called theta is the reciprocal of lambda in the other definition.
I once complained to a much better mathematician than I was that I came unstuck with formulas with some of the weirder greek letters in because I couldn't write them recognisably in my handwriting (which is bad enough for the latin alphabet). He said a lot of the people he knew simply said "let x be funny-squiggle-thing" and rewrote with sensible letters. I really wish I'd thought of that.
In general, letters in weird alphabets behave pretty much like sensible letters, at least in the sort of thing you are pointing at. It's done as a sort of type-checking - usually all of the letters pinched from some particular foreign language are related in some way - e.g. all parameters. Unfortunately that doesn't hold exactly in the Wikipedia example you quote, where two of the greek letters stand for functions - one is definitely the Gamma function. I suspect the other is http://en.wikipedia.org/wiki/Digamma_function, but I'm not really sure.
Check out the resources list here: http://en.wikipedia.org/wiki/Greek_alphabet
I would say your best bet is still searching in Google (or other search engine, whatever float your boat) about the specific formula you are trying to learn. Sometimes a symbol may be used in different meaning in different formula.
Anyway, there is a good resource in here that explained a lot of math symbols, not just the Greek symbol.
Some link that may interest you here and here.
First, find a Greek alphabet (upper and lower case) to refer to, so that you can at least call lambda by it's name. No one starts out knowing automatically what the various Greek characters are, not even Greeks.
Second, Read the actual article, usually either the character is defined (as lambda happens to be in the Wikipedia page you references) or it's standard nomenclature (in which case you've done the right thing by looking for a basic article on the function in question-- I do this all the time so don't feel bad.) Or, as a third option, it's a crappy paper. Happens sometimes. It's kind of a pain, though, since you can't just do a text search on the lambda character in a PDF.
(Someone educate me on that if I'm wrong....)
Third, try to pick out which unfamiliar symbols are variables (like lambda) and which are operators (like sigma, and it's helpers.) It's the operators that can sometimes cause real trouble. A variable is just a name for something, but operators come freighted with more meaning, more rules, and more syntax. It's not always obvious which symbols are operators, either.
Finally, and specifically for computer science, a good introductory book (college freshman/sophomore level) on discrete math will hopefully treat most of the basic notations and operators to at least get your feet on the ground. Nowadays, you kids and your newfangled internet might be able to get something similar from Udacity, Edx, Course RA, or the Khan Academy.
Basically, it's a lot of hard work, especially on your own, but you're already doing most of the right things.

Code generation by genetic algorithms

Evolutionary programming seems to be a great way to solve many optimization problems. The idea is very easy and the implementation does not make problems.
I was wondering if there is any way to evolutionarily create a program in ruby/python script (or any other language)?
The idea is simple:
Create a population of programs
Perform genetic operations (roulette-wheel selection or any other selection), create new programs with inheritance from best programs, etc.
Loop point 2 until program that will satisfy our condition is found
But there are still few problems:
How will chromosomes be represented? For example, should one cell of chromosome be one line of code?
How will chromosomes be generated? If they will be lines of code, how do we generate them to ensure that they are syntactically correct, etc.?
Example of a program that could be generated:
Create script that takes N numbers as input and returns their mean as output.
If there were any attempts to create such algorithms I'll be glad to see any links/sources.
If you are sure you want to do this, you want genetic programming, rather than a genetic algorithm. GP allows you to evolve tree-structured programs. What you would do would be to give it a bunch of primitive operations (while($register), read($register), increment($register), decrement($register), divide($result $numerator $denominator), print, progn2 (this is GP speak for "execute two commands sequentially")).
You could produce something like this:
progn2(
progn2(
read($1)
while($1
progn2(
while($1
progn2( #add the input to the total
increment($2)
decrement($1)
)
)
progn2( #increment number of values entered, read again
increment($3)
read($1)
)
)
)
)
progn2( #calculate result
divide($1 $2 $3)
print($1)
)
)
You would use, as your fitness function, how close it is to the real solution. And therein lies the catch, that you have to calculate that traditionally anyway*. And then have something that translates that into code in (your language of choice). Note that, as you've got a potential infinite loop in there, you'll have to cut off execution after a while (there's no way around the halting problem), and it probably won't work. Shucks. Note also, that my provided code will attempt to divide by zero.
*There are ways around this, but generally not terribly far around it.
It can be done, but works very badly for most kinds of applications.
Genetic algorithms only work when the fitness function is continuous, i.e. you can determine which candidates in your current population are closer to the solution than others, because only then you'll get improvements from one generation to the next. I learned this the hard way when I had a genetic algorithm with one strongly-weighted non-continuous component in my fitness function. It dominated all others and because it was non-continuous, there was no gradual advancement towards greater fitness because candidates that were almost correct in that aspect were not considered more fit than ones that were completely incorrect.
Unfortunately, program correctness is utterly non-continuous. Is a program that stops with error X on line A better than one that stops with error Y on line B? Your program could be one character away from being correct, and still abort with an error, while one that returns a constant hardcoded result can at least pass one test.
And that's not even touching on the matter of the code itself being non-continuous under modifications...
Well this is very possible and #Jivlain correctly points out in his (nice) answer that genetic Programming is what you are looking for (and not simple Genetic Algorithms).
Genetic Programming is a field that has not reached a broad audience yet, partially because of some of the complications #MichaelBorgwardt indicates in his answer. But those are mere complications, it is far from true that this is impossible to do. Research on the topic has been going on for more than 20 years.
Andre Koza is one of the leading researchers on this (have a look at his 1992 work) and he demonstrated as early as 1996 how genetic programming can in some cases outperform naive GAs on some classic computational problems (such as evolving programs for Cellular Automata synchronization).
Here's a good Genetic Programming tutorial from Koza and Poli dated 2003.
For a recent reference you might wanna have a look at A field guide to genetic programming (2008).
Since this question was asked, the field of genetic programming has advanced a bit, and there have been some additional attempts to evolve code in configurations other than the tree structures of traditional genetic programming. Here are just a few of them:
PushGP - designed with the goal of evolving modular functions like human coders use, programs in this system store all variables and code on different stacks (one for each variable type). Programs are written by pushing and popping commands and data off of the stacks.
FINCH - a system that evolves Java byte-code. This has been used to great effect to evolve game-playing agents.
Various algorithms have started evolving C++ code, often with a step in which compiler errors are corrected. This has had mixed, but not altogether unpromising results. Here's an example.
Avida - a system in which agents evolve programs (mostly boolean logic tasks) using a very simple assembly code. Based off of the older (and less versatile) Tierra.
The language isn't an issue. Regardless of the language, you have to define some higher-level of mutation, otherwise it will take forever to learn.
For example, since any Ruby language can be defined in terms of a text string, you could just randomly generate text strings and optimize that. Better would be to generate only legal Ruby programs. However, it would also take forever.
If you were trying to build a sorting program and you had high level operations like "swap", "move", etc. then you would have a much higher chance of success.
In theory, a bunch of monkeys banging on a typewriter for an infinite amount of time will output all the works of Shakespeare. In practice, it isn't a practical way to write literature. Just because genetic algorithms can solve optimization problems doesn't mean that it's easy or even necessarily a good way to do it.
The biggest selling point of genetic algorithms, as you say, is that they are dirt simple. They don't have the best performance or mathematical background, but even if you have no idea how to solve your problem, as long as you can define it as an optimization problem you will be able to turn it into a GA.
Programs aren't really suited for GA's precisely because code isn't good chromossome material. I have seen someone who did something similar with (simpler) machine code instead of Python (although it was more of an ecossystem simulation then a GA per se) and you might have better luck if you codify your programs using automata / LISP or something like that.
On the other hand, given how alluring GA's are and how basically everyone who looks at them asks this same question, I'm pretty sure there are already people who tried this somewhere - I just have no idea if any of them succeeded.
Good luck with that.
Sure, you could write a "mutation" program that reads a program and randomly adds, deletes, or changes some number of characters. Then you could compile the result and see if the output is better than the original program. (However we define and measure "better".) Of course 99.9% of the time the result would be compile errors: syntax errors, undefined variables, etc. And surely most of the rest would be wildly incorrect.
Try some very simple problem. Say, start with a program that reads in two numbers, adds them together, and outputs the sum. Let's say that the goal is a program that reads in three numbers and calculates the sum. Just how long and complex such a program would be of course depends on the language. Let's say we have some very high level language that lets us read or write a number with just one line of code. Then the starting program is just 4 lines:
read x
read y
total=x+y
write total
The simplest program to meet the desired goal would be something like
read x
read y
read z
total=x+y+z
write total
So through a random mutation, we have to add "read z" and "+z", a total of 9 characters including the space and the new-line. Let's make it easy on our mutation program and say it always inserts exactly 9 random characters, that they're guaranteed to be in the right places, and that it chooses from a character set of just 26 letters plus 10 digits plus 14 special characters = 50 characters. What are the odds that it will pick the correct 9 characters? 1 in 50^9 = 1 in 2.0e15. (Okay, the program would work if instead of "read z" and "+z" it inserted "read w" and "+w", but then I'm making it easy by assuming it magically inserts exactly the right number of characters and always inserts them in the right places. So I think this estimate is still generous.)
1 in 2.0e15 is a pretty small probability. Even if the program runs a thousand times a second, and you can test the output that quickly, the chance is still just 1 in 2.0e12 per second, or 1 in 5.4e8 per hour, 1 in 2.3e7 per day. Keep it running for a year and the chance of success is still only 1 in 62,000.
Even a moderately competent programmer should be able to make such a change in, what, ten minutes?
Note that changes must come in at least "packets" that are correct. That is, if a mutation generates "reax z", that's only one character away from "read z", but it would still produce compile errors, and so would fail.
Likewise adding "read z" but changing the calculation to "total=x+y+w" is not going to work. Depending on the language, you'll either get errors for the undefined variable or at best it will have some default value, like zero, and give incorrect results.
You could, I suppose, theorize incremental solutions. Maybe one mutation adds the new read statement, then a future mutation updates the calculation. But without the calculation, the additional read is worthless. How will the program be evaluated to determine that the additional read is "a step in the right direction"? The only way I see to do that is to have an intelligent being read the code after each mutation and see if the change is making progress toward the desired goal. And if you have an intelligent designer who can do that, that must mean that he knows what the desired goal is and how to achieve it. At which point, it would be far more efficient to just make the desired change rather than waiting for it to happen randomly.
And this is an exceedingly trivial program in a very easy language. Most programs are, what, hundreds or thousands of lines, all of which must work together. The odds against any random process writing a working program are astronomical.
There might be ways to do something that resembles this in some very specialized application, where you are not really making random mutations, but rather making incremental modifications to the parameters of a solution. Like, we have a formula with some constants whose values we don't know. We know what the correct results are for some small set of inputs. So we make random changes to the constants, and if the result is closer to the right answer, change from there, if not, go back to the previous value. But even at that, I think it would rarely be productive to make random changes. It would likely be more helpful to try changing the constants according to a strict formula, like start with changing by 1000's, then 100's then 10's, etc.
I want to just give you a suggestion. I don't know how successful you'd be, but perhaps you could try to evolve a core war bot with genetic programming. Your fitness function is easy: just let the bots compete in a game. You could start with well known bots and perhaps a few random ones then wait and see what happens.

How to calculate indefinite integral programmatically

I remember solving a lot of indefinite integration problems. There are certain standard methods of solving them, but nevertheless there are problems which take a combination of approaches to arrive at a solution.
But how can we achieve the solution programatically.
For instance look at the online integrator app of Mathematica. So how do we approach to write such a program which accepts a function as an argument and returns the indefinite integral of the function.
PS. The input function can be assumed to be continuous(i.e. is not for instance sin(x)/x).
You have Risch's algorithm which is subtly undecidable (since you must decide whether two expressions are equal, akin to the ubiquitous halting problem), and really long to implement.
If you're into complicated stuff, solving an ordinary differential equation is actually not harder (and computing an indefinite integral is equivalent to solving y' = f(x)). There exists a Galois differential theory which mimics Galois theory for polynomial equations (but with Lie groups of symmetries of solutions instead of finite groups of permutations of roots). Risch's algorithm is based on it.
The algorithm you are looking for is Risch' Algorithm:
http://en.wikipedia.org/wiki/Risch_algorithm
I believe it is a bit tricky to use. This book:
http://www.amazon.com/Algorithms-Computer-Algebra-Keith-Geddes/dp/0792392590
has description of it. A 100 page description.
You keep a set of basic forms you know the integrals of (polynomials, elementary trigonometric functions, etc.) and you use them on the form of the input. This is doable if you don't need much generality: it's very easy to write a program that integrates polynomials, for example.
If you want to do it in the most general case possible, you'll have to do much of the work that computer algebra systems do. It is a lifetime's work for some people, e.g. if you look at Risch's "algorithm" posted in other answers, or symbolic integration, you can see that there are entire multi-volume books ("Manuel Bronstein, Symbolic Integration Volume I: Springer") that have been written on the topic, and very few existing computer algebra systems implement it in maximum generality.
If you really want to code it yourself, you can look at the source code of Sage or the several projects listed among its components. Of course, it's easier to use one of these programs, or, if you're writing something bigger, use one of these as libraries.
These expert systems usually have a huge collection of techniques and simply try one after another.
I'm not sure about WolframMath, but in Maple there's a command that enables displaying all intermediate steps. If you do so, you get as output all the tried techniques.
Edit:
Transforming the input should not be the really tricky part - you need to write a parser and a lexer, that transforms the textual input into an internal representation.
Good luck. Mathematica is very complex piece of software, and symbolic manipulation is something that it does the best. If you are interested in the topic take a look at these books:
http://www.amazon.com/Computer-Algebra-Symbolic-Computation-Elementary/dp/1568811586/ref=sr_1_3?ie=UTF8&s=books&qid=1279039619&sr=8-3-spell
Also, going to the source wouldn't hurt either. These book actually explains the inner workings of mathematica
http://www.amazon.com/Mathematica-Book-Fourth-Stephen-Wolfram/dp/0521643147/ref=sr_1_7?ie=UTF8&s=books&qid=1279039687&sr=1-7

Choosing a multiplier for a (string) hash function

Do you have any advice/rules on selecting a multiplier to use in a (multiplicative) hash function. The function is computing the hash value of a string.
You want to use something that is relatively prime to the size of your set. That way, when you loop around, you won't end up on the same numbers you just tried.
I had an interesting discussion with a coworker about hash function recently. Our conclusions were as follows:
If you really need to write a good hash function that minimizes collisions more than the default implementations available in the standard languages you need an advanced degree in mathematics.
If you're writing applications where a custom hash function will noticeably improve the performance of your application, you're Google and you've got plenty of Math PhDs to do the work.
Sorry to not directly answer your question, but the bottom line is that there's really no need to write your own hash function for String. What language are you working with? I'd imagine there's an easy way to compute a "good enough" hash code.
Historically 33 seems like a popular choice, and it tends to work pretty well. No one knows why though. For more details, look here

Resources