So, since the answer to this basically said that I really should look into encoding the genes of my creatures*, which I did!
So, I created the following neat little (byte[]-) structure:
Gene = { X, X, X, X, Y, Y, Y, Y, Z, Z, Z, Z }
Where
X = Represents a certain trait in a creature.
Y = These blocks controls how, if and when crossovers and mutation will happen (16 possible values, I think that should be more than enough!)
Z = The length of the strand (basically, this is for future builds, where I want to let the evolution control even the length of the entire strand).
(So Z and Y can be thought of as META-information)
(Before you ask, yes, that's a 12-byte :) )
My question to you are as follows:
How would I connect these the characteristics of each 'creature'?
Basically, I see it this way (and this will probably be how I will implement it):
Each 'creature' can run around, eat and breed, basic stuff. I don't think (I sure don't hope so at least!) I will need a fitness-function per se, but I hope evolution, as in the race for food, partners and space will drive the creatures to evolution.
Is this view wrong? Would it be easier (do note, I'm a programmer, not a mathematician!) to look at it as one big graph and "simply" take it from there?
Or, tl;dr: Can you point me in the right direction to articles, studies and/or examples of implementation of this?
(Even more tl;dr; How do I translate a Gene to, say for example the Length of a leg?)
*Read the question, I'm building a sort of simulator.
I haven't seen metainformation like your Y and Z in a genetic algorithm's gene sequence before (in my limited exposure to the technique). Is your genetic algorithm a nontraditional one?
How many traits does a creature have? If your X's are representing the values of traits, and a gene sequence can have variable length (Z), then what happens if there are not enough X's defined for all the traits? What happens if there are more X's than the traits you have for a creature?
Z should be a fixed value.
Y should be parameters of your genetic evolution routine
there should be an X (or set of X) for every trait of a creature (no more, no less)
If the number of X's you have is fixed, then for each trait, you assign a particular index (or set of indexes) to represent that trait.
EDIT:
You should determine the encoding of your traits that X should represent: for length of a leg, e.g., you could have a few bytes represent the leg-length. If bytes 3-5 were the length of the leg, you could represent the length in your X-vector like this:
[...101......]
Where the dots are other trait representations. The above snippet represents a leg length of 5 (whatever that means). The below genome still has 5 as the leg length, but other traits filled out as well.
[001101011011]
Looking at Mitchell, 1998, An Introduction to Genetic Algorithms, Chap. 3.3, I found a reference to Forrest and Jones, 1994, Modeling Complex Adaptive Systems with Echo. that refers to the software Echo that seems to do what you are looking for (evolving creature in a world). For the moment I can't find a link to it but here is a dissertation on implementing jEcho, by Brian McIndoe.
Related
I want to define a continuous function on an interval of the real line -- [a,b), [a,b], (a,b), or (a,b] (where a < b and either could possibly be infinite) -- in Isabelle, or state that a function is continuous on such an interval.
I believe what I want is continuous_on s f because
"continuous_on s f ⟷ (∀x∈s. (f ⤏ f x) (at x within s))"
(which I got from here:https://isabelle.in.tum.de/library/HOL/HOL/Topological_Spaces.html#Topological_Spaces.topological_space_class.at_within|const)
So I think "s" should be an interval but I'm not sure how best to represent that.
So I would like to know how I could set "s" to any of the types of set intervals I list above. Thank you!
The answer to your question
In every ordered type you can write {a..b} for the closed interval from a to b. For the open/half-open variants there are {a<..b}, {a..<b}, {a<..<b}, {a..}, {..b}, {a<..}, and {..<b}.
These are defined in HOL.Set_Interval and there internal names are things like lessThan, atLeastAtMost, etc.
So in your case, you can simply write something like continuous_on {a..b} f.
It takes some time to understand how to do arguments about limits, continuity, etc. in a theorem prover efficiently. Don't be afraid to ask, either here or on the Zulip. You can of course always do things the hard way (with e.g. ε–δ reasoning) but once you become more proficient in using the library a lot of things become much easier. Which brings me to:
Addendum: Filters
Note however that to really reason about continuity in Isabelle in an idiomatic way you will have to understand filters, which are (among other things) a way to talk about ‘neighbourhoods of points’. There is the definition continuous which takes a function at a filter and can be used to say that a function is continuous at a given point with some conditions.
For instance, there are filters
nhds x, which describes the neighbourhoods of x (i.e. everything that is sufficiently close to x)
at x within A, which describes the pointed neighbourhood of x intersected with A – i.e. you approach x without every reaching it and while staying fully inside the set A
at x, which is simply at x within UNIV, i.e. the pointed neighbourhood of x without further restricitions
at_left x and at_right x for e.g. the real numbers, which are defined as at x within {..<x} and at x within {x<..}, i.e. the left and right neighbourhoods of x
Then you can write things like continuous (at x) F or continuous (at_right x) F. If I recall correctly, continuous_on A f is equivalent to ∀x∈A. continuous (at x within A) f.
Filters are also useful for many other related things, like limits, open/closed sets, uniform continuity, and derivatives.
This paper explains how filters are used in the Isabelle analysis library. It also has a short section about continuity.
I have read a lot of threads here discussing edit-distance based fuzzy-searches, which tools like Elasticsearch/Lucene provide out of the box, but my problem is a bit different. Suppose I have a dictionary of words, {'cat', 'cot', 'catalyst'}, and a character similarity relation f(x, y)
f(x, y) = 1, if characters x and y are similar
= 0, otherwise
(These "similarities" can be specified by the programmer)
such that, say,
f('t', 'l') = 1
f('a', 'o') = 1
f('f', 't') = 1
but,
f('a', 'z') = 0
etc.
Now if we have a query 'cofatyst', the algorithm should report the following matches:
('cot', 0)
('cat', 0)
('catalyst', 0)
where the number is the 0-based starting index of the match found. I have tried the Aho-Corasick algorithm, and while it works great for exact matching and in the case when a character has relatively less number of "similar" characters, its performance drops exponentially as we increase the number of similar characters for a character. Can anyone point me to a better way of doing this? Fuzziness is an absolute necessity, and it must take in to account character similarities(i.e., not blindly depend on just edit-distances).
One thing to note is that in the wild, the dictionary is going to be really large.
I might try to use the cosine similarity using the position of each character as a feature and mapping the product between features using a match function based on your character relations.
Not a very specific advise, I know, but I hope it helps you.
edited: Expanded answer.
With the cosine similarity, you will compute how similar two vectors are. In your case the normalisation might not make sense. So, what I would do is something very simple (I might be oversimplifying the problem): First, see the matrix of CxC as a dependency matrix with the probability that two characters are related (e.g., P('t' | 'l') = 1). This will also allow you to have partial dependencies to differentiate between perfect and partial matches. After this I will compute, for each position the probability that the letter from each word is not the same (using the complement of P(t_i, t_j)) and then you can just aggregate the results using a sum.
It will count the number of terms that are different for a specific pair of words, and it allows you to define partial dependencies. Furthermore, the implementation is very simple and should scale well. This is why I am not sure if I misunderstood your question.
I am using Fuse JavaScript Library for a project of mine. It is a javascript file which works on JSON dataset. It is quite fast. Have a look at it.
It has implemented a full Bitap algorithm, leveraging a modified version of the Diff, Match & Patch tool by Google(from his site).
The code is simple to understand the algorithm implementation done.
I was given a brain puzzle from lonpos.cc as a present. I was curius of how many different solutions there were, and I quite enjoy writing algorithms and code, so I started writing an application to brute force it.
The puzzle looks like this : http://www.lonpos.cc/images/LONPOSdb.jpg / http://cdn100.iofferphoto.com/img/item/191/498/944/u2t6.jpg
It's a board of 20x14 "points". And all puzzle pieces can be flipped and turned. I wrote an application where each piece (and the puzzle) is presented like this:
01010
00100
01110
01110
11111
01010
Now my application so far is reasonably simple.
It takes the list of pieces and a blank board, pops of piece #0
flips it in every direction, and for that piece tries to place it for every x and y coordinate. If it successfully places a piece it passes a copy of the new "board" with some pieces taken to a recursive function, and tries all combinations for their pieces.
Explained in pseudocode:
bruteForce(Board base, List pieces) {
for (Piece in pieces.pop, piece.pop.flip, piece.pop.flip2...) {
int x,y = 0;
if canplace(piece, x, y) {
Board newBoard = base.clone();
newBoard.placePiece(piece, x, y);
bruteForce(newBoard, pieces);
}
## increment x until x > width, then y
}
}
Now I'm trying to find out ways to make this quicker. Things I've thought of so far:
Making it solve in parallel - Implemented, now using 4 threads.
Sorting the pieces, and only trying to place the pieces that will fit in the x,y space we're trying to fit. (Aka if we're on the bottom row, and we only have 4 "points" from our position to the bottom, dont try the ones that are 8 high).
Not duplicating the board, instead using placePiece and removePiece or something like it.
Checking for "invalid" boards, aka if a piece is impossible to reach (boxed in completely).
Anyone have any creative ideas on how I can do this quicker? Or any way to mathematically calculate how many different combinations there are?
I don't see any obvious way to do things fast, but here are some tips that might help.
First off, if you ignore the bumps, you have a 6x4 grid to fill with 1x2 blocks. Each of the blocks has 6 positions where it can have a bump or a hole. Therefore, you're trying to find an arrangement of the blocks such that at each edge, a bump is matched with a hole. Also, you can represent the pieces much more efficiently using this information.
Next, I'd recommend trying all ways to place a block in a specific spot rather than all places to play a specific block anywhere. This will reduce the number of false trails you go down.
This looks like the Exact Cover Problem. You basically want to cover all fields on the board with your given pieces. I can recommend Dancing Links, published by Donald Knuth. In the paper you find a clear example for the pentomino problem which should give you a good idea of how it works.
You basically set up a system that keeps track of all possible ways to place a specific block on the board. By placing a block, you would cover a set of positions on the field. These positions can't be used to place any other blocks. All possibilities would then be erased from the problem setting before you place another block. The dancing links allows for fast backtracking and erasing of possibilities.
Is it possible to obtain the maximum difference between two columns (for example starting and ending weights)?
Right now I'm leaning towards no as this would require a new column with the difference between the two columns for each row, then taking the max of that. Doing it the way I orginally intended doesn't work either since arithmetic operations are not allowed in the conditions of select operations (e.g. SIGMA (c1 - c2 < c3 - c4)(Table) is not allowed).
Disclosure: this is part of a homework question.
It can be done, exactly in the way you planned, but you need generalized projection for that. The generalized projection is the operator
Π(E1, E2,..., En)R
where R is a relation, and E1...En are expressions in the form a⊕b, where a and b are attributes of R or constants, and ⊕ is an arbitrary binary operator between them. The result is a relation with attributes E1...En.
This would allow you to project the differences into a new relation (R' := Π(x-y)R), then find the maximum on that, just as you planned.
If we're not allowed to use generalized projection, then I think we have no means to actually subtract an attribute from another, or to actually calculate anything from them, as the definition of projection allow only attribute names, and the definition of selection allow only expressions of the form aθb where a and b are attributes or constants and θ is a binary relational operator (this is logical, in its way, because if we have a relation R(X,Y), then we have no idea about the type of X or Y, making operations on them quite meaningless).
I think generalized projection is a great extension to relational algebra. It's obviously immensely useful in real life, and it can be defended even from a more scientific point of view: if we allow binary conditional operators on the values like "X > 50", then we made assumptions on the type already, rendering that point kind of moot. Your instructor may disagree, though.
If you're looking to do this in the real world, you should be able to do this with a subquery (or a view, which amounts to much the same thing), something like:
select max (diff) from (
select high - low as diff from blah blah blah
)
Whether this applies to the abstract world of relational algebra, I couldn't say. I'm too busy fixing those damn real-world problems :-)
I am a Mechanical engineer with a computer scientist question. This is an example of what the equations I'm working with are like:
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
The situation is this:
I need r to find x, but I need x to find z. I also need x to find f which is a part of finding z. So I guess a value for x, and then I use that value to find r and f. Then I go back and use the value I found for r and f to find x. I keep doing this until the guess and the calculated are the same.
My question is:
How do I get the computer to do this? I've been using mathcad, but an example in another language like C++ is fine.
The very first thing you should do faced with iterative algorithms is write down on paper the sequence that will result from your idea:
Eg.:
x_0 = ..., f_0 = ..., r_0 = ...
x_1 = ..., f_1 = ..., r_1 = ...
...
x_n = ..., f_n = ..., r_n = ...
Now, you have an idea of what you should implement (even if you don't know how). If you don't manage to find a closed form expression for one of the x_i, r_i or whatever_i, you will need to solve one dimensional equations numerically. This will imply more work.
Now, for the implementation part, if you never wrote a program, you should seriously ask someone live who can help you (or hire an intern and have him write the code). We cannot help you beginning from scratch with, eg. C programming, but we are willing to help you with specific problems which should arise when you write the program.
Please note that your algorithm is not guaranteed to converge, even if you strongly think there is a unique solution. Solving non linear equations is a difficult subject.
It appears that mathcad has many abstractions for iterative algorithms without the need to actually implement them directly using a "lower level" language. Perhaps this question is better suited for the mathcad forums at:
http://communities.ptc.com/index.jspa
If you are using Mathcad, it has the functionality built in. It is called solve block.
Start with the keyword "given"
Given
define the guess values for all unknowns
x:=2
f:=3
r:=2
...
define your constraints
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
calculate the solution
find(x, y, z, r, ...)=
Check Mathcad help or Quicksheets for examples of the exact syntax.
The simple answer to your question is this pseudo-code:
X = startingX;
lastF = Infinity;
F = 0;
tolerance = 1e-10;
while ((lastF - F)^2 > tolerance)
{
lastF = F;
X = ?;
R = ?;
F = FunctionOf(X,R);
}
This may not do what you expect at all. It may give a valid but nonsense answer or it may loop endlessly between alternate wrong answers.
This is standard substitution to convergence. There are more advanced techniques like DIIS but I'm not sure you want to go there. I found this article while figuring out if I want to go there.
In general, it really pays to think about how you can transform your problem into an easier problem.
In my experience it is better to pose your problem as a univariate bounded root-finding problem and use Brent's Method if you can
Next worst option is multivariate minimization with something like BFGS.
Iterative solutions are horrible, but are more easily solved once you think of them as X2 = f(X1) where X is the input vector and you're trying to reduce the difference between X1 and X2.
As the commenters have noted, the mathematical aspects of your question are beyond the scope of the help you can expect here, and are even beyond the help you could be offered based on the detail you posted.
However, I think that even if you understood the mathematics thoroughly there are computer science aspects to your question that should be addressed.
When you write your code, try to make organize it into functions that depend only upon the parameters you are passing in to a subroutine. So write a subroutine that takes in values for y, z, and r and returns you x. Make another that takes in f,L,D,G and returns z. Now you have testable routines that you can check to make sure they are computing correctly. Check the input values to your routines in the routines - for instance in computing x you will get a divide by 0 error if you pass in a 0 for r. Think about how you want to handle this.
If you are going to solve this problem interatively you will need a method that will decide, based on the results of one iteration, what the values for the next iteration will be. This also should be encapsulated within a subroutine. Now if you are using a language that allows only one value to be returned from a subroutine (which is most common computation languages C, C++, Java, C#) you need to package up all your variables into some kind of data structure to return them. You could use an array of reals or doubles, but it would be nicer to choose to make an object and then you can reference the variables by their name and not their position (less chance of error).
Another aspect of iteration is knowing when to stop. Certainly you'll do so when you get a solution that converges. Make this decision into another subroutine. Now when you need to change the convergence criteria there is only one place in the code to go to. But you need to consider other reasons for stopping - what do you do if your solution starts diverging instead of converging? How many iterations will you allow the run to go before giving up?
Another aspect of iteration of a computer is round-off error. Mathematically 10^40/10^38 is 100. Mathematically 10^20 + 1 > 10^20. These statements are not true in most computations. Your calculations may need to take this into account or you will end up with numbers that are garbage. This is an example of a cross-cutting concern that does not lend itself to encapsulation in a subroutine.
I would suggest that you go look at the Python language, and the pythonxy.com extensions. There are people in the associated forums that would be a good resource for helping you learn how to do iterative solving of a system of equations.