Game Data Structure in OCaml - data-structures

I am currently working on a game / simulation of a computer like logistics system (like the minecraft mod applied energestics).
The main part of the game is the 2d grid of blocks.
All the blocks have common properties like a position.
But then there are supposed to be different kinds of Blocks like:
Item-Containers,
Input and Export Busses,
etc.
In an imperative object-oriented language (like Java) I would implement this with:
a main block class
with the common properties like position
then have sub classes
that inherit from the block class
These subclasses would implement the different properties of the different block types.
In ocaml I am a litle bit lost.
I could create objects which inherit but this does not work Like in Java.
For example:
I cant put objects of different subclasses in one list together.
I also wanted to approach the data structure differently by separating data from logic. I would not add methods to the objects. I tried using records instead of objects.
I don't know how to implement the different block types.
I tried using custom data types like this:
type blockType = Container | Input | Output | Air
type block = {blockType :blockType; pos :int * int}
I struggled to add the individual additional properties. I tried to add an entity field to the block record type which would hold the additional properties:
type entity = Container of inventory | OutputEntity of facing | InputEntity of facing | NoEntity
(Where inventory and facing are also custom types)
This solution doesn't really feel fitting.
One problem I have is that there are logic operations I want to perform on blocks of type Input and Output. I have to repeat code like this:
let rotateBlock block =
match block.blockType with
| Input -> {block with entity = nextDirection block.entity}
| Output -> {block with entity = nextDirection block.entity}
| _ -> block
That's not that bad with two types but I plan to add much more so it is a big negative in terms of scalability.
Another point of critism of this kind of structure is that it is a litle bit inconsistent. I use a field in the record to implement the different types on the block level and multiple constructors on the entity level. I did it to be able to access the positon of every block easily with block.pos instead of using pattern matching.
I am not really happy with this solution.
Request
I hope somebody can point me in the right direction regarding the data structure.

You're trying to satisfy competing goals. You can't have both, a rigid static model of blocks and a dynamic extensible block type. So you need to choose. Fortunately, OCaml provides solutions for both, and even for something between, but as always for the middle-ground solutions, they kind of bad in both. So let's try.
Rigid static hierarchy using ADT
We can use sum types to represent a static hierarchy of objects. In this case, it would be easy for us to add new methods, but hard to add new types of objects. As the base type, we will use a polymorphic record, that is parametrized with a concrete block type (the concrete block type could be polymorphic itself, and this will allow us to build the third layer of hierarchy and so on).
type pos = {x : int; y : int}
type 'a block = {pos : pos; info = 'a}
type block_info = Container of container | Input of facing | Air | Solid
where info is an additional concrete block specific payload, i.e., a value of type block_info. This solution allows us to write polymorphic functions that accept different blocks, e.g.,
let distance b1 b2 =
sqrt ((float (b1.x - b2.x))**2. + (float (b1.y - b2.y)) **2.)
The distance function has type 'a blk -> 'b blk -> float and will compute the distance between two blocks of any type.
This solution has several drawbacks:
Hard to extend. Adding a new kind of block is hard, you basically need to design beforehand what blocks do you need and hope that you do not need to add a new block in the future. It looks like that you're expecting that you will need to add new block types, so this solution might not suit you. However, I believe that you will actually need a very small number of block kinds if you will treat each block as a syntactical element of the world grammar, you will soon notice that the minimal set of block kinds is quite small. Especially, if you will make your blocks recursive, i.e., if you will allow block composition (i.e., mixtures of different blocks in the same block).
You can't put blocks of different kinds in the same container. Because to do this, you need to forget the type of block. If you will do this, you will eventually end up with a container of positions. We will try to alleviate this in our middle-ground solution by using existential types.
Your type model doesn't impose right constraints. The world constraint is that the world is composed of blocks, and each coordinate either has a block or doesn't have one (i.e., it is the void). In your case, two blocks may have the same coordinates.
Not so rigid hierarchy using GADT
We may relax few restrictions of the previous solution, by using existential GADT. The idea of the existential is that you can forget the kind of a block, and later recover it. This is essentially the same as a variant type (or dynamic type in C#). With existentials, you can have an infinite amount of block kinds and even put them all in the same container. Essentially, an existential is defined as a GADT that forgets its type, e.g., the first approximation
type block = Block : block_info -> {pos : pos; info : block_info}
So now we have a unified block type, that is locally quantified with the type of block payload. You may even move further, and make the block_info type extensible, e.g.,
type block_info = ..
type block_info += Air
Instead of building an existential representation by yourself (it's a nice exercise in GADT), you may opt to use some existing libraries. Search for "universal values" or "universals" in the OPAM repositories, there are a couple of solutions.
This solution is more dynamic and allows us to store values of the same type in the same container. The hierarchy is extensible. This comes with a price of course, as now we can't have one single point of definition for a particular method, in fact, method definitions would be scattered around your program (kind of close to the Common Lisp CLOS model). However, this is an expected price for an extensible dynamic system. Also, we lose the static property of our model, so we will use lots of wildcard in pattern matching, and we can't rely on the type system to check that we covered all possible combinations. And the main problem is still there our model is not right.
Not so rigid structure with OO
OCaml has the Object Oriented Layer (hence the name) so you can build classical OO hierarchies. E.g.,
class block x y = object
val x = x
val y = y
method x = x
method y = y
method with_x x = {< x = x >}
method with_y y = {< y = y >}
end
class input_block facing = object
inherit block
val facing = facing
method facing = facing
method with_facing f = {< facing = f >}
end
This solution is essentially close to the first solution, except that your hierarchy is now extensible at the price that the set of methods is now fixed. And although you can put different blocks in the same container by forgetting a concrete type of a block using upcasting, this won't make much sense since OCaml doesn't have the down-cast operator, so you will end up with a container of coordinates. And we still have the same problem - our model is not right.
Dynamic world structure using Flyweights
This solution kills two bunnies at the same time (and I believe that this should be the way how it is implemented in Minecraft). Let's start with the second problem. If you will represent every item in your world with a concrete record that has all attributes of that item you will end up with lots of duplicates and extreme memory consumption. That's why in real-world applications a pattern called Flyweight is used. So, if you think about scalability you will still end up in using this approach. The idea of the Flyweight pattern is that your objects share attributes by using finite mapping, and objects itself are represented as identifiers, e.g.,
type block = int
type world = {
map : pos Int.Map.t;
facing : facing Int.Map.t;
air : Int.Set.t;
}
where 'a Int.Map.t is a mapping from int to 'a, and Int.Set.t is a set of integers (I'm using the Core library here).
In fact, you may even decide that you don't need a closed world type, and just have a bunch of finite mappings, where each particular module adds and maintains its own set of mappings. You can use abstract types to store this mapping in a central repository.
You may also consider the following representation of the block type, instead of one integer, you may use two integers. The first integer denotes an identity of a block and the second denotes its equality.
type block = {id : int; eq : int}
The idea is that every block in the game will have a unique id that will distinguish it from other blocks even if they are equal "as two drops of water". And eq will denote structural equality of two blocks, i.e., two blocks with exact same attributes will have the same eq number. This solution is hard to implement if you world structure is not closed though (as in this case the set of attributes is not closed).
The main drawback of this solution is that it is so dynamic that it sorts of leaving the OCaml type system out of work. That's a reasonable penalty, in fact you can't have a dynamic system that is fully verified in static time. (Unless you have a language with dependent types, but this is a completely different story).
To summarize, if I were devising such kind of game, I will use the last solution. Mainly because it scales well to a large number of blocks, thanks to hashconsing (another name for Flyweight). If scalability is not an issue, then I will build a static structure of blocks with different composition operators, e.g.,
type block =
| Regular of regular
| ...
| Compose of compose_kind * block * block
type compose_kind = Horizontal | Vertical | Inplace
And now the world is just a block. This solution is pure mathematical though, and doesn't really scale to larger worlds.

Sounds like fun.
I cant put objects of different subclasses in one list together.
You can actually. Suppose you had lots of different block-objects that
all had a 'decay' method. You could have a function "get me the
decayables" and it could put all those blocks in a list for you, and
you could then at timed intervals iterate over the list and apply the
decay method on each of those blocks. This is all well typed and easy
to do with OCaml's object system. What you can't do is take out a
'decayable' from that list and say, actually, this was always also
an AirBlock, and I want to treat it like a full-fledged AirBlock now,
instead of a decayable.
...
type blockType = Container | Input | Output | Air
You can only have 240 so variants per type. If you plan on having more
blocks than this, an easy way to gain extra space would be to
categorize your blocks and work with e.g. Solid Rock | Liquid Lava
rather than Rock | Lava.
type block = {blockType :blockType; pos :int * int}
What's the position of a block in your inventory? What's the position
of a block that's been mined out of its place in the world, and is
now sort of sitting on the ground, waiting to be picked up? Why not
keep the positions in the array indices or map key that you're using
to represent the locations of blocks in the world? Otherwise you also
have to consider what it means for blocks to have the same position,
or impossible positions.
let rotateBlock block =
match block.blockType with
| Input -> {block with entity = nextDirection block.entity}
| Output -> {block with entity = nextDirection block.entity}
| _ -> block
I don't really follow this input/output stuff, but it seems that in this
function you're interested in some kind of property like "has a next direction
to face, if rotated". Why not name that property and make it what you match on?
type block = {
id : blockType;
burnable : bool;
consumable : bool;
wearable : bodypart option; (* None - not wearable *)
hitpoints : int option; (* None - not destructible *)
oriented : direction option; (* None - doesn't have distinct faces *)
}
let rotateBlock block =
match block.oriented with
| None -> block
| Some dir -> {block with oriented = Some (nextDirection dir)}
let burn block =
match block.burnable, block.hitpoints with
| false, _ | true, None -> block
| true, Some hp when hp > 5 -> { block with hitpoints = Some (hp - 5) }
| true, Some hp -> ash

The block type is interesting because each kind will have different operations.
Item-Containers,
Input and Export Busses,
etc.
type container = { specificinfo : int ; etc ...}
type bus .....
type block =
| Item of position * container
| Bus of position * bus
type inventory = block list
My intuition say me that you can perhaps use GADT for creating the type of the operations on block and implement an evaluator easily for the simulator.
UPDATE
to answer to your comment :
If you have a common information to all your variants, you need to extract them, you can imagine something like :
type block_info =
| Item of specific_item_type ....
| Bus of specific_bus_type
type block = {position:Vector.t ; information : block_info}
let get_block_position b = b.position

Related

fat arrow in Idris

I hope this question is appropriate for this site, it's just about the choice of concrete syntax in Idris compared to Haskell, since both are very similar. I guess it's not that important, but I'm very curious about it. Idris uses => for some cases where Haskell uses ->. So far I've seen that Idris only uses -> in function types and => for other things like lambdas and case _ of. Did this choice come from realizing that it's useful in practice to have a clear syntactical distinction between these use cases? Is it just an arbitrary cosmetic choice and I'm overthinking it?
Well, in Haskell, type signatures and values are in different namespaces, so something defined in one is at no risk of clashing with something in the other. In Idris, types and values occupy the same namespace, which is why you don't see e.g. data Foo = Foo as you would in Haskell, but rather, data Foo = MkFoo - the type is called Foo, and the constructor is called MkFoo, as there is already a value (the type Foo), bound to the name Foo, e.g. data Pair = MkPair http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html#tuples
So it's probably for the best it didn't try to use the arrow used to construct the type of functions, with the arrow used for lambdas - those are rather different things. You can combine them with e.g. the (Int -> Int) (\x => x).
I think it is because they interpret the -> symbol differently.
From Wikipedia:
A => B means if A is true then B is also true; if A is false then nothing is said about B
which seems right for case expressions, and
-> may mean the same as =>, or it may have the meaning for functions given below
which is
f: X -> Y means the function f maps the set X into the set Y
So my guess is that Idris just uses -> for the narrow second meaning, i.e. for mapping one type to another in type signatures, whereas Haskell uses the broader interpretation, where it means the same as =>.

Java-8 stream expression to 'OR' several enum values together

I am aggregating a bunch of enum values (different from the ordinal values) in a foreach loop.
int output = 0;
for (TestEnum testEnum: setOfEnums) {
output |= testEnum.getValue();
}
Is there a way to do this in streams API?
If I use a lambda like this in a Stream<TestEnum> :
setOfEnums.stream().forEach(testEnum -> (output |= testEnum.getValue());
I get a compile time error that says, 'variable used in lambda should be effectively final'.
Predicate represents a boolean valued function, you need to use reduce method of stream to aggregate bunch of enum values.
if we consider that you have HashSet as named SetOfEnums :
//int initialValue = 0; //this is effectively final for next stream pipeline if you wont modify this value in that stream
final int initialValue = 0;//final
int output = SetOfEnums.stream()
.map(TestEnum::getValue)
.reduce(initialValue, (e1,e2)-> e1|e2);
You nedd to reduce stream of enums like this:
int output = Arrays.stream(TestEnum.values()).mapToInt(TestEnum::getValue).reduce(0, (acc, value) -> acc | value);
I like the recommendations to use reduction, but perhaps a more complete answer would illustrate why it is a good idea.
In a lambda expression, you can reference variables like output that are in scope where the lambda expression is defined, but you cannot modify the values. The reason for that is that, internally, the compiler must be able to implement your lambda, if it chooses to do so, by creating a new function with your lambda as its body. The compiler may choose to add parameters as needed so that all of the values used in this generated function are available in the parameter list. In your case, such a function would definitely have the lambda's explicit parameter, testEnum, but because you also reference the local variable output in the lambda body, it could add that as a second parameter to the generated function. Effectively, the compiler might generate this function from your lambda:
private void generatedFunction1(TestEnum testEnum, int output) {
output |= testEnum.getValue();
}
As you can see, the output parameter is a copy of the output variable used by the caller, and the OR operation would only be applied to the copy. Since the original output variable wouldn't be modified, the language designers decided to prohibit modification of values passed implicitly to lambdas.
To get around the problem in the most direct way, setting aside for the moment that the use of reduction is a far better approach, you could wrap the output variable in a wrapper (e.g. an int[] array of size 1 or an AtomicInteger. The wrapper's reference would be passed by value to the generated function, and since you would now update the contents of output, not the value of output, output remains effectively final, so the compiler won't complain. For example:
AtomicInteger output = new AtomicInteger();
setOfEnums.stream().forEach(testEnum -> (output.set(output.get() | testEnum.getValue()));
or, since we're using AtomicInteger, we may as well make it thread-safe in case you later choose to use a parallel Stream,
AtomicInteger output = new AtomicInteger();
setOfEnums.stream().forEach(testEnum -> (output.getAndUpdate(prev -> prev | testEnum.getValue())));
Now that we've gone over an answer that most resembles what you asked about, we can talk about the superior solution of using reduction, that other answers have already recommended.
There are two kinds of reduction offered by Stream, stateless reduction (reduce(), and stateful reduction (collect()). To visualize the difference, consider a conveyer belt delivering hamburgers, and your goal is to collect all of the hamburger patties into one big hamburger. With stateful reduction, you would start with a new hamburger bun, and then collect the patty out of each hamburger as it arrives, and you add it to the stack of patties in the hamburger bun you set up to collect them. In stateless reduction, you start out with an empty hamburger bun (called the "identity", since that empty hamburger bun is what you end up with if the conveyer belt is empty), and as each hamburger arrives on the belt, you make a copy of the previous accumulated burger and add the patty from the new one that just arrived, discarding the previous accumulated burger.
The stateless reduction may seem like a huge waste, but there are cases when copying the accumulated value is very cheap. One such case is when accumulating primitive types -- primitive types are very cheap to copy, so stateless reduction is ideal when crunching primitives in applications such as summing, ORing, etc.
So, using stateless reduction, your example might become:
setOfEnums.stream()
.mapToInt(TestEnum::getValue) // or .mapToInt(testEnum -> testEnum.getValue())
.reduce(0, (resultSoFar, testEnum) -> resultSoFar | testEnum);
Some points to ponder:
Your original for loop is probably faster than using streams, except perhaps if your set is very large and you use parallel streams. Don't use streams for the sake of using streams. Use them if they make sense.
In my first example, I showed the use of Stream.forEach(). If you ever find yourself creating a Stream and just calling forEach(), it is more efficient just to call forEach() on the collection directly.
You didn't mention what kind of Set you are using, but I hope you are using EnumSet<TestEnum>. Because it is implemented as a bit field, It performs much better (O(1)) than any other kind of Set for all operations, even copying. EnumSet.noneOf(TestEnum.class) creates an empty Set, EnumSet.allOf(TestEnum.class) gives you a set of all enum values, etc.

Vectorize object oriented implementation in MATLAB

I'm trying to optimize a given object oriented code in matlab. It is an economical model and consists of a Market and Agents. The time consuming part is to update certain attributes of all Agents during each timestep which is implemented in a for loop.
However, I fail to vectorize the object oriented code.
Here is an example (Note, the second thing that slows down the code so far is the fact, that new entries are attached to the end of the vector. I'm aware of that and will fix that also):
for i=1:length(obj.traders)
obj.traders(i).update(obj.Price,obj.Sentiment(end),obj.h);
end
Where update looks like
function obj=update(obj,price,s,h)
obj.pos(end+1)=obj.p;
obj.wealth(end+1)=obj.w(1,1,1);
obj.g(end+1)=s;
obj.price=price;
obj.Update_pos(sentiment,h);
if (obj.c)
obj.Switch_Pos;
end
...
My first idea was to try something like
obj.traders(:).update(obj.Price,obj.Sentiment(end),obj.h);
Which didn't work. If someone has any suggestions how to vectorize this code, while keeping the object oriented implementation, I would be very happy.
I cannot provide a complete solution as this depends on the details of your implementation, but here are some tips which you could use to improve your code:
Remembering that a MATLAB object generally behaves like a struct, assignment of a constant value to a field can be done using [obj.field] =​deal(val); e.g.:
[obj.trader.price] = deal(obj.Price);
This can also be extended to non-constant RHS, using cell, like so:
[aStruct.(fieldNamesCell{idx})] = deal(valueCell{:}); %// or deal(numericVector(:));
To improve the update function, I would suggest making several lines where you create the RHS vectors\cells followed by "simultaneous" assignment to all relevant fields of the objects in the array.
Other than that consider:
setfield: s = setfield(s,{sIndx1,...,sIndxM},'field',{fIndx1,...,fIndxN},value);
structfun:
s = structfun(#(x)x(1:3), s, 'UniformOutput', false, 'ErrorHandler', #errfn);
"A loop-based solution can be flexible and easily readable".
P.S.
On a side note, I'd suggest you name the obj in your functions according to the class name, which would make it more readable to others, i.e.:
function obj=update(obj,price,s,h) => function traderObj=update(traderObj,price,s,h)

Understanding immutable composite types with fields of mutable types in Julia

Initial note: I'm working in Julia, but this question probably applies to many languages.
Setup: I have a composite type as follows:
type MyType
x::Vector{String}
end
I write some methods to act on MyType. For example, I write a method that allows me to insert a new element in x, e.g. function insert!(d::MyType, itemToInsert::String).
Question: Should MyType be mutable or immutable?
My understanding: I've read the Julia docs on this, as well as more general (and highly upvoted) questions on Stackoverflow (e.g. here or here), but I still don't really have a good handle on what it means to be mutable/immutable from a practical perspective (especially for the case of an immutable composite type, containing a mutable array of immutable types!)
Nonetheless, here is my attempt: If MyType is immutable, then it means that the field x must always point to the same object. That object itself (a vector of Strings) is mutable, so it is perfectly okay for me to insert new elements into it. What I am not allowed to do is try and alter MyType so that the field x points to an entirely different object. For example, methods that do the following are okay:
MyType.x[1] = "NewValue"
push!(MyType.x, "NewElementToAdd")
But methods that do the following are not okay:
MyType.x = ["a", "different", "string", "array"]
Is this right? Also, is the idea that the object that an immutable types field values are locked to are those that are created within the constructor?
Final Point: I apologise if this appears to duplicate other questions on SO. As stated, I have looked through them and wasn't able to get the understanding that I was after.
So here is something mind bending to consider (at least to me):
julia> immutable Foo
data::Vector{Float64}
end
julia> x = Foo([1.0, 2.0, 4.0])
Foo([1.0,2.0,4.0])
julia> append!(x.data, x.data); pointer(x.data)
Ptr{Float64} #0x00007ffbc3332018
julia> append!(x.data, x.data); pointer(x.data)
Ptr{Float64} #0x00007ffbc296ac28
julia> append!(x.data, x.data); pointer(x.data)
Ptr{Float64} #0x00007ffbc34809d8
So the data address is actually changing as the vector grows and needs to be reallocated! But - you can't change data yourself, as you point out.
I'm not sure there is a 100% right answer is really. I primarily use immutable for simple types like the Complex example in the docs in some performance critical situations, and I do it for "defensive programming" reasons, e.g. the code has no need to write to the fields of this type so I make it an error to do so. They are a good choice IMO whenever the type is a sort of an extension of a number, e.g. Complex, RGBColor, and I use them in place of tuples, as a kind of named tuple (tuples don't seem to perform well with Julia right now anyway, wheres immutable types perform excellently).

debugging chained relation declaration in alloy

I am using Alloy to model graph transformation.
I specify my transformation as different transformations which are applied to different part of the graph.
So I have a signature :
sig Transformation {
nodes : some Node,
added_node : one Special_Node
}
To apply this transformation I declare 3 relations in the fact part of the signature which apply to different part of the graph. The left part of a relation is related to the input graph and the right side to the output graph :
some mapping_rel0_nodes : rel0In one -> one rel0Out|{
C1 && C2 && C3
}
&&
some mapping_rel1_nodes : rel1In -> some (rel1Out+special_Node) | {
C1' && C2' && C3'
}
&&
some mapping_rel2_nodes : rel2In -> some (rel2Out+special_Node) |{
C1'' && C2'' && C3''
} &&
out.nodes <: connections = ~mapping_rel2_nodes.inpCnx.mapping_rel2_nodes +
~mapping_rel1_nodes.inpCnx.mapping_rel1_nodes +
~mapping_rel0_nodes.inpCnx.mapping_rel0_nodes
Each relation applies to disjoint different part of the graph, but they are connected by connections between them. The CX, CX' and CX'' are constraints applied on the relations. A Node has the following signature :
sig Node{
connections : set Node
}{
this !in this.#connections
}
To obtain the new connection I take all the connections in the input graph inpCnx and use the mapping obtained for each point to get the associated connections in the new graph.
My questions are :
Do mapping_relX_nodes are still known at his step of the fact?
When I control them in the evaluator and i do the operation manually on the appropriate instance it works, but expressed as fact, it returns no instances. I read this post and I was wondering if there are other tools to control the expression and variable, like debug print or else?
The relations have the same arity, but the rel0 is bijective and the others are just binary relation. Is there any constraint due to the bijectivity of rel0 that the union of these relations has to be bijective?
In my experience in the evaluator, when there is a duplication of a tuple, one of it is deleted : {A$0->b$0, A$0->B$0} becomes {A$0->B$0}. But sometimes it could be needed to keep it both, is there any way to have it both?
Thanks in advance.
You ask:
Do mapping_relX_nodes are still known at his step of the fact?
Without a full working model to test, it's hard to give an absolutely firm answer. But Alloy is purely declarative, and the uses of mapping_rel1_nodes etc. do not appear to be local variables, so the references in the fourth conjunct of your fact will be bound in the same way as the references in the other conjuncts. (Or not bound, if they are not bound.)
When I control them in the evaluator and i do the operation manually on the appropriate instance it works, but expressed as fact, it returns no instances. I read this post and I was wondering if there are other tools to control the expression and variable, like debug print or else?
Not that I know of. In my experience, when something seems to work as expected in the evaluator but I can't get it to work in a fact or predicate, it's almost invariably a failure on my part to get the semantics of the fact or predicate correct.
The relations have the same arity, but the rel0 is bijective and the others are just binary relation. Is there any constraint due to the bijectivity of rel0 that the union of these relations has to be bijective?
No (not unless I am totally misunderstanding your question).
In my experience in the evaluator, when there is a duplication of a tuple, one of it is deleted : {A$0->b$0, A$0->B$0} becomes {A$0->B$0}. But sometimes it could be needed to keep it both, is there any way to have it both?
Yes; Alloy works with sets. (So the duplicate is not "deleted" -- it's just that sets don't have duplicates.) To distinguish two tuples which would otherwise be identical, you can either (a) add another value to the tuple (so pairs become triples, triples become 4-tuples, and n-tuples become tuples with arity n+1), or (b) define a signature for objects representing the tuples. Since members of signatures have object identity, rather than value identity, they can be used to distinguish different occurrences of pairs like A$0->B$0.

Resources