Can I get the variable name from it's assignee? - ruby

Is it possible to have a class method that prints the name of the variable its parent class is assigned to? Example:
class Test
def who_am_I_assigned_to
puts "#{??}"
end
end
THIS = Test.new
THIS.who_am_I_assigned_to
# >> THIS

Technically yes, sort of.
But you'd have to recursively search through all the instance_variables of every object. And, as Cary points out, an object can be assigned to more than one variable at a time.
Don't do that.
The reason I was looking for this behavior was to use the variable name but also as a position within a grid of data. So if had a 2 dimensional Grid the variable names would be like "AA", "AB", "BC", etc. It is knew its name it would also know its position.
What you're describing is an "action at a distance" anti-pattern where changes to one part of a program change another part of a program with no obvious connection. This violates encapsulation; good encapsulation allows you to understand a single piece of a system by just looking at its inputs and outputs. Violating encapsulation means in order to understand one part of the system you have to understand every part of the system leading to a maintenance nightmare. Modern languages and practices strive to avoid this as much as possible.
For example, a variable's name should never matter to the behavior of the program. You should be able to safely perform a Rename Variable refactoring to name a variable according to what makes sense to the user of the object. In your example this would alter the behavior of the program violating the Principle of Least Astonishment.
Instead you'd have an object to represent your Grid and this would manage the relationships between Nodes in the Grid. Rather than passing around individual Nodes you'd pass around the Grid.
Or each Node can know who their neighboring Nodes are. An example of this would be a traditional Tree, Graph, or Linked List structure. The advantage here is there is no fixed position and the data structure can grow or shrink in any direction. Any Node can be passed around and it knows its position within the structure.

Related

How to decide to convert to categorical variable or keep it numeric?

This might be a basic or trivial question and might be straightforward. Still I would like to ask this to clear my doubt once and for all.
Take example of Passanger Class in Famous Titanic Data. Functionally it is indeed a Categorical Data, so it will make perfect sense to convert it to categorical variable. Algorithms as per my understanding tend to see a pattern specific to that class. But at the same time if you see it as numeric variable, it might denote a range also for a decision tree. Say passangers in between first class and second class.
It looks both are correct and both will affect the machine learning algorithm outputs in different ways.
Which one is appropriate and is there anywhere there is a extensive discussion about it? Should we use such ambiguous variables as numeric as well its copy as a categorical variable, which might prove to be a technique to uncover more patterns?
I suppose it's up to you whether you'd rather interpret a continuous PassengerClass variable as "for every one-unit increase in PassengerClass, the passenger's likelihood of survival goes up/down X%," versus a categorical (factor) PassengerClass as, "the likelihoods of survival for groups 2 and 3 (for example, leaving 1st-class passengers as the base group) are X and Y% percent higher, respectively, than the base group, holding all else constant."
I think about variables like PassengerClass almost as "treatment groups." Yes, I suppose you could interpret it as continuous, but I think it makes more sense to consider the unique effects of each class like "people who were given the drug versus those who weren't" - you can very easily compare the impacts of being in a higher class (e.g. 2 or 3) to being in the most common class, 1, which again would be left out.
The problem with mapping categorical notions to numerical is that some algorithms (e.g. neural networks) will interpret the value itself as having a meaning, i.e. you would get different results if you assign values 1,2,3 to passenger classes than, for example 0,1,2 or 3,2,1. The correspondence between the passenger classes and numbers is purely conventional and doesn't necessarily convey any additional meaning.
One could argue that the lesser the number, the "better" the class is, however it's still hard to interpret it as "the first class is twice as good as second class", unless you'll define some measure of "goodness" that will make the relation between numbers "1" and "2" sensible.
In this example, you have categorical data that is ordinal - meaning you can rank the categories (from best accommodations to worst, for example) but they're still categories. Regardless of how you label them, there's no actual information about the relative distances among your categories. You can put them in a table, but not (correctly) on a number line. In cases like this, it's generally best to treat your categorical data as independent categories.

Save/Restore Ruby's Random

I'm trying to create a game, which I want to always run the same given the same seed. That means that random events - be them what they may - will always be the same for two players using the same seed.
However, given the user's ability to save and load the game, Ruby's Random would reset every time the save loaded, making the whole principle void if two players save and load at different points.
The only solution I have imagined for this is, whenever a save file is loaded, to generate the same number of points as before, and thus getting Ruby's Random to the same state as it was before load. However, to do that I'd need to extend it so a counter is updated every time a random number is generated.
Does anyone know how to do that or has a better way to restore the state of Ruby's Random?
PS: I cannot use an instance of Random (Random.new) and Marshall it. I have to use Ruby's default.
Sounds like Marshal.dump/Marshal.load may be exactly what you want. The Random class documentation explicitly states "Random objects can be marshaled, allowing sequences to be saved and resumed."
You may still have problems with synchronization across games, since different user-based decisions can take you through different logic paths and thus use the sequence of random numbers in entirely different ways.
I'd suggest maybe saving the 'current' data to a file when the user decides to save (or when the program closes) depending on what you prefer.
This can be done using the File class in ruby.
This would mean you'd need to keep track of turns and pass that along with the save data. Or you could loop through the data in the file and find out how many turns have occurred that way I suppose.
So you'd have something like:
def loadGame(loadFile)
loadFile.open
data = loadFile.read
# What you do below here depends on how you decide to store the data in saveGame.
end
def saveGame(saveFile)
saveFile.open
saveFile.puts data
end
Havent really tried the above code so it could be bad syntax or such. It's mainly just the concept I'm trying to get across.
Hopefully that helps?
There are many generators that compute each random number in the sequence from the previous value alone, so if you used one of those you need only save the last random number as part of the state of the game. An example is a basic linear congruential generator, which has the form:
z(n+1) = (az(n) + b) mod c
where a, b and c are typically large (known) constants, and z(0) is the seed.
An arguably better one is the so-called "mulitply-with-carry" method.

Ruby : The best way to manage a large 3d array

I would like to know what is the best way to manage a large 3d array with something like :
x = 1000
y = 1000
z = 100
=> 100000000 objects
And each cell is an object with some amount of data.
Simple methods are very loooooong even if all data are collapsed (I a first tryed an array of array of array of objects)
class Test
def initialize
#name = "Test"
end
end
qtt = 1000*1000*100
Array.new(qtt).each { |e| e = Test.new }
I read somewhere that DB could be a good thing for such cases.
What do you think about this ?
What am I trying to do ?
This "matrix" represents a world. And each element is a 1mx1mx2m block who could be a different kind (water, mud, stone, ...) Some block could be empty too.
But the user should be able to remove blocks everywhere and change everything around (if they where water behind, it will flow through the hole for exemple.
In fact what I wish to do is not Minecraft be a really small clone of DwarfFortress (http://www.bay12games.com/dwarves/)
Other interesting things
In my model the groud is at level 10. It means that [0,10] is empty sky in most of cases.
Only hills and parts of mountains could be present on those layers.
Underground is basicaly unknown and not dug. So we should not have to add instances for unused blocks.
What we should add from the beginning to the model : gems, gold, water who could stored without having to store the adjacent stone/mood/earth blocks.
At the beginning of the game, 80% of the cube doesn't need to be loaded in memory.
Each time we dig we create new blocks : the empty block we dug and the blocks around.
The only things we should index is :
underground rivers
underground lakes
lava rivers
Holding that many objects in memory is never a good thing. A flat-file or database-centric approach would be a lot more efficient and easier to maintain.
What I would do - The object-oriented approach
Store the parameters of the blocks as simple data and construct the objects dynamically.
Create a Block class to represent a block in the game, and give it variables to hold the parameters of that particular block:
class Block
# location of the Block
attr_accessor :x, :y, :z
# an individual id for the Block
attr_accessor :id
# to define the block type (rock, water etc.)
attr_accessor :block_type
# and add any other attributes of a Block...
end
I'd then create a few methods that would enable me to serialise/de-serialise the data to a file or database.
As you've stated it works on a board, you'd also need a Board class to represent it that would maintain the state of the game as well as perform actions on the Block objects. Using the x, y, z attributes from each Block you can determine its location within the game. Using this information you can then write a method in the Block class that locates those blocks adjacent to the current one. This would enable you to perform the "cascading" effects you talk about where one Block is affected by actions on another.
Accessing the data efficiently
This will rely entirely on how you choose to serialise the Block objects. I would probably choose a binary format to reduce unnecessary data reads and store the objects via their id parameter, and then use something like MMIO to quickly do random-access reads/writes on a large data file in an Array-like manner. This will allow you to access the data quickly and efficiently, without the memory overhead. How you read the data will relate to your adjacent blocks method above.
You can of course also choose the DB storage route which will allow you to isolate the Blocks and do lookups on particular blocks in a higher-level manner, however that might give you a bit of extra overhead.
It sounds like an interesting project, I hope this helps a bit! :)
P.S With regards to the comment above by #Linuxious about choosing a different language. Yes this might be true in some cases, but a skilled programmer never blames his tools. A program is only as efficient as the programmer makes it...unless you're writing it in Java ;)

How can copying an entire list and appending an element be more efficient than just appending it?

Some claim that appending to immutable lists is more efficient. Is this true? How?
Producing a modified version of a list by allocating an array large enough to hold the modified version and copying over all of the unmodified elements is somewhat expensive, regardless of whether the modification is an append, an insert, a deletion, replacement, or anything else. The cost is roughly comparable to that of producing an unmodified, but distinct, copy of the list.
If an object Foo wishes to maintain a list of elements in such a way that it can only be changed when Foo changes it, there are two common approaches it can use to do so:
It can use an "immutable list" type which guarantees that any instance which has ever been exposed to the outside world will forever hold the same sequence of objects. The object `Foo` would be free to expose references to this list, since nobody would be able to alter it. If `Foo` wants to e.g. add an item to its list, it would generate a new immutable list which contains all the items in the list, plus the new one, and start holding a reference to that instead of the old one.
It can create a list object which is mutable, but is never exposed to the outside world. If anyone needs to retrieve the sequence of items from the list, `Foo` would copy the list's contents into a new list with which the caller could use in any way it sees fit without affecting `Foo`s list..
If one uses approach #1, then every time Foo alters the list it must create a new "immutable list" instance, but Foo could answer a request for the list's contents without having to copy it. If one uses approach #2, adding items to the list (and other modifications) will be cheaper, but answering a request for the list's contents will require copying the list. Whether it's better to use approach #1 or approach #2 will depend upon how often the list is updated, versus how often the application will need a copy of it.
Immutable objects can be shared between threads without synchronization. Synchronization negatively affects scaling, and can potentially be more costly than the copy.

Weak-memoizing result of multi-parameter function in OCaml

I am looking for a way to memoize the results of an OCaml function f that takes two parameters (or more, in general). In addition (and this is the difficult part), I want the map underlying this process to forget a result altogether if either of the values for the two parameters is garbage collected.
For a function that takes exactly one argument, this can be done with the Weak module and its Make functor in a straightforward way. To generalise this to something that can memoize functions of higher arity, a naive solution is to create a weak map from tuples of values to result values. But this will not work correctly with respect to garbage collection, as the tuple of values only exists within the scope of the memoization function, not the client code that calls f. In fact, the weak reference will be to the tuple, which is going to be garbage collected right after memoization (in the worst case).
Is there a way to do this without re-implementing Weak.Make?
Hash-consing is orthogonal to my requirements and is, in fact, not really desirable for my values.
Thanks!
Instead of indexing by tuples you could have a tree structure. You'd have one weak table indexed by the first function parameter whose entries are secondary weak tables. The secondary tables would be indexed by the second function parameter and contain the memoized results. This structure will forget the memoized function results as soon as either function parameter is GCed. However, the secondary tables themselves will be retained as long as the first function parameter is live. Depending on the sizes of your function results and the distribution of different first parameters, this could be a reasonable tradeoff.
I haven't tested this, either. Also it seems reasonably obvious.
One idea is to perform your own garbage collection.
For simplicity, let's assume that all arguments have the same type k.
In addition to the main weak table containing the memoized results keyed by k * k, create a secondary weak table containing single arguments of type k. The idea is to scan the main table once in a while and to remove the bindings that are no longer wanted. This is done by looking up the arguments in the secondary table; then if any of them is gone you remove the binding from the main table.
(Disclaimer: I haven't tested this; it may not work or there may be better solutions)
I know this is an old question, but my colleagues have recently developed an incremental computation library, called Adapton, that can handle this functionality. You can find the code here. You probably want to use the LazySABidi functor (the others are for benchmarking). You can look in the Applications folder for examples of how to use the library. Let me know if you have any more questions.

Resources