Evaluating execution flow for visual programming languages - algorithm

I'm reading about visual programming languages these days. So I've thought up two "paradigms". In both of them, you have one start point, and several end points.
Now, you could either begin at the start point or move in reverse from the end points (the order of end points is known).
Beginning from the start point feels weird, because you can have "splits" in the data flow. Say, if I have an interger, and this integer is needed by two functions simultaenously. Bad. I don't want to get into concurrent coding. Atleast not yet. Or should I?
Beginning at the end points feels much better. You start at the first end point. Check whatever is needed, and evaluate that. I believe this is the lazy evaluation. But the problem comes when you have multiple inputs. How do you decide the order in which to evaluate the inputs?
Can you point me to some articles/papers/something on the internet. Or mabye tell me a few keywords to look for?

If I get what you mean, using the same integer in two functions, is exactly that: you just use it twice, no need to bring concurrency in. If the 'implementation' you were thinking about destroyed input values, you could take a copy before using it.
int i = 2;
int j = fun1(i);
int k = fun2(i);
int res = fun3(j, k);
would become:
i = 2[A]
|
Clone[B]
/ \
/ \
/ \
i_1 i_2
| |
fun1[C] fun2[D]
| |
j k
\ /
\ /
\ /
fun3[E]
|
res
But there's no need of concurrency in order to evaluate the graph. You can just evaluate 'parallel' branches left to right (as indicated by the A-B-C-... labelling - see also here).
Top-down (aka from start to end), left-to-right feels more natural than bottom-up, provided bottom-up actually has a well defined meaning. Regarding the latter point, assuming you do have results for the program, you can't always compute the inputs: think about what happens when funXXX are not injective (for example fun1(x) = x*x) and thus not invertible.
I hope I'm not completely misinterpreting your train of thought.

Moving forward, what you want is the topological sort of your dependency graph - that is, an order in which to execute nodes such that you never execute a node before its dependencies. This assumes, naturally, that there are no cycles in your graph.
Moving backwards, what you're doing is recursively resolving the graph. Starting with the end node, for each dependency that is not yet calculated, you recursively invoke the procedure on that node, until all input values are evaluated. This has the advantage that you never process nodes that aren't required by a particular end state.
Which of the two approaches is best depends somewhat on what precisely you're doing.

Related

Minimum cost path / least cost path

I am currently using a library from skimage.graph and the function route_through_array to get the least cost path from one point to another in a cost map. The problem is that i have multiple start points and multiple end points, which leads to thousands of iterations. This i am fixing currently with two for loops. The following code is just an illustration:
img=np.random.rand(400,400)
img=img.astype(dtype=int)
indices=[]
costs=[]
start=[[1,1],[2,2],[3,3],[4,5],[6,17]]
end=[[301,201],[300,300],[305,305],[304,328],[336,317]]
for i in range(len(start)):
for j in range(len(end)):
index, weight = route_through_array(img, start[i],end[j])
indices.append(index)
costs.append(weight)
From what i understand from the documentation the function accepts many many end points but i do not know how to pass them in a function. Any ideas?
This should be possible much more efficiently by directly interacting with the skimage.graph.MCP Cython class. The convenience wrapper route_through_array isn't general enough. Assuming I understand your question correctly, what you are looking for is basically the MCP.find_costs() method.
Your code will then look like (neglecting imports)
img = np.random.rand(400,400)
img = img.astype(dtype=int)
starts = [[1,1], [2,2], [3,3], [4,5], [6,17]]
ends = [[301,201], [300,300], [305,305], [304,328], [336,317]]
# Pass full set of start and end points to `MCP.find_costs`
from skimage.graph import MCP
m = MCP(img)
cost_array, tracebacks_array = m.find_costs(starts, ends)
# Transpose `ends` so can be used to index in NumPy
ends_idx = tuple(np.asarray(ends).T.tolist())
costs = cost_array[ends_idx]
# Compute exact minimum cost path to each endpoint
tracebacks = [m.traceback(end) for end in ends]
Note that the raw output cost_array is actually a fully dense array the same shape as img, which has finite values only where you asked for end points. The only possible issue with this approach is if the minimum path from more than one start point converges to the same end point. You will only get the full traceback for the lower of these convergent paths through the code above.
The traceback step still has a loop. This is likely possible to remove by using the tracebacks_array and interacting with `m.offsets, which would also remove the ambiguity noted above. However, if you only want the minimum cost(s) and best path(s), this loop can be omitted - simply find the minimum cost with argmin, and trace that single endpoint (or a few endpoints, if multiple are tied for lowest) back.

Why are epsilon transitions used in NFA?

I'm trying to understand how to create NFA-s from regular expressions, but I am really confused from epsilon transitions. I have this example in my textbook , but I don't understand why epsilon transitions are used and how does one know when to use them.
In general, espilon-transitions are used when they are convenient. For example, when constructing an NFA from a regular expression, you start by constructing small parts of the automaton corresponding to parts of the expression. To connect them, you need to put a transition. But if there is no symbol to be read there, an epsilon transition is a simple way to do this. They are, however never necessary, you can always find a solution without them.
In your example, just apply the algorithm described in your textbook. It tells you when to use them.
The epsilon transitions
from 1 to 2 probably connects the parts for (a|b)* and for ac
1->5 and 8->1 probably result from the *
5->6 and 5->7 probably result from the alternative in |
Epsilon-transitions in NFAs are a natural representation of choice or disjunction or union in regular expressions. That is, a regular expression like r + s (or r | s or r U s depending on your preferred notation) is naturally represented as an NFA consisting of two independent NFAs, one for r and one for s, joined using e-transitions as follows:
e
----->q0----->(r)
|
| e
|
V
(s)
When used to connect states in more complicated ways, the effect may not be as easy or natural to describe, but essentially these transitions let you choose unconditionally among multiple options. So, if I have seen a part of the input already and there are a few different ways the string could end, I can represent that by using e-transitions to states that handle the different possibilities.
In your example, the e-transitions are not really serving any very useful function and are merely artifacts of the conversion algorithm you have used. That algorithm includes them because, in the general case, they may be useful or necessary. In your specific case this was not true, so they look out of place.

Queue-like datastructure that is readable from anywhere

I'm working on a game in Yoyo Game Maker where I have items moving along conveyor belts. As the items only move in one direction, I thought it would make most sense to use a queue or queue-like data structure to store the items. However, to be able to render the items, I need to be able to read all of them at any point in the queue, not just the head or tail.
[[a] [b] [c] [d]]
|
V
a <- [[ ] [b] [c] [d]] <- e
|
V
[[b] [c] [d] [e]]
| | | |
V V V V
b c d e
I could simply use an array that manually moves all its values forward by one slot every turn (using a for loop), but somehow that seems inefficient, laggy, or at the very least, bad form. My programming instincts recoil at the thought of using such a system, anyway.
Is this a correct assumption to make? Is an array really the best way of implementing a structure like this? Should I even be worried about efficiency, or are the differences in this case negligible?
Some advice or examples (in any programming language) would be greatly appreciated.
It might "seem" inefficient to use an array, but it most likely won't be. Take into consideration how many items will actually be on the conveyor belt at any single time. If you want quick random access to any index in a data structure, you must use an array, a DS List or a DS Grid(wouldn't make sense here, tough).
Using a DS List, you can use ds_list_delete(your_list, 0) to 'dequeue', much like a DS Queue, and ds_list_insert(your_list, 0, value) to 'enqueue' an item.
Iterating through the list is then very simple:
for ( var i = 0; i < ds_list_size( your_list ); i++ ) {
var item = your_list[|i];
}
It might also be relevant to add, that, in a game I'm working on, objects are build using components, which basically means that all of the enemies, the player, and such have a for-loop in their Step events to iterate through and update all components if need be. At most I have about 80 such objects in the game at any time and performance hasn't been a problem yet.
You should always first try to get a working solution, then testing it in conditions specific to your game: eg. if you are going to need a 100 items in the structure, try that, and if the performance is not satisfactory, only then optimize.
I had made a game in game maker which had a similar requirement. If you know about the frog game it had some woods floating on a river.
Movement: For that, I had created some number of the wood object which moves in the particular direction. when frog collides with any one of them he/she starts to float in that direction. Also, if it goes out if window in one direction I would place it back at the other side of window.
Picking: I think to pick up one of the item, You will need to define the collision inside each object. After collision you can define you next steps.

suggestions on constructing nested statements properly

I am having trouble constructing my own nested selection statements (ifs) and repetition statements (for loops, whiles and do-whiles). I can understand what most simple repetition and selection statements are doing and although it takes me a bit longer to process what the nested statements are doing I can still get the general gist of the code (keeping count of the control variables and such). However, the real problem comes down to the construction of these statements, I just can't for the life of me construct my own statements that properly aligns with the pseudo-code.
I'm quite new to programming in general so I don't know if this is an experience thing or I just genuinely lack a very logical mind. It is VERY demoralising when it takes me a about an hour to complete 1 question in a book when I feel like it should just take a fraction of the time.
Can you people give me some pointers on how I can develop proper nested selection and repetition statements?
First of all, you need to understand this:
An if statement defines behavior for when **a decision is made**.
A loop (for, while, do-while) signifies **repetitive (iterative) work being done** (such as counting things).
Once you understand these, the next step, when confronted with a problem, is to try to break that problem up in small, manageable components:
i.e. decisions, that provide you with the possible code paths down the way,
and various work you need to do, much of which will end up being repetitive,
hence the need for one or more loops.
For instance, say we have this simple problem:
Given a positive number N, if N is even, count (display numbers) from
0(zero) to N in steps of 2, if N is odd, count from 0 to N in steps of
1.
First, let's break up this problem.
Step 1: Get the value of N. For this we don't need any decision, simply get it using the preferred method (from file, read console, etc.)
Step 2: Make a decision: is N odd or even?
Step 3: According to the decision made in Step 2, do work (count) - we will iterate from 0 to N, in steps of 1 or 2, depending on N's parity, and display the number at each step.
Now, we code:
//read N
int N;
cin<<N;
//make decision, get the 'step' value
int step=0;
if (N % 2 == 0) step = 2;
else step = 1;
//do the work
for (int i=0; i<=N; i=i+step)
{
cout >> i >> endl;
}
These principles, in my opinion, apply to all programming problems, although of course, usually it is not so simple to discern between concepts.
Actually, the complicated phase is usually the problem break-up. That is where you actually think.
Coding is just translating your thinking so the computer can understand you.

Lua: Code optimization vector length calculation

I have a script in a game with a function that gets called every second. Distances between player objects and other game objects are calculated every second there. The problem is that there can be thoretically 800 function calls in 1 second(max 40 players * 2 main objects(1 up to 10 sub-objects)). I have to optimize this function for less processing. this is my current function:
local square = math.sqrt;
local getDistance = function(a, b)
local x, y, z = a.x-b.x, a.y-b.y, a.z-b.z;
return square(x*x+y*y+z*z);
end;
-- for example followed by: for i = 800, 1 do getDistance(posA, posB); end
I found out, that the localization of the math.sqrt function through
local square = math.sqrt;
is a big optimization regarding to the speed, and the code
x*x+y*y+z*z
is faster than this code:
x^2+y^2+z^2
I don't know if the localization of x, y and z is better than using the class method "." twice, so maybe square(a.x*b.x+a.y*b.y+a.z*b.z) is better than the code local x, y, z = a.x-b.x, a.y-b.y, a.z-b.z;
square(x*x+y*y+z*z);
Is there a better way in maths to calculate the vector length or are there more performance tips in Lua?
You should read Roberto Ierusalimschy's Lua Performance Tips (Roberto is the chief architect of Lua). It touches some of the small optimizations you're asking about (such as localizing library functions and replacing exponents with their mutiplicative equivalents). Most importantly, it conveys one of the most important and overlooked ideas in engineering: sometimes the best solution involves changing your problem. You're not going to fix a 30-million-calculation leak by reducing the number of CPU cycles the calculation takes.
In your specific case of distance calculation, you'll find it's best to make your primitive calculation return the intermediate sum representing squared distance and allow the use case to call the final Pythagorean step only if they need it, which they often don't (for instance, you don't need to perform the square root to compare which of two squared lengths is longer).
This really should come before any discussion of optimization, though: don't worry about problems that aren't the problem. Rather than scouring your code for any possible issues, jump directly to fixing the biggest one - and if performance is outpacing missing functionality, bugs and/or UX shortcomings for your most glaring issue, it's nigh-impossible for micro-inefficiencies to have piled up to the point of outpacing a single bottleneck statement.
Or, as the opening of the cited article states:
In Lua, as in any other programming language, we should always follow the two
maxims of program optimization:
Rule #1: Don’t do it.
Rule #2: Don’t do it yet. (for experts only)
I honestly doubt these kinds of micro-optimizations really help any.
You should be focusing on your algorithms instead, like for example get rid of some distance calculations through pruning, stop calculating the square roots of values for comparison (tip: if a^2<b^2 and a>0 and b>0, then a<b), etc etc
Your "brute force" approach doesn't scale well.
What I mean by that is that every new object/player included in the system increases the number of operations significantly:
+---------+--------------+
| objects | calculations |
+---------+--------------+
| 40 | 1600 |
| 45 | 2025 |
| 50 | 2500 |
| 55 | 3025 |
| 60 | 3600 |
... ... ...
| 100 | 10000 |
+---------+--------------+
If you keep comparing "everything with everything", your algorithm will start taking more and more CPU cycles, in a cuadratic way.
The best option you have for optimizing your code isn't not in "fine tuning" the math operations or using local variables instead of references.
What will really boost your algorithm will be eliminating calculations that you don't need.
The most obvious example would be not calculating the distance between Player1 and Player2 if you already have calculated the distance between Player2 and Player1. This simple optimization should reduce your time by a half.
Another very common implementation consists in dividing the space into "zones". When two objects are on the same zone, you calculate the space between them normally. When they are in different zones, you use an approximation. The ideal way of dividing the space will depend on your context; an example would be dividing the space into a grid, and for players on different squares, use the distance between the centers of their squares, that you have computed in advance).
There's a whole branch in programming dealing with this issue; It's called Space Partitioning. Give this a look:
http://en.wikipedia.org/wiki/Space_partitioning
Seriously?
Running 800 of those calculations should not take more than 0.001 second - even in Lua on a phone.
Did you do some profiling to see if it's really slowing you down? Did you replace that function with "return (0)" to verify performance improves (yes, function will be lost).
Are you sure it's run every second and not every millisecond?
I haven't see an issue running 800 of anything simple in 1 second since like 1987.
If you want to calc sqrt for positive number a, take a recursive sequense
x_0 = a
x_n+1 = 1/2 * (x_n + a / x_n)
x_n goes to sqrt(a) with n -> infinity. first several iterations should be fast enough.
BTW! Maybe you'll try to use the following formula for length of vector instesd of standart.
local getDistance = function(a, b)
local x, y, z = a.x-b.x, a.y-b.y, a.z-b.z;
return x+y+z;
end;
It's much more easier to compute and in some cases (e.g. if distance is needed to know whether two object are close) it may act adequate.

Resources