I have the following scenario (preliminary apologies for length, but I wanted to be as descriptive as possible):
I am presented with a list of "recipes" (Ri) that must be fulfilled, in the order presented, to complete a given task. Each recipe consists of a list of the parts (Pj) required to complete it. A recipe typically requires up to 3 or 4 parts, but might require as many as 16. An example recipe list might look like:
R1 = {P1}
R2 = {P4}
R3 = {P2, P3, P4}
R4 = {P1, P4}
R5 = {P1, P2, P2} //Note that more than 1 of a given part may be needed. (Here, P2)
R6 = {P2, P3}
R7 = {P3, P3}
R8 = {P1} //Note that recipes may recur within the list. (Same as R1)
The longest list might consist of a few hundred recipes, but typically contains many recurrences of some recipes, so eliminating identical recipes will generally reduce the list to fewer than 50 unique recipes.
I have a bank of machines (Mk), each of which has been pre-programmed (this happens once, before list processing has begun) to produce some (or all) of the available types of parts.
An iteration of the fulfillment process occurs as follows:
The next recipe in the list is presented to the bank of machines.
On each machine, one of its available programs is selected to produce one of the parts required by this recipe, or, if it is not required for this recipe, it is set "offline."
A "crank" is turned, and each machine (that has not been "offlined") spits out one part.
Combining the parts produced by one turn of the crank fulfills the recipe. Order is irrelevant, e.g., fulfilling recipe {P1, P2, P3} is the same as fulfilling recipe {P1, P3, P2}.
The machines operate instantaneously, in parallel, and have unlimited raw materials, so there are no resource or time/scheduling constraints. The size k of the bank of machines must be at least equal to the number of elements in the longest recipe, and thus has roughly the same range (typically 3-4, possibly up to 16) as the recipe lengths noted above. So, in the example above, k=3 (as determined by the size of R3 and R5) seems a reasonable choice.
The question at hand is how to pre-program the machines so that the bank is capable of fulfilling all of the recipes in a given list. The machine bank shares a common pool of memory, so I'm looking for an algorithm that produces a programming configuration that eliminates (entirely, or as much as possible) redundancy between machines, so as to minimize the amount of total memory load. The machine bank size k is flexible, i.e., if increasing the number of machines beyond the length of the longest recipe in a given list produces a more optimal solution for the list (but keeping a hard limit of 16), that's fine.
For now, I'm considering this a unicost problem, i.e., each program requires the same amount of memory, although I'd like the flexibility to add per-program weighting in the future. In the example above, considering all recipes, P1 occurs at most once, P2 occurs at most twice (in R5), P3 occurs at most twice (in R7), and P4 occurs at most once, so I would ideally like to achieve a configuration that matches this - only one machine configured to produce P1, two machines configured to produce P2, two machines configured to produce P3, and one machine configured to produce P4. One possible minimal configuration for the above example, using machine bank size k=3, would be:
M1 is programmed to produce either P1 or P3
M2 is programmed to produce either P2 or P3
M3 is programmed to produce either P2 or P4
Since there are no job-shop-type constraints here, my intuition tells me that this should reduce to a set-cover problem - something like the minimal unate set-cover problem found in designing digital systems. But I can't seem to adapt my (admittedly limited) knowledge of those algorithms to this scenario. Can someone confirm or deny the feasibility of this approach, and, in either case, point me towards some helpful algorithms? I'm looking for something I can integrate into an existing chunk of code, as opposed to something prepackaged like Berkeley's Espresso.
Thanks!
This reminds me of the graph coloring problem used for register allocation in compilers.
Step 1: if the same part is repeated in a recipe, rename it; e.g., R5 = {P1, P2, P2'}
Step 2: insert all the parts into a graph with edges between parts in the same recipe
Step 3: color the graph so that no two connected nodes (parts) have the same color
The colors are the machine identities to make the parts.
This is sub-optimal because the renamed parts create false constraints in other recipes. You may be able to fix this with "coalescing." See Briggs.
Related
It is a programming assignment in Prolog to write a program that takes in input of processes, resources, etc and either print out the safe order of execution or simply return false if there is no safe order of execution.
I am very new to Prolog and I just simply can't get myself to think in the way it is demanding me to think (which is, formulating the problem logically and letting the compiler figure out how to do the operations) and I am stuck just thinking procedurally. How would you formulate such a problem logically, in terms of predicates and whatnot?
The sample input goes as follows: a list of processes, a list of pairs of resource and the number of available instances of that resource and facts allocated(x,y) with x being a process and y being a list of resources allocated to x and finally requested(x,y) such that x is a process and y is a list of resources requested by x.
For the life of me I can't think of it in terms of anything but C++. I am too stuck. I don't even need code, just clarity.
edit: here's a sample input. I seriously just need to see what I need to do. I am completely clueless.
processes([p1, p2, p3, p4, p5]).
available_resources([[r1, 2], [r2, 2], [r3, 2], [r4, 1], [r5, 2]]).
allocated(p1, [r1, r2]).
allocated(p2, [r1, r3]).
allocated(p3, [r2]).
allocated(p4, [r3]).
allocated(p5, [r4]).
requested(p5, [r5]).
What you want to do is apply the "state search" approach.
Start with an initial state S0.
Apply a transformation to S0 according to allowed rules, giving S1. The rules must allowed only consistent states to be created. For example, the rules may not allow to generate new resources ex nihilo.
Check whether the new state S1 fulfills the condition of a "final state" or "solution state" permitting you to declare victory.
If not, apply a transformation to S1, according to allowed rules, giving S2.
etc.
Applying transformations may get you to generate a state from which no progress is possible but which is not a "final state" either. You are stuck. In that case, dump a few of the latest transformations, moving back to an earlier state, and try other transformations.
Through this you get a tree of states through the state space as you explore the different possibilites to reach one of the final states (or the single final state, depending on the problem).
What we need is:
A state description ;
A set of allowed state transformations (sometimes called "operators") ;
A criterium to decide whether we are blocked in a state ;
A criterium to decide whether we have found a final state ;
Maybe a heuristic to decide which state to try next. If the state space is small enough, we can try everything blindly, otherwise something like A* might be useful.
The exploration algorithm itself. Starting with an initial state, it applies operators, generating new states, backtracks if blocked, and terminates if a final state has been reached.
State description
A state (at any time t) is described by the following relevant information:
a number of processes
a number of resources, several of the same kind
information about which processes have allocated which resources
information about which processes have requested which resources
As with anything else in informatics, we need a data structure for the above.
The default data structure in Prolog is the term, a tree of symbols. The extremely useful list is only another representation of a particular tree. One has to a representation so that it speaks to the human and can still be manipulated easily by Prolog code. So how about a term like this:
[process(p1),owns(r1),owns(r1),owns(r2),wants(r3)]
This expresses the fact that process p1 owns two resources r1 and one resource r2 and wants r3 next.
The full state is then a list of list specifying information about each process, for example:
[[process(p1),owns(r1),owns(r1),owns(r2),wants(r3)],
[process(p2),owns(r3),wants(r1)],
[process(p3),owns(r3)]]
Operator description
Prolog does not allow "mutable state", so an operator is a transformation from one state to another, rather than a patching of a state to represent some other state.
The fact that states are not modified in-place is of course very important because we (probably) want to retain the states already visited so as to be able to "back track" to an earlier state in case we are blocked.
I suppose the following operators may apply:
In state StateIn, process P requests resource R which it needs but can't obtain.
request(StateIn, StateOut, P, R) :- .... code that builds StateOut from StateIn
In state StateIn, process P obtains resource R which is free.
obtain(StateIn, StateOut, P, R) :- .... code that builds StateOut from StateIn
In state StateIn, process P frees resource R which is owns.
free(StateIn, StateOut, P, R) :- .... code that builds StateOut from StateIn
The code would be written such that if StateIn were
[[process(p1),owns(r1),owns(r1),owns(r2),wants(r3)],
[process(p2),owns(r3),wants(r1)],
[process(p3),owns(r3)]]
then free(StateIn, StateOut, p1, r2) would construct a StateOut
[[process(p1),owns(r1),owns(r1),wants(r3)],
[process(p2),owns(r3),wants(r1)],
[process(p3),owns(r3)]]
which becomes the new current state in the search loop. Etc. Etc.
A criterium to decide whether we are blocked in the current state
Often being "blocked" means that no operators are applicable to the state because for none of the operators, valid preconditions hold.
In this case the criterium seems to be "the state implies a deadlock".
So a predicate check_deadlock(StateIn) needs to be written. It has to test the state description for any deadlock conditions (performing its own little search, in fact).
A criterium to decide whether we have found a final state
This is underspecified. What is a final state for this problem?
In any case, there must be a check_final(StateIn) predicate which returns true if StateIn is, indeed, a final state.
Note that the finality criterium may also be about the whole path from the start state to the current state. In that case: check_path([StartState,NextState,...,CurrentState]).
The exploration algorithm
This can be relatively short in Prolog as you get depth-first search & backtracking for free if you don't use specific heuristics and keep things primitive.
You are all set!
I've been given as an assignment to write using prolog a solver for
the battleships solitaire puzzle. To those unfamiliar, the puzzle deals
with a 6 by 6 grid on which a series of ships are placed according to the provided
constraints on each row and column, i.e. the first row must contain 3 squares with ships, the second row must contain 1 square with a ship, the third row must contain 0 squares etc for the other rows and columns.
Each puzzle comes with it's own set of constraints and revealed squares, typically two. An example can be seen here:
battleships
So, here's what I've done:
step([ShipCount,Rows,Cols,Tiles],[ShipCount2,Rows2,Cols2,Tiles2]):-
ShipCount2 is ShipCount+1,
nth1(X,Cols,X1),
X1\==0,
nth1(Y,Rows,Y1),
Y1\==0,
not(member([X,Y,_],Tiles)),
pairs(Tiles,TilesXY),
notdiaglist(X,Y,TilesXY),
member(T,[1,2,3,4,5,6]),
append([X,Y],[T],Tile),
append([Tile],Tiles,Tiles2),
dec_elem1(X,Cols,Cols2),dec_elem1(Y,Rows,Rows2).
dec_elem1(1,[A|Tail],[B|Tail]):- B is A-1.
dec_elem1(Count,[A|Tail],[A|Tail2]):- Count1 is Count-1,dec_elem1(Count1,Tail,Tail2).
neib(X1,Y1,X2,Y2) :- X2 is X1,(Y2 is Y1 -1;Y2 is Y1+1; Y2 is Y1).
neib(X1,Y1,X2,Y2) :- X2 is X1-1,(Y2 is Y1 -1;Y2 is Y1+1; Y2 is Y1).
neib(X1,Y1,X2,Y2) :- X2 is X1+1,(Y2 is Y1 -1;Y2 is Y1+1; Y2 is Y1).
notdiag(X1,Y1,X2,Y2) :- not(neib(X1,Y1,X2,Y2)).
notdiag(X1,Y1,X2,Y2) :- neib(X1,Y1,X2,Y2),((X1 == X2,t(Y1,Y2));(Y1 == Y2,t(X1,X2))).
notdiaglist(X1,Y1,[]).
notdiaglist(X1,Y1,[[X2,Y2]|Tail]):-notdiag(X1,Y1,X2,Y2),notdiaglist(X1,Y1,Tail).
t(X1,X2):- X is abs(X1-X2), X==1.
pairs([],[]).
pairs([[X,Y,Z]|Tail],[[X,Y]|Tail2]):-pairs(Tail,Tail2).
I represent a state with a list: [Count,Rows,Columns,Tiles]. The last state must be
[10,[0,0,0,0,0,0],[0,0,0,0,0,0], somelist]. A puzzle starts from an initial state, for example
initial([1, [1,3,1,1,1,2] , [0,2,2,0,0,5] , [[4,4,1],[2,1,0]]]).
I try to find a solution in the following manner:
run:-initial(S),step(S,S1),step(S1,S2),....,step(S8,F).
Now, here's the difficulty: if i restrict myself to one type of ship parts by using member(T,[1])
instead of
member(T,[1,2,3,4,5,6])
it works fine. However, when I use the full range of possible values for T which are needed
later, the query never ends since it runs for too long. this puzzles me, since :
(a) it works for 6 types of ships but only for 8 steps instead of 9
(b) going from a single type of ship to 6 types increases the number
of options for just the last step by a factor of 6, which
shouldn't have such a dramatic effect.
So, what's going on?
To answer your question directly, what's going on is that Prolog is trying to sift through an enormous space of possibilities.
You're correct that altering that line increases the search space of the last call by a factor of six, note that the size of the search space of, say, nine calls, isn't proportional to 9 times the size of one call. Prolog will backtrack on failure, so it's proportional (bounded above, actually) to the size of the possible results of one call raised to the ninth power.
That means we can expect the size of the space Prolog needs to search to grow by at most a factor of 6^9 = 10077696 when we allow T to take on 6 times as many values.
Of course, it doesn't help that (as far as I was able to tell) a solution doesn't exist if we call step 9 times starting with initial anyways. Since that last call is going to fail, Prolog will keep trying until it's exhausted all possibilities (of which there are a great many) before it finally gives up.
As far as a solution goes, I'm not sure I know enough about the problem. If the value if T is the kind of ship that fits in the grid (e.g. single square, half of a 2-square-ship, part of a 3-square-ship) you should note that that gives you a lot more information than the numbers on the rows/columns.
Right now, in pseudocode, your step looks like this:
Find a (X,Y) pair that has non-zero markings on its row/column
Check that there isn't already a ship there
Check that it isn't diagonal to a ship
Pick a kind of ship-part for it to be.
I'd suggest you approach like this:
Finish any already placed ship-bits to form complete ships (if we can't: fail)
Until we're finished:
Find acceptable places to place ship
Check that the markings on the row/column aren't zero
Try to place an entire ship here. (instead of a single part)
By using the most specific information that we have first (in this case, the previously placed parts), we can reduce the amount of work Prolog has to do and make things return reasonably fast.
I am taking a course on models of computation and currently we are doing finite state machines. One my tasks is to draw out a FSM that performs division of 3; to simplify the model the machine only accepts numbers multiple of 3. I am not sure how this exactly works, especially since I imagine FSM putting out only single binary values. Could you guys give examples (division by 2 or 4) or hints on how to approach this?
This is what you need, I think (sorry about the bad picture). The 'E' represents epsilon/lambda/no-output. The label of the edges denotes 'input/output'. For each symbol read there is also a corresponding output which may be lambda (no output).
This is just an experiment I'm trying to wrap my brain around.
So I've got two registers r1 r2 and two wires w1 w2. What I want is, if both r's are 1, both w's should be 1. If one r is 1, the corresponding w should be 1 and the other should be 0. If both r's are 0, w1 should be 1 and w2 should be 0.
11=>11
10=>10
01=>01
00=>10
The caveat is I want the assign for w1 not to include r2 directly, and vice versa. So, I've got (in Verilog for instance--a VHDL answer would be perfectly fine too)
assign w1 = r1 | !w2;
assign w2 = r2 | !w1;
Which is necessary but not sufficient. All the cases above are true, but 00=>01 is also true. In fact when r1=r2=0, it just creates a cycle of wires without a driver, so I think the result is non-deterministic.
Is there any way to get the result I'm looking for without including r2 in the assignment for w1, or vice versa? (And without introducing new variables). Basically just to ensure that in a wire-cycle, w1 is pulled high and w2 pulled low?
No, I think there is no clean way to do this without extra wires/signals and without your cross dependency.
By the way, your "cyclical wires" are commonly referred to as a combinational loops and it is a good practice to avoid these.
As for the simulation of a VHDL model with combinational loop, the result is deterministic provided the simulator converges to a stable point, ie no more signal value change. If signal values continuously change, then you are likely to reach your simulator's iteration limit. I don't know for Verilog but I assume it is deterministic as well.
As for synthesis, tools with either reject this construct and raise an error, or try to handle this, with a possible very bad impact on timing.
Again, even if your simulation is ok and your synthesis tool allows this, combinational loops should be avoided.
What you have currently is very similar to an SR latch, and as such it has a metastable condition (also known as a race condition).
From your truth table above though, it looks like w2 should be set only to r2.
assign w2 = r2;
That change should fix your race condition; though as expressed above, be wary of the large restrictions created by combinational logic.
I currently have an application which can contain 100s of user defined formulae. Currently, I use reverse polish notation to perform the calculations (pushing values and variables on to a stack, then popping them off the stack and evaluating). What would be the best way to start parallelizing this process? Should I be looking at a functional language?
The calculations are performed on arrays of numbers so for example a simple A+B could actually mean 100s of additions. I'm currently using Delphi, but this is not a requirement going forward. I'll use the tool most suited to the job. Formulae may also be dependent on each other So we may have one formula C=A+B and a second one D=C+A for example.
Let's assume your formulae (equations) are not cyclic, as otherwise you cannot "just" evaluate them. If you have vectorized equations like A = B + C where A, B and C are arrays, let's conceptually split them into equations on the components, so that if the array size is 5, this equation is split into
a1 = b1 + c1
a2 = b2 + c2
...
a5 = b5 + c5
Now assuming this, you have a large set of equations on simple quantities (whether integer, rational or something else).
If you have two equations E and F, let's say that F depends_on E if the right-hand side of F mentions the left-hand side of E, for example
E: a = b + c
F: q = 2*a + y
Now to get towards how to calculate this, you could always use randomized iteration to solve this (this is just an intermediate step in the explanation), following this algorithm:
1 while (there is at least one equation which has not been computed yet)
2 select one such pending equation E so that:
3 for every equation D such that E depends_on D:
4 D has been already computed
5 calculate the left-hand side of E
This process terminates with the correct answer regardless on how you make your selections on line // 2. Now the cool thing is that it also parallelizes easily. You can run it in an arbitrary number of threads! What you need is a concurrency-safe queue which holds those equations whose prerequisites (those the equations depend on) have been computed but which have not been computed themselves yet. Every thread pops out (thread-safely) one equation from this queue at a time, calculates the answer, and then checks if there are now new equations so that all their prerequisites have been computed, and then adds those equations (thread-safely) to the work queue. Done.
Without knowing more, I would suggest taking a SIMD style approach if possible. That is, create threads to compute all formulas for a single data set. Trying to divide the computation of formulas to parallelise them wouldn't yield much speed improvement as the logic required to be able to split up the computations into discrete units suitable for threading would be hard to write and harder to get right, the overhead would cancel out any speed gains. It would also suffer quickly from diminishing returns.
Now, if you've got a set of formulas that are applied to many sets of data then the parallelisation becomes easier and would scale better. Each thread does all computations for one set of data. Create one thread per CPU core and set its affinity to each core. Each thread instantiates one instance of the formula evaluation code. Create a supervisor which loads a single data set and passes it an idle thread. If no threads are idle, wait for the first thread to finish processing its data. When all data sets are processed and all threads have finished, then exit. Using this method, there's no advantage to having more threads than there are cores on the CPU as thread switching is slow and will have a negative effect on overall speed.
If you've only got one data set then it is not a trivial task. It would require parsing the evaluation tree for branches without dependencies on other branches and farming those branches to separate threads running on each core and waiting for the results. You then get problems synchronizing the data and ensuring data coherency.