Related
For example, in Huggingface's example:
encoded_input = tokenizer("Do not meddle in the affairs of wizards, for they are subtle and quick to anger.")
print(encoded_input)
{'input_ids': [101, 2079, 2025, 19960, 10362, 1999, 1996, 3821, 1997, 16657, 1010, 2005, 2027, 2024, 11259, 1998, 4248, 2000, 4963, 1012, 102],
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
The input_ids vector already encode the order of each token in the original sentence. Why does it need positional encoding again with an extra vector to represent it?
The reason is the design of the neural architecture. BERT consists of self-attention and feedforward sub-layers, and neither of them is sequential.
The feedforward layers process each token independently of others.
The self-attention views the input states as an unordered set of states. Attention can be interpreted as soft probabilistic retrieval from a set of values according to some keys. The position embeddings are there so the keys can contain information about their relative order.
I 'am developing a technique for sorting a table that contains either 0 or 1 such as:
{{1, 1, 0, 1, 1, 1, 1, 1},
{1, 1, 0, 0, 0, 0, 1, 0},
{1, 1, 1, 1, 1, 1, 1, 0},
{1, 1, 1, 1, 1, 1, 1, 0},
{1, 1, 1, 0, 0, 0, 1, 0},
{1, 1, 1, 1, 1, 1, 1, 0},
{0, 0, 0, 0, 0, 1, 0, 1},
{1, 1, 1, 1, 1, 0, 0, 0},
{1, 1, 1, 1, 1, 1, 0, 1},
{0, 0, 0, 1, 0, 1, 0, 1},
{1, 1, 1, 1, 1, 0, 0, 0},
{1, 1, 1, 1, 1, 0, 0, 0}}
The objective is to count the total per column and sort the table:
I. Descending based on the total per column.
II. coverage. For instance, in the 1st row the 3rd value is 0. We'll have to find the 1st column that has 1 in the 3rd column and re-sort the columns. In other words, 1 stands for coverage and we have to make sure that we cover all within the 1st few columns.
I managed to get the total per column, as follows:
For (i=0; i<m; i++)
For (j=0; j< TS.Size(); j++)
if (tc.detected()==1)
TS_Detect[j][i]= 1
else
TS_Detect[j][i]= 0
TC_Sum=(2, TS.Size())
For (k=0; k<TS.Size(); k++)
TC_Sum(0, k)=k
For (l=0; l< m; l++)
Flag=TS_Detect[l][k]
If (flag == 1)
TC_Sum(1, k)= TC_Sum(1, k)+1
int temp
For (g=0; g<TC_Sum.length-1; g++)
For (b=1; b< TC_Sum.length-1; b++)
If (TC_Sum[b-1]< TC_Sum[b])
temp= TC_Sum[b-1]
TC_Sum[b-1]= TC_Sum[b]
TC_Sum[b]= temp
return TC_Sum
The problem now is that I couldn't sort the original array (TC_Detect) based on the column number from TC_Sum.
Consequently, I would like to re-sort the table so if a column has 0, the next one will be 1.
The expected output for the above example will look like:
{{1, 1, 0, 1, 1, 1, 1, 1},
{1, 1, 1, 1, 1, 1, 1, 0},
{1, 1, 0, 0, 0, 0, 1, 0},
{1, 1, 1, 1, 1, 1, 1, 0},
{0, 0, 0, 0, 0, 1, 0, 1},
{1, 1, 1, 0, 0, 0, 1, 0},
{1, 1, 1, 1, 1, 1, 1, 0},
{0, 0, 0, 1, 0, 1, 0, 1},
{1, 1, 1, 1, 1, 0, 0, 0},
{1, 1, 1, 1, 1, 1, 0, 1},
{1, 1, 1, 1, 1, 0, 0, 0},
{1, 1, 1, 1, 1, 0, 0, 0}}
Any suggestion, please.
I'm not sure what language you are using, but I think my answer is general enough.
I assume that you have a list of lists, let's call it A.
A = [ [0,1,0,0] , [1,0,1,1] , [0,0,0,0] ]
You've used your counting algorithm above to make another list, call it S for sum.
S = [ 3 , 1 , 0 ]
You now want to sort A based on the values of S.
To make things easy, let's define a third list that we'll call I for index.
I = [ 0 , 1 , 2 ]
I would continue up to 3,4,5,6,... depending on the number of elements in your list
What you need now is a sort function that allows you to sort based on a key. Such a sort function usually takes the thing you want to sort along with a function for comparing two items.
In this case, sort I. The sort function is then passed indices. Compare these indices based on the values in S. The result is a list I* containing indices sorted according to S. You can now reorder A based on I*.
I am not sure what language you are using, but the following Python code accomplishes this:
def MyComparison(i,j):
return S[j]-S[i]
A = [ [0,1,0,0] , [1,0,1,1], [0,0,0,0] ]
S = [ 1 , 3 , 0 ]
I = [ 0 , 1 , 2 ]
Istar = sorted(I, cmp=MyComparison)
#The above returns: [2, 0, 1]. If this is the wrong order, reverse the result.
[A[x] for x in Istar]
#The above returns: [[1, 0, 1, 1], [0, 1, 0, 0], [0, 0, 0, 0]]
Note that the comparison function returns -1, 0, or 1 depending on the relative ranking of the items compared.
Im trying to write a prolog program that receives a representation of an unsolved Hashi board and answers all the possible solutions, using restrictions. Im having an hard time figuring out which is the best (or a very good) way of representing the board with the bridges and without. The program is supposed to draw the boards for an easy reading of the solutions.
board(
[[3, 0, 6, 0, 0, 0, 6, 0, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[2, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 3, 0, 0, 2, 0, 0, 0],
[0, 3, 0, 0, 0, 0, 4, 0, 1]]
).
For example, this representation is only good without the bridges, since it holds no info about them. The drawing of this board would be basicly turning the 0's into spaces, and the board would be drawn like this:
3 6 6 3
1
2 1
1 3 2
3 4 1
which is a decent representation of a real hashi board.
The point now is to be able to draw the same thing, but also draw bridges if there's any. I must be able to do so before i even think of making the restrictions themselves, since going at it with a bad way of representation will make my job alot more difficult.
I started thinking of solutions like this:
if every element of the board would be a list:
[NumberOfConnections, [ListOfConnections]]
but this gives me no info for the drawing, and what would the list of connections really have?
maybe this:
[Index, NumberOfConnections, [ListOfIndex]]
this way every "island" would have a unique ID and the list of connections would have ids
but drawing still sounds kinda hard, in the end the bridges can only be horizontal or vertical
Anyway, anyone can think of a better way of representation that makes it the easiest to achive the final goal of the program?
Nice puzzle, I agree. Here is a half-way solution in ECLiPSe, a Prolog dialect with constraints (http://eclipseclp.org).
The idea is to have, for every field of the board, four variables N, E, S, W (for North, East, etc) that can take values 0..2 and represent the number of connections on that edge of the field. For the node-fields, these connections must sum up to the given number. For the empty fields, the connections must go through (N=S, E=W) and not cross (N=S=0 or E=W=0).
Your example solves correctly:
?- hashi(stackoverflow).
3 = 6 = = = 6 = 3
| X X |
| 1 X X |
| | X X |
2 | X 1 X |
| | X | X |
| | X | X |
1 | 3 - - 2 X |
3 = = = = 4 1
but the wikipedia one doesn't, because there is no connectedness constraint yet!
:- lib(ic). % uses the integer constraint library
hashi(Name) :-
board(Name, Board),
dim(Board, [Imax,Jmax]),
dim(NESW, [Imax,Jmax,4]), % 4 variables N,E,S,W for each field
( foreachindex([I,J],Board), param(Board,NESW,Imax,Jmax) do
Sum is Board[I,J],
N is NESW[I,J,1],
E is NESW[I,J,2],
S is NESW[I,J,3],
W is NESW[I,J,4],
( I > 1 -> N #= NESW[I-1,J,3] ; N = 0 ),
( I < Imax -> S #= NESW[I+1,J,1] ; S = 0 ),
( J > 1 -> W #= NESW[I,J-1,2] ; W = 0 ),
( J < Jmax -> E #= NESW[I,J+1,4] ; E = 0 ),
( Sum > 0 ->
[N,E,S,W] #:: 0..2,
N+E+S+W #= Sum
;
N = S, E = W,
(N #= 0) or (E #= 0)
)
),
% find a solution
labeling(NESW),
print_board(Board, NESW).
print_board(Board, NESW) :-
( foreachindex([I,J],Board), param(Board,NESW) do
( J > 1 -> true ; nl ),
Sum is Board[I,J],
( Sum > 0 ->
write(Sum)
;
NS is NESW[I,J,1],
EW is NESW[I,J,2],
symbol(NS, EW, Char),
write(Char)
),
write(' ')
),
nl.
symbol(0, 0, ' ').
symbol(0, 1, '-').
symbol(0, 2, '=').
symbol(1, 0, '|').
symbol(2, 0, 'X').
% Examples
board(stackoverflow,
[]([](3, 0, 6, 0, 0, 0, 6, 0, 3),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](0, 1, 0, 0, 0, 0, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](2, 0, 0, 0, 0, 1, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](1, 0, 3, 0, 0, 2, 0, 0, 0),
[](0, 3, 0, 0, 0, 0, 4, 0, 1))
).
board(wikipedia,
[]([](2, 0, 4, 0, 3, 0, 1, 0, 2, 0, 0, 1, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 1),
[](0, 0, 0, 0, 2, 0, 3, 0, 2, 0, 0, 0, 0),
[](2, 0, 3, 0, 0, 2, 0, 0, 0, 3, 0, 1, 0),
[](0, 0, 0, 0, 2, 0, 5, 0, 3, 0, 4, 0, 0),
[](1, 0, 5, 0, 0, 2, 0, 1, 0, 0, 0, 2, 0),
[](0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 4, 0, 2),
[](0, 0, 4, 0, 4, 0, 0, 3, 0, 0, 0, 3, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
[](2, 0, 2, 0, 3, 0, 0, 0, 3, 0, 2, 0, 3),
[](0, 0, 0, 0, 0, 2, 0, 4, 0, 4, 0, 3, 0),
[](0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0),
[](3, 0, 0, 0, 0, 3, 0, 1, 0, 2, 0, 0, 2))
).
For drawing bridges, you could use ASCII 179 for single vertical bridges, 186 for double vertical bridges, 196 for single horizontal bridges, and 205 for double horizontal bridges. This depends on which extended ASCII set is in use, though. It works in the most common.
For internal representation, I'd use -1 and -2 for single and double bridges in one direction, and -3 and -4 in the other. You could use just about any symbol that isn't 0-8, but this has the added benefit of simply adding the bridges to the island (converting (-3, -4) to (-1, -2)) to check the solution. If the sum is 0, that island is solved.
What a cool puzzle! I did a few myself, and I don't see an obvious way to make solving them deterministic, which is a nice property for a puzzle to have. Games like Tetris derive much of their ongoing play value from the fact that you don't get bored--even a good strategy can continually be refined. This has a practical ramification: if I were coding this, I would spend no further time trying to find a deterministic algorithm. I would instead focus on the generate/test paradigm Prolog excels at.
If you know you're going to do generate-and-test, you know already where all your effort at optimization is going to go: making your generator more intelligent (so it generates better candidates) and making your test fast. So I'm looking at your board representation and I'm asking myself: is it going to be easy and fast to generate alternatives from this? And we both know the answer is no, for several reasons:
Finding alternative islands to connect to from any particular island is going to be highly inefficient: searching a list forward and backward and then indexing all the other lists by the current offset. This is a huge amount of list finagling, which won't be cheap.
Detecting and preventing a bridge crossing is going to be interesting.
More to the point, the proper way to encode bridges is not obvious with this design. Islands can be separated by great distances--are you going to put a 0/1/2 in every connecting cell? If so, you have a data duplication problem; if not, you're going to have some fun calculating which location should hold the bridge count.
It's just an intuition, but having a heterogeneous data structure like this where the "kind" of element is determined entirely by whether the indices are odd or even, strikes me as unwelcome.
I think what you've got for the board layout is a great input format, but I don't think it's going to serve you well as an intermediate representation. The game is clearly a graph problem. This suggests one of the two classic graph data structures might be more helpful: the adjacency list, or the edge matrix. Either of these will expedite choosing alternatives for bridge layout, but it's not obvious to me (maybe to someone who does more graph theory) how one would prevent bridge crossings. Ideally, your data structure would simply prevent bridge crossings from occurring. Next best would be preventing the generator from generating candidate solutions with bridge crossings; worst would be to simply fail them at the test stage.
I'm trying to solve an algorithm problem involving chess.
Suppose I have a king in A8 and want to move it to H1 (only with allowed moves).
How could I find out how many possibilities (paths) there is making exactly any given k moves?
(e.g. How many paths/possibilities there is if I want to move the king from A8 to H1 with 15 moves?)
One trivial solution is to see it as a graph problem and use any standard
path finding algorithm counting each move as having cost 1. So, let's say I want to move my king from A8 to H1 in 10 moves. I would simply search all paths which sum up to 10.
My question is, if there are other more clever and efficient ways of doing this?
I was also wondering, if there could be something more "mathematical" and straightforward to find this number and not so "algorithmic" and "brute-force-like"?
This is a straight-forward O(N^3) dynamic programming problem.
Simply assign a 3D array as follows:
Let Z[x][y][k] be the number of moves of k steps to reach the destination from position (x,y) on board.
The base cases are:
foreach x in 0 to 7,
foreach y in 0 to 7,
Z[x][y][0] = 0 // forall x,y: 0 ways to reach H1 from
// anywhere else with 0 steps
Z[7][7][0] = 1 // 1 way to reach H1 from H1 with 0 steps
The recursive case is:
foreach k in 1 to K,
foreach x in 0 to 7,
foreach y in 0 to 7,
Z[x][y][k+1] = Z[x-1][y][k]
+ Z[x+1][y][k]
+ Z[x][y-1][k]
+ Z[x][y+1][k]
+ ...; // only include positions in
// the summation that are on the board
// and that a king can make
Your answer is then:
return Z[0][0][K]; // number of ways to reach H1(7,7) from A8(0,0) with K moves
(There is a faster way to do this in O(n^2) by decomposing the moves into two sets of horizontal and vertical moves and then combining these and multiplying by the number of interleavings.)
See this related question and answer: No of ways to walk M steps in a grid
You could use an adjacency matrix. If you multiply such a matrix with itself, you get the amount of paths from Point to Point. Example:
Graph: complete K3 graph : A<->B<->C<->A
Matrix:
[0 ; 1 ; 1]
[1 ; 0 ; 1]
[1 ; 1 ; 0]
Paths for length 2: M * M
[2 ; 1 ; 1]
[1 ; 2 ; 1]
[1 ; 1 ; 2]
Length 3 would then be M * M * M
[2 ; 3 ; 3]
[3 ; 2 ; 3]
[3 ; 3 ; 2]
.......E <-end
........
........
........
........
........
........
S....... <-start
Unfortunately you can't use "any standard path finding algorithm" because your paths might not be shortest-paths. You'd have to specifically use a naive search which considered all paths (depth-first or breadth-first, for example).
However, because you don't care how you got to a tile, you can use a technique called dynamic programming. For every location (i,j), the number of ways to get there in n moves (let's call it waysi,j(n)) is:
waysi,j(n) = waysi-1,j(n-1) + waysi+1,j(n-1) + waysi,j-1(n-1) + waysi,j+1(n-1) + waysi+1,j+1(n-1) + waysi-1,j+1(n-1) + waysi+1,j-1(n-1) + waysi-1,j-1(n-1)
That is, the king can move from any of the adjacent squares in 1 move:
waysi,j(n) = sumneighbors(i,j)(waysneighbor(n-1))
Thus you'd do, for example in python:
SIZE = 8
cache = {}
def ways(pos, n):
r,c = pos # row,column
if not (0<=r<SIZE and 0<=c<SIZE):
# off edge of board: no ways to get here
return 0
elif n==0:
# starting position: only one way to get here
return 1 if (r,c)==(0,0) else 0
else:
args = (pos,n)
if not args in cache:
cache[args] = ways((r-1,c), n-1) + ways((r+1,c), n-1) + ways((r,c-1), n-1) + ways((r,c+1), n-1) + ways((r-1,c-1), n-1) + ways((r+1,c-1), n-1) + ways((r+1,c-1), n-1) + ways((r+1,c+1), n-1)
return cache[args]
Demo:
>>> ways((7,7), 15)
1074445298
The above technique is called memoization, and is simpler to write than dynamic programming, because you don't need to really think about the order in which you do things. You can see the cache grow as we perform a series of larger and larger queries:
>>> cache
{}
>>> ways((1,0), 1)
1
>>> cache
{((1, 0), 1): 1}
>>> ways((1,1), 2)
2
>>> cache
{((0, 1), 1): 1, ((1, 2), 1): 0, ((1, 0), 1): 1, ((0, 0), 1): 0, ((2, 0), 1): 0, ((2, 1), 1): 0, ((1, 1), 2): 2, ((2, 2), 1): 0}
>>> ways((2,1), 3)
5
>>> cache
{((1, 2), 1): 0, ((2, 3), 1): 0, ((2, 0), 2): 1, ((1, 1), 1): 1, ((3, 1), 1): 0, ((4, 0), 1): 0, ((1, 0), 1): 1, ((3, 0), 1): 0, ((0, 0), 1): 0, ((2, 0), 1): 0, ((2, 1), 1): 0, ((4, 1), 1): 0, ((2, 2), 2): 1, ((3, 3), 1): 0, ((0, 1), 1): 1, ((3, 0), 2): 0, ((3, 2), 2): 0, ((3, 2), 1): 0, ((1, 0), 2): 1, ((4, 2), 1): 0, ((4, 3), 1): 0, ((3, 1), 2): 0, ((1, 1), 2): 2, ((2, 2), 1): 0, ((2, 1), 3): 5}
(In python, can also use a #cached or #memoized decorator to avoid having to write the entire code in the last else: block. Other languages have other ways to automatically perform memoization.)
The above was a top-down approach. It can sometimes produce very large stacks (your stack will grow with n). If you want to be super-efficient to avoid unnecessary work, you can do a bottom-up approach, where you simulate all positions the king could be, for 1 step, 2 steps, 3 steps, ...:
SIZE = 8
def ways(n):
grid = [[0 for row in range(8)] for col in range(8)]
grid[0][0] = 1
def inGrid(r,c):
return all(0<=coord<SIZE for coord in (r,c))
def adjacentSum(pos, grid):
r,c = pos
total = 0
for neighbor in [(1,0),(1,1),(0,1),(-1,1),(-1,0),(-1,-1),(0,-1),(1,-1)]:
delta_r,delta_c = neighbor
(r2,c2) = (r+delta_r,c+delta_c)
if inGrid(r2,c2):
total += grid[r2][c2]
return total
for _ in range(n):
grid = [[adjacentSum((r,c), grid) for r in range(8)] for c in range(8)]
# careful: grid must be replaced atomically, not element-by-element
from pprint import pprint
pprint(grid)
return grid
Demo:
>>> ways(0)
[[1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(1)
[[0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(2)
[[3, 2, 2, 0, 0, 0, 0, 0],
[2, 2, 2, 0, 0, 0, 0, 0],
[2, 2, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(3)
[[6, 11, 6, 4, 0, 0, 0, 0],
[11, 16, 9, 5, 0, 0, 0, 0],
[6, 9, 6, 3, 0, 0, 0, 0],
[4, 5, 3, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(4)
[[38, 48, 45, 20, 9, 0, 0, 0],
[48, 64, 60, 28, 12, 0, 0, 0],
[45, 60, 51, 24, 9, 0, 0, 0],
[20, 28, 24, 12, 4, 0, 0, 0],
[9, 12, 9, 4, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
I have some problems with solving puzzle. I haven't found solution for this puzzle anywhere, but I tried to write it in Prolog, but I think my solution won't be fast (I generate every solution and delete them if they aren't possible or correct).
This is my problem:
(I found a name of that puzzle, here is the link with all rules of that puzzle: http://en.wikipedia.org/wiki/Kuromasu).
Now I have a different question, which method would be the quite easy to write and quite fast to solve it in Prolog. I thought about transforming my list of fields into a undirected graph, or maybe there is another method to search my list vertically (head after head)?
In:
0, 0, 0, 5, 0, 0, 0
0, 5, 0, 0, 0, 0, 2
0, 0, 0, 0, 7, 0, 4
0, 0, 0, 0, 0, 0, 0
8, 0, 13,0, 0, 0, 0
5, 0, 0, 0, 0, 6, 0
0, 0, 0, 8, 0, 0, 0
Result:
0, #, 0, 5, 0, 0, #
0, 5, 0, 0, 0, #, 2
0, #, 0, #, 7, 0, 4
#, 0, 0, 0, 0, 0, #
8, 0, 13,0, 0, 0, 0
5, 0, 0, 0, #, 6, 0
#, 0, 0, 8, 0, 0, #
This type of puzzles is called Kuromasu. Here is a page that solves it with SWI-Prolog and finite domain constraints: http://jfoutelet.developpez.com/articles/kuromasu/