Related
For example, in Huggingface's example:
encoded_input = tokenizer("Do not meddle in the affairs of wizards, for they are subtle and quick to anger.")
print(encoded_input)
{'input_ids': [101, 2079, 2025, 19960, 10362, 1999, 1996, 3821, 1997, 16657, 1010, 2005, 2027, 2024, 11259, 1998, 4248, 2000, 4963, 1012, 102],
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
The input_ids vector already encode the order of each token in the original sentence. Why does it need positional encoding again with an extra vector to represent it?
The reason is the design of the neural architecture. BERT consists of self-attention and feedforward sub-layers, and neither of them is sequential.
The feedforward layers process each token independently of others.
The self-attention views the input states as an unordered set of states. Attention can be interpreted as soft probabilistic retrieval from a set of values according to some keys. The position embeddings are there so the keys can contain information about their relative order.
I have a list of 23 utilities that can be in a state of either enabled or disabled. I've ordered them from 0-22.
Some of these utilities are dependent on others, meaning they cannot be enabled without one or multiple dependency utilities first being enabled. I've put the indices of each utility's dependencies in a list for each utility; for example, if utilities 0-1 had no dependencies, but utility 2 had dependencies on utilities 0 and 9, the full dependency list would look something like:
[ [], [], [0, 9], ... ]
What I want to do is devise an algorithm (pseudocode is fine, implementation does not matter) for generating a list of all possible 23-bit bitvectors---each bit in each bitvector with an index that we could label 0-22 corresponding to a single utility, each bitvector itself representing a possible combination of the status of all 23 utilities---that ignores combinations where the dependency requirements provided by the dependency list (described above) would not be satisfied. For example (assume right-to-left numbering):
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0 ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ],
//skip[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 ] this would not be included (2 enabled, but 0 and/or 9 are not. See prev. example of dependency list)
...
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
]
First step, get rid of all circular dependencies. If A depends on B depends on C depends on A, all will be on/off together. So we can transfer all dependencies to A, then fill B and C at the last step. This is a question of identifying all connected components in a graph, which we can use Kosaraju's algorithm for to do efficiently.
Second step, do a topological sort by dependencies of the remaining list. This will put the remaining utilities into a list where each only depends on ones you looked at before.
And now we can use recursion down that list. The first utility can be 0 or 1. Each subsequent utility is 0 only if some dependency is not satisfied, else it can be 0 or 1. And then for the ones eliminated due to being part of circular dependencies, fill them in with whatever value the one kept has.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
var sudoku = map[int][9]int{ // map of 9 x 9
0: {3, 0, 6, 5, 0, 8, 4, 0, 0},
1: {5, 2, 0, 0, 0, 0, 0, 0, 0},
2: {0, 8, 7, 0, 0, 0, 0, 3, 1},
3: {0, 0, 3, 0, 1, 0, 0, 8, 0},
4: {9, 0, 0, 8, 6, 3, 0, 0, 5},
5: {0, 5, 0, 0, 9, 0, 6, 0, 0},
6: {1, 3, 0, 0, 0, 0, 2, 5, 0},
7: {0, 0, 0, 0, 0, 0, 0, 7, 4},
8: {0, 0, 5, 2, 0, 6, 3, 0, 0},
}
fmt.Println("Old value:", sudoku[1][0])
sudoku[1][0] = append(sudoku[1][0],10)
fmt.Println("New value:", sudoku[1][0])
Value is not changed, error message- cannot assign to sudoku[1][0]
sudoku[1][0] is an int, you can't append to an int, only to slices. Also [9]int is an array, not a slice.
You most likely want to change an element, not append to it, so use a simple assignment:
sudoku[1][0] = 10
This will output (try it on the Go Playground):
Old value: 5
New value: 10
Strongly recommended to take the Go Tour if you're not clear with the basics.
Im trying to write a prolog program that receives a representation of an unsolved Hashi board and answers all the possible solutions, using restrictions. Im having an hard time figuring out which is the best (or a very good) way of representing the board with the bridges and without. The program is supposed to draw the boards for an easy reading of the solutions.
board(
[[3, 0, 6, 0, 0, 0, 6, 0, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[2, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 3, 0, 0, 2, 0, 0, 0],
[0, 3, 0, 0, 0, 0, 4, 0, 1]]
).
For example, this representation is only good without the bridges, since it holds no info about them. The drawing of this board would be basicly turning the 0's into spaces, and the board would be drawn like this:
3 6 6 3
1
2 1
1 3 2
3 4 1
which is a decent representation of a real hashi board.
The point now is to be able to draw the same thing, but also draw bridges if there's any. I must be able to do so before i even think of making the restrictions themselves, since going at it with a bad way of representation will make my job alot more difficult.
I started thinking of solutions like this:
if every element of the board would be a list:
[NumberOfConnections, [ListOfConnections]]
but this gives me no info for the drawing, and what would the list of connections really have?
maybe this:
[Index, NumberOfConnections, [ListOfIndex]]
this way every "island" would have a unique ID and the list of connections would have ids
but drawing still sounds kinda hard, in the end the bridges can only be horizontal or vertical
Anyway, anyone can think of a better way of representation that makes it the easiest to achive the final goal of the program?
Nice puzzle, I agree. Here is a half-way solution in ECLiPSe, a Prolog dialect with constraints (http://eclipseclp.org).
The idea is to have, for every field of the board, four variables N, E, S, W (for North, East, etc) that can take values 0..2 and represent the number of connections on that edge of the field. For the node-fields, these connections must sum up to the given number. For the empty fields, the connections must go through (N=S, E=W) and not cross (N=S=0 or E=W=0).
Your example solves correctly:
?- hashi(stackoverflow).
3 = 6 = = = 6 = 3
| X X |
| 1 X X |
| | X X |
2 | X 1 X |
| | X | X |
| | X | X |
1 | 3 - - 2 X |
3 = = = = 4 1
but the wikipedia one doesn't, because there is no connectedness constraint yet!
:- lib(ic). % uses the integer constraint library
hashi(Name) :-
board(Name, Board),
dim(Board, [Imax,Jmax]),
dim(NESW, [Imax,Jmax,4]), % 4 variables N,E,S,W for each field
( foreachindex([I,J],Board), param(Board,NESW,Imax,Jmax) do
Sum is Board[I,J],
N is NESW[I,J,1],
E is NESW[I,J,2],
S is NESW[I,J,3],
W is NESW[I,J,4],
( I > 1 -> N #= NESW[I-1,J,3] ; N = 0 ),
( I < Imax -> S #= NESW[I+1,J,1] ; S = 0 ),
( J > 1 -> W #= NESW[I,J-1,2] ; W = 0 ),
( J < Jmax -> E #= NESW[I,J+1,4] ; E = 0 ),
( Sum > 0 ->
[N,E,S,W] #:: 0..2,
N+E+S+W #= Sum
;
N = S, E = W,
(N #= 0) or (E #= 0)
)
),
% find a solution
labeling(NESW),
print_board(Board, NESW).
print_board(Board, NESW) :-
( foreachindex([I,J],Board), param(Board,NESW) do
( J > 1 -> true ; nl ),
Sum is Board[I,J],
( Sum > 0 ->
write(Sum)
;
NS is NESW[I,J,1],
EW is NESW[I,J,2],
symbol(NS, EW, Char),
write(Char)
),
write(' ')
),
nl.
symbol(0, 0, ' ').
symbol(0, 1, '-').
symbol(0, 2, '=').
symbol(1, 0, '|').
symbol(2, 0, 'X').
% Examples
board(stackoverflow,
[]([](3, 0, 6, 0, 0, 0, 6, 0, 3),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](0, 1, 0, 0, 0, 0, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](2, 0, 0, 0, 0, 1, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0),
[](1, 0, 3, 0, 0, 2, 0, 0, 0),
[](0, 3, 0, 0, 0, 0, 4, 0, 1))
).
board(wikipedia,
[]([](2, 0, 4, 0, 3, 0, 1, 0, 2, 0, 0, 1, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 1),
[](0, 0, 0, 0, 2, 0, 3, 0, 2, 0, 0, 0, 0),
[](2, 0, 3, 0, 0, 2, 0, 0, 0, 3, 0, 1, 0),
[](0, 0, 0, 0, 2, 0, 5, 0, 3, 0, 4, 0, 0),
[](1, 0, 5, 0, 0, 2, 0, 1, 0, 0, 0, 2, 0),
[](0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 4, 0, 2),
[](0, 0, 4, 0, 4, 0, 0, 3, 0, 0, 0, 3, 0),
[](0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
[](2, 0, 2, 0, 3, 0, 0, 0, 3, 0, 2, 0, 3),
[](0, 0, 0, 0, 0, 2, 0, 4, 0, 4, 0, 3, 0),
[](0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0),
[](3, 0, 0, 0, 0, 3, 0, 1, 0, 2, 0, 0, 2))
).
For drawing bridges, you could use ASCII 179 for single vertical bridges, 186 for double vertical bridges, 196 for single horizontal bridges, and 205 for double horizontal bridges. This depends on which extended ASCII set is in use, though. It works in the most common.
For internal representation, I'd use -1 and -2 for single and double bridges in one direction, and -3 and -4 in the other. You could use just about any symbol that isn't 0-8, but this has the added benefit of simply adding the bridges to the island (converting (-3, -4) to (-1, -2)) to check the solution. If the sum is 0, that island is solved.
What a cool puzzle! I did a few myself, and I don't see an obvious way to make solving them deterministic, which is a nice property for a puzzle to have. Games like Tetris derive much of their ongoing play value from the fact that you don't get bored--even a good strategy can continually be refined. This has a practical ramification: if I were coding this, I would spend no further time trying to find a deterministic algorithm. I would instead focus on the generate/test paradigm Prolog excels at.
If you know you're going to do generate-and-test, you know already where all your effort at optimization is going to go: making your generator more intelligent (so it generates better candidates) and making your test fast. So I'm looking at your board representation and I'm asking myself: is it going to be easy and fast to generate alternatives from this? And we both know the answer is no, for several reasons:
Finding alternative islands to connect to from any particular island is going to be highly inefficient: searching a list forward and backward and then indexing all the other lists by the current offset. This is a huge amount of list finagling, which won't be cheap.
Detecting and preventing a bridge crossing is going to be interesting.
More to the point, the proper way to encode bridges is not obvious with this design. Islands can be separated by great distances--are you going to put a 0/1/2 in every connecting cell? If so, you have a data duplication problem; if not, you're going to have some fun calculating which location should hold the bridge count.
It's just an intuition, but having a heterogeneous data structure like this where the "kind" of element is determined entirely by whether the indices are odd or even, strikes me as unwelcome.
I think what you've got for the board layout is a great input format, but I don't think it's going to serve you well as an intermediate representation. The game is clearly a graph problem. This suggests one of the two classic graph data structures might be more helpful: the adjacency list, or the edge matrix. Either of these will expedite choosing alternatives for bridge layout, but it's not obvious to me (maybe to someone who does more graph theory) how one would prevent bridge crossings. Ideally, your data structure would simply prevent bridge crossings from occurring. Next best would be preventing the generator from generating candidate solutions with bridge crossings; worst would be to simply fail them at the test stage.
I'm trying to solve an algorithm problem involving chess.
Suppose I have a king in A8 and want to move it to H1 (only with allowed moves).
How could I find out how many possibilities (paths) there is making exactly any given k moves?
(e.g. How many paths/possibilities there is if I want to move the king from A8 to H1 with 15 moves?)
One trivial solution is to see it as a graph problem and use any standard
path finding algorithm counting each move as having cost 1. So, let's say I want to move my king from A8 to H1 in 10 moves. I would simply search all paths which sum up to 10.
My question is, if there are other more clever and efficient ways of doing this?
I was also wondering, if there could be something more "mathematical" and straightforward to find this number and not so "algorithmic" and "brute-force-like"?
This is a straight-forward O(N^3) dynamic programming problem.
Simply assign a 3D array as follows:
Let Z[x][y][k] be the number of moves of k steps to reach the destination from position (x,y) on board.
The base cases are:
foreach x in 0 to 7,
foreach y in 0 to 7,
Z[x][y][0] = 0 // forall x,y: 0 ways to reach H1 from
// anywhere else with 0 steps
Z[7][7][0] = 1 // 1 way to reach H1 from H1 with 0 steps
The recursive case is:
foreach k in 1 to K,
foreach x in 0 to 7,
foreach y in 0 to 7,
Z[x][y][k+1] = Z[x-1][y][k]
+ Z[x+1][y][k]
+ Z[x][y-1][k]
+ Z[x][y+1][k]
+ ...; // only include positions in
// the summation that are on the board
// and that a king can make
Your answer is then:
return Z[0][0][K]; // number of ways to reach H1(7,7) from A8(0,0) with K moves
(There is a faster way to do this in O(n^2) by decomposing the moves into two sets of horizontal and vertical moves and then combining these and multiplying by the number of interleavings.)
See this related question and answer: No of ways to walk M steps in a grid
You could use an adjacency matrix. If you multiply such a matrix with itself, you get the amount of paths from Point to Point. Example:
Graph: complete K3 graph : A<->B<->C<->A
Matrix:
[0 ; 1 ; 1]
[1 ; 0 ; 1]
[1 ; 1 ; 0]
Paths for length 2: M * M
[2 ; 1 ; 1]
[1 ; 2 ; 1]
[1 ; 1 ; 2]
Length 3 would then be M * M * M
[2 ; 3 ; 3]
[3 ; 2 ; 3]
[3 ; 3 ; 2]
.......E <-end
........
........
........
........
........
........
S....... <-start
Unfortunately you can't use "any standard path finding algorithm" because your paths might not be shortest-paths. You'd have to specifically use a naive search which considered all paths (depth-first or breadth-first, for example).
However, because you don't care how you got to a tile, you can use a technique called dynamic programming. For every location (i,j), the number of ways to get there in n moves (let's call it waysi,j(n)) is:
waysi,j(n) = waysi-1,j(n-1) + waysi+1,j(n-1) + waysi,j-1(n-1) + waysi,j+1(n-1) + waysi+1,j+1(n-1) + waysi-1,j+1(n-1) + waysi+1,j-1(n-1) + waysi-1,j-1(n-1)
That is, the king can move from any of the adjacent squares in 1 move:
waysi,j(n) = sumneighbors(i,j)(waysneighbor(n-1))
Thus you'd do, for example in python:
SIZE = 8
cache = {}
def ways(pos, n):
r,c = pos # row,column
if not (0<=r<SIZE and 0<=c<SIZE):
# off edge of board: no ways to get here
return 0
elif n==0:
# starting position: only one way to get here
return 1 if (r,c)==(0,0) else 0
else:
args = (pos,n)
if not args in cache:
cache[args] = ways((r-1,c), n-1) + ways((r+1,c), n-1) + ways((r,c-1), n-1) + ways((r,c+1), n-1) + ways((r-1,c-1), n-1) + ways((r+1,c-1), n-1) + ways((r+1,c-1), n-1) + ways((r+1,c+1), n-1)
return cache[args]
Demo:
>>> ways((7,7), 15)
1074445298
The above technique is called memoization, and is simpler to write than dynamic programming, because you don't need to really think about the order in which you do things. You can see the cache grow as we perform a series of larger and larger queries:
>>> cache
{}
>>> ways((1,0), 1)
1
>>> cache
{((1, 0), 1): 1}
>>> ways((1,1), 2)
2
>>> cache
{((0, 1), 1): 1, ((1, 2), 1): 0, ((1, 0), 1): 1, ((0, 0), 1): 0, ((2, 0), 1): 0, ((2, 1), 1): 0, ((1, 1), 2): 2, ((2, 2), 1): 0}
>>> ways((2,1), 3)
5
>>> cache
{((1, 2), 1): 0, ((2, 3), 1): 0, ((2, 0), 2): 1, ((1, 1), 1): 1, ((3, 1), 1): 0, ((4, 0), 1): 0, ((1, 0), 1): 1, ((3, 0), 1): 0, ((0, 0), 1): 0, ((2, 0), 1): 0, ((2, 1), 1): 0, ((4, 1), 1): 0, ((2, 2), 2): 1, ((3, 3), 1): 0, ((0, 1), 1): 1, ((3, 0), 2): 0, ((3, 2), 2): 0, ((3, 2), 1): 0, ((1, 0), 2): 1, ((4, 2), 1): 0, ((4, 3), 1): 0, ((3, 1), 2): 0, ((1, 1), 2): 2, ((2, 2), 1): 0, ((2, 1), 3): 5}
(In python, can also use a #cached or #memoized decorator to avoid having to write the entire code in the last else: block. Other languages have other ways to automatically perform memoization.)
The above was a top-down approach. It can sometimes produce very large stacks (your stack will grow with n). If you want to be super-efficient to avoid unnecessary work, you can do a bottom-up approach, where you simulate all positions the king could be, for 1 step, 2 steps, 3 steps, ...:
SIZE = 8
def ways(n):
grid = [[0 for row in range(8)] for col in range(8)]
grid[0][0] = 1
def inGrid(r,c):
return all(0<=coord<SIZE for coord in (r,c))
def adjacentSum(pos, grid):
r,c = pos
total = 0
for neighbor in [(1,0),(1,1),(0,1),(-1,1),(-1,0),(-1,-1),(0,-1),(1,-1)]:
delta_r,delta_c = neighbor
(r2,c2) = (r+delta_r,c+delta_c)
if inGrid(r2,c2):
total += grid[r2][c2]
return total
for _ in range(n):
grid = [[adjacentSum((r,c), grid) for r in range(8)] for c in range(8)]
# careful: grid must be replaced atomically, not element-by-element
from pprint import pprint
pprint(grid)
return grid
Demo:
>>> ways(0)
[[1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(1)
[[0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(2)
[[3, 2, 2, 0, 0, 0, 0, 0],
[2, 2, 2, 0, 0, 0, 0, 0],
[2, 2, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(3)
[[6, 11, 6, 4, 0, 0, 0, 0],
[11, 16, 9, 5, 0, 0, 0, 0],
[6, 9, 6, 3, 0, 0, 0, 0],
[4, 5, 3, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]
>>> ways(4)
[[38, 48, 45, 20, 9, 0, 0, 0],
[48, 64, 60, 28, 12, 0, 0, 0],
[45, 60, 51, 24, 9, 0, 0, 0],
[20, 28, 24, 12, 4, 0, 0, 0],
[9, 12, 9, 4, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]]