How to find boundary point using bfs algorithm - algorithm

I think of the 2D array as a coordinate and try to find a coordinate value with a value of 1.
So far, it's a very easy BFS problem, but what I want to do is look at the following picture.
While I'm looking for 1 or after I've found it all, I would like to know the coordinate values surrounding the boundary in the order of the arrow or the other direction.
What options do I need to add to get those information?
Below is the BFS code that I use now. I can get coordinate values from the BFS function as shown in the second picture.
class Node
{
public int x;
public int y;
public Node(int x, int y)
{
this.x = x;
this.y = y;
}
};
private int[] dx = new int[8] { -1, 0, 1, 0, 1, -1, -1, 1 };
private int[] dy = new int[8] { 0, -1, 0, 1, 1, -1, 1, -1 };
private Queue<Node> q = new Queue<Node>();
bool[,] visit = new bool[15, 15];
int[,] coordinates = new int[15, 15] { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 },
{ 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 },
{ 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }};
void BFS(int[,] pixel, int x, int y)
{
q.Enqueue(new Node(x, y));
visit[x, y] = true;
while (q.Count != 0)
{
Node cur = q.Dequeue();
for (int i = 0; i < 8; i++)
{
int r = cur.x + dx[i];
int c = cur.y + dy[i];
if (r >= 0 && c >= 0 && r < 15 && c < 15)
{
if (!visit[r, c] && pixel[r, c] == 1)
{
q.Enqueue(new Node(r, c));
visit[r, c] = true;
}
}
}
}
}
void main()
{
for (int y = 0; y < 15; y++)
{
for (int x = 0; x < 15; x++)
{
if (!visit[x, y] && coordinates[x, y] == 1)
{
BFS(coordinates, x, y);
}
}
}
}

we do not need BFS for finding boundary '1' values. We can simply loop over 2D grid and then for each '1', we can just check whether all of it's 4 adjacent (i.e up, down, left, right) values are '1' or not. If at least one of them is not '1', then it is a boundary point. Thanks!

find a coordinate value with a value of 1
Start by pre-processing the matrix
- Search all 1 values (this can also be done recursively)
- If 1 value does not have a 0 neighbor, it means it it not on edge - change it to 0.
After the pre-processing you are left with only the edge 1 values, and all others are 0.
I would like to know the coordinate values surrounding the boundary in
the order of the arrow or the other direction
To find out if the edge forms a closed loop, and get the nodes in the right order
apply BFS to the pre-processed matrix.
Seek a path from a node of your choice, back to the same node(a loop).

Related

Rust performance issue - High complexity code

I bought RUST by its performance so I decided to translate one project where performance matters a lot, from JAVA 11 to Rust.
The thing is the Version written in JAVA performance pretty much better more than 3x in single thread, +10X in multi thread
For context purpose: The most complex code is a function that trys to find an assigment between 2 sets, imagine that you have houses and stores, the stores have a fixed capacity and houses have necesities, you want to find the best assignment to walk less.
With all this in mind, I guess that the problem is how I use the variables, maybe clone() is called too much automatically, maybe reference access cause some unknown behavior.
Any upgrade that reduce while loop time will be great because it iterate over 5000 times. Sorry for the long code but I think everything is relevant in this case. You can't copy and paste this code if you want I can send you the git project link.
PD: I'm running with cargo run --release
pub fn evaluate(elem: &Element) -> EvaluatedElement {
let p1 = properties::get_cast::<f64>("p1");
let p2 = properties::get_cast::<usize>("p2");
let p3 = properties::get_cast::<usize>("p3");
let p4 = properties::get_cast::<f64>("p4");
let p5 = properties::get_array::<usize>("p5");
let mut kinds1 = kind1::get_map(); //almost 300 elements
let kinds2 = = kind2::get_map(); //almost 300 elements
let usables = elem.usables();
for (i, &a) in usables.iter().enumerate() {
if !a {
&kinds1.remove(&(i + 1));
}
}
let mut assignations = HashMap::new();
for k in (1..=p2).rev() {
let mut kinds2_sub = HashMap::with_capacity((&kinds2).len());
for (_, p) in kinds2.iter() {
if p.val1[k - 1] == 0 {
continue;
}
&kinds2_sub.insert(p.id, Kind2Sub {
parent: p.clone(),
val2: p.val1[k - 1],
val3: std::f64::MAX,
kind1_id: std::usize::MAX,
});
}
let mut opt_kind1_id: Option<usize> = Option::None;
while !&kinds2_sub.is_empty() {//arround 5500 times loop
for mut l in kinds2_sub.values_mut() {
match opt_kind1_id {
None => (),
Some(id) => if !l.kind1_id == id { continue; },
}
l.val3 = std::f64::MAX;
l.kind1_id = std::usize::MAX;
for b in kinds1.values_mut() {
let dist_b_l = calc_dist(b.id, l.id);
if dist_b_l > p4
|| (p1 as usize).min(l.val2) > p4 + b.val3
|| b.val2 < k
|| (l.val2 < (2 * p4) && (b.val3 as i16 - l.val2 as i16) < 0)
{ continue; }
let tmp = dist_b_l * p1.min(l.val2 as f64);
if l.val3 > tmp {
l.val3 = tmp;
l.kind1_id = b.id;
}
}
}
let lc = kinds2_sub.values_mut().min_by(|x, y| x.val3.partial_cmp(&y.val3).unwrap()).unwrap();
let obc = kinds1.get_mut(&lc.kind1_id);
let bc = obc.unwrap_or_else(|| {
panic!("No assignation able")
});
let b_c_id = (*bc).id;
let l_c_id = (*lc).id;
let time = if lc.val2 < (2usize * p1 as usize) { lc.val2 } else { p1 as usize };
let val = (*bc).val3 as i16 - time as i16;
let assignation = Assignation { kind1_id: (*bc).id, kind2_id: lc.id, val3: k, val4: 0 };
let assignation_id = assignation.id();//id() = fn concatenate first 3 values
if !assignations.contains_key(&assignation_id) {
assignations.insert(assignation.id(), assignation);
}
let mut assignation = assignations.get_mut(&assignation_id).unwrap_or_else(|| panic!("Assignation not found {}", assignation_id));
if val >= 0 {
assignation.val4 += time;
lc.val2 -= time;
(*bc).val3 -= time;
} else {
assignation.val4 += (*bc).val3;
lc.val2 -= (*bc).val3;
(*bc).val3 = 0;
}
if (*bc).val3 < p4 {
&kinds1.remove(&b_c_id);
}
if lc.val2 == 0 {
&kinds2_sub.remove(&l_c_id);
}
opt_kind1_id = Some(b_c_id);
}
}
let assignations_values = assignations.iter().map(|(_, v)| v.clone()).collect();
EvaluatedElement::evaluation(assignations_values)
}
Now I have a 4X increase.
Step Value Time Used Stores
RUST -> BI 90 2672540 28057 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0]
Java -> BI 90 2672625 4704 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0]
FIX: -> BI 90 2672540 1093 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0]
The "bug"
match opt_kind1_id {
None => (),
Some(id) => if !l.kind1_id == id { continue; },
}
The fix
if let Some(id) = opt_kind1_id {
if l.kind1_id != id {
continue;
}
}
This continue skips 90% of finding a new value

How can I create a random array between a range of two other arrays?

I need to generate an array of random 20 bytes between a given range of arrays. Since arrays are comparable in Rust, this works:
let low = [0u8; 20];
let high = [2u8; 20];
assert_eq!(true, low < high);
assert_eq!(false, low > high);
assert_eq!(true, low == [0u8; 20]);
For these bounds:
let low: [u8; 20] = [98, 0, 1, 0, 2, 6, 99, 3, 0, 5, 23, 3, 5, 6, 11, 8, 0, 2, 0, 17];
let high: [u8; 20] = [99, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1];
These would be a valid result:
[98, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[99, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
These are not:
[98, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[99, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]
I want to do something like:
use rand::prelude::*;
fn main() {
let low = [0u8; 20];
let high = [2u8; 20];
let value = rand::thread_rng().gen_range(low, high);
println!("{:?}", value);
}
but I get following error:
error[E0277]: the trait bound `[u8; 20]: rand::distributions::uniform::SampleUniform` is not satisfied
--> src\main.rs:6:36
|
6 | let value = rand::thread_rng().gen_range(low, high);
| ^^^^^^^^^ the trait `rand::distributions::uniform::SampleUniform` is not implemented for `[u8; 20]`
I tried implementing SampleUniform and UniformSampler without much success. Is there a simple way to implement this?
If you want to treat the byte arrays as big integers, use the
num-bigint crate with the rand feature enabled:
use bigint::{ToBigInt, RandBigInt};
let low = -10000.to_bigint().unwrap();
let high = 10000.to_bigint().unwrap();
let b = rng.gen_bigint_range(&low, &high);
You could also use unsigned integers instead of signed. There are methods to convert to and from big endian byte arrays:
from_bytes_be
to_bytes_be
See also:
How do I generate a random num::BigUint?

Matrix columns permutation with cublas

I have an input matrix A of size 10x20, I want to permute its columns as follows:
p=[1 4 2 3 5 11 7 13 6 12 8 14 17 9 15 18 10 16 19 20] ;%rearrange the columns of A
A=A(:,p);
To do so, I constructed a permutation matrix I corresponding to the permutation vector p and permuted A can be obtained by performing the following multiplication:
A=A*I
I tested the permutation in Matlab and everything is ok. Now, I want to test it in cuda using cublas.
The input matrix A is entered in column major. The permuation matrix I in column major as well. The following code is to simply test the permutation:
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cublas_v2.h>
#define cudacall(call) \
do \
{ \
cudaError_t err = (call); \
if(cudaSuccess != err) \
{ \
fprintf(stderr,"CUDA Error:\nFile = %s\nLine = %d\nReason = %s\n", __FILE__, __LINE__, cudaGetErrorString(err)); \
cudaDeviceReset(); \
exit(EXIT_FAILURE); \
} \
} \
while (0)
#define cublascall(call) \
do \
{ \
cublasStatus_t status = (call); \
if(CUBLAS_STATUS_SUCCESS != status) \
{ \
fprintf(stderr,"CUBLAS Error:\nFile = %s\nLine = %d\nCode = %d\n", __FILE__, __LINE__, status); \
cudaDeviceReset(); \
exit(EXIT_FAILURE); \
} \
\
} \
while(0)
__global__ void sgemm_kernel(float *A_d, float *I_d)
{
int m=10,n=20,k=20;
int lda=k, ldb=k;
cublasHandle_t hdl;
cublasStatus_t status = cublasCreate_v2(&hdl);
const float alpha=1.0F, beta=0.0f;
status=cublasSgemm(hdl,CUBLAS_OP_N,CUBLAS_OP_N,k,n,k,&alpha,A_d,lda,I_d,ldb,&beta,A_d,lda);
}
int main(int argc, char* argv[])
{float A[10*20]={-0.0614, -0.0199, 0.0024, -0.0414, 0.1736, -0.0595, -0.2794, 0.1946, -0.0647, -0.0025,
-0.0036, 0.0628, -0.0827, 0.3679, -0.1913, 0.0500, -0.0245, 0.3855, -0.1298, -0.0334,
-0.0241, -0.0564, 0.0098, -0.2862, -0.0474, 0.0333, -0.3049, 0.2851, -0.1242, 0.0162,
0.0241, 0.0270, -0.0670, 0.3129, -0.2428, 0.0947, -0.1878, 0.0889, -0.0208, 0.0075,
-0.1559, 0.1437, -0.1916, 0.2297, -0.0833, -0.1805, 0.2522, -0.1738, 0.1027, -0.1273,
0.0716, 0.1882, -0.0963, 0.1081, 0.0958, -0.0713, 0.1931, 0.0874, -0.4186, 0.0345,
-0.1912, 0.0501, -0.1396, -0.0989, -0.0338, 0.1773, 0.1088, 0.0389, -0.0117, 0.0014,
0.1648, -0.1705, -0.0575, -0.0133, -0.0570, 0.2124, -0.0193, 0.1535, 0.0857, -0.1308,
0.1971, 0.0882, -0.2577, 0.1662, -0.2498, -0.0365, -0.1805, 0.0921, 0.0912, 0.0178,
-0.0379, 0.0080, 0.0572, -0.0067, 0.0591, -0.0136, 0.0471, -0.0163, 0.0082, -0.0338,
-0.2436, 0.1116, 0.0732, -0.0319, 0.0550, 0.2821, 0.0240, 0.0109, -0.0034, 0.1212,
-0.0061, 0.2497, -0.0542, -0.0939, 0.0651, 0.0063, -0.1367, 0.0580, 0.7389, -0.1143,
-0.3786, 0.1288, 0.0001, 0.2604, -0.1094, -0.3624, -0.0184, 0.0538, 0.0329, 0.0040,
0.0603, 0.1422, 0.1037, -0.1846, 0.4046, -0.3738, -0.3487, 0.3846, -0.0849, 0.0135,
-0.1850, 0.3571, -0.0543, -0.0025, -0.2880, 0.0600, 0.2605, -0.0474, 0.0010, -0.0333,
-0.1974, 0.4788, -0.2441, 0.3847, -0.1235, -0.3503, -0.1785, -0.1095, 0.3158, 0.0062,
-0.0509, -0.0502, 0.2154, 0.2237, -0.0671, 0.0377, 0.0519, 0.1530, -0.1675, 0.1856,
-0.0380, -0.0026, 0.4700, 0.0097, -0.2394, 0.0717, -0.2101, 0.2841, -0.1799, -0.0924,
-0.2678, 0.4485, 0.0044, 0.0030, -0.0439, 0.4337, 0.1819, -0.0180, -0.5443, 0.0864,
0.0390, -0.0235, -0.0706, 0.0138, 0.0633, -0.0147, 0.0444, -0.0334, 0.0557, 0.0507}
float I[20*20]={1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
float *A_d, *I_d;
cudacall(cudaMalloc(&A_d,10*20*sizeof( float )));
cudacall(cudaMalloc(&I_d, 20*20*sizeof(float )));
cudacall(cudaMemcpy(A_d, A, 10*20*sizeof(float), cudaMemcpyHostToDevice));
cudacall(cudaMemcpy(I_d, I, 20*20*sizeof(float), cudaMemcpyHostToDevice));
sgemm_kernel<<<1,1>>>(A_d, I_d);
cudacall(cudaDeviceSynchronize());
cudacall(cudaMemcpy(A, A_d, 10*20*sizeof(float), cudaMemcpyDeviceToHost));
cudacall(cudaFree(A_d));
cudacall(cudaFree(I_d));
return 0;
}
I couldn't get a correct result.
CUBLAS doesn't support in-place operations (in fact no parallel BLAS I am aware of supports it). You cannot pass A_d and use it in the multiplication and as the matrix in the operation. You must use a different memory allocation to hold the result.
So
C <- 1*(A * B) + 0*C
is legal, whereas
A <- 1*(A * B) + 0*A
is not.
cublasSgemm is a host function, so it should be called from a function without the __global__ qualifier.

It is possible to get a binary array from a BinData::Record instance?

I'm using Ruby gem Bindata, using the following code:
require 'bindata'
class Rectangle < BinData::Record
endian :little
uint16 :len
string :name, :read_length => :len
uint32 :width
uint32 :height
end
rectangle = rectangle.new
rectangle.len = 12
It is possible to get from rectangle instance an array like [0, 1, 1, 0, 0, ...] with the binary representation of all the fields inside the object?
BinData::Base#to_binary_s returns "the string representation of this data object":
rectangle.to_binary_s
#=> "\f\x00\x00\x00\x00\x00\x00\x00\x00\x00"
This can be converted to a bit string via String#unpack:
rectangle.to_binary_s.unpack('b*')
#=> ["00110000000000000000000000000000000000000000000000000000000000000000000000000000"]
Or to a bit array via:
rectangle.to_binary_s.unpack('b*')[0].chars.map(&:to_i)
#=> [0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Get matrix from textfile in Mathematica

What's the easiest way to turn this text file into matrix? It has one row per line, where O means 0 and X means 1
$url = "http://hyperpublic.com/challenge2input.txt";
StringCases[Import[$url, "Lines"], {"O" -> 0, "X" -> 1}]
I first saved that text in a file tmp.txt.
In[180]:= words = ReadList["~danl/tmp.txt", Word];
vals = Map[Characters, words] /. {"O" -> 0, "X" -> 1};
In[182]:= vals[[1]]
Out[182]= {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
Daniel Lichtblau
Wolfram Research

Resources