I want to judge the relationship between two 3D geometry:
Whether overlap
If overlap, what is the overlapped part and what is the overlapped volume.
I want to know if there is a mature method in ArcGIS to achieve such a function, because it is too difficult for me to implement specific mathematical problems.
Currently I found a related class GeometryEngine in ArcGIS, but it seems to be invalid:
var onMapLocation = new MapPoint(0, 0, 0, SpatialReferences.Wgs84);
var num = 1;
List<MapPoint> points = new List<MapPoint>();
points.Add(new MapPoint(onMapLocation.X, onMapLocation.Y + num, onMapLocation.Z + num, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X + num, onMapLocation.Y + num, onMapLocation.Z + num, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X + num, onMapLocation.Y, onMapLocation.Z + num, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X, onMapLocation.Y, onMapLocation.Z + num, onMapLocation.SpatialReference));
Esri.ArcGISRuntime.Geometry.Polygon polygon1 = new Esri.ArcGISRuntime.Geometry.Polygon(points);
var num2 = 2;
points = new List<MapPoint>();
points.Add(new MapPoint(onMapLocation.X, onMapLocation.Y + num2, onMapLocation.Z + num2, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X + num2, onMapLocation.Y + num2, onMapLocation.Z + num2, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X + num2, onMapLocation.Y, onMapLocation.Z + num2, onMapLocation.SpatialReference));
points.Add(new MapPoint(onMapLocation.X, onMapLocation.Y, onMapLocation.Z + num2, onMapLocation.SpatialReference));
Esri.ArcGISRuntime.Geometry.Polygon polygon2 = new Esri.ArcGISRuntime.Geometry.Polygon(points);
var g1 = GeometryEngine.Difference(polygon1, polygon2);
The result g1 is empty.
Here is the reference:
https://developers.arcgis.com/net/latest/wpf/api-reference/html/M_Esri_ArcGISRuntime_Geometry_GeometryEngine_Difference.htm
I used the wrong method, the correct method are as follows:
var b = GeometryEngine.Intersects(polygon1, polygon2);
var g3 = GeometryEngine.Intersection(polygon1, polygon2);
var g2 = GeometryEngine.Intersections(polygon1, polygon2);
public static bool Intersects(Geometry geometry1, Geometry geometry2);
public static Geometry? Intersection(Geometry geometry1, Geometry geometry2);
public static IReadOnlyList<Geometry> Intersections(Geometry geometry1, Geometry geometry2);
Related
The Magical Sequence
A Magical Sequence is defined as shown.
Magical[1] = 0
Magical[2] = 1
Magical[n] = Magical[n-1] + 2*Magical[n-2] + 3*Magical[n-3] + ... (n-1)*Magical[1] + n*1., for n > 2
Given n (1 <= n <= 10^9 ), find Magical[n].
Example 1: input: 3
Output: 4
Explanation:
Magical[n] = 1*Magical[n-1] + 2*Magical[n-2] + 3*1
Magical[3] = 1*Magical[2] + 2*Magical[1] + 3*1
Magical[3] = 1*1 + 2*0 + 3*1
Magical[3] = 4
Example 2: input: 4
Output: 10
Magical[4] = 1*Magical[3]+2*Magical[2]+3*Magical[1]+4*1
= 1*4+2*1+3*0+4 = 10
Example 3: input: 5
Output: 26
Magical[5] = 1*Magical[4]+2*Magical[3]+3*Magical[2]+4*Magical[1]+5*1
= 1*10+2*4+3*1+4*0+5 = 26
I tried something like below :-
int CuckooNum(int n)
{
if (1 == n)
{
return 0;
}
else if (2 == n)
{
return 1;
}
std::vector<int> vec;
vec.resize(n);
vec[0] = 4;
vec[1] = 0;
vec[2] = 1;
int multiplyer = n;
int result = 0;
for (int index=3; index <= n; index++)
{
result += multiplyer * vec[index-1];
vec[index] = result;
multiplyer--;
}
return result;
}
long long func(int n)
{
if (n==1) return 0;
else if (n==2) return 1;
else return 1*func(n-1)+2*func(n-2)+n;
}
As the size n can be very large (10^9), a direct implementation O(n^2) is not possible.
A specific algorithm is needed. I will focus here on the algorithm, and propose a O(log n) solution.
To simplify explanation, I rename magical[] as x[]
Moreover, we can define x[0] = 1. Then,
x[n] = x[n-1] + 2*x[n-2] + 3*x[n-3] + ... (n-1)*x[1] + n*x[0]
As
x[n-1] = 1*x[n-2] + 2*x[n-3] + ... (n-2)*x[1] + (n-1)*x[0]
It follows
x[n] - x[n-1] = x[n-1] + x[n-2] + x[n-3] + ... x[1] + x[0] = S[n-1]
When S[n] represents the sum of the terms until n (x[0] included)
Moreover,
S[n] = S[n-1] + x[n] = 2*S[n-1] + x[n-1]
Therefore, the iterative formula can be represented in a simple matrix form:
(x[n]) = (1 1) (x[n-1])
(S[n]) (1 2) (S[n-1])
Or, defining the vector (x[n] S[n])^t as Z[n]:
Z[n] = A * Z[n-1] where A is the matrix (1 1)
(1 2)
Note: this formula is valid for n>= 4 only, as the first x[n] values do no respect the simple recurrence relation.
It follows that
Z[n] = A^(n-3) Z[3] with Z[3] = (4 6)^t
Classically, this calculation can be performed with O(log n) complexity, iteratively calculating A^2, A^4, A^8 etc.
Pay attention that the values increase rapidly.
Here is an example of C++ implementation. Note that this implementation is not optimized, as for example it doesn't use the fact that all matrices are symmetric.
#include <iostream>
#include <array>
using Matr22 = std::array<std::array<long long int, 2>, 2>;
using Vect2 = std::array<long long int, 2>;
Matr22 Matrsquare (const Matr22 &m) {
Matr22 m2;
m2[0][0] = m[0][0]*m[0][0] + m[0][1]*m[1][0];
m2[0][1] = m[0][0]*m[0][1] + m[0][1]*m[1][1];
m2[1][0] = m[1][0]*m[0][0] + m[1][1]*m[1][0];
m2[1][1] = m[1][0]*m[0][1] + m[1][1]*m[1][1];
return m2;
}
Matr22 Mult (const Matr22 &m1, const Matr22 &m2) {
Matr22 y;
y[0][0] = m1[0][0]*m2[0][0] + m1[0][1]*m2[1][0];
y[0][1] = m1[0][0]*m2[0][1] + m1[0][1]*m2[1][1];
y[1][0] = m1[1][0]*m2[0][0] + m1[1][1]*m2[1][0];
y[1][1] = m1[1][0]*m2[0][1] + m1[1][1]*m2[1][1];
return y;
}
Vect2 Mult (const Matr22 &m, const Vect2& x) {
Vect2 y;
y[0] = m[0][0] * x[0] + m[0][1] * x[1];
y[1] = m[1][0] * x[0] + m[1][1] * x[1];
return y;
}
// Matrix exponentiation
Matr22 Mult_exp (const Matr22 &m, int exp) {
Matr22 y = {1, 0, 0, 1};
if (exp == 0) return y;
Matr22 M2k = m;
while (exp) {
if (exp%2) y = Mult (y, M2k);
M2k = Matrsquare (M2k);
exp /= 2;
};
return y;
}
long long int Magical (int n) {
if (n == 1) return 0;
if (n == 2) return 1;
if (n == 3) return 4;
Matr22 A = {1, 1, 1, 2};
Vect2 z = {4, 6}; // corresponds to n=3
auto Ak = Mult_exp (A, n-3);
z = Mult (Ak, z);
return z[0];
}
int main() {
int n;
std::cout << "Input n: ";
std::cin >> n;
auto ans = Magical (n);
std::cout << "Magical[" << n << "] = " << ans << '\n';
}
My task :
I have two matrices : A - (18 x 4194304) ; B - (18 x 1024).
I have to take each 18-length vector from A and compute distance with each 18-length vector from B and find minimum distance and index.
My code :
__device__
void GetMin(float &dist, int &idx)
{
float dist2;
int idx2;
dist2 = __shfl_down_sync(0xFFFFFFFF, dist, 16, 32);
idx2 = __shfl_down_sync(0xFFFFFFFF, idx, 16);
if (dist > dist2)
{
dist = dist2;
idx = idx2;
}
dist2 = __shfl_down_sync(0xFFFFFFFF, dist, 8, 32);
idx2 = __shfl_down_sync(0xFFFFFFFF, idx, 8);
if (dist > dist2)
{
dist = dist2;
idx = idx2;
}
dist2 = __shfl_down_sync(0xFFFFFFFF, dist, 4, 32);
idx2 = __shfl_down_sync(0xFFFFFFFF, idx, 4);
if (dist > dist2)
{
dist = dist2;
idx = idx2;
}
dist2 = __shfl_down_sync(0xFFFFFFFF, dist, 2, 32);
idx2 = __shfl_down_sync(0xFFFFFFFF, idx, 2);
if (dist > dist2)
{
dist = dist2;
idx = idx2;
}
dist2 = __shfl_down_sync(0xFFFFFFFF, dist, 1, 32);
idx2 = __shfl_down_sync(0xFFFFFFFF, idx, 1);
if (dist > dist2)
{
dist = dist2;
idx = idx2;
}
}
__global__
void CalcMinDist_kernel(const float *A, const float *B, float *output, const int nNumPixels, int nNumImages)
{
int tx = threadIdx.x + blockIdx.x * blockDim.x;
int ty = threadIdx.y;
int lane_id = tx % 32;
float dist = 0;
int idx = 0;
float fMin = 99999999;
int nMinIdx = -1;
for(int i = lane_id; i < 1024; i += 32)
{
dist = 0;
for(int j = 0; j < nNumImages; ++j)
{
int img_idx = blockIdx.x * ty + j * nNumPixels;
dist += (A[img_idx] - B[i * nNumImages + j]) *
(A[img_idx] - B[i * nNumImages + j]);
}
idx = i;
GetMin(dist, idx);
if(threadIdx.x == 0)
{
if(fMin > dist)
{
fMin = dist;
nMinIdx = idx;
}
}
}
if(threadIdx.x == 0)
{
output[blockIdx.x * ty] = nMinIdx;
}
}
Looking at the profiler, I'm memory bound, and do have ~90% occupancy. Is there any way to speed up this operation?
Let me know if I need to provide any other information.
Actually, I would look at the algorithm first. This is a geometric problem - treat it as such.
You should represent the B data using a different data structure, e.g. by clustering or building a partition structure (e.g. k-d tree). That will let you avoid actually computing the distance from most B elements. (You could also consider a project onto fewer dimensions, but the benefit of this may be more elusive.)
With respect to the access pattern - you would probably benefit from having consecutive threads working on consecutive elements of the 18-element-long vectors, rather than having threads work on complete 18-element-long vectors individually. That would better fit the memory layout - right now, a warp read is of many elements which are at distance 18 from each other. If I understand the code correctly anyway.
(I also think the GetMin() could avoid some of the index swaps, but that's not significant since you only perform very few of those.)
I am new to java8. Just wanted to know, this is correct way to write java 8 and suggest if given code can be improved further with respect to java8 functional programming?
public class OperationByJava8 {
public int add(int a, int b) {
Operation op = (num1, num2) -> num1 + num2;
return op.operate(a, b);
}
public int subtract(int a, int b) {
Operation op = (num1, num2) -> num1 - num2;
return op.operate(a, b);
}
public int multiply(int a, int b) {
Operation op = (num1, num2) -> num1 * num2;
return op.operate(a, b);
}
public int devide(int a, int b) {
Operation op = (num1, num2) -> {
if (num2 == 0) {
throw new IllegalArgumentException("denominator cannot be zero");
}
return num1 / num2;
};
return op.operate(a, b);
}
public static void main(String[] args) {
OperationByJava9 op = new OperationByJava9();
System.out.println(" Addition(12, 12) :" + op.add(12, 12));
System.out.println(" Subtract(12, 12) :" + op.subtract(12, 12));
System.out.println(" Multiply(12, 12) :" + op.multiply(12, 12));
System.out.println(" Devide (12, 12) :" + op.devide(12, 12));
}
}
#FunctionalInterface
interface Operation {
int operate(int a, int b);
}
Your approach is completely fine and would be good to proceed with.
if you want to further reduce the code, you could create a single function as follows for addition, subtraction and multiplication.
public int apply(int a, int b, IntBinaryOperator func) {
return func.applyAsInt(a, b);
}
but then you'll need to create a separate function for the division functionality in order to throw the IllegalArgumentException exception in the case of the second argument being 0.
Note that i've used the IntBinaryOperator functional interface to avoid having to create the Operation interface but if the latter is more meaningful then you can keep that.
Alternatively, you can define the functions inline and then invoke them:
IntBinaryOperator add = (num1, num2) -> num1 + num2;
IntBinaryOperator subtract = (num1, num2) -> num1 + num2;
IntBinaryOperator multiply = (num1, num2) -> num1 + num2;
IntBinaryOperator division = (num1, num2) -> {
if (num2 == 0)
throw new IllegalArgumentException("denominator cannot be zero");
return num1 / num2;
};
System.out.println(" Addition(12, 12) :" + add.applyAsInt(12, 12));
System.out.println(" Subtract(12, 12) :" + subtract.applyAsInt(12, 12));
System.out.println(" Multiply(12, 12) :" + multiply.applyAsInt(12, 12));
System.out.println(" Devide (12, 12) :" + division.applyAsInt(12, 12));
I've tried to implement an inefficient function to generate the points, normals, tex coords, and index list of a sphere.
Ignoring the lines, when I draw the sphere with OpenGL I get the following output which is clearly wrong:
Can anyone help me understand what's wrong with my code?
public SphereObject()
{
super();
// inefficient but quick sphere data
int num_points = 16;
double as = Math.PI / num_points;
double theta, phi;
double [] p;
ArrayList<double []> points = new ArrayList<double []>();
ArrayList<Integer> edges = new ArrayList<Integer>();
ArrayList<double []> normals = new ArrayList<double []>();
ArrayList<double []> tex = new ArrayList<double []>();
theta = Math.PI;
phi = Math.PI / 2;
for(int row = 0; row < num_points; row++)
{
for(int col = 0; col < num_points; col++)
{
p = new double[3];
p[0] = Math.sin(theta) * Math.cos(phi - as);
p[1] = Math.cos(theta) * Math.cos(phi - as);
p[2] = Math.sin(phi - as);
points.add(p);
normals.add(p);
tex.add(new double [] {0, 0});
p = new double[3];
p[0] = Math.sin(theta + 2 * as) * Math.cos(phi - as);
p[1] = Math.cos(theta + 2 * as) * Math.cos(phi - as);
p[2] = Math.sin(phi - as);
points.add(p);
normals.add(p);
tex.add(new double [] {1, 0});
p = new double[3];
p[0] = Math.sin(theta + 2 * as) * Math.cos(phi);
p[1] = Math.cos(theta + 2 * as) * Math.cos(phi);
p[2] = Math.sin(phi);
points.add(p);
normals.add(p);
tex.add(new double [] {1, 1});
p = new double[3];
p[0] = Math.sin(theta) * Math.cos(phi);
p[1] = Math.cos(theta) * Math.cos(phi);
p[2] = Math.sin(phi);
points.add(p);
normals.add(p);
tex.add(new double [] {0, 1});
// make triangles
edges.add(points.size()-1);
edges.add(points.size()-3);
edges.add(points.size()-4);
edges.add(points.size()-1);
edges.add(points.size()-2);
edges.add(points.size()-3);
theta -= 2 * as;
}
phi -= as;
}
sphereVertices = new double[points.size() * 3];
sphereTexcoords = new double[tex.size() * 2];
sphereNormals = new double[normals.size() * 3];
sphereIndices = new short[edges.size() * 1];
for(int c1 = 0; c1 < points.size(); c1 += 3)
{
sphereVertices[c1] = points.get(c1)[0];
sphereVertices[c1+1] = points.get(c1)[1];
sphereVertices[c1+2] = points.get(c1)[2];
}
for(int c1 = 0; c1 < tex.size(); c1 += 2)
{
sphereTexcoords[c1] = tex.get(c1)[0];
sphereTexcoords[c1+1] = tex.get(c1)[1];
}
for(int c1 = 0; c1 < normals.size(); c1 += 3)
{
sphereNormals[c1] = normals.get(c1)[0];
sphereNormals[c1+1] = normals.get(c1)[1];
sphereNormals[c1+2] = normals.get(c1)[2];
}
for(int c1 = 0; c1 < edges.size(); c1++)
{
sphereIndices[c1] = edges.get(c1).shortValue();
}
mVertBuff = fillBuffer(sphereVertices);
mTexCoordBuff = fillBuffer(sphereTexcoords);
mNormBuff = fillBuffer(sphereNormals);
mIndBuff = fillBuffer(sphereIndices);
}
My OpenGL code is below. The getVertices() functions et al return the buffers created in the Sphere constructor above.
Matrix.translateM(modelViewMatrix, 0, 0, 0, kObjectScale);
Matrix.scaleM(modelViewMatrix, 0, kObjectScale, kObjectScale, kObjectScale);
GLES20.glUseProgram(shaderProgramID);
GLES20.glVertexAttribPointer(vertexHandle, 3, GLES20.GL_FLOAT, false, 0, sphere.getInstance().getVertices());
GLES20.glVertexAttribPointer(normalHandle, 3, GLES20.GL_FLOAT, false, 0, sphere.getInstance().getNormals());
GLES20.glVertexAttribPointer(textureCoordHandle, 2, GLES20.GL_FLOAT, false, 0, sphere.getInstance().getTexCoords());
GLES20.glEnableVertexAttribArray(vertexHandle);
GLES20.glEnableVertexAttribArray(normalHandle);
GLES20.glEnableVertexAttribArray(textureCoordHandle);
GLES20.glActiveTexture(GLES20.GL_TEXTURE0);
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, mTextures.get(2).mTextureID[0]);
Matrix.translateM(modelViewMatrix, 0, (float)result[0], (float)result[1], (float)result[2]);
Matrix.rotateM(modelViewMatrix, 0, 0, 1, 0, 0);
Matrix.rotateM(modelViewMatrix, 0, 0, 0, 1, 0);
Matrix.rotateM(modelViewMatrix, 0, 0, 0, 0, 1);
Matrix.scaleM(modelViewMatrix, 0, 5, 5, 5);
Matrix.multiplyMM(modelViewProjection, 0, vuforiaAppSession.getProjectionMatrix().getData(), 0, modelViewMatrix, 0);
GLES20.glEnable(GLES20.GL_BLEND);
GLES20.glUniformMatrix4fv(mvpMatrixHandle, 1, false, modelViewProjection, 0);
GLES20.glUniform1i(texSampler2DHandle, 0);
GLES20.glDrawElements(GLES20.GL_TRIANGLES, sphere.getInstance().getNumObjectIndex(), GLES20.GL_UNSIGNED_SHORT, sphere.getInstance().getIndices());
GLES20.glDisable(GLES20.GL_BLEND);
GLES20.glDisableVertexAttribArray(vertexHandle);
GLES20.glDisableVertexAttribArray(normalHandle);
GLES20.glDisableVertexAttribArray(textureCoordHandle);
The fillBuffer code is as follows:
protected Buffer fillBuffer(double[] array)
{
// Convert to floats because OpenGL doesn't work on doubles, and manually
// casting each input value would take too much time.
// Each float takes 4 bytes
ByteBuffer bb = ByteBuffer.allocateDirect(4 * array.length);
bb.order(ByteOrder.LITTLE_ENDIAN);
for (double d : array)
bb.putFloat((float) d);
bb.rewind();
return bb;
}
The problem is when you add the points to the final arrays:
for(int c1 = 0; c1 < points.size(); c1 += 3)
{
sphereVertices[c1] = points.get(c1)[0];
sphereVertices[c1+1] = points.get(c1)[1];
sphereVertices[c1+2] = points.get(c1)[2];
}
instead of using the same index for both the array and the list us separate ones:
for(int c1 = 0, i= 0; c1 < points.size(); c1++)
{
sphereVertices[i++] = points.get(c1)[0];
sphereVertices[i++] = points.get(c1)[1];
sphereVertices[i++] = points.get(c1)[2];
}
same for the other arrays
I've the following opencl-code:
__constant SomeConstants[] = { 5, 7, 242, 74 };
long aslong(__global int * arr, int offset) {
return ... <CalculateLong> ...;
};
void calcSomething(__global int * numArray, int * result) {
long tempArray[] = { numArray[0], numArray[1], numArray[2], numArray[3] };
long x0 = (result[0] = aslong(numArray, 0)) + tempArray[0];
long x1 = (result[1] = aslong(numArray, 8)) + tempArray[1];
long x2 = (result[2] = aslong(numArray, 16)) + tempArray[2];
long x3 = (result[3] = aslong(numArray, 32)) + tempArray[3];
// do some calculations here
result[0] = x0;
result[1] = x1;
result[2] = x2;
result[3] = x3;
};
__kernel void calc(__global int * numArray, __global long * sum) {
long arr[] = { SomeConstants[0], SomeConstants[1], SomeConstants[2], SomeConstants[3] };
calcSomething(numArray, arr);
sum[get_global_id(0)] = arr[0];
sum[get_global_id(0)+1] = arr[1];
sum[get_global_id(0)+2] = arr[2];
sum[get_global_id(0)+3] = arr[3];
};
The code above works as expected but the problem is, that I can't add some more calculations to calcSomething. When I add an empty loop to calcSomethig like as follows:
void calcSomething(__global int * numArray, int * result) {
long tempArray[] = { numArray[0], numArray[1], numArray[2], numArray[3] };
long x0 = (result[0] = aslong(numArray, 0)) + tempArray[0];
long x1 = (result[1] = aslong(numArray, 8)) + tempArray[1];
long x2 = (result[2] = aslong(numArray, 16)) + tempArray[2];
long x3 = (result[3] = aslong(numArray, 32)) + tempArray[3];
for (int i = 0; i < 18; i++) {}
result[0] = x0;
result[1] = x1;
result[2] = x2;
result[3] = x3;
};
result[0] and result[1] stores the correct values, but result[2] and result[3] are absolutely wrong. By commenting out the for-loop, result[2] and 3 are correct... any idea what happened to x0, x1, x2 and x3 while calling the for-loop?
EDITED ANSWER: Your problem is located in the operations:
sum[get_global_id(0)] = arr[0];
sum[get_global_id(0)+1] = arr[1];
sum[get_global_id(0)+2] = arr[2];
sum[get_global_id(0)+3] = arr[3];
You are using your sum[] global array with overlapping indexes. However MANY work-items will write the same memory, producing undefined results.
You need either to fix the overlapping, or perform an atomic operation to the global variable.
Example:
gid 0 -> Write to sum[0,1,2,3]
gid 1 -> Write to sum[1,2,3,4]
//....
The for loop doesn't have to do with the problem at all. You are just seeing the effect of different schedules in a multithreaded enviroment. But that is not the root of the problem.
Further edit:
The code has problems NOT related to OpenCL. The only problem with OpenCL is already described above.
There are many places where C variables are wrongly adressed or casted implicitly and then used wrongly. For example in aslong() 8 consecutive ints are added to make a long, but a long is made of 8bytes, not 8int.