Why my code gives a different output when zero is added front? - bit

can anyone explain why the output is 65?
#include <stdio.h>
int main()
{
int b=0101;
printf("%d",b);
return 0;
}

Because 0101 is an octal number (it is in base 8). In C octal numbers start with 0. So it is 64 * 1 + 8 * 0 + 1 * 1 = 65.

Related

Qsort comparison

I'm converting C++ code to Go, but I have difficulties in understanding this comparison function:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>
using namespace std;
typedef struct SensorIndex
{ double value;
int index;
} SensorIndex;
int comp(const void *a, const void* b)
{ SensorIndex* x = (SensorIndex*)a;
SensorIndex* y = (SensorIndex*)b;
return abs(y->value) - abs(x->value);
}
int main(int argc , char *argv[])
{
SensorIndex *s_tmp;
s_tmp = (SensorIndex *)malloc(sizeof(SensorIndex)*200);
double q[200] = {8.48359,8.41851,-2.53585,1.69949,0.00358129,-3.19341,3.29215,2.68201,-0.443549,-0.140532,1.64661,-1.84908,0.643066,1.53472,2.63785,-0.754417,0.431077,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256};
for( int i=0; i < 200; ++i ) {
s_tmp[i].value = q[i];
s_tmp[i].index = i;
}
qsort(s_tmp, 200, sizeof(SensorIndex), comp);
for( int i=0; i<200; i++)
{
cout << s_tmp[i].index << " " << s_tmp[i].value << endl;
}
}
I expected that the "comp" function would allow the sorting from the highest (absolute) value to the minor, but in my environment (gcc 32 bit) the result is:
1 8.41851
0 8.48359
2 -2.53585
3 1.69949
11 -1.84908
5 -3.19341
6 3.29215
7 2.68201
10 1.64661
14 2.63785
12 0.643066
13 1.53472
4 0.00358129
9 -0.140532
8 -0.443549
15 -0.754417
16 0.431077
17 -0.123256
18 -0.123256
19 -0.123256
20 -0.123256
...
Moreover one thing that seems strange to me is that by executing the same code with online services I get different values (cpp.sh, C++98):
0 8.48359
1 8.41851
5 -3.19341
6 3.29215
2 -2.53585
7 2.68201
14 2.63785
3 1.69949
10 1.64661
11 -1.84908
13 1.53472
4 0.00358129
8 -0.443549
9 -0.140532
12 0.643066
15 -0.754417
16 0.431077
17 -0.123256
18 -0.123256
19 -0.123256
20 -0.123256
...
Any help?
This behavior is caused by using abs, a function that works with int, and passing it double arguments. The doubles are being implicitly cast to int, truncating the decimal component before comparing them. Essentially, this means you take the original number, strip off the sign, and then strip off everything to the right of the decimal and compare those values. So 8.123 and -8.9 are both converted to 8, and compare equal. Since the inputs are reversed for the subtraction, the ordering is in descending order by magnitude.
Your cpp.sh output reflects this; all the values with a magnitude between 8 and 9 appear first, then 3-4s, then 2-3s, 1-2s and less than 1 values.
If you wanted to fix this to actually sort in descending order in general, you'd need a comparison function that properly used the double-friendly fabs function, e.g.
int comp(const void *a, const void* b)
{ SensorIndex* x = (SensorIndex*)a;
SensorIndex* y = (SensorIndex*)b;
double diff = fabs(y->value) - fabs(x->value);
if (diff < 0.0) return -1;
return diff > 0;
}
Update: On further reading, it looks like std::abs from <cmath> has worked with doubles for a long time, but std::abs for doubles was only added to <cstdlib> (where the integer abs functions dwell) in C++17. And the implementers got this stuff wrong all the time, so different compilers would behave differently at random. In any event, both the answers given here are right; if you haven't included <cmath> and you're on pre-C++17 compilers, you should only have access to integer based versions of std::abs (or ::abs from math.h), which would truncate each value before the comparison. And even if you were using the correct std::abs, returning the result of double subtraction as an int would drop fractional components of the difference, making any values with a magnitude difference of less than 1.0 appear equal. Worse, depending on specific comparisons performed and their ordering (since not all values are compared to each other), the consequences of this effect could chain, as comparison ordering changes could make 1.0 appear equal to 1.6 which would in turn appear equal to 2.5, even though 1.0 would be correctly identified as less than 2.5 if they were compared to each other; in theory, as long as each number is within 1.0 of every other number, the comparisons might evaluate as if they're all equal to each other (pathological case yes, but smaller runs of such errors would definitely happen).
Point is, the only way to figure out the real intent of this code is to figure out the exact compiler version and C++ standard it was originally compiled under and test it there.
There is a bug in your comparison function. You return an int which means you lose the distinction between element values whose absolute difference is less then 1!
int comp(const void* a, const void* b)
{
SensorIndex* x = (SensorIndex*)a;
SensorIndex* y = (SensorIndex*)b;
// what about differences between 0.0 and 1.0?
return abs(y->value) - abs(x->value);
}
You can fix it like this:
int comp(const void* a, const void* b)
{ SensorIndex* x = (SensorIndex*)a;
SensorIndex* y = (SensorIndex*)b;
if(std::abs(y->value) < std::abs(x->value))
return -1;
return 1;
}
A more modern (and safer) way to do this would be to use std::vector and std::sort:
// use a vector for dynamic arrays
std::vector<SensorIndex> s_tmp;
for(int i = 0; i < 200; ++i) {
s_tmp.push_back({q[i], i});
}
// use std::sort
std::sort(std::begin(s_tmp), std::end(s_tmp), [](SensorIndex const& a, SensorIndex const& b){
return std::abs(b.value) < std::abs(a.value);
});

C/C++ rand() function for biased expectation

I am using <stdlib.h> rand() function to generate 100 random integers within range [0 ... 9]. I used the following way to generate them on equal distribution,
int random_numbers[100];
for(register int i = 0; i < 100; i++){
random_numbers[i] = rand() % 10;
}
This is working fine. But now I want to get 100 numbers where I want around 50% of those numbers to be 5. How do I do that?
Extended Problem
I want to get 100 numbers. What if I want 50% of those number will be between 0~2. I mean 50 percent of those number will consists only with number 0, 1, 2. How to do that?
I am expecting generalised steps which can be applied beyond the boundary of 10 or 100.
Hmmm, how about choosing a random number between 0 and 17, and if the number is greater than 9, change it to 5?
For 0 - 17, you would get a distribution like
0,1,2,3,4,5,6,7,8,9,5,5,5,5,5,5,5,5
Code:
int random_numbers[100];
for(register int i = 0; i < 100; i++){
random_numbers[i] = rand() % 18;
if (random_numbers[i] > 9) {
random_numbers[i] = 5;
}
}
You basically add a set of numbers beyond your desired range that, when translated to 5 give you equal numbers of 5 and non-5.
In order to get around 50% of these numbers to be in [0, 2] range you can split the full range of rand() into two equal halves and then use the same %-based technique to map the first half to [0, 2] range and the second half to [3, 9] range.
int random_numbers[100];
for(int i = 0; i < 100; i++)
{
int r = rand();
random_numbers[i] = r <= RAND_MAX / 2 ? r % 3 : r % 7 + 3;
}
To to get around 50% of these numbers to be 5 a similar technique will work. Just map the second half to [0, 9] range with 5 excluded
int random_numbers[100];
for(int i = 0; i < 100; i++)
{
int r = rand();
if (r <= RAND_MAX / 2)
r = 5;
else if ((r %= 9) >= 5)
++r;
random_numbers[i] = r;
}
I think it is easy to solve the particular problem of 50% using the techniques mentioned by other answers. Let us try to answer the question for a general case -
Let us say you want a distribution where you want the numbers {A1, A2, .. An} with the percentages {P1, P2, .. Pn} and sum of Pi is 100% (and all the percentages are integers, if not it can be adjusted).
We will create an array of 100 size and fill it with the numbers A1-An.
int distribution[100];
Now we fill each number, it's percentage number of times.
int postion = 0;
for (int i = 0; i < n; i++) {
for( int j = 0; j < P[i]; j++) {
// Add a check here to make sure the sum hasn't crossed 100
distribution[position] = A[i];
position ++;
}
}
Now that this initialization is done once, you can draw a random number as -
int number = distribution[rand() % 100];
In case your percentages are not integers but say you want precision of 0.1%, you can create an array of 1000 instead of 100.
In both case, the goal is 50% selected from one set and 50% from another. Code could call rand() and uses some bits (one) for choosing the group and the remaining bits for value selection.
If the range of numbers needed is much smaller than RAND_MAX, a first attempt could use:
int rand_special_50percent(int n, int special) {
int r = rand();
int r_div_2 = r/2;
if (r%2) {
return special;
}
int y = r_div_2%(n-1); // 9 numbers left
if (y >= special) y++;
return y;
}
int rand_low_50percent(int n, int low_special) {
int r = rand();
int r_div_2 = r/2;
if (r%2) {
return r_div_2%(low_special+1);
}
return r_div_2%(n - low_special) + low_special + 1;
}
Sample
int r5 = rand_special_50percent(10, 5);
int preferred_low_value_max = 2;
int r012 = rand_low_50percent(10, preferred_low_value_max);
Advanced:
With n above RAND_MAX/2, additional calls to rand() are needed.
When using rand()%n, unless (RAND_MAX+1u)%n == 0 (n is a divisor of RAND_MAX+1), a bias is introduced. The above code does not compensate for that.
C++11 solution (not optimal but easy)
std::piecewise_constant_distribution can generate random real numbers (float or double) for given intervals and weights for the each interval.
Not optimal because this solution is generating double and converting double to int. Also getting exactly 50 from [0,3) 100 samples is not guaranteed but for around 50 samples is guaranteed.
For your case : 2 intervals - [0,3), [3,100) and their weights [1,1]
Equal weights, so ~50% of the numbers from [0,3) and ~50% from [3,100)
#include <iostream>
#include <string>
#include <map>
#include <random>
int main()
{
std::random_device rd;
std::mt19937 gen(rd());
std::vector<double> intervals{0, 3, 3, 100};
std::vector<double> weights{ 1, 0, 1};
std::piecewise_constant_distribution<> d(intervals.begin(), intervals.end(), weights.begin());
std::map<int, int> hist;
for(int n=0; n<100; ++n) {
++hist[(int)d(gen)];
}
for(auto p : hist) {
std::cout << p.first << " : generated " << p.second << " times"<< '\n';
}
}
Output:
0 : generated 22 times
1 : generated 19 times
2 : generated 16 times
4 : generated 1 times
5 : generated 2 times
8 : generated 1 times
12 : generated 1 times
17 : generated 1 times
19 : generated 1 times
22 : generated 2 times
23 : generated 1 times
25 : generated 1 times
29 : generated 1 times
30 : generated 2 times
31 : generated 1 times
36 : generated 1 times
38 : generated 1 times
44 : generated 1 times
45 : generated 1 times
48 : generated 1 times
49 : generated 1 times
51 : generated 1 times
52 : generated 1 times
53 : generated 1 times
57 : generated 2 times
58 : generated 3 times
62 : generated 1 times
65 : generated 2 times
68 : generated 1 times
71 : generated 1 times
76 : generated 2 times
77 : generated 1 times
85 : generated 1 times
90 : generated 1 times
94 : generated 1 times
95 : generated 1 times
96 : generated 2 times

C++ sizeof(struct)

code like this:
#include <stdio.h>
int main(){
struct{
unsigned char a:4;
unsigned char b:4;
}i;
struct{
unsigned char a:4;
unsigned char b:4;
unsigned char c:4;
}j;
i.a = 1;
i.b = 1;
j.a = 1;
j.b = 1;
j.c = 1;
printf("size of i is: %d, size of j is: %d", sizeof(i), sizeof(j));
return 0;
}
why the output is 1 2? means size of i possess 1 byte, j possess 2 bytes. we know unsigned char have 1 byte, so why i not equal 2? i am sorry for my english.
All variables in C++ are padded upto next byte.
In struct i, both a and b are of 4 bit summing up to 1 byte.
In j, variables sum up to 12 bits, but size is 2 byte due to padding.
Reference: http://www.cplusplus.com/forum/general/51911/

Best way to achieve CUDA Vector Diagonalization

What I want to do is feed in my m x n matrix, and in parallel, construct n square diagonal matrices for each column of the matrix, perform an operation on each square diagonal matrix, and then recombine the result. How do I do this?
So far, I start of with an m x n matrix; the result from a previous matrix computation where each element is calculated using the function y = f(g(x)).
This gives me a matrix with n column elements [f1, f2...fn] where each fn represents a column vector of height m.
From here, I want to differentiate each column of the matrix with respect to g(x). Differentiating fn(x) w.r.t. g(x) results in a square matrix with elements f'(x). Under constraint, this square matrix reduces to a Jacobian with the elements of each row along the diagonal of the square matrix, and equal to fn', all other elements equaling zero.
Hence the reason why it is necessary to construct the diagonal for each of the vector rows fn.
To do this, I take a target vector defined as A(hA x 1) which was extracted from the larger A(m x n) matrix. I then prepared a zeroed matrix defined as C(hA x hA) which will be used to hold the diagonals.
The aim is to diagonalize the vector A into a square matrix with each element of A sitting on the diagonal of C, everything else being zero.
There are probably more efficient ways to accomplish this using some pre-built routine without building a whole new kernel, but please be aware that for these purposes, this method is necessary.
The kernel code (which works) to accomplish this is shown here:
_cudaDiagonalizeTest << <5, 1 >> >(d_A, matrix_size.uiWA, matrix_size.uiHA, d_C, matrix_size.uiWC, matrix_size.uiHC);
__global__ void _cudaDiagonalizeTest(float *A, int wA, int hA, float *C, int wC, int hC)
{
int ix, iy, idx;
ix = blockIdx.x * blockDim.x + threadIdx.x;
iy = blockIdx.y * blockDim.y + threadIdx.y;
idx = iy * wA + ix;
C[idx * (wC + 1)] = A[idx];
}
I am a bit suspicious that this is a very naive approach to a solution and was wondering if someone could give an example of how I could do the same using
a) reduction
b) thrust
For vectors of large row size, I would like to be able to use the GPU's multithreading capabilities to chunk the task into small jobs, and combine each result at the end with __syncthreads().
The picture below shows what the desired result is.
I have read NVIDIA's article on reduction, but did not manage to achieve the desired results.
Any assistance or explanation would be very much welcomed.
Thanks.
Matrix A is the target with 4 columns. I want to take each column, and copy its elements into Matrix B as a diagonal, iterating through each column.
I created a simple example based on thrust. It uses column-major order to store the matrices in a thrust::device_vector. It should scale well with larger row/column counts.
Another approach could be based off the thrust strided_range example.
This example does what you want (fill the diagonals based on the input vector). However, depending on how you proceed with the resulting matrix to your "Differentiating" step, it might still be worth investigating if a sparse storage (without all the zero entries) is possible, since this will reduce memory consumption and ease iterating.
#include <thrust/device_vector.h>
#include <thrust/scatter.h>
#include <thrust/sequence.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/functional.h>
#include <iostream>
template<typename V>
void print_matrix(const V& mat, int rows, int cols)
{
for(int i = 0; i < rows; ++i)
{
for(int j = 0; j < cols; ++j)
{
std::cout << mat[i + j*rows] << "\t";
}
std::cout << std::endl;
}
}
struct diag_index : public thrust::unary_function<int,int>
{
diag_index(int rows) : rows(rows){}
__host__ __device__
int operator()(const int index) const
{
return (index*rows + (index%rows));
}
const int rows;
};
int main()
{
const int rows = 5;
const int cols = 4;
// allocate memory and fill with demo data
// we use column-major order
thrust::device_vector<int> A(rows*cols);
thrust::sequence(A.begin(), A.end());
thrust::device_vector<int> B(rows*rows*cols, 0);
// fill diagonal matrix
thrust::scatter(A.begin(), A.end(), thrust::make_transform_iterator(thrust::make_counting_iterator(0),diag_index(rows)), B.begin());
print_matrix(A, rows, cols);
std::cout << std::endl;
print_matrix(B, rows, rows*cols);
return 0;
}
This example will output:
0 5 10 15
1 6 11 16
2 7 12 17
3 8 13 18
4 9 14 19
0 0 0 0 0 5 0 0 0 0 10 0 0 0 0 15 0 0 0 0
0 1 0 0 0 0 6 0 0 0 0 11 0 0 0 0 16 0 0 0
0 0 2 0 0 0 0 7 0 0 0 0 12 0 0 0 0 17 0 0
0 0 0 3 0 0 0 0 8 0 0 0 0 13 0 0 0 0 18 0
0 0 0 0 4 0 0 0 0 9 0 0 0 0 14 0 0 0 0 19
An alternate answer that does not use thrust is as follows:
_cudaMatrixTest << <5, 5 >> >(d_A, matrix_size.uiWA, matrix_size.uiHA, d_C, matrix_size.uiWC, matrix_size.uiHC);
__global__ void _cudaMatrixTest(float *A, int wA, int hA, float *C, int wC, int hC)
{
int ix, iy, idx;
ix = blockIdx.x * blockDim.x + threadIdx.x;
iy = blockIdx.y * blockDim.y + threadIdx.y;
idx = iy * wA + ix;
C[idx * wC + (idx % wC)] = A[threadIdx.x * wA + (ix / wC)];
}
where d_A is
0 5 10 15
1 6 11 16
2 7 12 17
3 8 13 18
4 9 14 19
Both answers are viable solutions. The question is, which is better/faster?

Algorithm for Simple Squared Squares

I want to split a square in unequal squares.
After some search on the web found this Link.
This is an output i need :
Does anyone have idea for this?
As Yves Daoust said the algorithm to solve this is going to be slow. The first challenge is to determine what squares COULD be combined to fit into your big square. Then figure out if they WILL fit in there.
I would first filter by area.
To answer the first part you need to look for a combination of squares that will fit into your big one. There are likely multiple combinations as a 5x5 square takes up the same area as a 3x3 with a 4x4 square. This is a O(2^n) problem in itself.
Then attempt to arrange them.
I would make a matrix that is the size of your big square. Then starting at the topmost then right most index add in a square by marking the matrix positions as occupied by that square. Then move to the next unoccupied space, based on the previous rules adding an unused square. If no square fits then remove the previous square and continue to the next. This is a method begging for recursion.
As I said at the beginning this is a SLOW way to do it but it will give you a solution if one exists.
I used a dynamic programming approach for solving this. but it works until n ~ 50. I stored a solution as a bitset for efficiency:
You can compile the code yourself with:
$ g++ -O3 -std=c++11 squares.cpp -o squares
#include <bitset>
#include <iostream>
#include <list>
#include <vector>
using namespace std;
constexpr auto N = 116;
class FastSquareList {
public:
FastSquareList() = default;
FastSquareList(int i) { mask_.set(i); }
FastSquareList operator+(int i) const {
FastSquareList result = *this;
result.mask_.set(i);
return result;
}
bool has(int i) const { return mask_.test(i); }
void display() const {
for (auto i = 1; i <= N; ++i) {
if (has(i)) {
cout << i * i << " ";
}
}
cout << endl;
}
private:
bitset<N + 1> mask_;
};
int main() {
int n;
cin >> n;
vector<list<FastSquareList> > v(n * n + 1);
for (int i = 1; i <= n; ++i) {
v[i * i].push_back(i);
for (int a = i * i + 1; a <= n * n; ++a) {
int p = a - i * i;
for (const auto& l : v[p]) {
if (l.has(i)) {
continue;
}
v[a].emplace_back(l + i);
}
}
}
for (const auto& l : v[n * n]) {
l.display();
}
cout << "solutions count = " << v[n*n].size() << endl;
return 0;
}
an example:
$ ./Squares
15
9 16 36 64 100
25 36 64 100
1 4 9 16 25 49 121
4 36 64 121
4 100 121
4 16 25 36 144
1 16 64 144
81 144
4 16 36 169
4 9 16 196
4 25 196
225
solutions count = 12

Resources