MPI zero count data exchange - parallel-processing

I have some questions regarding exchanging zero-count data (say, by MPI_Send and MPI_Recv), for which I have trouble finding answers in MPI docs:
1) As I understand, it is legal (by MPI standard) to have count equal 0. Or is it implementation defined?
1a) In functions like MPI_Gatherv can some counts be zero?
2) If count is zero, does buffer still have to be a valid pointer? Or can it be NULL/uninitialized?
3) Even if count is 0, there is some communication over the network, i.e. some header/meta data is still communicated. Am I right?

1) It is legal to have count equal to zero.
1a) It is legal to have some counts to be zero in MPI_Gatherv() (and MPI_Scatterv(),MPI_Alltoallv()` and friends)
2) The standard does not mandate the pointer to be valid if the count is zero
3) A zero message size is a message, and as a direct consequence, some metadata is exchanged. MPI_Recv(..., count=0, ...) only returns after a zero size message has been received (and hence sent), and that could not happen if no data was sent.

Related

Could a CRC32 key with a most or least significant bit of 0 be valid?

I have a server receiving UDP packets with the payload being a number of CRC32 checksumed 4 byte words. The header in each UDP packet has a 2 byte field holding the "repeating" key used for the words in the payload. The way I understand it is that in CRC32 the keys must start and end with a 1 in the binary representation of the key. In other words the least and most significant bits of the key must be a 1 and not 0. So my issue is that I get, for example, the first UDP packet received has the key holding field reading 0x11BC which would have the binary representation 00010001 10111100. So the 1's are neither right nor left aligned to the key holding word. There are trailing 0's on both sides. Is my understanding on valid CRC32 keys wrong then? I ask as I'm trying to write the code to check each word using the key as is and it seems to always give a remainder meaning every word in the payload has an error and yet the instructions I've been given guarantee that the first packet received in the sample given has no errors.
Although it is true that CRC polynomials always have the top and bottom bit set, often this is dealt with implicitly; a 32-bit CRC is actually a 33-bit calculation and the specified polynomial ordinarily omits the top bit.
So e.g. the standard quoted polynomial for a CCITT CRC16 is 0x1021, which does not have its top bit set.
It is normal to include the LSB, so if you're certain you know which way around the polynomial has been specified then either the top or the bottom bit of your word should be set.
However, for UDP purposes you've possibly also made a byte ordering error on one side of the connection or the other? Network byte ordering is conventionally big endian whereas most processors today are little — is one side of the link switching byte order but not the other?

maximum field number in protobuf message

The official document for protocol buffers https://developers.google.com/protocol-buffers/docs/proto3 says the maximum field number for fields in protobuf message is 2^29-1. But why is this limit?
Please anyone can explain in some detail? I am newbie to this.
I read answers to the this question at why 2^29-1 is the biggest key in protocol buffers.
But I am not clarified
Each field in an encoded protocol buffer has a header (called key or tag) prefixed to the actual encoded value. The encoding spec defines this key:
Each key in the streamed message is a varint with the value (field_number << 3) | wire_type – in other words, the last three bits of the number store the wire type.
Here the spec says the tag is a varint where the first 3 bits are used to encode the wire type. A varint could encode a 64 bit value, thus just by going on this definition the limit would be 2^61-1.
In addition to this, the Language Guide narrows this down to a 32 bit value at max.
The smallest field number you can specify is 1, and the largest is 2^29 - 1, or 536,870,911.
The reasons for this are not given. I can only speculate for the reasons behind this:
Artificial limit as no one is expecting a message to have that many fields. Just think about fitting a message with that many fields into memory.
As the key is a varint, it isn't simply the next 4 bytes in the raw buffer, rather a variable length of bytes (Java code reading a varint32). Each byte has 7 bit of actual data and 1 bit indicating if the end is reached. It cloud be that for performance reasons it was deemed to be better to limit the range.
Since proto3 is the 3rd version of protocol buffers, it could be that either proto1 or proto2 defined the tag to be a varint32. To keep backwards compatibility this limit is still true in proto3 today.
Because of this line:
#define GOOGLE_PROTOBUF_WIRE_FORMAT_MAKE_TAG(FIELD_NUMBER, TYPE) \
static_cast<uint32>((static_cast<uint32>(FIELD_NUMBER) << 3) | (TYPE))
this line create a "tag", which left only 29 (32 - 3) bits to save field indice.
Don't know why google use uint32 instead of uint64 though, since field number is a varint, may be they think 2^29-1 fields is large enough for a single message declaration.
I suspect this is simply so that a field-header (wire-type and tag-number) can be decoded and handled as a 32-bit value. The wire-type is always the 3 least significant bits, leaving 29 bits for the tag number. Technically "varint" should support 64 bits, but it makes sense to limit it to reasonable numbers, not least because "varint" encoding means that larger numbers take more bytes to encode.
Edit: I realise now that this is similar to the linked post, but... it remain true! Each field in protobuf is prefixed by a "varint" that expresses what field (tag-number) follows, and what data type it is (wire-type). The latter is important especially so that unexpected fields (version differences) can be stored or skipped correctly. It is convenient for that field-header to be trivially processed by most frameworks, and most frameworks are fine with 32-bit integers.
this is another question rather a comment, in the document it says,
Field numbers in the range 16 through 2047 take two bytes. So you
should reserve the numbers 1 through 15 for very frequently occurring
message elements. Remember to leave some room for frequently occurring
elements that might be added in the future.
Because for the first byte, top 5 bits are used for field number, and bottom 3 bits for field type, isn't it that field number from 31 (because zero is not used) to 2047 take two bytes? (and I also guess the second bytes' lower 3 bits are used also for field type.. I'm in the middle of reading it, so I'll fix it when I know it)

Is there any algorithm for random number generation whose pattern cannot be revealed?

Can it be possible to create random number whose pattern of getting the next random number never be repeated even the universe ends.
I read this security rule-of-thumb:
All processes which require non-trivial random numbers MUST attempt to
use openssl_pseudo_random_bytes(). You MAY fallback to
mcrypt_create_iv() with the source set to MCRYPT_DEV_URANDOM. You MAY
also attempt to directly read bytes from /dev/urandom. If all else
fails, and you have no other choice, you MUST instead generate a value
by strongly mixing multiple sources of available random or secret
values.
http://phpsecurity.readthedocs.org/en/latest/Insufficient-Entropy-For-Random-Values.html
In Layman's terms, no; In order to generate a particular data form, such as a string or integer, you must have an algorithm of some sort, which obviously cannot be 100% untraceable...
Basically, the final product myust come from a series of events (algorithm) in which is impossible to keep 'unrevealed'.
bignum getUniqueRandom()
{
static bignum sum = 0;
sum += rand();
return sum;
}
That way the next random number will be always greater than the previous (by a random factor between 0 and 1) and as result the numbers returned will never repeat.
edit:
The actual approach when randomness requirements are so high is to use a hardware random number generator; there are for example chips that measure atom decays of background radiation generating truly random seeds. Of course the nature or randomness is such that there is never a guarantee a pattern can't repeat, or you'd be damaging the actual randomness of the result. But the pattern can't be repeated by any technical/mathematical means, so the repeats are meaningless.
At every step in the execution of a computer program, the total internal state determines what the next total internal state will be. This internal state must be represented by some number of bits--all of the memory used by the program, registers of the processor, anything else that affects it. There can only be 2**N possible states given N bits of state information.
Since any given state T will lead to the same state T+1 (that's what "deterministic" means), the algorithm must eventually repeat itself after no more than 2**N steps. So what limits the cycle length of an RNG is the number of bits of internal state. A simple LCG might have only 32 bits of state, and therefore a cycle <= 2^32. Something like Mersenne Twister has 19968 bits of internal state, and its period is 2^19937-1.
So for any deterministic algorithm to be "unrepeatable in the history of the Universe", you'll probably need most of the atoms of the Universe to be memory for its internal state.

Common algorithm example (with counters)

I'm trying to find an example of a common algorithm (the sort that you could find in a basic computer science / telecoms text book) which meets the following conditions:
there is a counter a (which could count integers, bits, time etc.)
a is reset when either: (i) a reaches or exceeds a predefined threshold x or (ii) another event occurs.
Ideally (although not strictly necessary) the "other event" that causes a to reset would be another counter b. Both a and b would reset if b reaches a predefined threshold y (and similarly, both a and b would reset if a reaches the predefined threshold x). This could be presented by:
Initialize x and y thresholds
while (true)
if (*particular event relevant to a*)
a++;
if (*particular event relevant to b*)
b++;
if (a>=x) OR (b>=y)
*Something happens*
a=0;
b=0;
Any thoughts would be much appreciated!
Many thanks
What about higher level network packet reception?
received_samples=0
you ask your lower level layer to receive min(maximum_packet_size, request_length - received_bytes )
a. you take the bytes you've got and add them to your buffer, received_bytes += n_bytes_received_this_time
b. if the number of received bytes hasn't reached the size you need, you repeat 2.
(c.) If an error occurs, you handle that.

MPI_Scatter redundant parameters?

My question is rather simple, the MPI_Scatter function definition is:
#include <mpi.h>
void MPI::Comm::Scatter(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype,
int root) const
Are 'sendcount' and 'sendtype' redundant?
In which case it can happen: sendcount!=recvcount?
Edit:
Maybe some clarification is needed about the question. I understand that maybe the reason is that, for the root the data is some 'struct X' and for the receivers is some 'struct Y' that somehow it also makes sense (it all fits 'Ok').
If that's the case... I don't get why is needed to say again that the total size of the expected data to receive, is the same of the sended data size. If it's just a matter of casting the view of the data, I'd only do the cast. In fact, the buffer is a (void *).
MPI allows for both datatypes on the sending and on the receiving end to be different as long as they are constructed from the same basic datatypes. Thare are many cases where this comes handy, e.g. scattering rows of a matrix from the root process into columns in the other processes. Sending and receiving rows is straightforward in C and C++ as the memory layout of the matrices is row-major. Sending and receiving columns requres that a special strided vector type is constructed first. Usually this type is constructed for a specified number of rows and columns and then one has to supply a count of 1 when receiving the data.
There are also many other cases when sendcount and recvcount might differ. Mind also that recvcount does not specify the size of the message to be received but rather the capacity of the receive buffer and that capacity may be way larger than the size of the message.
MPI_scatter() is for break the message in equal piezes and process each one in the child nodes and in your own. Knowing this:
Are 'sendcount' and 'sendtype' redundant?
-How can that happen?, if sendCount is the number of elements sent, and sendType is the type of those elements. Both contains different information.
And for the last question:
In which case it can happen: sendcount!=recvcount?.
-When you want to sort a sequence of numbers, you send blocks of size N and type=int to your nodes. You want the same but sorted.

Resources