what is "|" operator in Go? [closed] - go

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
what does this | operator do in Go? I found this in
import log
log.SetFlags(log.Ldate | log.Lmicroseconds | log.Llongfile)
When I checked the log.SetFlags(flag) method, it accepts an int. I don't understand how does it operate on this int value?

The | operator is bitwise OR, as mentioned in the Arithmetic operators section of the spec.
This performs bitwise OR of two integers. In this case, combining multiple flags into one.
In the log package, the flags have the following values:
const (
Ldate = 1 << iota // the date in the local time zone: 2009/01/23
Ltime // the time in the local time zone: 01:23:23
Lmicroseconds // microsecond resolution: 01:23:23.123123. assumes Ltime.
Llongfile // full file name and line number: /a/b/c/d.go:23
Lshortfile // final file name element and line number: d.go:23. overrides Llongfile
LUTC // if Ldate or Ltime is set, use UTC rather than the local time zone
Lmsgprefix // move the "prefix" from the beginning of the line to before the message
LstdFlags = Ldate | Ltime // initial values for the standard logger
)
Ldate: 1 (or b00001)
Lmicroseconds: 4 (or b00100)
Llongfile: 8 (or b01000)
Performing a bitwise OR of all three gives you b01101 or 13. This is a common way of using "bit flags" and combining them.

| operator is an Arithmetic operator called bitwise OR used for integers operations.
Example
var a uint = 60 /* 60 = 0011 1100 */
var b uint = 13 /* 13 = 0000 1101 */
c := a | b /* 61 = 0011 1101 */
Here,
log.Ldate , log.Lmicroseconds, log.Llongfile all represent int value.
Bitwise Or of their value means 1|4|8 = 13, so flags set as 13 which is a int value.

Related

Why does fmt.Prinft() log `const` in other package like memory address? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
How did fmt.Printf() treat %x? when the parameter is a var or const. I read from books that const has no Type before been specified, it is hard to understand.
const AValue int32 = 1049088
func main() {
fmt.Printf("%#x\n", 1049088)
fmt.Printf("%#x\n", AValue)
fmt.Printf("%#x\n", int(time.Friday))
fmt.Printf("%#x\n", time.Friday)
}
logs :
0x100200 // 1049088
0x100200 // AValue
0x5 // int(time.Friday)
0x467269646179 // time.Friday is a type Weekday int
Is `0x467269646179` some kind address of time.Friday?
From the docs:
Except when printed using the verbs %T and %p, special formatting
considerations apply for operands that implement certain interfaces.
In order of application:
...
5. If an operand implements method String() string, that method will be invoked to convert the object to a string, which will then be
formatted as required by the verb (if any).
For string the verb formatting is defined as:
%s the uninterpreted bytes of the string or slice
%q a double-quoted string safely escaped with Go syntax
%x base 16, lower-case, two characters per byte
%X base 16, upper-case, two characters per byte
So 0x467269646179 is the base 16, lower-case, two characters per byte of the output of time.Friday.String().
https://play.golang.org/p/avV-X2uiL1D

Algorithm for hashing/encoding multiple values into a single integer value

There is this algorithm for "hashing" or encoding multiple values into a single integer, by assigning exponentially increasing numeric value to individual values. This approach was in particular used in Windows DLLs.
A possible use case can be a client application requesting a list of items matching certain status codes from an API.
For example, if we have the following values:
* open
* assigned
* completed
* closed
...we assign a numeric value to each:
* open - 1
* assigned - 2
* completed - 4
* closed - 8
etc. where each following value is 2x the previous.
Encoding
When we need to pass a combination of any of these values, we add up the corresponding numeric values. For example, for "open, assigned" it is 3, for "assigned, completed, closed" it is 14. This covers all of the unique combinations. As we can see, the "encoding" part is very straightforward.
Decoding
To decode the value, the only way I can think of is switch..case statements, like so (pseudocode):
1 = open
2 = assigned
3 = open + assigned
4 = completed
5 = open + completed
6 = assigned + completed
7 = open + assigned + completed
8 = closed
9 = open + closed
10 = assigned + closed
11 = open + assigned + closed
12 = completed + closed
13 = open + completed + closed
14 = assigned + completed + closed
15 = open + assigned + completed + closed
This algorithm obviously works under the following assumptions:
only works when each value is used only once
only works when both sides know the matching numeric values
Questions:
What is a more optimal way/algorithm to "decode" the values instead of the very elaborate switch..case statements?
Is there a name for this algorithm?
Note: the question is tagged with winapi mostly for discoverability. The algorithm is fairly universal.
What you are describing is formally known as a bit mask, where each bit in an integer is assigned a meaning. Bits are assigned numeric values that are powers of 2 in binary (bit0=20=1, bit1=21=2, bit2=22=4, bit3=23=8, etc).
You can use the OR and AND logical bitwise operators to set/query individual bits in an integer, eg:
const DWORD State_Open = 1;
const DWORD State_Assigned = 2;
const DWORD State_Completed = 4;
const DWORD State_Closed = 8;
void DoSomething(DWORD aStates)
{
...
if (aStates & State_Open)
// open is present
else
// open is not present
if (aStates & State_Assigned)
// assigned is present
else
// assigned is not present
if (aStates & State_Completed)
// completed is present
else
// completed is not present
if (aStates & State_Closed)
// closed is present
else
// closed is not present
...
}
DWORD lState = State_Open | State_Assigned | State_Completed | State_Closed;
// whatever combination you need ...
DoSomething(lState);
In Delphi/Pascal, this is better handled using a Set instead, which is internally implemented as a bit mask, eg:
type
State = (State_Open, State_Assigned, State_Completed, State_Closed);
States = Set of State;
procedure DoSomething(aStates: States);
begin
...
if State_Open in aStates then
// open is present
else
// open is not present
if State_Assigned in aStates then
// assigned is present
else
// assigned is not present
if State_Completed in aStates then
// completed is present
else
// completed is not present
if State_Closed in aStates then
// closed is present
else
// closed is not present
...
end;
var
lState: States;
begin
...
lState := [State_Open, State_Assigned, State_Completed, State_Closed];
// whatever combination you need ...
DoSomething(lState);
...
end;

How to Create LMDB for Caffe Using C

I need to create LMDBs dynamically that can be read by Caffe's data layer, and the constraint is that only C is available for doing so. No Python.
Another person examined the byte-level contents of a Caffe-ready LMDB file here: Caffe: Understanding expected lmdb datastructure for blobs
This is a good illustrative example but obviously not comprehensive. Drilling down led me to the Datum message type, defined by caffe.proto, and the ensuing caffe.pb.h file created by protoc from caffe.proto, but this is where I hit a dead end.
The Datum class in the .h file defines a method that appears to be a promising lead:
void SerializeWithCachedSizes(::google::protobuf::io::CodedOutputStream* output) const
I'm guessing this is where the byte-level magic happens for encoding messages before they're sent.
Question: can anyone point me to documentation (or anything) that describes how the encoding works, so I can replicate an abridged version of it? In the illustrative example, the LMDB file contains MNIST data and metadata, and 0x08 seems to signify that the next value is "Number of Channels". And 0x10 and 0x18 designate heights and widths, respectively. 0x28 appears to designate an integer label being next. And so on, and so forth.
I'd like to gain a comprehensive understanding of all possible bytes and their meanings.
Additional digging yielded answers on the following page: https://developers.google.com/protocol-buffers/docs/encoding
Caffe.proto defines Datum by:
optional int32 channels = 1
optional int32 height = 2
optional int32 width = 3
optional bytes data = 4
optional int32 label = 5
repeated float float_data = 6
optional bool encoded = 7
The LMDB record's header in the illustrative example cited above is "08 01 10 1C 18 1C 22 90 06", so with the Google documentation's decoder ring, these hexadecimal values begin to make sense:
08 = Field 1, Type = int32 (since tags are encoded by: (field_number << 3) | wire_type)
01 = Value of Field 1 (i.e., number of channels) is 01
10 = Field 2, Type = int32
1C = Value of Field 2 (i.e., height) is 28
18 = Field 3, Type = int32
1C = Value of Field 3 (i.e., width) is 28
22 = Field 4, Type = length-delimited in bytes
90 06 = Value of Field 4 (i.e., number of bytes) is 1580 using the VarInt encoding methodology
Given this, efficiently creating LMDB entries directly with C for custom, non-image data sets that are readable by Caffe's data layer becomes straightforward.

How to create a hack proof unique code [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am creating bunch of unique codes in order to run a promotional campaign.
The campaign will run for a total of 20 million unique items. The validity of the code will be one year. I am currently looking for best possible option.
I can use only 0-9 and A-Z in the code. so that limits me to using 36 unique characters in my code. The end user will need to key in the unique cd in the system and get offers. The unique code will not be tied against any user or transaction to begin with.
One way to generate unique code is create incremental numbers and then convert them to base36 to get a unique cd. The problem with this is that its easily hackable. Users can start inserting unqiue cd in incremental fashion and redeem offers not meant for them. I am thinking of introducing some kind of randomisation. Need suggestions regarding the same.
Note - The limit of max characters in the code is 8.
Use a cryptographically strong random number generator to generate 40-bit numbers (i.e. sequences of 5-byte random arrays). Converting each array to base-36 will yield a sequence of random eight-character codes. Run an additional check on each code to make sure that there are no duplicates. Using a hash set on the converted strings will let you perform this task in a reasonable time.
Here is an example implementation in Java:
Set<String> codes = new HashSet<>();
SecureRandom rng = new SecureRandom();
byte[] data = new byte[5];
for (int i = 0 ; i != 100000 ; i++) {
rng.nextBytes(data);
long val = ((long)(data[0] & 0xFF))
| (((long)(data[1] & 0xFF)) << 8)
| (((long)(data[2] & 0xFF)) << 16)
| (((long)(data[3] & 0xFF)) << 24)
| (((long)(data[4] & 0xFF)) << 32);
String s = Long.toString(val, 36);
codes.add(s);
}
System.out.println("Generated "+codes.size()+" codes.");
Demo.
Use a Guid (C# code):
string code = Guid.NewGuid().ToString().Substring(0,8).ToUpperInvariant();
Since we have a hexadecimal representation we get digits and the characters a to f. We get 16^8 possible codes which is > 4 billion codes. One every 214 for 20 million codes.
Guid.NewGuid().ToString() yields a string like "6b984c2f-5866-4745-ac34-d5088a56070f". Since the first group has a length of 8 characters we can just take the first 8 chars and convert them to upper case. The result looks like "6B984C2F".
Note that this can yield duplicate codes. We can avoid this like this:
var codes = new HashSet<string>();
while (codes.Count < 20000000) {
string code = Guid.NewGuid().ToString().Substring(0,8).ToUpperInvariant();
codes.Add(code);
}
The HashSet allows you to add an item more than once but always only keeps one of them. (Just as math sets.)
If you want to use the full range of possible values the one-liner from above does not do it. With the whole alphabet plus digits we get 36^8 = ~2.8 * 10^12 possible codes. One every 141,055 for 20 million codes. That's better but still not completely hack proof. You will need to limit the number of entry attempts, use a CAPTCHA etc.
const string Base = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const int CodeLength = 8;
const int NumCodes = 20000000;
var random = new Random();
var codes = new HashSet<string>();
var chars = new char[CodeLength];
while (codes.Count < NumCodes) {
for (int i = 0; i < CodeLength; i++) {
int pos = random.Next(Base.Length);
chars[i] = Base[pos];
}
string code = new string(chars);
codes.Add(code);
}

UUID format: 8-4-4-4-12 - Why?

Why are UUID's presented in the format "8-4-4-4-12" (digits)? I've had a look around for the reason but can't find the decision that calls for it.
Example of UUID formatted as hex string:
58D5E212-165B-4CA0-909B-C86B9CEE0111
It's separated by time, version, clock_seq_hi, clock_seq_lo, node, as indicated in the following rfc.
From the IETF RFC4122:
4.1.2. Layout and Byte Order
To minimize confusion about bit assignments within octets, the UUID
record definition is defined only in terms of fields that are
integral numbers of octets. The fields are presented with the most
significant one first.
Field Data Type Octet Note
#
time_low unsigned 32 0-3 The low field of the
bit integer timestamp
time_mid unsigned 16 4-5 The middle field of the
bit integer timestamp
time_hi_and_version unsigned 16 6-7 The high field of the
bit integer timestamp multiplexed
with the version number
clock_seq_hi_and_rese unsigned 8 8 The high field of the
rved bit integer clock sequence
multiplexed with the
variant
clock_seq_low unsigned 8 9 The low field of the
bit integer clock sequence
node unsigned 48 10-15 The spatially unique
bit integer node identifier
In the absence of explicit application or presentation protocol
specification to the contrary, a UUID is encoded as a 128-bit object,
as follows:
The fields are encoded as 16 octets, with the sizes and order of the
fields defined above, and with each field encoded with the Most
Significant Byte first (known as network byte order). Note that the
field names, particularly for multiplexed fields, follow historical
practice.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_low |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | time_hi_and_version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|clk_seq_hi_res | clk_seq_low | node (0-1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node (2-5) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The format is defined in IETF RFC4122 in section 3. The output format is defined where it says "UUID = ..."
3.- Namespace Registration Template
Namespace ID: UUID Registration Information:
Registration date: 2003-10-01
Declared registrant of the namespace:
JTC 1/SC6 (ASN.1 Rapporteur Group)
Declaration of syntactic structure:
A UUID is an identifier that is unique across both space and time,
with respect to the space of all UUIDs. Since a UUID is a fixed
size and contains a time field, it is possible for values to
rollover (around A.D. 3400, depending on the specific algorithm
used). A UUID can be used for multiple purposes, from tagging
objects with an extremely short lifetime, to reliably identifying
very persistent objects across a network.
The internal representation of a UUID is a specific sequence of
bits in memory, as described in Section 4. To accurately
represent a UUID as a URN, it is necessary to convert the bit
sequence to a string representation.
Each field is treated as an integer and has its value printed as a
zero-filled hexadecimal digit string with the most significant
digit first. The hexadecimal values "a" through "f" are output as
lower case characters and are case insensitive on input.
The formal definition of the UUID string representation is
provided by the following ABNF [7]:
UUID = time-low "-" time-mid "-"
time-high-and-version "-"
clock-seq-and-reserved
clock-seq-low "-" node
time-low = 4hexOctet
time-mid = 2hexOctet
time-high-and-version = 2hexOctet
clock-seq-and-reserved = hexOctet
clock-seq-low = hexOctet
node = 6hexOctet
hexOctet = hexDigit hexDigit
hexDigit =
"0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
"a" / "b" / "c" / "d" / "e" / "f" /
"A" / "B" / "C" / "D" / "E" / "F"
128 bits
The "8-4-4-4-12" format is just for reading by humans. The UUID is really a 128-bit number.
Consider the string format requires the double of the bytes than the 128 bit number when stored or in memory. I would suggest to use the number internally and when it needs to be shown on a UI or exported in a file, use the string format.

Resources