What does the bytes (like `0x34`) mean in the module specs? - cosmos-sdk

In the specs of the Cosmos SDK, what does the byte 0x34 in a line like this mean?
Redelegations: 0x34 | DelegatorAddr | ValidatorSrcAddr | ValidatorDstAddr -> amino(redelegation)
And I guess amino is the decoder?

When you build a blockchain with Cosmos SDK, there is a key-value store abstraction for interacting with the on-chain data. Each module has its own KV store. A module may want to store different types of data in its store and so prefixes are used to differentiate between different types in the store.
In your example, in the staking module there are "redelegations" and each redelegation is stored in the store with a prefix of 0x34. There are keys for other types as well.

0x34 is the hexadecimal number 34. It corresponds with the decimal number 3*16+4=52. The | is a binary operator that uses infix notation and calculates the logical or of the two operands.

Related

What is the name 0x41 (65) in a snmp variable bindings reply?

I am attempting to understand SNMP (in general, and v3). The goal is to include an snmp agent in an embedded device running an RTOS.
I've already been through over a dozen RFCs with at least another dozen more to go. Each one creates more questions than it answers. (1052, 1065, 1067, 1155, 1156, 1157, 1212, 1213, 1592, 1905, 2578, 2579, 2580, 3410, 3411, 3412, 3413, 3414, 3415, 3416, 3417, 3418, 3584... )
I implemented mDNS-SD and 802.1X EAPOL with just a couple RFCs and it wasn't this confusing.
Many of the reviews of books I considered all complain of the same inconsistent and vagueness of the material. I bought a couple books that had better reviews.
Searching online isn't getting anywhere largely because the keywords aren't finding things I want answers to. So I must not even know the best keywords to search with.
Eventually, I decided to just try to reverse engineer what's going on, I installed WireShark on a Linux PC, and the snmpd and snmp tools, so I could sniff it. Here is what I have, and can't align what I see with what I read.
This is a v3 sniff, It's a reply to the first request from a manager. This question is just zeroing in on one of the things that I want to understand. I can't decode and examine a plaintext PDU, because I can't get a request in v2 or v1.
Wireshark shows this reply to a manager. It's apparently the first step in whatever authentication it to be used.
The book I have shows this as the protocol on the wire. And I am trying to parse out the variable bindings.
Here are the variable bindings from Wireshark
A "sequence" that is 15 bytes long (x30 x0f)
This, from the RFC, says that the list is a SEQUENCE of VarBinds, where each VarBind is the object name, and the value in ObjectSyntax. So it's looking okay so far.
Here is the next segment inside the SEQUENCE (Wireshark highlighted all 14 bytes)
An object ID that is 10 bytes long (x06, x0a)
Here is the actual object:
The objectName is the object ID, and it is x2b x6 x1 x6 x3 xf x1 x1 xx4 x0 or (1.3).6.1.6.3.15.1.1.4.0
Given that this is ISO, ORG, DOD, INTERNET, 6?... I have to assume "6" is an object under internet branch I've not yet come across. Likely something to do with the v3 security.
Next, is the value.
This is a type x41 (65), with a length of 1, and a value of 7.
Well, in "ObjectSyntax" what is x41? I can't find it defined anywhere.
For that matter, all these RFCs use words for identifiers, and I can find only a fraction of what their actual numeric values are.
Wireshark knew what it was... It's saying "Counter32"... is that what x41 is supposed to be? If so, it's nowhere near 32 bits. It's only one byte. Again, I'd like to find it's definition.
Also, somewhere, (I can't even recall which RFC) it said the reply to an OID request is to append the value to the requested object, not replace the zero (example: request: 1.3.6.1.4.300.1 -> reply 1.3.6.1.4.300.1.15 so it is a value of 15 ). This OID has a trailing zero, nad I'm not sure why.
Can anyone point me to some useful, concise, condensed information explaining this material? Every RFC requires that I go back and read some previous (and sometimes obsoleted) RFC, and I've now got over 25 of them already. I don't think it should take this many RFCs to be able to write an "simple" snmp agent. A month of researching, and most of what I have to show for it is how to read MIB files. Although that take some mental gymnastics too.
"Simple" is rather deceptive (as more than one book reviewer has stated).
RFC 1157 specifies that SNMP messages are encoded with "a subset of the basic encoding rules of ASN.1". I don't think the official basic encoding rules (BER) specification is available for free, but it's not hard to find explainers online (here's one I found with a simple search). To your question about the 0x41 byte, this is a BER identifier. The 2 most-significant bits (01) tell you the "class" (i.e. something like a namespace) is "application". The "form" bit (0) tells you that it's a primitive type (i.e. not a sequence). Finally the "tag" is 1. Consulting the SNMPv2-SMI MIB (RFC 2578) you can find this definition:
Counter32 ::=
[APPLICATION 1]
IMPLICIT INTEGER (0..4294967295)
You also asked about why a 32-bit integer is encoded with a single byte. This requires you to distinguish between the scope of the SNMP standard versus the ASN.1 standard. ASN.1 only has a single INTEGER type, which 1) has an unlimited range, 2) is always signed (two's complement), and 3) should be encoded in the least number of octets possible. This actually means that a Counter32 (or any other 32-bit unsigned integer type) might use up to 5 bytes for its encoding (see this answer I gave to a question about that).
Finally, you asked about the way the replies are modifying the requested OID. I was confused about this for a long time, but when I figured it out, I realized it's actually pretty simple. I think the best place to start is with this excerpt from RFC 1157:
Each instance of any object type defined in the MIB is identified in
SNMP operations by a unique name called its "variable name." In
general, the name of an SNMP variable is an OBJECT IDENTIFIER of the
form x.y, where x is the name of a non-aggregate object type defined
in the MIB and y is an OBJECT IDENTIFIER fragment that, in a way
specific to the named object type, identifies the desired instance.
This naming strategy admits the fullest exploitation of the semantics
of the GetNextRequest-PDU (see Section 4), because it assigns names
for related variables so as to be contiguous in the lexicographical
ordering of all variable names known in the MIB.
The type-specific naming of object instances is defined below for a
number of classes of object types. Instances of an object type to
which none of the following naming conventions are applicable are
named by OBJECT IDENTIFIERs of the form x.0, where x is the name of
said object type in the MIB definition.
For example, suppose one wanted to identify an instance of the
variable sysDescr The object class for sysDescr is:
iso org dod internet mgmt mib system sysDescr
1 3 6 1 2 1 1 1
Hence, the object type, x, would be 1.3.6.1.2.1.1.1 to which is
appended an instance sub-identifier of 0. That is, 1.3.6.1.2.1.1.1.0
identifies the one and only instance of sysDescr.
So, to summarize, the OID that comes from the MIB doesn't refer to a concrete object, but to the "object type". Each concrete object (i.e. "instance") is identified by a suffix of one or more sub-identifiers (i.e. the y in this explanation). For singleton objects, this suffix is always 0. However, I think most SNMP objects are found in tables, not in singleton objects. I don't actually know of a good explanation of this in the standards, so I'll give it my best shot.
Like any table, SNMP tables are made up of rows and columns. In SNMP, however, the rows are called "entries", and each entry defines a custom type to describe the columns. Here's a simple example from the IF-MIB:
ifTable OBJECT-TYPE
SYNTAX SEQUENCE OF IfEntry
MAX-ACCESS not-accessible
STATUS current
DESCRIPTION
"A list of interface entries. The number of entries is
given by the value of ifNumber."
::= { interfaces 2 }
ifEntry OBJECT-TYPE
SYNTAX IfEntry
MAX-ACCESS not-accessible
STATUS current
DESCRIPTION
"An entry containing management information applicable to a
particular interface."
INDEX { ifIndex }
::= { ifTable 1 }
IfEntry ::=
SEQUENCE {
ifIndex InterfaceIndex,
ifDescr DisplayString,
ifType IANAifType,
ifMtu Integer32,
ifSpeed Gauge32,
ifPhysAddress PhysAddress,
ifAdminStatus INTEGER,
ifOperStatus INTEGER,
ifLastChange TimeTicks,
ifInOctets Counter32,
ifInUcastPkts Counter32,
ifInNUcastPkts Counter32, -- deprecated
ifInDiscards Counter32,
ifInErrors Counter32,
ifInUnknownProtos Counter32,
ifOutOctets Counter32,
ifOutUcastPkts Counter32,
ifOutNUcastPkts Counter32, -- deprecated
ifOutDiscards Counter32,
ifOutErrors Counter32,
ifOutQLen Gauge32, -- deprecated
ifSpecific OBJECT IDENTIFIER -- deprecated
}
So, ifTable has an OID of 1.3.6.1.2.1.2.2, and ifEntry has an OID of 1.3.6.1.2.1.2.2.1. Each item in IfEntry also has its own definition, which includes the OID relative to ifEntry. Generally they match up with the entry's data type, so, for example, ifIndex, as the first column in IfEntry, has an OID of ifEntry.1. Confusingly, when you do a simple Get-Next walk, you will traverse in column-major order, meaning you will get all the ifIndexes, followed by all the ifDescrs, and so on.
So, with all that explained, I'm now prepared to explain the instance identifiers for these tables. Notice above that ifEntry defines
INDEX { ifIndex }
This means, first, that each row is guaranteed to have a unique ifIndex, and, more importantly, that the ifIndex is used as the instance identifier for the entire entry. For example, you can pick any column in the IfEntry data type, let's say ifOperStatus (1.3.6.1.2.1.2.2.1.8), and use Get-Next to find the first instance of that column. Let's say its OID is 1.3.6.1.2.1.2.2.1.8.1, and it's value is 1 (up). The last sub-identifier tells you that it belongs to the row whose ifIndex is 1. To find the name of that interface, you can then query ifDescr.1, and to find its speed setting, you can query ifSpeed.1, and so forth. In this case, it is possible to query ifIndex.1, which will just return 1, but in many tables, the INDEX columns are not-accessible, meaning you can only find out what instances there are by walking some other column. Some tables also use multiple indices, or use OCTET STRING or even OBJECT IDENTIFIER rather than INTEGER typed indices. The rules for encoding and decoding those are in RFC 2578 section 7.7.

How to differentiate code terminology in MEDICAL SERVICE LINES?

In the MEDICAL_SERVICE_LINES table, there is a field ‘PROCEDURE’. The data dictionary notes that this is ‘CPT, HCPCS, or ICD-10-PCS (less commonly)’. Is there a field that indicates which of these terminologies the code is from?
Can you use modifiers to help identify? Or are the code formats the best tool like:
CPT:
5 numbers or 4 numbers and a letter (in that order)
HCPCS:
1 letter and 4 numbers (in that order).
This customer receives PLAID and is not in Sentinel. (data dictionary here)
The code formats would be the best to distinguish definitively what type of code it is. The modifiers are not filled out all the time (some claims may not have modifiers attached to the procedure).
Your layout of the code format is correct (see section HCPCS Coding here for additional confirmation). HCPCS Level 1 is comprised of CPT codes. HCPCS Level 2/3 is what we typically regard as just "HCPCS"

create a URL shortener with Base 62?

I understood the process to shorten the URL with base 62 at How do I create a URL shortener?.
Steps given are
Think of an alphabet we want to use. In your case, that's [a-zA-Z0-9]. It contains 62 letters.
Take an auto-generated, unique numerical key (the auto-incremented id of a MySQL table for example).
For this example, I will use 12510 (125 with a base of 10).
Now you have to convert 12510 to X62 (base 62)
My question is why not just create unique numerical key and return it ? What is the advantage of concerting numerical key > Base 62 > then Finally some alphanumeric number ?
Is it because final alphanumeric number will be much smaller than unique numerical key ?
Yes. The idea is to make it short and usable in a URL. A number in base 62 will use fewer characters than the same number in base 10. Notice also that URL shorteners use short hosts, such as g.co.
I can see you understand that, yes, a number written in base 62 takes less characters than a number in base 10 just like a number in base 10 takes less characters than a number in base 2 (e.g. 0101 is 3 characters longer than just '5').
So, I'll answer specifically "Why".
Sometimes a link is shortened to be more visually pleasing. A company worried about their public perception likely doesn't want their links to look like an error code due to how long they are so they resort to shortening. That's why some url shortening services allow you to add your own "vanity url" which customizes the domain name, so that a link can be shortened and branded.
Other times a link is shortened to minimize character count when working with constraints, like Twitter. For example, at my company we shortened the links in our automated Twilio messages because SMS messages that contain more than 160 characters are technically 2 concatenated messages so it is more expensive to send.
And finally if the link is being shared through a medium that cannot be directly clicked on (e.g. verbally, on paper), making it shorter makes it much easier to type into an address bar manually. (Imagine trying to type the url to this SO question when someone is reading it to you.) I assume this is also at least partially why the base used for these links usually stop at around 62. If you start including other arbitrary characters to higher the base and consequentially make the link marginally shorter, it'll become harder to communicate, read and type. ("domain.name/5omeC0d3" vs "domian.name/🈲}♠ "

cryptocoin address generation

Recently have been looking at crypto currencies, mostly Bitcoin and Dogecoin. I'm using this source for my project. I've got Bitcoin to work perfectly, and since the Bitcoin gem did not have native support for Dogecoin I had to self-implement it.
Also noticed that another githubber had opened tried to implement dogecoin support, which as of now did not generating address correctly.
The problem seems to be in this particular line. (corresponding to the format of the crypto address)
:address_version => "30"
:address_version = PUBKEY_ADDRESS in base58.h
base58.h
PUBKEY_ADRESS contains value 30.
Specifying this particular number will give address beginning with letter 'L' (litecoin address), where Dogecoin requires 'D'.
Does this have anything to do with Doge using scrypt, I have no technical expertise in this field. How do I go about generating dogecoin pubkey/private key pair?
30 in decimal will give you a address begins with a letter D
30 in hexadecimal (48 in decimal) will give you a address begins with letter L
I think that bitcoin-ruby first changes PUBKEY_ADDRESS from hexa to decimal so :address_version should be 1E (30 in decimal)

Creating an id from name and address data. Hash/Digest

My problem:
I'm looking for a way to represent a person's name and address as an encoded id. The id should contain only alpha-numeric characters, be collision-proof, and be represented in a smallest number of characters possible. My first thought was to simply use a cryptographic hash function like MD5 or SHA1, but this seems like overkill (security isn't important - doesn't need to be one-way) and I'd prefer to find something that would produce a shorter id. Does anyone know of an existing algorithm that fits this problem?
In other words, what is the best way to implement the following function so that the return value is the same consistently for the same input, collisions are unlikely, and ids are less than 20 characters?
>>> make_fake_id(fname = 'Oscar', lname = 'Grouch', stnum = '1', stname = 'Sesame', zip = '12345')
N1743123734
Application Context (for those that are interested):
This will be used for a record linkage app. Given an input name and address we search a very large database for the best match and return the database id and other data (how we do this is not important here). If there isn't a match I need to generate this psuedo/generated/derived id from the search input (entity's name and address data). Every search record should result in an output record with either a real (the actual database id resulting from a match/link) or this generated psuedo/generated/derived id. The psuedo id will be prefixed with a character (e.g. N) to differentiate it from a real id.
I know you said no to MD5 and SHA1, but I think you should consider them anyway. As well as being well studied hashing algorithms, the length gives you more protection against possible collisions. No hash is collision-proof, but the cryptographic ones generally are less collision-prone than something you couuld come up with yourself.
Use a cryptographic hash for its collision resistance, not its other qualities
Use as many bytes from the hash as you want (truncate)
convert to alpha-numeric characters
You can also truncate the alpha-numeric string instead of the hash
An easy way to do this: hash the data, encode in base64, remove all non-alpha-numeric characters, truncate.
N_HASH_CHARS = 11
import hashlib, re
def digest(name, address):
hash = hashlib.md5(name + "|" + address).digest().encode("base64")
alnum_hash = re.sub(r'[^a-zA-Z0-9]', "", hash)
return alnum_hash[:N_HASH_CHARS]
How many alpha-numeric characters should you keep? Each character gives you around 5.95 bits of entropy (log(62,2)). 11 characters give you 65.5 bits of entropy, which should be enough to avoid a collision for the first 2**32.7 users (about 7 billion).
A good solution is somewhat dependent on your application. Do you know how many users and what the set of all users is? If you provide more details you would get better help.
I agree with the other poster suggesting serial numbers. OTOH, if you really, really really want to do something else:
Create a SHA1 hash from the data, and store it in a table with a serial number field.
Then, when you get the data, calculate the hash, look it up on the table, get the serial, and that's your id. If it's not on the table, insert it.
I wonder whether you intend to "assign" these ids to the users? If so, I would expect your users to hate anything that you propose; who would want a user id of "AAAAA01"?
So, if these ids are visible to the user, then you should just let them pick what they like and check them for uniqueness (easy). If they are not visible to the user (e.g., internal primary key), then just generate them sequentially using an appropriate technique such as an Oracle Sequence or SQL Server AutoNumber (also easy).
If these ids are an attempt to detect a user that is registering more than once, then I would agree that you should consider a cryptographic hash followed by a full comparison of the registration data (name, address, etc.). However, to be usable, you will need to translate the data into a canonical form (standardized letter case, whitespace, canonical street address, etc.) before computing the hash or making the comparison. Otherwise, you will mismatch based on trivial differences.
EDIT: Now that I understand the problem space better based on your edits, I think that it is highly unlikely that your algorithm (so far) will catch most matches. Beyond my suggestion to canonicalize the inputs, I recommend that you consider an approach that results in a ranked list of a handful of possible matches (to be resolved by a human if possible) rather than an all-or-nothing attempt at a single match. In other words, I recommend a search approach rather than a lookup approach.
Is that feasible in your situation?
Well, if there's more than one person at the same address with the same name, you're toast here, (w/o adding code to detect this and add a discriminator of some kind).
but assuming that issue is not, then the street address and zip code portion of the full addresss is sufficient to guaranteee uniqueness there, so adding enough data from the name should take care of the issue...
Do you have access to a database, or other persistence mechanism, where you could generate and maintain key values for each address? Then keep the address and individual entities in two keyed dictionary structures, where the key is autogenerated for each new distinct address, person encountered... and then use the autogenerated alpha-numeric key...
You could use AAAAA01 for first person at first address,
AAAAA02 for second person at first address,
AAAAB07 for the seventh resident at the second adresss, etc.
If you donlt have any way to generate and maintain these entity-Key mappings then you need to use the full street address/Zip and fullNAme, or a hash value of the same, although the Hash value approach has a smnall chance of generating duplicates...

Resources