What is the algorithm behind Cassandra's token function? - algorithm

The Token function in my driver doesn't support a composite partition key, but it works very well with a single partition key, it takes a binary in 8 bits form as an input and pass it to murmur3 hash function and extract the 64-signed-little-integer (Token) from the result of murmur3 and ignore any extra binary buffer.
So my hope is to generate the binary for a composite partition key and then pass it to murmur3 as usual, an algorithm or bitwise operations will be really helpful or at least a source in any programming language.
I don't mean murmur3 part, only the token side which converts/mixes the composite partition key and outputs raw bytes in binary form.

Take a look at the drivers since they have generate the token to find the correct coordinator. https://github.com/datastax/java-driver/blob/8be7570a3c7fbba773ae2581bbf26e8196e7d6fb/driver-core/src/main/java/com/datastax/driver/core/Token.java#L112
Its slightly different than the typical murmur3 due to a bug when it was made and inability to change it without breaking existing clusters. So I would recommend copying it from them or better yet, use the existing drivers to find the token.

Finally I found a solution to my question :The Algorithm to compute the token for a composite partition key :
Primary_key((text, int)) -> therefore the partition key is a composite_partition_key (text, int).
Example : a row with composite_partition_key ('hello', 1)
Applying the algorithm :
1- lay-out the components of the composite partition key in big-endian (16 bits) presentation :
first_component = 'hello' -> 68 65 6c 6c 6f
sec_component = 1 -> 00 00 00 01
68 65 6c 6c 6f 00 00 00 01
2- add two-bytes length of component before each component
first_component = 'hello', length= 5-> 00 05 68 65 6c 6c 6f
sec_component = 1, therefore length= 4 -> 00 04 00 00 00 01
00 05 68 65 6c 6c 6f 00 04 00 00 00 01
3- add zero-value after each component
first_component = 'hello' -> 00 05 68 65 6c 6c 6f 00
sec_component = 1 -> 00 04 00 00 00 01 00
4- result
00 05 68 65 6c 6c 6f 00 00 04 00 00 00 01 00
now pass the result as whatever binary base your murmur3 function understand (make sure it's cassandra variant).

Related

parse the following ibeacon packet

I am trying to parse this ibeacon packet received by scanning through a hci socket
b'\x01\x03\x00\x18\xbe\x99m\xf3\x14\x1e\x02\x01\x1a\x1a\xffL\x00\x02\x15e\xec\xe2\x90\xc7\xdbM\xd0\xb8\x1aV\xa6-b 2\x00\x00\x00\x02\xc5\xcc'
hex format 01 03 00 18 be 99 6d f3 14 1e 02 01 1a 1a ff 4c 00 02 15 65 ec e2 90 c7 db 4d d0 b8 1a 56 a6 2d 62 20 32 00 00 00 02 c5 cc
the parameters after applying the parser are
'UUID': '65ece290c7db4dd0b81a56a62d622032', 'MAJOR': '0000', 'MINOR': '0002', 'TX': -59, 'RSSI': -60
I am not sure if the RSSI portion of this parsing is right.
Referring to this https://stackoverflow.com/a/19040616/10355673
the last bit of the beacon advertising packet is the TX power value.
so how do we get the rssi value? here, I have taken rssi to be cc and tx to be c5. Is this correct?
There are flags headers before the manufacturer advertisement sequence shown below, but you really don't care about the flags. Here are the bytes you care about:
ff # manufacturee adv type
4c 00 # apple Bluetooth company code
02 15 # iBeacon type code
65 ec e2 90 c7 db 4d d0 b8 1a 56 a6 2d 62 20 32 # proximity uuid
00 00 # major
00 02 # minor
c5 # measured power (tx power)
cc # crc
Proximity UUUD: 65ece290-c7db-4dd0-b81a-56a62d622032,
Major: 0,
Minor: 2,
Measured Power: -59 dBm
The RSSI is not part of the transmitted packet, but a measurement taken by the receiver based on the strength of the signal. It will typically be a slightly different value for each packet that is received. You get this value from an API on a mobile device or embedded system that fetches it from the bluetooth chip.

In WinDbg commands, how can I refer to a register with the same name as a global variable?

Suppose I want to debug this program using the WinDbg, cdb, or ntsd debuggers for Windows:
/* test.c */
#include <stdio.h>
int rip = 42;
int main(void)
{
puts("Hello world!");
return (0);
}
I compile the program for AMD64 and run it under WinDbg. I set a breakpoint at main(), and when the breakpoint hits, I want to inspect the value at the RIP register (program counter), and the memory around that value if the value is treated as a pointer.
I can see the value of the register directly with r rip, but when I try to look at the memory around that address, WinDbg shows me a different address! Having read the symbols in test.pdb, WinDbg sees that rip is a global variable declared in the C code and shows me the memory around &rip.
0:000> bu test!main
0:000> g
Breakpoint 0 hit
test!main:
00007ff6`de1868d0 4883ec28 sub rsp,28h
0:000> r rip
rip=00007ff6de1868d0
0:000> db rip
00007ff6`de1f2000 2a 00 00 00 ff ff ff ff-01 00 00 00 00 00 00 00 *...............
00007ff6`de1f2010 01 00 00 00 02 00 00 00-ff ff ff ff ff ff ff ff ................
00007ff6`de1f2020 00 00 00 00 00 00 00 00-43 46 92 e5 1b df 00 00 ........CF......
00007ff6`de1f2030 bc b9 6d 1a e4 20 ff ff-00 00 00 00 00 00 00 00 ..m.. ..........
00007ff6`de1f2040 00 01 00 00 00 00 00 00-ca b0 1e de f6 7f 00 00 ................
00007ff6`de1f2050 00 00 00 00 00 80 00 00-00 00 00 00 00 80 00 00 ................
00007ff6`de1f2060 d0 66 fc c2 f2 01 03 00-ab 90 ec 5e 22 c0 b2 44 .f.........^"..D
00007ff6`de1f2070 a5 dd fd 71 6a 22 2a 15-00 00 00 00 00 00 00 00 ...qj"*.........
0:000> ? rip
Evaluate expression: 140698265264128 = 00007ff6`de1f2000
0:000> ? dwo(rip)
Evaluate expression: 42 = 00000000`0000002a
This is really annoying, but as long as I'm aware of it, it isn't a problem when manually reading data like this. But if I want to use the register value, for example in scripting the debugger, then there is no easy workaround:
0:000> bu test!main ".if (dwo(rip) == 0n42) { .echo Whoops! I don't want to get here! }"
0:000> g
Whoops! I don't want to get here!
test!main:
00007ff6`de1868d0 4883ec28 sub rsp,28h
This problem, that symbols in the program hide register names, makes things really difficult for me. An actual scenario this broke:
I wanted to set a breakpoint on CreateFileW(), a very commonly called Windows API function.
Since I only cared about one particular file, I wanted to inspect the filename, which is passed in the RCX register, and continue past the breakpoint unless the filename matched the file I wanted.
But I couldn't write this condition, because another module in the program defined a symbol foobar!rcx, and any references to rcx I make in the command to execute on the breakpoint refer to that global variable!
So how do I tell WinDbg that yes, I really meant to read the register? And what if I want to write that register? There must be a simple thing I am missing here.
As noted in passing by another question, you can put an at sign (#) in front of a register name to force it to be interpreted as a register or pseudo-register, bypassing the attempt to parse it as a hexadecimal number or a symbol.
Registers and Pseudo-Registers in MASM Expressions
You can use registers and pseudo-registers within MASM expressions. You can add an at sign (#) before all registers and pseudo-registers. The at sign causes the debugger to access the value more quickly. This at sign is unnecessary for the most common x86-based registers. For other registers and pseudo-registers, we recommend that you add the at sign, but it is not actually required. If you omit the at sign for the less common registers, the debugger tries to parse the text as a hexadecimal number, then as a symbol, and finally as a register.

How to tell GCC to place an inline assembly instruction at a specific position?

I am working on the bootloader for a processor architecture that is based on ORPSoC. To execute a program, the bootloader loads it into memory and then jumps to the beginning of that program.
Now I need the custom instruction l.cust1 inserted in the delay slot of the jump. This instruction is implemented by the processor and activates decryption of the following instructions. That is the reason why it has to be placed in the delay slot. Any later, and the program could not be executed, as its instructions are encrypted. Similarly, if the decryption is activated too early, the bootloader crashes because it is not encrypted.
I am now wondering whether it is possible to tell GCC where to place the l.cust1 instruction. Currently I have to manually modify the bootloader binary accordingly.
Inserting inline assembly __asm__("l.cust1\n\t"); in the bootloader's C source code results in the instruction being added somewhere before the relevant jump:
1fc2e10: 9c 21 01 b4 l.addi r1,r1,436
1fc2e14: 70 00 00 00 l.cust1 # switching on decryption
1fc2e18: 18 40 01 ff l.movhi r2,0x1ff
1fc2e1c: 9c 72 ff ff l.addi r3,r18,-1
1fc2e20: a8 42 7c 94 l.ori r2,r2,0x7c94
1fc2e24: 9c 90 00 04 l.addi r4,r16,4
1fc2e28: 85 62 00 60 l.lwz r11,96(r2)
1fc2e2c: 48 00 58 00 l.jalr r11
1fc2e30: 9d c0 00 00 l.addi r14,r0,0
However, I need it to be located in the delay slot of the jump:
1fc2e10: 9c 21 01 b4 l.addi r1,r1,436
1fc2e14: 9d c0 00 00 l.addi r14,r0,0
1fc2e18: 18 40 01 ff l.movhi r2,0x1ff
1fc2e1c: 9c 72 ff ff l.addi r3,r18,-1
1fc2e20: a8 42 7c 94 l.ori r2,r2,0x7c94
1fc2e24: 9c 90 00 04 l.addi r4,r16,4
1fc2e28: 85 62 00 60 l.lwz r11,96(r2)
1fc2e2c: 48 00 58 00 l.jalr r11
1fc2e30: 70 00 00 00 l.cust1 # switching on decryption
Put the l.cust1 in the same inline assembler statement as the jump, which must be declared volatile as it has side effects and have "memory" in the clobber list as it depends on the contents of memory.
You will want to use asm volatile and an artificial dependency to the code that you wish the inline assembly to proceed. The example taken from the gcc documentation is below:
asm volatile ("mtfsf 255,%1" : "=X"(sum): "f"(fpenv));
sum = x + y;
According to the documentation volatile alone is not enough. In other words the compiler still might change the order of the following code:
asm volatile("mtfsf 255,%0" : : "f" (fpenv));
sum = x + y;
By adding the dependency you ensure the placement of the inline assembly. In the first code snippet by adding a dependency on sum it is ensured that the compiler will not reorder the code as it believes it will be generating a different result due to the dependency. You can find another example of this on this webpage in the "C code optimization" section.

Method to calculate 'mechListMIC' for SPNEGO GSS-API(NTLMSSP_AUTH) accept-completed(0) state

I am trying to learn and implement SMB2 Server. I am very interested to learn GSS-API (NTLMSSP, NTLMSSP_AUTH) inside. So, I am doing experiment with my own component of GSS-API. I read the description of mechListMIC in RFC4178 & RFC2478. But I couldn’t understand how to calculate mechListMIC for ‘SessionSetup Response, Unknown message type’ response.
Actually, I can generate the mechListMIC for negTokenInit phase of ‘NegotiateProtocol Response’. But the problem is, when client sends ‘SessionSetup Request, NTLMSSP_AUTH, User: Domain\Administrator, Unknown message type’ request, I can’t understand how is it generating ‘mechListMIC: 01 00 00 00 78 1E E9 4A DB 99 7F E9 00 00 00 00’ and how should I send response back in ‘SessionSetup Response, Unknown message type’ with corresponding mechListMIC based on the previous SessionSetup Request.
I tried with the following Info:
SMB2.CSessionSetup.securityBlob.GSSAPI.InitialContextToken.InnerContextToken.SpnegoToken.NegTokenInit.MechTypes , hex data = 30 0C 06 0A 2B 06 01 04 01 82 37 02 02 0A
AND
SMB2.CSessionSetup.securityBlob.GSSAPI.NegotiationToken.NegTokenResp.MechListMic, hex data = 01 00 00 00 78 1E E9 4A DB 99 7F E9 00 00 00 00
SecBuffer SignBuffers[2];
SignBufferDesc.ulVersion = SECBUFFER_VERSION; // SECBUFFER_VERSION = 0
SignBufferDesc.cBuffers = 2;
SignBufferDesc.pBuffers = SignBuffers;
SignBuffers[0] = 30 0C 06 0A 2B 06 01 04 01 82 37 02 02 0A;
SignBuffers[1] = 01 00 00 00 78 1E E9 4A DB 99 7F E9 00 00 00 00;
SignBuffers[0].BufferType = SECBUFFER_DATA; // SECBUFFER_DATA = 1
SignBuffers[1].BufferType = SECBUFFER_TOKEN; // SECBUFFER_TOKEN = 2
Can anyone please tell me what information do I need to use inside HMAC-MD5 (key, data) algorithm to generate mechListMIC for SessionSetup Response and how?
If it is possible to create a step-by-step example using my test case to calculate mechListMIC for ‘SessionSetup Response, Unknown message type’ response, that would be very helpful for me. Please let me know if you need any further information.
Thanks,
Shishir
Please find the answer in MSDN site
http://social.msdn.microsoft.com/Forums/gu-IN/os_fileservices/thread/d00b4e1a-077b-4620-99c7-da7bf86d5212

Whats going on with this byte array?

I have a byte array:
00 01 00 00 00 12 81 00 00 01 00 C8 00 00 00 00 00 08 5C 9F 4F A5 09 45 D4 CE
It is read via StreamReader using UTF8 encoding
// Note I can't change this code, to many component dependent on it.
using (StreamReader streamReader =
new StreamReader(responseStream, Encoding.UTF8, false))
{
string streamData = streamReader.ReadToEnd();
if (requestData.Callback != null)
{
requestData.Callback(response, streamData);
}
}
When that function runs I get the following returned to me (i converted to a byte array)
00 01 00 00 00 12 EF BF BD 00 00 01 00 EF BF BD 00 00 00 00 00 08 5C EF BF BD 4F EF BF BD 09 45 EF BF BD
Somehow I need to take whats returned to me and get it back to the right encoding and the right byte array, but I've tried alot.
Please be aware, I'm working with WP7 limited API.
Hopefully you guys can help.
Thanks!
Update for help...
if I do the following code, it's almost right, only thing that is wrong is the 5th to last byte gets split out.
byte[] writeBuf1 = System.Text.Encoding.UTF8.GetBytes(data);
string buf1string = System.Text.Encoding.BigEndianUnicode.GetString(writeBuf1, 0, writeBuf1.Length);
byte[] writeBuf = System.Text.Encoding.BigEndianUnicode.GetBytes(buf1string);
The original byte array is not encoded as UTF-8. The StreamReader therefore replaces each invalid byte with the replacement character U+FFFD. When that character gets encoded back to UTF-8, this results in the byte sequence EF BF BD. You cannot construct the original byte value from the string because the information is completely lost.

Resources