Gdb printing long values in hex without having to guess the length - debugging

How do I print long values in gdb?
When using just x, i.e x $rdi, the value (in hex) is cut off after 8 bytes.
If I use x/32bx(or whatever other length), bytes are separated by spaces which is not nice, but okay. The problem is that when there's some long value I want to print, I have to guess the size and pass it to x/. If that value is 256 bytes long, the output will look messy, because it's separated by spaces, but it also means I have to make a lot of guesses and then look through a long and ugly string of bytes and find the place where the value ends and is followed by 0x00s (and obviously the value can have 0x00s in between which makes trying to work this out even more confusing) to be able to know how long it is.
If I try to print it as an integer, it gets cut off as well. I'd like to be able to easily tell how long a value is and not have it be cut off.

A way to display long (as well as ll & ull) values is with the g modifier. For example if we have a program which just stores
unsigned long long int a = 1234567891234567898;
int b = 23;
unsigned long long int c = 1111111111111111111;
by typing x/20xg $rsp (after the values have been moved to the stack) we get
0x7fffffffdd90: 0x0000000000000000 0x0000001755555040
0x7fffffffdda0: 0x112210f4c023b6da 0x0f6b75ab2bc471c7
0x7fffffffddb0: 0x0000000000000000 0x00007ffff7e08b25
...
With the long numbers in [rsp+0x10] & [rsp+0x18] being a & c respectively, and that 0x17 in [rsp+0xc] being b.

Related

How can I see the actual binary content of a VB6 Double variable?

I have hunted about quite a bit but can't find a way to get at the Hexadecimal or Binary representation of the content of a Double variable in VB6. (Are Double variables held in IEEE754 format?)
The provided Hex(x) function is no good because it integerizes its input first.
So if I want to see the exact bit pattern produced by Atn(1), Hex(Atn(1)) does NOT produce it.
I'm trying to build a mathematical function containing If clauses. I want to be able to see that the values returned on either side of these boundaries are, as closely as possible, in line.
Any suggestions?
Yes, VB6 uses standard IEEE format for Double. One way to get what you want without resorting to memcpy() tricks is to use two UDTs. The first would contain one Double, the second a static array of 8 Byte. LSet the one containing the Double into the one containing the Byte array. Then you can examine each Byte from the Double one by one.
If you need to see code let us know.
[edit]
At the module level:
Private byte_result() As Byte
Private Type double_t
dbl As Double
End Type
Private Type bytes_t
byts(1 To 8) As Byte
End Type
Then:
Function DoubleToBytes (aDouble As Double) As Byte()
Dim d As double_t
Dim b As bytes_t
d.dbl = aDouble
LSet b = d
DoubleToBytes = b.byts
End Function
To use it:
Dim Indx As Long
byte_result = DoubleToBytes(12345.6789#)
For Indx = 1 To 8
Debug.Print Hex$(byte_result(Indx)),
Next
This is air code but it should give you the idea.

Explanation of a macro in kernel

In kernel 2.4.37, there is a macro in page.h like this:
struct page *mem_map;
struct page *page;
#define VALID_PAGE(page) ((page - mem_map) < max_mapnr)
I know mem_map is an array of struct page, page is a struct, so what does page - mem_map mean?
It will compute the index of corresponding page in mem_map array means which number of page it is in mem_map array, let say it as pfn or page frame number for linux (linux assumes that mem_map array starts with 0th pfn to the max pfn) , adding a PHYS_PFN_OFFSET to pfn will give you the actual physical page frame in your memory map.
__page_to_pfn
max_mapnr is the limit of maximum number of mapped pages or maximum page frame number.
set_max_mapnr
I hope it clears your doubts.
Humm, I'm not sure but maybe a pointer adresses comparaison ?
I mean, if one of them is a array and it's not dereferenced the operations are apply on adresses I suppose.
Edit: (precision)
So, in this case I think this operation is for check if "page" is in range of the adresses array "mem_map".
We can represent like this: Graphic representation
Utility of Macro:
So, "mem_map" is adresses of the begin of array, suppose: 0x0...5.
The size of "mem_map"(max_mapnr) array is: 5.
We want to know if "page" adresses is in the range of "mem_map" array.
True Case:
Suppose "page" is in "mem_map", 2e element. We can suppose his adresses is something like: 0x0...7;
Now we do operation: ((0x0...7 - 0x0...5) < 5).
We obtain 2. So "page" adresse is in mem_map.
False Case:
Otherwise if "page" is out of the array (0x0...D): We the result will be 8. So, 8 is not less than "max_mapnr"(5). So this page is not in the "mem_map" array.
And if the adresses is bellow the array adresse (0x0...2):
The result of ((0x0...2 - 0x0...2)) will be a negative value. And in that case they comparaison with "max_mapnr"(unsigned long) is not possible.
I found this topic explain why better than me:
Signed/unsigned comparisons
So for resume:
You canno't do operations between negative(signed) and unsigned value in C cause he cast them automaticaly. In others terms, when you do (-3 - U_nbr), it's same if you do: (((unsigned)-3) - U_nbr). And in option, normaly if you compile with gcc -Wall flags, and you don't cast manually your value you will have an compilation Warning message.
For testing I tried to run this code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
unsigned long test = 0x0000F;
unsigned long test2 = 0x0000A;
unsigned long weird = 0x00002;
char* pt1 = "This is first test string !";
char* pt2 = "This is a test string";
printf("Try to make operation on two unsigned long result must be 5: %lu\n", (test - test2));
printf("Try to make operation between unsigned long result must be negative, so he will be cast: %lu\n", (weird - test2));
printf("Let's try the same with real adresses: %lu\n", (pt2 - pt1));
printf("And this is what happens with negative value: %lu\n", (pt1 - pt2));
printf("For be sure, this is the lenght of string 1. %lu\n", strlen(pt1));
return (0);
}
The ouput is:
Try to make operation on two unsigned long result must be 5: 5
Try to make operation between unsigned long result must be negative, so he will be cast: 18446744073709551608
Let's try the same with real adresses: 28
And this is what happens with negative value: 18446744073709551588
For be sure, this is the lenght of string 1. 27
So, as we can see, the negative value is casted in Unsigned long and return a overflowed one. And if you make this comparaison with max_mapnr you will see he is "out of range".
Thank's to AnshuMan Gupta for the "weird case".

Last byte in Huffman compression

I am wondering about what is the best way to handle the last byte in Huffman Copression. I have some nice code in C++, that can compress text files very well, but currently I must write to my coded file also number of coded chars (well, it equal to input file size), because of no idea how to handle last byte better.
For example, last char to compress is 'a', which code is 011 and I am just starting new byte to write, so the last byte will look like:
011 + some 5 bits of trash, I am making them zeros for example at the end.
And when I am encoding this coded file, it may happen that code 00000 (or with less zeros) is code for some char, so I will have some trash char at the end of my encoded file.
As I wrote in first paragraph, I am avoiding this by saving numbers of chars of input file in coded file, and while encoding, I am reading the coded file to reach that number (not to EndOfFile, to don't get to those example 5 zeros).
It's not really efficient, size of coded file is increased for long number.
How can I handle this in better way?
Your approach (write the number of encoded bytes the to the file) is a perfectly reasonable approach. If you want to try a different avenue, you could consider inventing a new "pseudo-EOF" character that marks the end of the input (I'll denote it as &square;). Whenever you want to compress a string s, you instead compress the string s&square;. This means that when you build up your encoding tree, you would include one copy of the &square; character so that you have a unique encoding for &square;. Then, when you write out the string to the file, you would write out the bits characters of the string as normal, then write out the bit pattern for &square;. If there are leftover bits, you can just leave them set arbitrarily.
The advantage to this approach is that as you decode the file, if at any point you find the &square; character, you can immediately stop decoding bits because you know that you have hit the end of the file. This does not require you to store the number of bytes that were written out anywhere - the encoding implicitly marks its own endpoint.
The disadvantage to this setup is that it might increase the length of the bit patterns used by certain characters, since you will need to assign a bit pattern to &square; in addition to all the other characters.
I teach an introductory programming course and we use Huffman encoding as one of our assignments. We have students use the above approach, since it's a bit easier than having to write out the number of bits or bytes before the file contents. For more details, you could take a look at this handout or these lecture slides from the course.
Hope this helps!
I know this is an old question, but still, there's an alternate, so it might help someone.
When you're writing your compressed file to output, you probably have some integer keeping track of where you are in the current byte (for bit shifting).
char c, p;
p = '\0';
int curr = 7;
while (infile.get(c))
{
std::string trav = GetTraversal(c);
for (int i = 0; i < trav.size(); i++)
{
if (trav[i] == '1')
p += (1 << curr);
if (--curr < 0)
{
outfile.put(p);
p = '\0';
curr = 7;
}
}
}
if (curr < 7)
outfile.put(p);
At the end of this block, (curr+1)%8 equals the number of trash bits in the last data byte. You can then store it at the end as a single extra byte, and just keep it in mind when you're decompressing.

Little endian data and sha 256

I have to generate sha256 hashes of data that is in little endian form. I would like to know if I have to convert it to big endian first, before using the sha 256 algorithm. Or if, the algorithm is "endian-agnostic".
EDIT: Sorry, I think I wasnt clear. What I would like to know is the following: The sha256 algorithm requires to pad the end of a message with certain bits. The first step is to add a 1 at the end of the message. Then, to pad it with zero up to the end. At the very end, you must add the length of the message in bits. What I would like to know is if this padding can be performed in little endian. For example, for a 640 bit message, I could write the last word as 0x280 (in big endian), or 0x8002000 (in little endian). Can this padding be done in little endian?
SHA256 is endian-agnostic if all you want is a good hash. But if you are writing SHA256 and want to the same results with a correct implementation then you must play games on little endian hardware. SHA256 combines arithmetic addition (mod 2*32) and boolean operation thus is not endian-agnostic internally.
The SHA-256 implementation itself should take care of padding - you shouldn't have to deal with that unless you're implementing your own specialized SHA-256 code. If you are, note that the padding rules specified in the "pre-processing step" say that the length is a 64-bit big-endian integer. See SHA-2 - Wikipedia
It's hard to even figure out what "endian-agnostic" would mean, but the order of all the bits, bytes and words for a hash algorithm matter a whole lot, so I sure wouldn't use that term.
Let me reply regarding sha 256 as well as sha 512.
in short:
The algorithm itself is endian agnostic. The endian sensitive parts are when data is imported from a byte buffer to the algorithm working variables and when it is exported back to the digest result - also a byte buffer. If the import / export include casting, then endian matters.
Where could casting occur:
In sha 512 there is a working buffer of 128 bytes.
In my code its defined like this:
union
{
U64 w [80]; (see U64 example below)
byte buffer [128];
};
Input data is copied to this byte buffer and then work is done on W. This means the data was casted to some 64 bit type. This data will have to be swapped. in my case its swapped for little endian machines.
A better method would be to prepare a get macro that takes each byte and places it in its correct place in the u64 type.
When the algorithm is done the digest result is output from the working variables to some byte buffer, if this is done by memcpy it will also have to be swapped.
Another casting could occur when implementing sha 512 - which is designed for 64 bit machines - on 32 bit machines. In my case I have a 64 bit type that is defined:
typedef struct {
uint high;
uint low;
} U64;
Assume I define it for little endian as well, as follows:
typedef struct {
uint low;
uint high;
} U64;
And then the k algorithm init is done like this:
static const SHA_U64 k[80] =
{
{0xD728AE22, 0x428A2F98}, {0x23EF65CD, 0x71374491}, ...
...
...
}
But i need the logic value of k[0].high to be the same in any machine.
So in this example I will need another k array with high and low values swapped.
After the data is stored in the working parameters any bitwise manipulation would have the same result on both big/little endian machines.
Good method would be to avoid any casting:
Import bytes from input buffer to your working parameters using macro.
Work with logical values without thinking about the memory mapping.
Export output to digest result with a macro.
Macro for taking 32 bits from a byte buffer to int32 (BE = big endian):
#define GET_BE_BYTES_FROM32(a)
((((NQ_UINT32) (a)[0]) << 24) |
(((NQ_UINT32) (a)[1]) << 16) |
(((NQ_UINT32) (a)[2]) << 8) |
((NQ_UINT32) (a)[3]))
#define GET_LE_BYTES_FROM32(a)
((((NQ_UINT32) (a)[3]) << 24) |
(((NQ_UINT32) (a)[2]) << 16) |
(((NQ_UINT32) (a)[1]) << 8) |
((NQ_UINT32) (a)[0]))

VB6: Surely this simple Hex addition is wrong?

I'm getting odd results in some VB6 code which I've narrowed to this:
Debug.Print Hex(&hEDB80000 + &h8300)
Shows EDB78300
That can't by right can it? Surely it should be EDB88300?
Am I going mad?
Don't forget how negative numbers are expressed in binary, and that VB6 and VB.NET interpret numbers like &h8300 differently.
Because &hEDB80000 doesn't fit in 16-bits, VB interprets it as a long (32-bits). Because the high bit is set, VB6 knows it's negative.
Let's undo the two's complement (in a 32-bit world) to figure out the decimal value
(~&hEDB80000 + 1) = &h1247FFFF + 1 = &h12480000 = 306708480
since the sign bit was set, that's -306708480
Because &h8300 fits in 16-bits, VB interprets it as an integer (16-bits). Because the high bit is set, VB6 knows that it's negative.
Let's undo the two's complement (in a 16-bit world)
(~&h8300 + 1) = &h7DFF + 1 = &h7D00 = 32000
since the sign bit was set, that's -32000. When the addition happens, both values are considered to be longs (32-bits).
(-306708480) + (-32000) = -306740480
Let's put that back into two's complement hex
~(306740480 - 1) = ~(&h12487D00 - 1) = ~(&h12487CFF) = &hEDB78300
So &hEDB78300 is the correct answer.
Notes:
I personally thing the confusion happens because of the following:
&h0004000 is interpreted as 16384 // Fits in 16-bits, sign bit is not set
&h0008000 is interpreted as -32768 // Fits in 16-bits, sign bit is set
&h0010000 is interpreted as 65536 // Requires 32-bits, sign bit is not set
as mentioned in the other post, you can get around this by explicitly marking values as longs
&h0004000& is interpreted as 16384
&h0008000& is interpreted as 32768
&h0010000& is interpreted as 65536
Fundementally because VB6 sees &h8300 as an integer having the value -32000. To get the results you were expecting you would need to explictly mark it as a Long:-
Debug.Print Hex(&hEDB80000 + &h8300&)
What your were doing was adding a Long to an Interger. To do that VB6 first extends the Integer to a Long, since &h8300 represents a negative number the Long it is converted to ends up with the value &hFFFF8300. Armed with that value you can see that the result returned by VB6 is correct.
FF + B8 = B7 with carry bit set
FF + ED + carry bit = ED

Resources