Problem decoding h264 over RTP TCP stream - ffmpeg

I'm trying to receive RTP stream encoding h264 over TCP from my intercom Hikvision DS-KH8350-WTE1. By reverse engineering I was able to replicate how Hikvision original software Hik-Connect on iPhone and iVMS-4200 on MacOS connects and negotaties streaming. Now I'm getting the very same stream as original apps - verified through Wireshark. Now I need to "make sense" of the stream. I know it's RTP because I inspected how iVMS-4200 uses it using /usr/bin/sample on MacOS. Which yields:
! : 2 CStreamConvert::InputData(void*, int) (in libOpenNetStream.dylib) + 52 [0x11ff7c7a6]
+ ! : 2 SYSTRANS_InputData (in libSystemTransform.dylib) + 153 [0x114f917f2]
+ ! : 1 CRTPDemux::ProcessH264(unsigned char*, unsigned int, unsigned int, unsigned int) (in libSystemTransform.dylib) + 302 [0x114fa2c04]
+ ! : | 1 CRTPDemux::AddAVCStartCode() (in libSystemTransform.dylib) + 47 [0x114fa40f1]
+ ! : 1 CRTPDemux::ProcessH264(unsigned char*, unsigned int, unsigned int, unsigned int) (in libSystemTransform.dylib) + 476 [0x114fa2cb2]
+ ! : 1 CRTPDemux::ProcessVideoFrame(unsigned char*, unsigned int, unsigned int) (in libSystemTransform.dylib) + 1339 [0x114fa29b3]
+ ! : 1 CMPEG2PSPack::InputData(unsigned char*, unsigned int, FRAME_INFO*) (in libSystemTransform.dylib) + 228 [0x114f961d6]
+ ! : 1 CMPEG2PSPack::PackH264Frame(unsigned char*, unsigned int, FRAME_INFO*) (in libSystemTransform.dylib) + 238 [0x114f972fe]
+ ! : 1 CMPEG2PSPack::FindAVCStartCode(unsigned char*, unsigned int) (in libSystemTransform.dylib) + 23 [0x114f97561]`
I can catch that with lldb and see the arriving packet data making sense as the format I'm describing.
The packet signatures look following:
0x24 0x02 0x05 0x85 0x80 0x60 0x01 0x57 0x00 0x00 0x00 0x02 0x00 0x00 0x27 0xde
0x0d 0x80 0x60 0x37 0x94 0x71 0xe3 0x97 0x10 0x77 0x20 0x2c 0x51 | 0x7c
0x85 0xb8 0x00 00 00 00 01 65 0xb8 0x0 0x0 0xa 0x35 ...
0x24 0x02 0x05 0x85 0x80 0x60 0x01 0x58 0x00 0x00 0x00 0x02 0x00 0x00 0x27 0xde
0xd 0x80 0x60 0x37 0x95 0x71 0xe3 0x97 0x10 0x77 0x20 0x2c 0x51 | 0x7c 0x05 0x15 0xac ...
0x24 0x02 0x5 0x85 0x80 0x60 0x01 0x59 0x00 0x0 0x00 0x02 0x00 00x0 0x27 0xde
0xd 0x80 0x60 0x37 0x96 0x71 0xe3 0x97 0x10 0x77 0x20 0x2c 0x51 | 0x7c 0x05 0x5d 0x00 ...
By the means of reverse engineering the original software I was able to figure out that 0x7c85 indicates a key frame. 0x7c85 bytes in the genuine software processing do get replaced by h264 00 00 00 01 65 Key frame NALU . That's h264 appendix-B format.
The 0x7c05 packets always follow and are the remaining payload of the key frame. No NALU are added during their handling (the 0x7c05 is stripped away and rest of the bytes is copied). None of the bytes preceding 0x7cXX make it to a mp4 recording (that makes sense as it's the RTP protocol , albeit I'm not sure if it's entirely RTP standard or there's something custom from Hikvision).
If you pay close attention in the Header there are 2 separate bytes indicating order which always match, so I'm sure no packet loss is occurring.
I also observed nonkey frames arriving as 0x7c81 and converted to 00 00 00 01 61 NALU but I want to focus solely on the single key frame for now. Mostly because if I record a movie with the original software it will always begin with 00 00 00 01 65 Key frame (that obviously makes sense).
To get a working mp4 I decided to copy paste a mp4 header of a genuine iVMS-4200 recording (in this sense that's every byte preceding 1st frame NALU 00 00 00 01 65 in the mp4 file). I know that the resolution will match the actual camera footage. With the strategy of waiting for a keyframe , replacing 0x7c85 with 00 00 00 01 65 NALU and appending the remaining bytes, or only appending bytes in the 0x7c05 case I do seem to get something that eventually could work.
When I attempt to ffplay the custom crafted mp4 result I do get something (with a little stretch of imagination that's actually the camera fisheye image forming), but clearly there is a problem.
It seems around 3-4th 0x7c05 packet (as the failing packet differs on every run), when I copy bytes eventually the h264 stream is incorrect. Just by eye-inspecting the bytes I don't see anything unusual.
This is the failing packet around offset 750 decimal, (I know it's around this place because I keep stripping bytes away to see if there's still same amount frame arriving before it breaks).
More over I did dump those bytes from original software using lldb taking out my own python implementation out of equation. And I run into very same problem with the original packets.
The mp4 header I use should work (since it does for original recordings even if I manipulate number of frames and leave just the first keyframe).
Correct me if I'm wrong but the phase of converting this to MPEG2-PS (which iVMS-4200 does and I don't) should be entirely optional and should not be related to my problem.
Update:
I went the path of setting up recording and only then dumping the original iVMS-4200 packets. I edited the recorded movie to only contain keyframe of interest and it works. I found differences but I cannot explain where they are there yet:
Somehow 00 00 01 E0 13 FA 88 00 02 FF FF is inserted in the genuine recording (that's 4th packet), but I have no idea how this byte string was generated and what is its purpose.
When I fixed the first difference the next one is:
The pattern is striking. But what00 00 01 E0 13 FA 88 00 02 FF FF actually is? And why is it inserted after 18 03 25 10 & 2F F4 80 FC
The 00 00 01 E0 signature would suggest those are Packetized Elementary Stream (PES) headers

Going for mp4 container wasn't a good choice after all. It turns out the RTP essentially yields raw h264 stream. To inspect its structure I converted the genuine mp4 recording to .264 like this:
ffmpeg -i recording.mp4 -codec copy recording.264
It's essentially a PPS (00 00 00 01 67) and SPS 00 00 00 01 68 followed by frame data NALUs that I got in the stream.
Raw h264 turned out a way simpler structure to aim at, and I don't have to deal with those Packetized Elementary Stream (PES) headers anymore. That yields correct image. In my case I just took the PPS & SPS settings that original recordings use from recording.264. That could definitely be resolved dynamically somehow but I didn't bother.

Related

How to explain embedded message binary wire format of protocol buffer?

I'm trying to understand protocol buffer encoding method, when translating message to binary(or hexadecimal) format, I can't understand how the embedded message is encoded.
I guess maybe it's related to memory address, but I can't find the accurate relationship.
Here is what i've done.
Step 1: I defined two messages in test.proto file,
syntax = "proto3";
package proto_test;
message Education {
string college = 1;
}
message Person {
int32 age = 1;
string name = 2;
Education edu = 3;
}
Step 2: And then I generated some go code,
protoc --go_out=. test.proto
Step 3: Then I check the encoded format of the message,
p := proto_test.Person{
Age: 666,
Name: "Tom",
Edu: &proto_test.Education{
College: "SOMEWHERE",
},
}
var b []byte
out, err := p.XXX_Marshal(b, true)
if err != nil {
log.Fatalln("fail to marshal with error: ", err)
}
fmt.Printf("hexadecimal format:% x \n", out)
fmt.Printf("binary format:% b \n", out)
which outputs,
hexadecimal format:08 9a 05 12 03 54 6f 6d 1a fd 96 d1 08 0a 09 53 4f 4d 45 57 48 45 52 45
binary format:[ 1000 10011010 101 10010 11 1010100 1101111 1101101 11010 11111101 10010110 11010001 1000 1010 1001 1010011 1001111 1001101 1000101 1010111 1001000 1000101 1010010 1000101]
what I understand is ,
08 - int32 wire type with tag number 1
9a 05 - Varints for 666
12 - string wire type with tag number 2
03 - length delimited which is 3 byte
54 6f 6d - ascii for "TOM"
1a - embedded message wire type with tag number 3
fd 96 d1 08 - ? (here is what I don't understand)
0a - string wire type with tag number 1
09 - length delimited which is 9 byte
53 4f 4d 45 57 48 45 52 45 - ascii for "SOMEWHERE"
What does fd 96 d1 08 stands for?
It seems like that d1 08 always be there, but fd 96 sometimes change, don't know why. Thanks for answering :)
Add
I debugged the marshal process and reported a bug here.
At that location I/you would expect the number of bytes in the embedded message.
I have repeated your experiment in Python.
msg = Person()
msg.age = 666
msg.name = "Tom"
msg.edu.college = "SOMEWHERE"
I got a different result, the one I would expect. A varint stating the size of the embedded message.
0x08
0x9A, 0x05
0x12
0x03
0x54 0x6F 0x6D
0x1A
0x0B <- Equals to 11 decimal.
0x0A
0x09
0x53 0x4F 0x4D 0x45 0x57 0x48 0x45 0x52 0x45
Next I deserialized your bytes:
msg2 = Person()
str = bytearray(b'\x08\x9a\x05\x12\x03\x54\x6f\x6d\x1a\xfd\x96\xd1\x08\x0a\x09\x53\x4f\x4d\x45\x57\x48\x45\x52\x45')
msg2.ParseFromString(str)
print(msg2)
The result of this is perfect:
age: 666
name: "Tom"
edu {
college: "SOMEWHERE"
}
The conclusion I come to is that there are some different ways of encoding in Protobuf. I do not know what is done in this case but I know the example of a negative 32 bit varint. A positive varint is encoded in five bytes, a negative value is cast as a 64 bit value and encoded to ten bytes.

Gameboy emulation - Clarification need on CD instruction

I'm currently in the process of writing a Gameboy emulator, and I've noticed something that seems strange to me.
My emulator is hitting a jump instruction 0xCD, for example CD B6 FF, but my understanding was that a jump should only be jumping to an address within cartridge ROM (0x7FFF maximum), because I'm assuming the CPU can only execute instructions from ROM, not RAM. The ROM in question is Dr. Mario, which I'd expect to only be carrying out valid operations. 0xFFB6 is in high RAM, which seems odd to me.
Am I correct in my thinking? If I am, presumably that means my program counter is somehow ending up at the wrong address and that the CB is actually part of another instruction's data, and not an instruction itself?
I'd be grateful for some clarification, thanks.
For reference, I've been using Gameboy Opcodes and CPU docs to implement the instructions. I know they contain a few errors, and I think I've accounted for them (for example, 0xE2 being listed as a two-byte instruction, when it's only one)
Just checked Dr. Mario 1.1, it copies the VBlank int routine at hFFB6 at startup, then when VBlank happens, the routine at 0:01A6 is called, which calls the OAM DMA transfer routine.
During OAM DMA transfer, the CPU can only access HRAM, so writing a short routine in HRAM that will wait for the transfer to be completed is required. The OAM DMA transfer takes 160 µs, so you usually make a loop that will wait this amount of time after specifying the OAM transfer source.
This is the part of the initialization routine run at startup that copies the DMA transfer routine to HRAM:
...
ROM0:027E 0E B6 ld c,B6 ;destination hFFB6
ROM0:0280 06 0A ld b,0A ;length 0xA
ROM0:0282 21 86 23 ld hl,2386 ;source 0:2386
ROM0:0285 2A ldi a,(hl) ;copy OAM DMA transfer routine from source
ROM0:0286 E2 ld (ff00+c),a ;paste to destination
ROM0:0287 0C inc c ;destination++
ROM0:0288 05 dec b ;length--
ROM0:0289 20 FA jr nz,0285 ;loop until DMA transfer routine is copied
...
When VBlank happens, it jumps to the routine at 0:01A6:
ROM0:0040 C3 A6 01 jp 01A6
Which contains a call to our OAM DMA transfer routine, waiting for DMA to be completed:
ROM0:01A6 F5 push af
ROM0:01A7 C5 push bc
ROM0:01A8 D5 push de
ROM0:01A9 E5 push hl
ROM0:01AA F0 B1 ld a,(ff00+B1)
ROM0:01AC A7 and a
ROM0:01AD 28 0B jr z,01BA
ROM0:01AF FA F1 C4 ld a,(C4F1)
ROM0:01B2 A7 and a
ROM0:01B3 28 05 jr z,01BA
ROM0:01B5 F0 EF ld a,(ff00+EF)
ROM0:01B7 A7 and a
ROM0:01B8 20 09 jr nz,01C3
ROM0:01BA F0 E1 ld a,(ff00+E1)
ROM0:01BC FE 03 cp a,03
ROM0:01BE 28 03 jr z,01C3
ROM0:01C0 CD B6 FF call FFB6 ;OAM DMA transfer routine is in HRAM
...
OAM DMA transfer routine:
HRAM:FFB6 3E C0 ld a,C0
HRAM:FFB8 E0 46 ld (ff00+46),a ;source is wC000
HRAM:FFBA 3E 28 ld a,28 ;loop start
HRAM:FFBC 3D dec a
HRAM:FFBD 20 FD jr nz,FFBC ;wait for the OAM DMA to be completed
HRAM:FFBF C9 ret ;ret to 0:01C3
Here is my analysis:
Looking for CD B6 FF in the raw ROM I can only find it in one place of the memory which is 0x01C0 (448 in decimal).
So I decided to disassemble the ROM, to see if it is a valid instruction.
I used gb-disasm to disassemble the ROM. Here are the values from 0x150 (ROM start) to address 0x201.
[0x00000100] 0x00 NOP
[0x00000101] 0xC3 0x50 0x01 JP $0150
[0x00000150] 0xC3 0xE8 0x01 JP $01E8
[0x00000153] 0x01 0x0E 0xD0 LD BC,$D00E
[0x00000156] 0x0A LD A,[BC]
[0x00000157] 0xA7 AND A
[0x00000158] 0x20 0x0D JR NZ,$0D ; 0x167
[0x0000015A] 0xF0 0xCF LDH A,[$CF] ; HIMEM
[0x0000015C] 0xFE 0xFE CP $FE
[0x0000015E] 0x20 0x04 JR NZ,$04 ; 0x164
[0x00000160] 0x3E 0x01 LD A,$01
[0x00000162] 0x18 0x01 JR $01 ; 0x165
[0x00000164] 0xAF XOR A
[0x00000165] 0x02 LD [BC],A
[0x00000166] 0xC9 RET
[0x00000167] 0xFA 0x46 0xD0 LD A,[$D046]
[0x0000016A] 0xE0 0x01 LDH [$01],A ; SB
[0x0000016C] 0x18 0xF6 JR $F6 ; 0x164
[0x000001E8] 0xAF XOR A
[0x000001E9] 0x21 0xFF 0xDF LD HL,$DFFF
[0x000001EC] 0x0E 0x10 LD C,$10
[0x000001EE] 0x06 0x00 LD B,$00
[0x000001F0] 0x32 LD [HLD],A
[0x000001F1] 0x05 DEC B
[0x000001F2] 0x20 0xFC JR NZ,$FC ; 0x1F0
[0x000001F4] 0x0D DEC C
[0x000001F5] 0x20 0xF9 JR NZ,$F9 ; 0x1F0
[0x000001F7] 0x3E 0x0D LD A,$0D
[0x000001F9] 0xF3 DI
[0x000001FA] 0xE0 0x0F LDH [$0F],A ; IF
[0x000001FC] 0xE0 0xFF LDH [$FF],A ; IE
[0x000001FE] 0xAF XOR A
[0x000001FF] 0xE0 0x42 LDH [$42],A ; SCY
[0x00000201] 0xE0 0x43 LDH [$43],A ; SCX
The way we have to disassemble a ROM is by following the flow of instructions. For example, we know that the main program starts at position 0x150. So we should start disassembling there. Then we follow instruction by instruction until we hit any JUMP instruction (JP, JR, CALL, RET, etc). From that moment on the flow of the program is forked in two and we should follow both paths to disassemble.
The think to understand here is that if I show you a random memory position in a ROM, you cannot tell me if it is data or instructions. The only way to find out is by following the program flow. We need to define blocks of code that start in a jump destination and end in another jump instruction.
gb-disasm skips any memory position that is not inside a code block. 0x16C marks the end of a block.
[0x0000016C] 0x18 0xF6 JR $F6 ; 0x164
The next block starts on 0x1E8. We know that because it is the destination address of a jump located on 0x150.
[0x00000150] 0xC3 0xE8 0x01 JP $01E8
Memory block from 0x16E until 0x1E8 is not consider a code block. That's why you don't see the memory position 0x01C0 listed as an instruction.
So there you are, it is very likely that you are interpreting the instructions in a wrong way. If you want to be 100% sure, you can disassemble the whole room and check if any instruction points to 0x16E-0x1E8 and reads it as raw data, such as a tile or something.
Please leave a comment if you agree with the analysis.

What's the memory layout of UTF-16 encoded strings with Visual Studio 2015?

WinAPI uses wchar_t buffers. As I understand we need to use UTF-16 to encode all our arguments to WinAPI.
We have two versions of UTF-16: UTF-16be and UTF-16le. Let encode a string "Example" 0x45 0x78 0x61 0x6d 0x70 0x6c 0x65. With UTF-16be bytes should be placed as this: 00 45 00 78 00 61 00 6d 00 70 00 6c 00 65. With UTF-16le it should be 45 00 78 00 61 00 6d 00 70 00 6c 00 65 00. (We are omitting BOM). Byte representations of the same string are different.
According to the docs Windows uses UTF-16le. This means that we should encode all strings with UTF-16le or it would not work.
At the same time my compiler (VS2015) uses UTF-16be for the strings that I hard coded into my code (smth like L"my test string"). But WinAPI works well with these strings. Why it works? What am I missing?
Update 1:
To test byte representation of hard coded strings I used following code:
std::string charToHex(wchar_t ch)
{
const char alphabet[] = "0123456789ABCDEF";
std::string result(4, ' ');
result[0] = alphabet[static_cast<unsigned int>((ch & 0xf000) >> 12)];
result[1] = alphabet[static_cast<unsigned int>((ch & 0xf00) >> 8)];
result[2] = alphabet[static_cast<unsigned int>((ch & 0xf0) >> 4)];
result[3] = alphabet[static_cast<unsigned int>(ch & 0xf)];
return std::move(result);
}
Little endian or big endian describes the way that variables of more than 8 bits are stored in memory. The test you have devised doesn't test memory layout, it's working with wchar_t types directly; the upper bits of an integer type are always the upper bits, no matter if the CPU is big endian or little endian!
This modification to your code will show how it really works.
std::string charToHex(wchar_t * pch)
{
const char alphabet[] = "0123456789ABCDEF";
std::string result;
unsigned char * pbytes = static_cast<unsigned char *>(pch);
for (int i = 0; i < sizeof(wchar_t); ++i)
{
result.push_back(alphabet[(pbytes[i] & 0xf0) >> 4];
result.push_back(alphabet[pbytes[i] & 0x0f];
}
return std::move(result);
}

ACR122 - Card Emulation

How can I get the NFC contactless reader ACR122U to behave as a tag (card emulation mode)?
The prospectus claims that the device can do card emulation, but the SDK does not seem to provide an example or documentation for this feature.
Does anybody know how to do this?
Is there additional software required?
Please note that my target platform is MS Windows.
Thanks in advance
For "Card Emulation" or in other words, "Configure as target and wait for initiators", please refer to here: http://code.google.com/p/nfcip-java/source/browse/trunk/nfcip-java/doc/ACR122_PN53x.txt
** Command to PN532 **
0xd4 0x8c TgInitAsTarget instruction code
0x00 Acceptable modes
(0x00 = allow all, 0x01 = only allow to be
initialized as passive, 0x02 = allow DEP only)
_6 bytes (_MIFARE_)_:
0x08 0x00 SENS_RES
0x12 0x34 0x56 NFCID1
0x40 SEL_RES
_18 bytes (_Felica_)_:
0x01 0xfe 0xa2 0xa3 0xa4 0xa5 0xa6 0xa7
NFCID2
0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xc6 0xc7
?
0xff 0xff System parameters?
0xaa 0x99 0x88 0x77 0x66 0x55 0x44 0x33 0x22 0x11
NFCID3
0x00 ?
0x00 ?
This is the response when an initiator activated this target:
** Response from PN532 **
0xd5 0x8d TgInitAsTarget response code
0x04 Mode
(0x04 = DEP, 106kbps)
Let me know if it works!
Also you can try to send the following ADPU in HEX to put the reader in "Card emulation" mode:
FF 00 00 00 27 D4 8C 00 08 00 12 34 56 40 01 FE A2 A3 A4 A5 A6 A7 C0 C1 C2 C3 C4 C5 C6 C7 FF FF AA 99 88 77 66 55 44 33 22 11 00 00
For getting the ACR122 (or rather the PN532 NFC controller chip inside it) into card emulation mode, you would do about the following:
ReadRegister:
> FF000000 08 D406 6305 630D 6338
< D507 xx yy zz 9000
Update register values:
xx = xx | 0x004; // CIU_TxAuto |= InitialRFOn
yy = yy & 0x0EF; // CIU_ManualRCV &= ~ParityDisable
zz = zz & 0x0F7; // CIU_Status2 &= ~MFCrypto1On
WriteRegister:
> FF000000 11 D408 6302 80 6303 80 6305 xx 630D yy 6338 zz
< D509 9000
SetParameters:
> FF000000 03 D412 30
< D513 9000
TgInitAsTarget
> FF000000 27 D48C 05 0400 123456 20 000000000000000000000000000000000000 00000000000000000000 00 00
< D58D xx ... 9000
Where xx should be equal to 0x08.
Communicate using a sequence of TgGetData and TgSetData commands:
> FF000000 02 D486
< D587 xx <C-APDU> 9000
Where xx is the status code (should be 0x00 for success) and C-APDU is the command sent from the reader.
> FF000000 yy D48E <R-APDU>
< D587 xx 9000
Where yy is 2 + the length of the R-APDU (response) and xx is the status code (should be 0x00 for success).
You can use LibNFC. It has example code for this.
I still never got this working properly in Windows unfortunately. You will probably have to compile libnfc for specific drivers.
Also, the ACR122u seems to be pretty poorly supported by many libraries. Apparently it's not really designed for this use. There are particular issues for card emulation too (such as the timeout). We really all need to stop by the ACR122u. I just bought what was popular and easy to get hold of but regret it now.
To future browsers/searchers coming across this: please check the compatibility section on the libnfc site and buy something that they recommend!

Function pointer incorrect in Visual Studio 2005, code starts at 1 byte offset

The code in question hooks into explorer.exe but was crashing on entry to the callback function:
Unhandled exception at 0x60055b50 (redacted.dll) in explorer.exe: 0xC0000005: Access violation writing location 0x548b0cca.
Callstack:
> redacted.dll!myCallWndProcRetCallback(int nCode=0x00000000, unsigned int wParam=0x00000000, long lParam=0x015afa58) Line 799 C++
user32.dll!_DispatchHookW#16() + 0x31 bytes
user32.dll!_fnHkINLPCWPRETSTRUCTW#20() + 0x5e bytes
user32.dll!___fnDWORD#4() + 0x24 bytes
ntdll.dll!_KiUserCallbackDispatcher#12() + 0x13 bytes
user32.dll!_NtUserMessageCall#28() + 0xc bytes
user32.dll!_SendMessageW#16() + 0x49 bytes
explorer.exe!CTaskBand::_FindIndexByHwnd() + 0x21 bytes
explorer.exe!CTaskBand::_HandleShellHook() + 0x48 bytes
explorer.exe!CTaskBand::v_WndProc() + 0x660 bytes
explorer.exe!CImpWndProc::s_WndProc() + 0x3f bytes
Visual Studio 2005 gave the following disassembly:
--- c:\projects\redacted.cpp -------------------------
//------------------------------------------------------------------------------
LRESULT CALLBACK myCallWndProcRetCallback(int nCode, WPARAM wParam, LPARAM lParam) {
60055B50 inc dword ptr [ebx+548B0CC4h]
60055B56 and al,18h
60055B58 mov eax,dword ptr [g_callWndProcRetHook (600B9EE8h)]
60055B5D push esi
and the memory around 0x548B0CC4 is all ?????? so it is not mapped memory, hence the crash.
The machine code at the start of myCallWndProcRetCallback is this:
0x60055B50: ff 83 c4 0c 8b 54 24 18 a1 e8 9e 0b 60 56 52 57 50 ff 15 8c a6 09 60 5f 5e 83 c4 08 c2 0c 00 cc 8b 4c 24 04 8b 01 8b 50
But Visual Studio also sometimes gives the following disassembly for this function:
--- c:\projects\redacted.cpp -------------------------
60055B51 add esp,0Ch
if ( nCode == HC_ACTION && lParam != NULL) {
60055B54 mov edx,dword ptr [esp+18h]
60055B58 mov eax,dword ptr [g_callWndProcRetHook (600B9EE8h)]
60055B5D push esi
This looks like the correct disassembly but it is from 1 byte later than the disassembly above! You can see the instructions are the same from 0x60055B58 onward.
So, it looks like the linker says the function is at 0x60055B50 but the code actually starts at 0x60055B51. I have confirmed the former is the callback set into the Windows hook. So when Windows calls back into the function it executes bad code.
The question I have is how the linker could get this wrong? I did a rebuild and the problem went away, it seems random. At the time the /FORCE:MULTIPLE linker option was in effect but without it no link error is reported for this callback.
A late addition: Could this be related to the relocation or rebasing of a DLL? If the relocation was off by 1 byte, this could perhaps cause the problem?
Relocations will almost never be off by 1 byte; the .dll image has to be aligned to the granularity of allocations returned by VirtualAlloc which should be 64k on most machines.
How long has this code worked? If it's random then the /FORCE:MULTIPLE might be suspect. Or you could be using Incredibuild...

Resources