Access violations on string operations in Delphi - windows

I have a Delphi application that reads data form a file and stores it in an array. Each row in a file contains an address, lineTypeIndicator and data. This is the algorithm (contains code that I believe is critical):
AssignFile(inputFile, inFileName);
Reset(inputFile);
while not EOF(inputFile) do
begin
Readln(inputFile,fileLineBuffer);
if Copy(fileLineBuffer, 8, 2) = '01' then //Never managed to catch the error here
begin
break;
end;
//extract the address from the line and use it to determine max and min address.
end;
//Now that you have min and max address, use it to set the length of an char array
SetLength(memoryArray,(lastAddress - firstAddress) * 2);
Reset(inputFile);
while not EOF(inputFile) do
begin
Readln(inputFile,fileLineBuffer);
if Copy(fileLineBuffer, 8, 2) = '01' then //I caught all the errors here
begin
break;
end;
//extract the address and data from the fileLineBuffer and place it in the corresponding place in an array
end;
This code is executed every time the user clicks the corresponding button on a form. It runs the first few times it is executed, but then after a few runs i get this:
MyProgram.exe faulted with message: 'access violation at 0x00406111:
write of address 0x00090d1c (this varies). Proceess stopped. Use step
or run to continue.
To me, this smells like some kind of heap overflow. I have tried replacing
if Copy(fileLineBuffer, 8, 2) = '01' then
with
lineTypeBuffer := Copy(fileLineBuffer, 8, 2);
if lineTypeBuffer = '01' then
or
if (fileLineBuffer[8] = '0') and (fileLineBuffer[9] = '1') then
but it did not help.
Any suggestions on how I should approach this problem?
P.S. Tried running it on Win7 32 bit and Win7 64 bit - no difference
P.P.S. sorry for the long question.

The only explanation for
Copy(fileLineBuffer, 8, 2) = '01'
resulting in an access violation is that you have corrupted the heap.
Something else in your program is writing out of bounds and corrupting the heap. Such problems can be tricky to diagnose because the fault is typically in one part of the code, but the error occurs elsewhere. Some code corrupts the heap, and then a subsequent heap operation fails because of the earlier heap corruption.
I am confident in my diagnosis because Delphi string variables are known to work, Copy is known to work, and string equality testing is known to work. In other words there is no fault in the line of code at which the error occurs. Ergo the error is elsewhere.
Some debugging tools that might help:
FastMM in full debug mode.
Range checking (enabled from the project's compiler options).

Related

Virtual Pascal RunTime Error (or no output) if including Printer unit?

I'm using Virtual Pascal on Windows and have a weird problem whereby if an attempt to Reset() any file that doesn't exist causes the WriteLn() to CRT fail. Example:
{$I+}
program TestIt;
uses Printer; { attempts to open LPT1 as Lst but LPT1 doesn't exist on system }
begin
{ Generates a RunTime Error with $I+ and no output with $I- }
writeln('Test It');
end.
This also fails:
program TestIt;
var
SomeFile : text;
begin
{$I-}
Assign(SomeFile, 'a:\filepath_not_exist');
{ Without $I- the Reset generates a RunTime Error as expected }
Reset(SomeFile);
{$I+}
{ Generates a RunTime Error with $I+ and no output with $I- }
writeln('Test It');
end.
This works as expected:
{$I+}
program TestIt;
begin
writeln('Test It');
end.
Any ideas why this may be happening and how to fix it? Is the source available for WriteLn() or Assign() ? I'm able to change the Printer unit to not Reset() to work around it until needing to get printing working but I don't think failure to open a file should cause the screen output to stop working.
Probably the error state is kept in some variable (which is queried by ioresult) and might cause an error to be raised in the next I/O function. If you don't do automatic error checking ($I+) you must call ioresult after every operation.
Such implementation is not ideal (if errors from the opening using "somefile" spill into a writeln to a second(output) file as happens in the second example), but it happens, specially in older compilers, and specially on targets that are less dos-like.
Also, not existing drive errors are much harder to cache than non existing files. Keep those cases fundamentally apart.
You can test this theory by trying to reset state by querying ioresult just before the action in the hope it resets the state.
Afaik VP is dead for almost a decade now. What you see (read: download) is what you get.

No direct packet access in BPF program with just CAP_BPF?

Up until Linux 5.8 CAP_SYSADMIN was required to load any but the most basic BPF program. The recently introduced CAP_BPF is a welcome addition as it allows to run software leveraging BPF with less privileges.
Certain types of BPF programs can access packet data. The pre-4.7 way of doing it is via bpf_skb_load_bytes() helper. As the verifier got smarter, it became possible to perform "direct packet access", i.e. to access packet bytes by following pointers in the context structure. E.g:
static const struct bpf_insn prog[] = {
// BPF_PROG_TYPE_SK_REUSEPORT: gets a pointer to sk_reuseport_md (r1).
// Get packet data pointer (r2) and ensure length >= 2, goto Drop otherwise
BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1,
offsetof(struct sk_reuseport_md, data)),
BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_1,
offsetof(struct sk_reuseport_md, data_end)),
BPF_MOV64_REG(BPF_REG_4, BPF_REG_2),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 2),
BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_3, /* Drop: */ +4),
// Ensure first 2 bytes are 0, goto Drop otherwise
BPF_LDX_MEM(BPF_H, BPF_REG_4, BPF_REG_2, 0),
BPF_JMP_IMM(BPF_JNE, BPF_REG_4, 0, /* Drop: */ +2),
// return SK_PASS
BPF_MOV32_IMM(BPF_REG_0, SK_PASS),
BPF_EXIT_INSN(),
// Drop: return SK_DROP
BPF_MOV32_IMM(BPF_REG_0, SK_DROP),
BPF_EXIT_INSN()
};
It is required to ensure that the accessed bytes are within bounds explicitly. The verifier will reject the program otherwise.
The program above loads successfully if the caller bears CAP_SYSADMIN. Supposedly, CAP_BPF should suffice as well, but it doesn't (Linux 5.13). Earlier kernels behave similarly. The verifier output follows:
Permission denied
0: (79) r2 = *(u64 *)(r1 +0)
1: (79) r3 = *(u64 *)(r1 +8)
2: (bf) r4 = r2
3: (07) r4 += 2
4: (2d) if r4 > r3 goto pc+4
R3 pointer comparison prohibited
processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
I understand that arbitrary pointer comparison is restricted as it reveals kernel memory layout. However, comparing a pointer to a packet data offset by a certain amount with a pointer to the packet end is safe.
I'd like to find a way to load the program without granting CAP_SYSADMIN.
Is there a way to write bounds checks in a way that doesn't trigger pointer comparison error?
The relevant code is in check_cond_jmp_op(). It looks like one can't get away with pointer comparison, even with the latest kernel version.
If there's no way to write bounds check in a way that keeps verifier happy, I wonder if lifting the limitation is on the roadmap.
As a workaround, I can grant CAP_PERFORM on top of CAP_BPF, removing the "embargo" on pointer comparison. The program loads successfully. I can probably restrict perf_event_open() and other superfluous bits with seccomp. Doesn't feel nice though.
Reproducer.
To make direct packet accesses in your program, you will need CAP_PERFMON in addition to CAP_BPF. I'm not aware of any way around it.
Why?
Because of Spectre vulnerabilities, someone able to perform arithmetic on unbounded pointers (i.e., all except stack and map value pointers) can read arbitrary memory via speculative out-of-bounds loads.
Such operations thus need to be forbidden for unprivileged users. Allowing CAP_BPF users to perform those operations would essentially give read access to arbitrary memory to CAP_BPF. For those reasons, I doubt this limitation will be lifted in the future.

Delphi XE8 Printe VCL and DocumentProperties strange issue

I'm using the vcl.printers unit (delphi XE8) and I'm facing an error when "talking" to a printer.
I traced into the vcl.printers and found this code (written by EMB people):
if OpenPrinter(ADevice, FPrinterHandle, nil) then
begin
if DeviceMode = 0 then // alloc new device mode block if one was not passed in
begin
DeviceMode := GlobalAlloc(GHND,
DocumentProperties(0, FPrinterHandle, ADevice, nil, nil, 0));
if DeviceMode <> 0 then
begin
DevMode := GlobalLock(DeviceMode);
if DocumentProperties(0, FPrinterHandle, ADevice, DevMode^,
DevMode^, DM_OUT_BUFFER) < 0 then
begin
GlobalUnlock(DeviceMode);
GlobalFree(DeviceMode);
DeviceMode := 0;
DevMode := nil;
end
end;
end;
if DeviceMode <> 0 then
SetPrinterCapabilities(DevMode^.dmFields);
end;
The
DocumentProperties(0, FPrinterHandle, ADevice, nil, nil, 0)
return the correct buffer size the first time (I haven't written it somewhere), then going thru the second execution time it returns 4294967295 bytes, indeed a -1 because declaration is wrong, but meanning an error.
As you can see the VCL code handle the errors very poorly since there is no error check !
But what error I have here and why ?
DocumentProperties lies in winspool.dll
To recover from it, I need to reboot the PC, but I cannot use this more than one time pass that is vey annoying for debug.
The printer is simply the "PDFcreator"
I tried with other PC and seems OK even if I run it many times.
I have also two laser network printers.
Thanks
In the past I did have the same problem with two customers. I did track it down to printers unit (printers.pas) Kind of hard to track it down without debugger on a distant computer in other part of my country.
Ok.. but I did track it all the way down to this line:
DeviceMode := GlobalAlloc(GHND, DocumentProperties (0, FPrinterHandle, ADevice, StubDevMode, StubDevMode, 0)); in the function SetPrinter in the unit PRINTERS.PAS
When I did broke it up into two lines, i.e. call to DocumentProperties first and store the value in integer variable and then check the value and only then call Globalalloc if the value is greater than 0 and kind of debugged it with stored values in debug file the error was truly in the DocumentProperties function from SPOOL.DLL if I remember correctly. This function returned -1 as size for the device, but only with this customer on one computer (he is using 4 or 5 with my program)
Of all my customers (close to 200 clients) I have had this issue on two computers. The other one fixed it kind of itself.. I didn't know how it did get fixed. The later one I was trying to fix just a couple of minutes ago. In the end I found a solution. I did fix this customer with simple change of shortcut. I began to use the automatic fix for compatibility in Windows 10 and then ran the button "test program" and it worked.. No error choosing printers or using it's propertis. Ok.. Then I tried again with the shortcut alone.. aarrgg.. error returned.. but then, aha.. I thought to myself "this has to connects to how Windows is running this program" and changed how Windows 10 ran the program as check "run as administrator" to uncheck.
And no problem.
On almost every compture with Windows 10 I do check "Run as Administrator" with no problems. I think there was a update or some issues with spool.dll that connects these dots.
ps. If you google this behavior with Delphi DocumentProperties problems, then you will find out this is known problem.. some say connected to x86 and x64 mode, but I found this out.

std::copy runtime_error when working with uint16_t's

I'm looking for input as to why this breaks. See the addendum for contextual information, but I don't really think it is relevant.
I have an std::vector<uint16_t> depth_buffer that is initialized to have 640*480 elements. This means that the total space it takes up is 640*480*sizeof(uint16_t) = 614400.
The code that breaks:
void Kinect360::DepthCallback(void* _depth, uint32_t timestamp) {
lock_guard<mutex> depth_data_lock(depth_mutex);
uint16_t* depth = static_cast<uint16_t*>(_depth);
std::copy(depth, depth + depthBufferSize(), depth_buffer.begin());/// the error
new_depth_frame = true;
}
where depthBufferSize() will return 614400 (I've verified this multiple times).
My understanding of std::copy(first, amount, out) is that first specifies the memory address to start copying from, amount is how far in bytes to copy until, and out is the memory address to start copying to.
Of course, it can be done manually with something like
#pragma unroll
for(auto i = 0; i < 640*480; ++i) depth_buffer[i] = depth[i];
instead of the call to std::copy, but I'm really confused as to why std::copy fails here. Any thoughts???
Addendum: the context is that I am writing a derived class that inherits from FreenectDevice to work with a Kinect 360. Officially the error is a Bus Error, but I'm almost certain this is because libfreenect interprets an error in the DepthCallback as a Bus Error. Stepping through with lldb, it's a standard runtime_error being thrown from std::copy. If I manually enter depth + 614400 it will crash, though if I have depth + (640*480) it will chug along. At this stage I am not doing something meaningful with the depth data (rendering the raw depth appropriately with OpenGL is a separate issue xD), so it is hard to tell if everything got copied, or just a portion. That said, I'm almost positive it doesn't grab it all.
Contrasted with the corresponding VideoCallback and the call inside of copy(video, video + videoBufferSize(), video_buffer.begin()), I don't see why the above would crash. If my understanding of std::copy were wrong, this should crash too since videoBufferSize() is going to return 640*480*3*sizeof(uint8_t) = 640*480*3 = 921600. The *3 is from the fact that we have 3 uint8_t's per pixel, RGB (no A). The VideoCallback works swimmingly, as verified with OpenGL (and the fact that it's essentially identical to the samples provided with libfreenect...). FYI none of the samples I have found actually work with the raw depth data directly, all of them colorize the depth and use an std::vector<uint8_t> with RGB channels, which does not suit my needs for this project.
I'm happy to just ignore it and move on in some senses because I can get it to work, but I'm really quite perplexed as to why this breaks. Thanks for any thoughts!
The way std::copy works is that you provide start and end points of your input sequence and the location to begin copying to. The end point that you're providing is off the end of your sequence, because your depthBufferSize function is giving an offset in bytes, rather than the number of elements in your sequence.
If you remove the multiply by sizeof(uint16_t), it will work. At that point, you might also consider calling std::copy_n instead, which takes the number of elements to copy.
Edit: I just realised that I didn't answer the question directly.
Based on my understanding of std::copy, it shouldn't be throwing exceptions with the input you're giving it. The only thing in that code that could throw a runtime_error is the locking of the mutex.
Considering you have undefined behaviour as a result of running off of the end of your buffer, I'm tempted to say that has something to do with it.

Pixel modifying code runs quick in main app, really slow in Delphi 6 DirectShow filter with other problems

I have a Delphi 6 application that sends bitmaps to a DirectShow DLL in real-time, 25 frames a second. The DirectShow DLL is my code too and is also written in Delphi 6 using the DSPACK DirectShow component suite. I have a simple block of code that goes through each pixel in the bitmap modifying the brightness and contrast of the image, if a certain flag is set, otherwise the bitmap is pushed out the DirectShow DLL unmodified (push source video filter). The code used to be in the main application and then I just moved it into the DirectShow DLL. When it was in the main application it ran fine. I could see the changes in the bitmap as expected. However, now that the code resides in the DirectShow DLL it has the following problems:
When the code block below is active the DirectShow DLL is really slow. I have a quad core i5 and it's really slow. I can also see a big spike in the CPU consumption. In contrast, the very same code running in the main application ran fine on an old single core P4. It did hit the CPU noticeably on that old machine but the video was smooth and there were no problems. The images are only 352 x 288 pixels in size.
I don't see the expected changes to the visible bitmap. I can trace the code in the DirectShow DLL and see the numerical values of each pixel properly altered by the code, but the viewable image in the Graph Edit ActiveMovie window looks completely unchanged.
If I deactivate the code, which I can do in real-time, the ActiveMovie window shows video that is as smooth as glass, perfectly rendered with the CPU barely touched. If I reactivate the code the video is now really choppy, probably showing only 1 to 2 frames a second with a long delay before the first frame is shown, and the CPU spikes. Not completely, but a lot more than I would expect.
I tried compiling the DirectShow DLL with everything on including range checking, overflow checking, etc. and there were no warnings or errors during run-time. I then tried compiling for fastest speed and it still had the exact same problems listed above. Something is really wrong and I can't figure out what. Note, I do indeed lock the canvas before modifying the bitmap and unlock it after I'm done. If it weren't for the "everything on" compilation run I noted above I'd say it felt like an FPU Exception was being raised and silently swallowed with every pixel computation, but as I said, no errors or Exceptions are occurring.
UPDATE: I am putting this here so that the solution, which is embedded in one of Roman R's comment, is plainly visible. The problem that I was not setting the PixelFormat property to pf24Bit before accessing the ScanLine property. As Roman suggested, not doing this must make the TBitmap code create a temporary copy of the bitmap. As soon as I added the line of code below the problems went away, both that of changes not being visible and the soft page faults. It's an insidious problem because the only object that is affected is the pointer you use to access the ScanLine property, since (assumption) it contains a pointer to a temporary copy of the bitmap. That's must be why the subsequent TextOut() call still worked since it worked on the original copy of the bitmap.
clip.PixelFormat := pf24bit; // The missing code line that fixes the problem.
Here's the code block I've been referring to:
function IntToByte(i: Integer): Byte;
begin
if i > 255 then
Result := 255
else if i < 0 then
Result := 0
else
Result := i;
end;
// ---------------------------------------------------------------
procedure brightnessTurboBoost(var clip: TBitmap; rangeExpansionPowerOf2: integer; shiftValue: Byte);
var
p0: PByte;
x,y: Integer;
begin
if (rangeExpansionPowerOf2 = 0) and (shiftValue = 0) then
exit; // These parameter settings will not change the pixel values.
for y := 0 to clip.Height-1 do
begin
p0 := clip.scanline[y];
// Can't just do the whole buffer as a big block of bytes since the
// individual scan lines may be padded for CPU alignment.
for x := 0 to (clip.Width - 1) * 3 do
begin
if rangeExpansionPowerOf2 >= 1 then
p0^ := IntToByte((p0^ shl rangeExpansionPowerOf2) + shiftValue)
else
p0^ := IntToByte(p0^ + shiftValue);
Inc(p0);
end;
end;
end;
There are a few things to say about this code snippet.
First of all, you are using Scanline property of TBitmap class. I have not been dealign with Delphi for many years, so I might be wrong about this but I am under impression that Scanline is not actually a thin accessor, is it? It might be internally hiding things which can dramatically affect performance, such as "if he wants to access the bits of the image, then we have to first convert it to DIB before returning pointers". So a thing looking so simple might appear to be a killer.
"if rangeExpansionPowerOf2 >= 1 then" in the inner loop body? You don't really want to compare this all the way. Either make two separate functions or duplicate the whole loop without in two version for zero and non-zero rangeExpansionPowerOf2 and do this if only once.
"for ... to (clip.Width - 1) * 3 do" I am not really sure that Delphi optimizes the upper boundary evaluation to make it only once. You might be doing those multiplication thrice for every pixel, while you could do it only once the whole image.
For top perofrmance IntToByte is definitely implemented in MMX to avoid ifs and process multiple bytes at once.
Still as you say that images are only 352x288, I would suspect that #1 is ruining the performance.

Resources