Does the close() system call close all references to the file? - linux-kernel

Let's say we have a pipe, my_pipe, which occupies file descriptors at entries 3 and 4 of the process's FDT (initial FDT scheme)
First, we close the stdout standard output stream using close(1) (updated FDT scheme).
Next, using dup(my_pipe[1]), we create a copy of the pipe's writing end file descriptor at the lowest available FDT entry - in our case entry 1, which was made available in the previous step.
my_pipe[1] is now referred to by both entry 1 and entry 4 (updated FDT scheme).
The following command we run is close(my_pipe[1]).
My question is - which of the file descriptors referring to my_pipe[1] will close?

Related

codesign an old Director projector on OS X 10.13 - perhaps manipulate __LINKEDIT segment value

(See Updates at end of this post)
For $reasons, I need to codesign an old Director projector that we can no longer re-publish (no access to original source code or to Director).
I'm doing this because when run without being signed, the app now opens a Finder window with a prompt saying "Where is..." asking for a file that's one of the embedded projector resources.
But... If I cd into the Projector.app contents (it's not really called that, but you get the idea) and find the projector binary inside Contents/MacOS/ and run this binary from terminal, the app launches and runs fine, once it's decompressed the (presumably) attached archive at the end of the binary...
/BuildRoot/Library/Caches/com.apple.xbs/Sources/AppleFSCompression/AppleFSCompression-96.30.2/Common/ChunkCompression.cpp:50: Error: unsupported compressor 8
/BuildRoot/Library/Caches/com.apple.xbs/Sources/AppleFSCompression/AppleFSCompression-96.30.2/Libraries/CompressData/CompressData.c:353: Error: Unknown compression scheme encountered for file '/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/Exceptions.plist'
/BuildRoot/Library/Caches/com.apple.xbs/Sources/AppleFSCompression/AppleFSCompression-96.30.2/Common/ChunkCompression.cpp:50: Error: unsupported compressor 8
/BuildRoot/Library/Caches/com.apple.xbs/Sources/AppleFSCompression/AppleFSCompression-96.30.2/Libraries/CompressData/CompressData.c:353: Error: Unknown compression scheme encountered for file '/System/Library/CoreServices/CoreTypes.bundle/Contents/Library/AppExceptions.bundle/Exceptions.plist'
271 blocks
1120 blocks
274 blocks
136 blocks
255 blocks
120 blocks
1487 blocks
575 blocks
1128 blocks
570 blocks
104 blocks
2042 blocks
4889 blocks
677 blocks
388 blocks
363 blocks
700 blocks
23010 blocks
...app opens and runs correctly at this point
I can't ask our users to do this (they're very non-technical) so I'm guessing that the "Where is..." prompt is some aspect of the OS X Gatekeeper, and hence I'm hoping that signing the binary will make it click-runnable again.
When I try and codesign the binary App.app/Contents/MacOS/projector I get:
main executable failed strict validation
Setting the --no-strict codesign option gives a bit more detail:
the __LINKEDIT segment does not cover the end of the file (can't be processed)
Which I think is because the Director projector is a binary with a bundled archive containing the rest of the application's resources, appended to the end of the executable. Some googling shows that other projects have similar problems with their embedded resources.
I've tried using macho_edit to see if I could modify the binary, but with no joy. I've also tried signing using jtool, but again, this didn't work.
So now, opening the binary in MachOView:
I'm hoping that I can hexedit the binary and change the value of the __LINKEDIT segment so it covers the end of the file, and hence so the codesigning will work, but I have no idea what the modified value should be, or what else if anything I need to change. Any tips appreciated.
Update 1 - in response to Kamil.S's answer
I've tried adjusting the File Size value in __LINKEDIT segment, so this + the File Offset is the same as the actual binary (I tried a few times; you actually need to change the VM Size to be the same value as the File Size or you get Killed: 9 by the OS. Same happens if you set File Size to be the total size of the binary), but with no luck. With the new File Size and VM Size values, I can still run the binary, but I can't codesign it; I do however, get a slightly different error message:
file not in an order that can be processed (link edit information does not fill the __LINKEDIT segment)
Update 2 - https://github.com/pyinstaller/pyinstaller/wiki/Recipe-OSX-Code-Signing#pyinstaller-fix-implementation has a bit more detail on the same problem:
PyInstaller breaks OSX code signing because it appends python code at
the end of the binary. Appending data at the end of executable breaks
the Mach-o format structure. codesign utility complains with the
following messages.
the __LINKEDIT segment does not cover the end of the file (can't be processed)
file not in an order that can be processed (link edit information does not fill the __LINKEDIT segment)
Fix __LINKEDIT - File Size (offset + File size == exe size), VM Size- same as 'File Size'
Fix LC_SYMTAB - String Table Size - last data in mach-o file (offset + size = exe size on the filesystem) - The data appended to the executable will be part of the 'String Table' (Last data section in
Mach-O file).
I'll take a look at fixing LC_SYMTAB to see if that helps.
__LINKEDIT's File Offset + File Size should be equal to physical executable size. You can tinker with File Size in MachOView by double clicking the value, editing it and saving - the executable should be fine. Just don't touch File Offset because this will definitely break it.
If the app can't find its external resources when being run normally from the Finder, that sounds like a result of Gatekeeper Path Randomization. The OS moves the app to a hidden read-only location when it's run, and it can't find the resources.
I don't think that signing the app binary itself will fix this problem and prevent the OS from doing path randomization. The user either needs to move the application to a different directory after extracting it, or you need to distribute the application inside a disk image that has been signed with your Developer ID certificate. DropDMG (linked from the post above) can create signed disk images, that might be worth a quick try.
DIrector uses a scheme widely used on Windows called "Overload". I attaches some data at the end of the physical file, but beyond the size of the executable image.
This approach is not supported with Mach-O files. The physical image has to end with the last byte covered by the LINKEDIT segment, and even the order of items inside the LINKEDIT segment is well defined. Reason for this was prebinding in the past, now it is codesigning.
The appended data is the initial DIR/DXR Directory wants to load first. I guess this has been fixed later by adding the DIR/DXR into the application bundle. But I am not anymore into Director, so I am not sure about this.

file system recognition on Windows and BIOS parameter block position

I am writing a boot sector for the FAT12 file system for a floppy disk.
Code #1:
start: jmp main
TIMES 3 - ($-$$) DB 0
OEMname: DB '12345678'
;rest of the BPB information below
Code #2:
TIMES 3 - ($-$$) DB 0
OEMname: DB '12345678'
;rest of the BPB information below
In both these cases, OEMname starts from byte 3 (as informed by the assembler listing).
When the boot sector is written to the disk, Windows recognizes the partition when code #1 is used but fails to recognize when code #2 is used, i.e. it complains that the drive is not formatted.
Why does Windows succeed to identify the file system in one case but not the other?

8086 Assembly / MS-DOS, passing file name from the command line

Say I have PROGRAM.ASM - I have the following in the data segment:
.data
Filename db 'file.txt', 0
Fhndl dw ?
Buffer db ?
I want 'file.txt' to be dynamic I guess? Once compiled, PROGRAM.exe needs to be able to accept a file name via the command line:
c:\> PROGRAM anotherfile.txt
EXECUTION GOES HERE
How do I enable this? Thank you in advance.
DOS stores the command line in a legacy structure called the Program Segment Prefix ("PSP"). And I do mean legacy. This structure was designed to be backwards-compatible with programs ported from CP/M.
Where's the PSP?
You know how programs built as .COM files always start with ORG 100h? The reason for that is precisely that - for .COM programs - the PSP is always stored at the beginning of the code segment (at CS:0h). The PSP is 0FFh bytes long, and the actual program code starts right after that (that is, at CS:100h).
The address is also conveniently available at DS:00h and ES:00h, since the key characteristic of the .COM format is that all the segment registers start with the same value (and a COM program typically never changes them).
To read the command line from a .COM program, you can pick its length at CS:80h (or DS:80h, etc. as long as you haven't changed those registers). The Command Line starts at CS:81h and takes the rest of PSP, ending with a Carriage Return (0Dh) as a terminator, so the command line is never more than 126 bytes long.
(and that is why the command line has been 126 bytes in DOS forever, despite the fact we all wished for years it could be made longer. Since WinNT uses provides a different mechanism to access the command line, the WinNT/XP/etc. command line doesn't suffer from this size limitation).
For an .EXE program, you can't rely on CS:00h because the startup code segment can be just about anywhere in memory. However, when the program starts, DOS always stores the PSP at the base of the default data segment. So, at startup, DS:00h and ES:00h will always point to the PSP, for both .EXE and .COM programs.
If you didn't keep track of PSP address at the beginning of the program, and you change both DS and ES, you can always ask DOS to provide the segment value at any time, via INT 21h, function 62h. The segment portion of the PSP address will be returned in BX (the offset being of course 0h).

Wrapping a binary data file to self-convert to CSV?

I'm writing custom firmware for a SparkFun Logomatic V2 that records binary data to a file on a 2GB micro-SD card. The data file size will range from 100 MB to 1 GB.
The format of the binary data is in flux as the board's firmware evolves (it will actually be dynamically reconfigurable at run-time). Rather than create and maintain a separate decoder/converter program for each version of firmware/configuration, I'd much rather make the data files self-converting to CSV format by starting the data file with a Bash script that is written to the data file before data recording starts.
I know how to create a Here Document, but I suspect Bash would be unable to quickly parse and convert a gigabyte of binary data, so I'd like to make the process run much faster by having the script first compile some C code (assume GCC is present and in the path), then run the resulting program, passing the binary data to stdin.
To make the problem more concrete, assume the firmware will create binary data consisting of 4 16-bit integer values: A timestamp (unsigned) followed by 3 accelerometer axes (signed). There is no separator between records (mainly because I'm saturating the SPI interface to the uSD card).
So, I think I need a script with TWO here documents: One for the C code (parameterized by expanded Bash variables), and another for the binary data. Here's where I am so far:
#! env bash
# Produced by firmware version 0.0.0.0.0.1 alpha
# Configuration for this data run:
header_string = "Time, X, Y, Z"
column_count = 4
# Create the converter executable
# Use "<<-" to permit code to be indented for readability.
# Allow variable expansion/substitution.
gcc -xc /tmp/convertit - <<-THE_C_CODE
#include <stdio.h>
int main (int argc, char **argv) {
// Write ${header_string} to stdout
while (1) {
// Read $(column_count} shorts from stdin
// Break if EOF
// Write $(column_count} comma-delimited values to stdout
}
// Close stdout
return 0;
}
THE_C_CODE
# Pass the binary data to the converter
# Hard-quote the Here tag to prevent subsequent expansion/substitution
/tmp/convertit >./$1.csv <<'THE_BINARY_DATA'
...
... hundreds of megabytes of semi-random data ...
...
THE_BINARY_DATA
rm /tmp/convertit
exit 0
Does that look about right? I don't yet have a real data file to test this with, but I wanted to verify the idea before going much further.
Will Bash complain if the closing lines are missing? This may happen if data capture terminates unexpectedly due to a shock knocking loose the battery or uSD card. Or if the firmware borks.
Is there a faster or better method I should consider? For example, I wonder if Bash will be too slow to copy the binary data as fast as the C program can consume it: Should the C program open the data file directly?
TIA,
-BobC
You may want to have a look at makeself. It allows you to change any .tar.gz archive into a self-extracting file which is platform independent (something like a shell script that contains a here document). This will allow you to easily distribute your data and decoder. It also allows you to configure a script contained within the archive to be run when the container script is run. This way you can use makeself for packaging and inside the archive you can put your data files and decoder written in C or bash or whatever language you find suitable.
While it is possible to decode binary data using shell tools (e.g. using od), it's very cumbersome and ineffective. I'd recommend using either a C program or perl which is also likely to be found on almost any machine (check this page).

Doing a zero-copy move of data from a Linux kernel buffer to hard disk

am trying to move data from a buffer in kernel space into the hard
disk without having to incur any additional copies from kernel buffer to
user buffers or any other kernel buffers. Any ideas/suggestions would be
most helpful.
The use case is basically a demux driver which collects data into a
demux buffer in kernel space and this buffer has to be emptied
periodically by copying the contents into a FUSE-based partition on the
disk. As the buffer gets full, a user process is signalled which then
determines the sector numbers on the disk the contents need to be copied
to.
I was hoping to mmap the above demux kernel buffer into user address
space and issue a write system call to the raw partition device. But
from what I can see, the this data is being cached by the kernel on its
way to the Hard Disk driver. And so I am assuming that involves
additional copies by the linux kernel.
At this point I am wondering if there is any other mechansim to do this
without involving additional copies by the kernel. I realize this is an
unsual usage scenario for non-embedded environments, but I would
appreciate any feedback on possible options.
BTW - I have tried using O_DIRECT when opening the raw partition, but
the subsequent write call fails if the buffer being passed is the
mmapped buffer.
Thanx!
You need to expose your demux buffer as a file descriptor (presumably, if you're using mmap() then you're already doing this - great!).
On the kernel side, you then need to implement the splice_read member of struct file_operations.
On the userspace side, create a pipe(), then use splice() twice - once to move the data from the demux file descriptor into the pipe, and a second time to move the data from the pipe to the disk file. Use the SPLICE_F_MOVE flag.
As documented in the splice() man page, it will avoid actual copies where it can, by copying references to pages of kernel memory rather than the pages themselves.

Resources