Why does SetFilePointerEx(1) succeed for an empty file that's open for reading only? - winapi

I do not understand why the following code, which sets the position of an open file handle relative to the base (i. e. sets the absolute position) succeeds when trying to set a positive position for an empty file that's open for reading only:
LARGE_INTEGER offset;
offset.QuadPart = 100;
LARGE_INTEGER pos = {0};
return ::SetFilePointerEx(_h, offset, &pos, FILE_BEGIN) != 0;
It returns a non-zero result, and the pos variable receives the value 100. That behavior is counter-intuitive for a GENRIC_READ file of size zero, what is the logic? I understand that this is normal behavior for files with write access.
P. S. The file is not overlapped and overall as simple as it can be with no fancy flags.
Does SetFilePointerEx ever fail at all for valid handles, positive absolute positions and plain files?

SetFilePointerEx internally call ZwSetInformationFile with FilePositionInformation. the FILE_POSITION_INFORMATION used as input.
exist only such restriction on this value
If the file was opened or created with the
FILE_NO_INTERMEDIATE_BUFFERING option, the value of CurrentByteOffset
must be an integral multiple of the sector size of the underlying
device.
also, always must be 0 <= CurrentByteOffset.QuadPart- so position must be not negative.
no more restriction on position value. you can set it to any, not depend from file size. this call even never go to file system but handle by I/O manager.
all what he doing - set CurrentByteOffset in FILE_OBJECT
how this is used ? when we call ZwReadFile or ZwWriteFile - the optional parameter ByteOffset exist
Pointer to a variable that specifies the starting byte offset in the
file where the read operation will begin. If an attempt is made to
read beyond the end of the file, ZwReadFile returns an error.
If the call to ZwCreateFile set either of the CreateOptions flags
FILE_SYNCHRONOUS_IO_ALERT or FILE_SYNCHRONOUS_IO_NONALERT, the I/O
Manager maintains the current file position. If so, the caller of
ZwReadFile can specify that the current file position offset be used
instead of an explicit ByteOffset value. This specification can be
made by using one of the following methods:
Specify a pointer to a LARGE_INTEGER value with the HighPart member
set to -1 and the LowPart member set to the system-defined value
FILE_USE_FILE_POINTER_POSITION.
Pass a NULL pointer for ByteOffset.
ZwReadFile updates the current file position by adding the number of
bytes read when it completes the read operation, if it is using the
current file position maintained by the I/O Manager.
Even when the I/O Manager is maintaining the current file position,
the caller can reset this position by passing an explicit ByteOffset
value to ZwReadFile. Doing this automatically changes the current file
position to that ByteOffset value, performs the read operation, and
then updates the position according to the number of bytes actually
read. This technique gives the caller atomic seek-and-read service.
so we can or explicit pass ByteOffset value or use additional api call for set it first in FILE_OBJECT and then I/O manager take it from here, if no explicit ByteOffset pointer.
note - in case asynchronous I/O - we need always explicit pass ByteOffset value or call just fail (exception for pipes and mailslot files)
in case ReadFile and WriteFile - ByteOffset taken from OVERLAPPED parameter. if it is 0 - the ByteOffset set to 0 pointer and CurrentByteOffset from FILE_OBJECT is used. and if pointer to OVERLAPPED not 0 - the exactly value from OVERLAPPED is explicit passed as ByteOffset value and CurrentByteOffset in FILE_OBJECT is ignored.
also always is ok use pointer to OVERLAPPED - not only for asynchronous file handles. simply for asynchronous - this is mandatory parameter and for synchronous is optional.
really faster and better - direct pass pointer to api call (read/write) than use separate api call, which take time, can (theoretical) fail, etc
use SetFilePointer may be exist sense only in legacu code, where it called from huge count of places, for not modify too many code

Related

writing partial data with libwebsockets

I'm using the libwebsockets v2.4.
The doc seems unclear to me about what I have to do with the returned value of the lws_write() function.
If it returns -1, it's an error and I'm invited to close the connection. That's fine for me.
But when it returns a value that is strictly inferior to the buffer length I pass, should I consider that I have to write the last bytes that could not be written later (in another WRITABLE callback occurrence). Is it even possible to have this situation?
Also, should I use the lws_send_pipe_choked() before using the lws_write(), considering that I always use lws_write() in the context of a WRITABLE callback?
My understanding is that lws_write always return the asked buffer length except is an error occurs.
If you look at lws_issue_raw() (from which the result is returned by lws_write()) in output.c (https://github.com/warmcat/libwebsockets/blob/v2.4.0/lib/output.c#L157), you can see that if the length written by lws_ssl_capable_write() is less than the provided length, then the lws allocate a buffer to fill up the remaining bytes on wsi->trunc_alloc, in order for it to be sent in the future.
Concerning your second question, I think it is safe to call lws_write() in the context of a WRITABLE callback without checking if the pipe is choked. However, if you happen to loop on lws_write() in the callback, lws_send_pipe_choked() must be called in order to protect the subsequent calls to lws_write(). If you don't, you might stumble upon this assertion https://github.com/warmcat/libwebsockets/blob/v2.4.0/lib/output.c#L83 and the usercode will crash.

Flag value of shmat() function

This function is used for attaching allocated memory segment to the calling process. It takes three arguments. First argument corresponds to identifier of memory segment. Second argument is pointer to memory segment. For second argument, NULL or 0 value is passed to the function, since when we allocate the shared memory, we know only its identifier not its memory address.
However, I cannot find what the task of third argument is. Some codes that I am encountered by set the flag value to 0. NULL and 0 have same meaning in C language, and I think that additional adjustments are not needed; hence, NULL is passed to the function as third argument.
Is there anyone who can explain the task of flag value in shmat() function ?
Four flags are defined:
SHM_RDONLY - the segment is attached for reading; default is Read/Write
SHM_RND - the attach occurrs at the address equal to shmaddr rounded down to the nearest multiple of SHMLBA (usually defined as the page size)
SHM_REMAP - flag may be specified in shmflg to indicate that the mapping of the segment should replace any existing mapping in the range starting at shmaddr and continuing for the size of the segment. This flag is Linux-specific.
SHM_EXEC - allow the contents of the segment to be executed. Linux-specific.
Passing the value 0 means that all flags are unset. I wouldn't use NULL here, since NULL implies the parameter type is a pointer, which it is not.
See the shmat(2) man page.

Checking for valid user memory in kernel mode with copy_to_user

So, I tried using this:
copy_to_user(p, q, 0)
I want to copy from q to p and if it doesn't work, then I want to know if p points to an invalid address.
copy_to_user returns the number of bytes that weren't copied successfully but in this case, there are 0 bytes and I can't know for sure if p points to an invalid address.
Is there another way to check if p points to a valid user memory?
Yes. You need to check passing size value manually each time before calling copy_to_user(). If it's 0 or not in valid range -- you shouldn't call copy_to_user() at all. This way you can rely on copy_to_user() return value.
the method copy_to_user defined at /usr/src/linux-3.0.6-gentoo/include/asm-generic/uaccess.h
static inline long copy_to_user(void __user *to,
const void *from, unsigned long n)
{
might_fault();
if (access_ok(VERIFY_WRITE, to, n))
return __copy_to_user(to, from, n);
else
return n;
}
the method access_ok checks the accessibility of to(user memory). So you can use the method access_ok to check memory is valid or not(to is not NULL / it's in user space)?
Argument VERIFY_READ or VERIFY_WRITE. VERIFY_READ: identifies whether memory region is readable, VERIFY_WRITE: identifies whether the memory region is readable as well as writable.
source of method access_ok
And what do you consider 'valid user memory'? What do you need this for?
Let's say we only care about the target buffer residing in userspace range (for archs with joint address spaces). From this alone we see that testing the address without the size is pointless - what if the address is the last byte of userspace? Appropriate /range/ check is done by access_ok.
Second part is whether there is a page there or a read/write can be performed without servicing a page fault. Is this of any concern for you? If you read copy_from/whatever you will see it performs the read/write and only catches the fault. There is definitely KPI to check whether the target page can be written to without a fault, but you would need to hold locks (mmap_sem and likely more) over your check and whatever you are going to do next, which is likely not what you wanted to do.
So far it seems you are trying

Can I pass an integer to `access_ok()` as it's second argument?

In LDD3's example, access_ok() is placed at the beginning of ioctl method of a kernel module to check whether a pointer passed from userspace is valid. It is correct when userspace application calls ioctl() system call, and passes it an address of a variable. In some cases, however, ioctl() system call is invoked with a value instead of a pointer as third argument and finally the second argument of access_ok() in kernel module.
I've tried to pass an integer as access_ok()'s second argument and it works fine. No error was reported. But I don't very sure that is this usage correct?
For example, if I invoke ioctl() in userspace with it's third argument to be '3'. Then, in ioctl() method of struct file_operations, access_ok() will receive 3 as it's second argument. Because the access_ok() expects a pointer, so it translates 3 to be a userspace pointer. Obversely, it's wrong...
Actually, access_ok's check is rough. Description of the function (in the source file) say:
Note that, depending on architecture, this function probably just
checks that the pointer is in the user space range - after calling
this function, memory access functions may still return -EFAULT.
E.g., according to source arch/x86/include/asm/uaccess.h, on x86 access_ok just checks that given address points to the lower area (because kernel besides in the upper area). So, it returns true for address equal to 3.
It is copy_from_user/copy_to_user who return a final verdict about user memory accessibility.
Userspace programs can give you any random value as a pointer, so access_ok() must be able to handle any random value.
So it is definitely OK to call access_ok() with a non-pointer value.
However, unless you are actually going to try to access that memory location, calling access_ok() is utterly pointless.
(For that matter, you should, if possible, avoid access_ok() and just check the actual userspace accesses (get_user() etc.) for errors.)

Are socket file descriptors > FD_SETSIZE valid on OS X?

On OS X, we're finding that calls to ::socket(...) can sometimes return a file descriptor that is greater than FD_SETSIZE. According to the docs for ::socket(...) this is a successful request, as only return values of -1 indicate failure. However, throughout our codebase – before using the descriptor – we're testing that it's within the range 0 to FD_SETSIZE inclusive.
Is this the correct thing to be doing, or can we safely go ahead and use the sockets whose descriptors are greater than FD_SETSIZE?
The reason this is happening is is because we're calling ::setrlimit(...) with the RLIMIT_NOFILE flag to up the maximum number of file descriptors. We need to do this because our application has to keep a large number of files open simultaneously. However, this value is obtained via a call to sysctl(...) with the KERN_MAXFILESPERPROC flag and set at run-time, whereas FD_SETSIZE is a compile-time setting.
You can use socket descriptors which have a value which is greater than FD_SETSIZE, just not with an fd_set.

Resources