How to edit a bio write request - linux-kernel

I'm writing a device-mapper which has to edit incoming writes, for example:
Incoming bio request contains a biovec with a page full of 0s, and I want to change it to all 1s.
I tried using bvec_kmap_local() to get the page address, and then I can read the data and if needed adjust it using memcpy or similar. Initial tests seemed to work, but if I execute things like mkfs on the created dm, I get a lot of segmentation faults. I narrowed down the problem that this only happens if I actually write something to the mapped page. However, I see no reason why it should cause this fault as I (after checking probably 100 times) don't access invalid memory. I'm wondering if my method of editing the write is actually correct?
Thanks! I can provide way more information if needed

So apparently you're not supposed to edit the pages of a bio request, since write submitters (like file systems) expect their data buffers not to be modified. Instead, you should create a new one with the page you want it to be, and submit it.

Related

Save file from POST request to disk without storing in memory with Python's BaseHTTPServer

I'm writing an HTTP server in Python 2 with BaseHTTPServer, and it's assumed that it accepts multiple connections at the same time, on each connection the user can send a large file through a POST request. However my understanding is that the whole request will be stored in the server's memory before being processed, and multiple uploaded file at the same time can exceed the amount of memory on the server. Is there any way to, instead of storing the file/request in memory, stream it to a file on disk directly?
BaseHTTPServer doesn't come with a POST handler out of the box, so you'll have to implement it yourself or find an implementation that works for you. (These are easy to search for; here's one I found that looked straightforward.
Your question is similar to this question about limiting the max-size of POST; the answer points out you'll need to read through all that data in order to ensure proper browser functionality. The comments to that answer seem to indicate the use of other techniques ("e.g. AJAX and realtime notifications via WebSocket." #dmitry-nedbaylo)

Where are page permissions stored on hardware and how can I alter them directly?

I'm trying to write a pseudo kernel driver (it uses CVE 2018-8120 to get kernel permission so it's technically not a driver) and I want to be as safe as possible when entering ring0. I'm writing a function to read and write MSR's from userland, and before the transition to ring0 I'm trying to guarantee that the void pointer given to my function can be written, I decided the ideal way to do this was to make it writable if it is not already.
The problem is that the only way I know how to do this is with VirtualProtect() and NtAllocateVirtualMemory, but VirtualProtect() sometimes fails and returns an error instead. I want to know precisely where these access permissions are stored (in ram? in some special CPU register?) how I can obtain their address and how can I modify them directly?
User-mode code should never try to muck around in kernel data structures, and any properly written kernel will prevent it anyway. The best way for user mode code to ensure that an address can be written is to write to it. If the page was not already writeable, the page fault will cause the kernel to make it so.
Nevertheless, the kernel code /cannot/ rely on the application having done so, for two reasons:
1) Even if the application does it properly, the page might be unmapped again before (or after) entering ring 0.
2) The kernel should /never/ rely on application code to do the right thing. It always has to protect itself.
The access permissions information and page data is stored in the page directory, page table, CR0 and CR3.
More information can be found here: https://wiki.osdev.org/Paging.

windows memory managment: check if a page is in memory

Is there a way, in Windows, to check if a page in in memory or in disk(swap space)?
The reason I want know this is to avoid causing page fault if the page is in disk, by not accessing that page.
There is no documented way that I am aware of for accomplishing this in user mode.
That said, it is possible to determine this in kernel mode, but this would involve inspecting the Page Table Entries, which belong to the Memory Manager - not something that you really wouldn't want to do in any sort of production code.
What is the real problem you're trying to solve?
The whole point of Virtual Memory is to abstract this sort of thing away. If you are storing your own data and in user-land, put it in a data-structure that supports caching and don't think about pages.
If you are writing code in kernel-space, I know in linux you need to convert a memory address from a user-land to a kernal-space one, then there are API calls in the VMM to get at the page_table_entry, and subsequently the page struct from the address. Once that is done, you use logical operators to check for flags, one of which is "swapped". If you are trying to make something fast though, traversing and messing with memory at the page level might not be the most efficient (or safest) thing to do.
More information is needed in order to provide a more complete answer.

dojo.io.script.get caching

I'm loading data using dojo.io.script.get. Size of each request can be big and I need to issue lots of them.
Question is, after data loaded and later dismissed is it cached by browser?
In other words, when I load some data that have content "myFunc('blah blah blah')". It will execute myFunc function. What happens to the browser memory after execution? If I say load it 100 times and size of each string within myFunc is say 1GB, will browser run out of memory?
Thanks.
Andrei
One of the things I have learned about Dojo is that the source code is a great reference.
My quick inspection of dojo/io/script.js shows that there is some logic involving dead code tags and destroying script tags so I guess it should protect against the memory leaks you mention. (Of course, you should always test this kind of stuff yourself, just to be sure).

Text difference patch

Am trying to write a piece of code which will allows the user to type text into a textbox which then gets saved on the server. When the user types some more text in the textbox, I want only the difference to be sent to the server.
Is there a difference algorithm for JS which I can use to send only information about the difference. So it should be able to tell the difference between two text boxes essentially.
It could also be language agnostic and I can port it.
Thank you for your time.
UPDATE
In simple words. I have a text area which keeps saving the text in the box every X seconds. Now to save bandwidth I only want it to send the difference from the last saved revision (which I can say put in a variable. Initially this will be empty). Now the JS has to check the difference between the last revision and the current state of the textbox and generate a change list to send to the server.
UPDATE 2
Something like www.etherpad.com
Google DiffMatchPatch has a Javascript implementation, I've used it with much success.
http://code.google.com/p/google-diff-match-patch/
The Python difflib module does this and more. It's very flexible but might be challenging to port to Javascript.
Regarding your update, I'm first wondering why you need to worry about bandwidth. Unless your users are typing a lot of text into an edit box (which has its own usability issues) then there just aren't that many bytes to send. Send the whole text box each time you autosave. Users can't type fast enough to really notice the use of bandwidth.
Or, you could meet halfway. Every time you autosave, check to see whether the user has only added new text to the end compared to the last time. If so, send an "append" type update with just the new text. If the user has gone back and edited anything else, then send a "replace" type update where you send the whole text. This takes care of the common append-only case without severely complicating your implementation.
Instead of calculating a diff between 2 texts, which is difficult,
you could always, while people are editting, record the keystrokes and the caret position in the textbox. If you send this over every now and then (and clean the buffer), the server can playback the exact same sequence.
This code-smells of premature optimization. Perhaps you should implement your solution first and then see about optimizing your transfer rates using diffs. How much text are you looking at? Because the request and response packets are going to be more or less the same size with only a few bytes difference for your message, so the savings could be very minimal.
At the very least, complete your solution without optimization and profile your network traffic using tools like Firebug and then test to see how much worse the performance is with what you would consider to be the maximum text block that could be sent.
Finally, you could always use the TypeWatch JQuery plugin to listen for change events in the textbox. You can set a delay so that once the user finishes typing and the delay elapses, the callback function is triggered. This means that the text will only be sent when the user types something, and only when they are finished typing. This will be significantly more efficient than repeatedly polling the server.
Depends on how far you are ready to go.
You would like to check deltav algorithm, it is used by svn in particular: http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff

Resources