Preamble:
Recently I came across an interesting story about people who seem to be sending emails with documents that contain child pornography. This is an example (this one is jpeg but im hearing about it being done with PDFs, which generally cant be previewed)
https://www.youtube.com/watch?v=zislzpkpvZc
This can pose a real threat to people in investigative journalism, because even if you delete the file after its been opened in Temp the file may still be recovered by forensics software. Even just having opened the file already puts you in the realm of committing a felony.
This also can pose a real problem to security consultants for a group. Lets say person A emails criminal files, person B is suspicious of email and forwards it to security manager for their program. In order to analyze the file the consultant may have to download it on a harddrive, even if they load it in a VM or Sandbox. Even if they figure out what it is they are still in this legal landmine area that bad timing could land them in jail for 20 years. Thinking about this if the memory was to only enter the RAM then upon a power down all traces of this opened file would disappear.
Question: I have an OK understanding about how computer architecture works, but this problem presented earlier made me start wondering. Is there a limitation, at the OS, hardware, or firmware level, that prevents a program from opening a stream of downloading information directly to the RAM? If not let's say you try to open a pdf, is it possible for the file it's opening to instead be passed to the program as a stream of downloading bytes that could then rewrite/otherwise make retention of the final file on the hdd impossible?
Unfortunately I can only give a Linux/Unix based answer to this, but hopefully it is helpful and extends to Windows too.
There are many ways to pass data between programs without writing to the hard disk, it is usually more of a question of whether the software applications support it (web browser and pdf reader for your example). Streams can be passed via pipes and sockets, but the problem here is that it may be more convenient for the receiving program to seek back in the stream at certain points rather than store all the data in memory. This may be a more efficient use of resources too. Hence many programs do not do this. Indeed a pipe can be made to look like a file, but if the application tries to seek backward, it will cause an error.
If there was more demand for streaming data to applications, it would probably be seen in more cases though as there are no major barriers. Currently it is more common just to store pdfs in a temporary file if they are viewed in a plugin and not downloaded. Video can be different though.
An alternative is to use a RAM drive, it is common for a Linux system to have at least one set up by default (tmpfs), although it seems for Windows that you have to install additional software. Using one of these removes the above limitations and it is fairly easy to set a web browser to use it for temporary files.
Related
TL:DR - How can I stream content (specifically music and videos) from a Cloud Storage solution, like Google Drive without having the entire file cached first? My goal is to create a Netflix/YouTube-esque experience with my movie/music library.
So, this seems to be an issue that many people are having, and so many forum posts say that PlexCloud is the solution, but it isn't available anymore, so I want to find another way.
Essentially, I would like to free up space on my local machine, offloading my movies and music to the cloud. I would like these files to be available instantly from any of my devices.
The solutions I have come across so far are:
Google File Stream (or similar)
Expandrive
CloudMounter
These apps mount your cloud storage as a network drive and allow you to store files on the cloud and have "instant" access. These sound great in prinicple, but the issue with all of them is that the entire file has to be cached first before you can watch/listen. This defeats the whole purpose of having the files saved to the cloud, as every time you want to watch a video, the entire file has to be cached. This is very inconvenient for me, as I have a rather slow internet connection, monthly transfer limits, and you have to wait until the file has been cached before you can watch.
The closest I've got to making this work is with Kodi, but the interface is horrible on anything other than a TV. On desktop or mobile, it's useless! But, as far as functionality, the way it retrieves files is perfect. On their website, it says that it only caches up to ~60MB at a time, meaning you can start watching/listening instantly, and the file doesn't need to be cached in its entirety.
So my questions are:
Is there an alternative to Kodi that works on all major OS's, where the files are instantly available and the caching system works like YouTube, Netflix, where only a small portion of the file is cached at once?
Is it actually possible to play a video natively in the OS (in an app like VLC) before the entire video is stored on the local disk, either in storage or in cache?
If so, how would I go about doing this?
A few conditions for the solution:
I don't want to have to use the browser every time - A desktop/mobile app, Finder, or File Explorer is essential.
Ideally something that will run on Android TV, or at least is able to use Chromecast.
Files must be instantly accessible - nothing that will cache the entire file first (unless this is impossible due to how OS's work).
If possible, I would prefer NOT to have to go through some massively complicated set up with coding, terminal commands, or using a dedicated server. The solution must use cloud storage, ideally with an app that works on major OS's.
Thanks in advance for help and suggestions!
I've tried to find the answer for days from many sources, but unfortunately not reached any solution.
The problem is how to prevent user from accessing to software data (videos, images, etc.). For example, i have a software or mobil application. And it has some folders that contains videos. I don't want users not to access directly and copy them.
In addition, since these files are big, any conversion of the file needs much time. So this causes slow down the application. I think, encrypting the whole file takes a long time.
I'm asking my question independent of any environment. It can be a windows or android application. Is there any method or technic to achieve this?
Edit: If there is a way to decode/encode the files quickly, it can help me. Or such a password protection solution...
Sorry for my english.
Short version: Not really. All you can do is obfuscate the files, and (on mobile) you have limited resources with which to do this (cpu,ram)
No matter what you do to the files, your program must contain all the information required to decode them or they will not be usable. Ergo, the determined attacker will be able to get the files.
If all you are trying to do is keep out the casual person, then you probably don't need to do anything - extracting files from within mobile applications is generally beyond the normal user.
Whiteboard Overview
The images below are 1000 x 750 px, ~130 kB JPEGs hosted on ImageShack.
Internal
Global
Additional Information
I should mention that each user (of the client boxes) will be working straight off the /Foo share. Due to the nature of the business, users will never need to see or work on each other's documents concurrently, so conflicts of this nature will never be a problem. Access needs to be as simple as possible for them, which probably means mapping a drive to their respective /Foo/username sub-directory.
Additionally, no one but my applications (in-house and the ones on the server) will be using the FTP directory directly.
Possible Implementations
Unfortunately, it doesn't look like I can use off the shelf tools such as WinSCP because some other logic needs to be intimately tied into the process.
I figure there are two simple ways for me to accomplishing the above on the in-house side.
Method one (slow):
Walk the /Foo directory tree every N minutes.
Diff with previous tree using a combination of timestamps (can be faked by file copying tools, but not relevant in this case) and check-summation.
Merge changes with off-site FTP server.
Method two:
Register for directory change notifications (e.g., using ReadDirectoryChangesW from the WinAPI, or FileSystemWatcher if using .NET).
Log changes.
Merge changes with off-site FTP server every N minutes.
I'll probably end up using something like the second method due to performance considerations.
Problem
Since this synchronization must take place during business hours, the first problem that arises is during the off-site upload stage.
While I'm transferring a file off-site, I effectively need to prevent the users from writing to the file (e.g., use CreateFile with FILE_SHARE_READ or something) while I'm reading from it. The internet upstream speeds at their office are nowhere near symmetrical to the file sizes they'll be working with, so it's quite possible that they'll come back to the file and attempt to modify it while I'm still reading from it.
Possible Solution
The easiest solution to the above problem would be to create a copy of the file(s) in question elsewhere on the file-system and transfer those "snapshots" without disturbance.
The files (some will be binary) that these guys will be working with are relatively small, probably ≤20 MB, so copying (and therefore temporarily locking) them will be almost instant. The chances of them attempting to write to the file in the same instant that I'm copying it should be close to nil.
This solution seems kind of ugly, though, and I'm pretty sure there's a better way to handle this type of problem.
One thing that comes to mind is something like a file system filter that takes care of the replication and synchronization at the IRP level, kind of like what some A/Vs do. This is overkill for my project, however.
Questions
This is the first time that I've had to deal with this type of problem, so perhaps I'm thinking too much into it.
I'm interested in clean solutions that don't require going overboard with the complexity of their implementations. Perhaps I've missed something in the WinAPI that handles this problem gracefully?
I haven't decided what I'll be writing this in, but I'm comfortable with: C, C++, C#, D, and Perl.
After the discussions in the comments my proposal would be like so:
Create a partition on your data server, about 5GB for safety.
Create a Windows Service Project in C# that would monitor your data driver / location.
When a file has been modified then create a local copy of the file, containing the same directory structure and place on the new partition.
Create another service that would do the following:
Monitor Bandwidth Usages
Monitor file creations on the temporary partition.
Transfer several files at a time (Use Threading) to your FTP Server, abiding by the bandwidth usages at the current time, decreasing / increasing the worker threads depending on network traffic.
Remove the files from the partition that have successfully transferred.
So basically you have your drives:
C: Windows Installation
D: Share Storage
X: Temporary Partition
Then you would have following services:
LocalMirrorService - Watches D: and copies to X: with the dir structure
TransferClientService - Moves files from X: to ftp server, removes from X:
Also use multi threads to move multiples and monitors bandwidth.
I would bet that this is the idea that you had in mind but this seems like a reasonable approach as long as your really good with your application development and your able create a solid system that would handle most issues.
When a user edits a document in Microsoft Word for instance, the file will change on the share and it may be copied to X: even though the user is still working on it, within windows there would be an API see if the file handle is still opened by the user, if this is the case then you can just create a hook to watch when the user actually closes the document so that all there edits are complete, then you can migrate to drive X:.
this being said that if the user is working on the document and there PC crashes for some reason, the document / files handle may not get released until the document is opened at a later date, thus causing issues.
For anyone in a similar situation (I'm assuming the person who asked the question implemented a solution long ago), I would suggest an implementation of rsync.
rsync.net's Windows Backup Agent does what is described in method 1, and can be run as a service as well (see "Advanced Usage"). Though I'm not entirely sure if it has built-in bandwidth limiting...
Another (probably better) solution that does have bandwidth limiting is Duplicati. It also properly backs up currently-open or locked files. Uses SharpRSync, a managed rsync implementation, for its backend. Open source too, which is always a plus!
I'm concerned about the dangers of using memory-mapped IO, via CreateFileMapping, on FAT filesystems. The specific scenario is users opening documents directly from USB sticks (yeah, you try and ban them doing this!).
The MSDN Managing Memory-Mapped Files article doesn't say anything about file system constraints.
Update
I didn't have any real reason to be concerned but a vague feeling that I'd read about problems with them at some point (my career spans over 25 years so I have a lot of vague depths in my memory, all the way back to 8-bit micros!). The issue of whether or not they should be supported is pretty important for me to recommend so I wanted to ask if anyone could corroborate my concerns. Thanks for putting my mind at rest.
Memory-mapped files is one of my favorite features. It's absolutely no danger. It's one of the base extremely optimized Windows I/O features. If one starts an EXE or load indirect a DLL it is implemented internally as memory-mapped file mapping.
It is supported on all types of file systems including FAT.
By the way atzz say that memory-mapped files are allowed on network drives. I can add it is not only allowed, but it is strictly recommended to use memory-mapped file also with files from network. In the case the I/O operation will be cached in very good way, which is not done with other (C/C++) I/O.
If you want that the EXE will not crash if you open it from the CD or network one can mark Program Executable with one bit in the header (linker switch /SWAPRUN see http://msdn.microsoft.com/en-us/library/chzz5ts6.aspx). There are no option for documents opened from USB stick.
But what exact problem do the users have? Do they don't use "Safely Remove Hardware" Icon? Then they have to learn to do this exactly like they have to learn to not switch computer power, but shutdown the computer properly.
Could you explain why you find dangers to use memory-mapped files, and in what situations you have problems and is usage of other I/O operation has no such problem?
Yes it does. It even supports mapping of files on CDFS or on network drives. What is the source of your doubts?
Two years ago, we shipped a multi-gigabyte Windows application, with lots of video files. Now we're looking to release a significant update, with approximately 1 gigabyte of new and changed data.
We're currently looking at DVD fulfillment houses (like these folks, for example), which claim to be able to ship DVDs to our customers for $5 and up. Does anyone have any experience with these companies?
We've also looked at a bunch of network-based "updater" software. Unfortunately, most of these tools are intended for much smaller programs. Are there any libraries or products which handle gigabyte-sized updates well?
Thank you for your advice!
BITS is a library from Microsoft for downloading files piece by piece using unused bandwidth. You can basically have your clients trickle-download the new video files. The problem, however, is that you'll have to update your program to utilize BITS first.
Depending on who the end user is you have a few options:
Shipping DVD's
This option tends to be rather expensive, and may not be the best way, what if you are shipping it to someone that no longer has the software installed.
HTTP hosting (using Akamai, or any other CDN)
This works rather well for other companies, for example Apple and I believe Microsoft as well.
Bittorrent
It is not just used for illegal content, it will allow you to offload some of the work load of sending the file, and at the same time it is a fast protocol, if you make sure the that the machine seeding has the correct file, the bittorrent protocol will make sure the end user gets the same file with the exact same hash.
You can use the rsync algorithm: http://samba.anu.edu.au/rsync/