I've been looking at the DropBox Mac client and I'm currently researching implementing a similar interface for a different service.
How exactly do they interface with finder like this? I highly doubt these objects represented in the folder are actual documents downloaded on every load? They must dynamically download as they are needed. So how can you display these items in finder without having actual file system objects?
Does anyone know how this is achieved in Mac OS X?
Or any pointer's to Apple API's or other open source projects that have a similar integration with finder?
Dropbox is not powered by either MacFUSE or WebDAV, although those might be perfectly fine solutions for what you're trying to accomplish.
If it were powered by those things, it wouldn't work when you weren't connected, as both of those rely on the server to store the actual information and Dropbox does not. If I quit Dropbox (done via the menu item) and disconnect from the net, I can still use the files. That's because the files are actually stored here on my hard drive.
It also means that the files don't need to be "downloaded on every load," since they are actually stored on my machine here. Instead, only the deltas are sent over the wire, and the Dropbox application (running in the background) patches the files appropriately. Going the other way, the Dropbox application watches for the files in the Dropbox folder, and when they change, it sends the appropriate deltas to the server, which propagates them to any other clients.
This setup has some decided advantages: it works when offline, it is an order of magnitude faster, and it is transparent to other apps, since they just see files on the disk. However, I have no idea how it deals with merge conflicts (which could easily arise with one or more clients offline), which are not an issue if the server is the only copy and every edit changes that central copy.
Where Dropbox really shines is that they have an additional trick that badges the items in the Dropbox folder with their current sync status. But that's not what you're asking about here.
As far as the question at hand, you should definitely look into MacFUSE and WebDAV, which might be perfect solutions to your problem. But the Dropbox way of doing things, with a background application changing actual files on the disk, might be a better tradeoff.
Dropbox is likely using FSEvents to watch for changes to the file system. It's a great API and can even bundle up changes that happened while your app was not running. It's the same API that Spotlight uses. The menubar app likely does the actual observing itself (since restarting it can fix uploads being hung, for instance).
There's no way they're using MacFUSE, as that would require installing the MacFUSE kernel extension to make Dropbox work, and since I definitely didn't install it, I highly doubt they're using it.
Two suggestions:
MacFUSE
WebDAV
The former will allow you to write an app that appears as a filesystem and does all the right things; the latter will allow you move everything server-side and let the user just mount your service as a file share.
Dropbox on the client is written in python.
The client seems to use a sqlite3 database to index files.
I suppose Dropobox split a file in chunks, to reduce bandwith usage.
By the way, it two people has the same file, even if they do not know each other, the server can optimize and avoid to transfer the file more times, only copying it on the server side
To me it feels like a heavily modified revision control system. It has all the features: updates files based on deltas, options to recover or restore old revisions of files. It almost feels like they are using git (GitFS?), or some filesystem they designed.
You could also give File Conveyor a try. It's a Python daemon capable of instantly detecting FS changes (on Linux through inotify, on OS X through FSEvents), processing the files and syncing them to one or more destinations.
Supported protocols: FTP, SFTP, Amazon S3 (CloudFront is also supported), Rackspace Cloud Files. Can easily be extended. Uses django-storages.
"processing files": e.g. optimizing images, transcoding videos — this was originally conceived to be used for sending static assets to a CDN in the context of speeding up websites)
Related
TL:DR - How can I stream content (specifically music and videos) from a Cloud Storage solution, like Google Drive without having the entire file cached first? My goal is to create a Netflix/YouTube-esque experience with my movie/music library.
So, this seems to be an issue that many people are having, and so many forum posts say that PlexCloud is the solution, but it isn't available anymore, so I want to find another way.
Essentially, I would like to free up space on my local machine, offloading my movies and music to the cloud. I would like these files to be available instantly from any of my devices.
The solutions I have come across so far are:
Google File Stream (or similar)
Expandrive
CloudMounter
These apps mount your cloud storage as a network drive and allow you to store files on the cloud and have "instant" access. These sound great in prinicple, but the issue with all of them is that the entire file has to be cached first before you can watch/listen. This defeats the whole purpose of having the files saved to the cloud, as every time you want to watch a video, the entire file has to be cached. This is very inconvenient for me, as I have a rather slow internet connection, monthly transfer limits, and you have to wait until the file has been cached before you can watch.
The closest I've got to making this work is with Kodi, but the interface is horrible on anything other than a TV. On desktop or mobile, it's useless! But, as far as functionality, the way it retrieves files is perfect. On their website, it says that it only caches up to ~60MB at a time, meaning you can start watching/listening instantly, and the file doesn't need to be cached in its entirety.
So my questions are:
Is there an alternative to Kodi that works on all major OS's, where the files are instantly available and the caching system works like YouTube, Netflix, where only a small portion of the file is cached at once?
Is it actually possible to play a video natively in the OS (in an app like VLC) before the entire video is stored on the local disk, either in storage or in cache?
If so, how would I go about doing this?
A few conditions for the solution:
I don't want to have to use the browser every time - A desktop/mobile app, Finder, or File Explorer is essential.
Ideally something that will run on Android TV, or at least is able to use Chromecast.
Files must be instantly accessible - nothing that will cache the entire file first (unless this is impossible due to how OS's work).
If possible, I would prefer NOT to have to go through some massively complicated set up with coding, terminal commands, or using a dedicated server. The solution must use cloud storage, ideally with an app that works on major OS's.
Thanks in advance for help and suggestions!
I've searched large and deep, but nothing is available, as far as I can see.
TLDR: How can I use rsync with a SharePoint installation? (Or something like rsync)
Long description
We have a large install base of Macs (~50%), Windows (~40%), and Linux (~10%), so our environment is pretty heterogeneous. Being an experimental job we produce a considerable amount of experimental datasets that we need to share, and more importantly, backup.
Right now we use external hard drives to store these files and folders, since our computers cannot hold these amount of data (50GB++, for instance, per dataset). And when we need to share, we "physically" share. We mainly we use rsync with some kind of backend (what kind is not important), but this solution requires computers to be left turned on, and act as servers.
For reasons that I will not bother you with, we cannot leave a computer on after work.
Having OneDrive for Business seemed a very promising technology to use, since we have more than 1TB per user. We could start syncing out datasets from our computers and hard drives, and we could share even when computers are turned off.
We are aware that we may hit some drawbacks, as not being able to actually share, having some limits about the number of objects (files/directories), but we will handle them later.
I prefer rsync, but right now we're open to any solution.
OneDrive for Business has a download that will allow you to synchronize a directory locally. https://onedrive.live.com/about/en-us/download/
For a Linux platform, you should be able to use onedrive-d found here:
https://github.com/xybu/onedrive-d
I know that it's an old question, but it's unanswered. Maybe a solution could be https://rclone.org/. Rclone is a command line program to sync files and directories to and from the cloud.
Going to be working with a medium sized remote group on a large (but independent) project that will be generating many GB to TB of data.
To keep users from having to store 500GB of data on their personal machines, and to keep everyone in sync, we need a command-line/python utility to control selective syncing of dependencies on multiple operating systems: or at least osx and linux.
So example, someone who needs to work on the folder:
startrek/startrekiii
May require the folders:
startrek/nimoy/common
startrek/nimoy/[user]
startrek/shatner/common
startrek/shatner/[user]
but not:
startrek/startrekii, startrek/nimoy/[some_other_user], etc
From their command line (or a UI) they would run:
sync startrekiii
And they'd also receive startrek/nimoy/common, etc
likewise we'll have an unsync command that, as long as those dependent folders are not in use by another sync, will be unsynced and removed from the user's HD.
Of cloud sync/storage solutions, dropbox seems to offer the most granular control over this, allowing you to sync specific folders and subfolders - however from everything I can find this granular control is strictly limited to their UI.
We're completely open to alternative solutions if you have them, we just need something as easily deployable as possible and don't have the budget for Aspera or something to that effect.
Two other important notes:
Because of one very central part of our pipeline which pulls files
from those dependent folders (over which we have limited API
control), the paths need to be consistent on their respective
platform. So ~/Dropbox/startrek/nimoy can never be ~/Dropbox/startrek/startrekiii/nimoy
Many of the people using this will be artists and otherwise non-technical people, the extent of who's experience using csh or bash is for simple things like changing directories and moving files around.
Has anyone found a way to hack into Dropbox's selective sync, and/or know of a better alternative?
So, I need to make a file storage for our team. Also I have SVN server. Opportunity to do rollbacks and control on who created or deleted file is very neccessary and important for our project.
Any ideas? Maybe without SVN. I can connect using WebDAV but only in read-only mode (because there is no LOCKS support in it).
You can set up the SVN server to allow exactly that.
Read the chapter in the SVN book about WebDAV and Autoversioning
So, what you want is the ability to roll back changes, and limit who can make the changes, but without the bother of checking in and out files?
Maybe Subversion isn't for you. I've done similar sharing with Dropbox and there's now BoxNet that's suppose to be like Dropbox on Steroids. Dropbox (and I assume box.net too) has some features that are very nice:
You can setup folder sharing between particular teams. That way, you can say who can and cannot access these files.
Dropbox automatically saves each and every version of a file, so you can always go back to previous versions -- even if that file has been deleted.
Files are stored locally. All a user has to know is to save a particular file in a particular folder, and everyone has access to it. I've successfully used Dropbox to collaborate with managers that make the Pointed Hair boss in Dilbert look like a high tech genius.
There's also Skydrive and Google Drive, but I don't find them as universal as Dropbox or as easy to use. It's possible to use Dropbox without ever going to the Dropbox website. To the non-geek, it appears to be magic as files I've written and edited appear on their drive. It took me a few weeks to train one person that he didn't have to email me his document when he made changes because I already had it.
Dropbox gives you 2 Gb of space for free which doesn't sound like a lot. However, my first hard drive was a whopping 20Mb which was twice the size of the standard 10Mb drive at that time. If you're not storing a lot of multimedia presentations or doing a lot of Photoshop, 2Gb might be more than enough for your project.
I know Windows 7 and later has some sort of versioning system built into it. I know this because anytime someone mentions that Mac OS X has time machine, some Wingeek pipes in stating that Windows has the same thing, but only better!. Unfortunately, Windows is not my forte, so I don't know too much about this specific feature. I believe the default is once per day, but it can be changed. This might be the perfect solution if everyone is on Windows.
Subversion can do autoversioning as Stefan stated. Considering his position in the Subversion community (especially his work on TortoiseSVN), he knows his stuff. Unfortunately I don't know too much about it since I've never used or seen this feature implemented. It's probably due to the fact that I work mainly with developers who know what a version control system is, and therefore have no need for something that does the versioning for them.
Also don't forget to check if you can use your corporate Sharepoint which does something very much what you want. I am not too impressed with Sharepoint, but if the facility is there, and your company can give you the support, it is something you probably want to look into.
Whiteboard Overview
The images below are 1000 x 750 px, ~130 kB JPEGs hosted on ImageShack.
Internal
Global
Additional Information
I should mention that each user (of the client boxes) will be working straight off the /Foo share. Due to the nature of the business, users will never need to see or work on each other's documents concurrently, so conflicts of this nature will never be a problem. Access needs to be as simple as possible for them, which probably means mapping a drive to their respective /Foo/username sub-directory.
Additionally, no one but my applications (in-house and the ones on the server) will be using the FTP directory directly.
Possible Implementations
Unfortunately, it doesn't look like I can use off the shelf tools such as WinSCP because some other logic needs to be intimately tied into the process.
I figure there are two simple ways for me to accomplishing the above on the in-house side.
Method one (slow):
Walk the /Foo directory tree every N minutes.
Diff with previous tree using a combination of timestamps (can be faked by file copying tools, but not relevant in this case) and check-summation.
Merge changes with off-site FTP server.
Method two:
Register for directory change notifications (e.g., using ReadDirectoryChangesW from the WinAPI, or FileSystemWatcher if using .NET).
Log changes.
Merge changes with off-site FTP server every N minutes.
I'll probably end up using something like the second method due to performance considerations.
Problem
Since this synchronization must take place during business hours, the first problem that arises is during the off-site upload stage.
While I'm transferring a file off-site, I effectively need to prevent the users from writing to the file (e.g., use CreateFile with FILE_SHARE_READ or something) while I'm reading from it. The internet upstream speeds at their office are nowhere near symmetrical to the file sizes they'll be working with, so it's quite possible that they'll come back to the file and attempt to modify it while I'm still reading from it.
Possible Solution
The easiest solution to the above problem would be to create a copy of the file(s) in question elsewhere on the file-system and transfer those "snapshots" without disturbance.
The files (some will be binary) that these guys will be working with are relatively small, probably ≤20 MB, so copying (and therefore temporarily locking) them will be almost instant. The chances of them attempting to write to the file in the same instant that I'm copying it should be close to nil.
This solution seems kind of ugly, though, and I'm pretty sure there's a better way to handle this type of problem.
One thing that comes to mind is something like a file system filter that takes care of the replication and synchronization at the IRP level, kind of like what some A/Vs do. This is overkill for my project, however.
Questions
This is the first time that I've had to deal with this type of problem, so perhaps I'm thinking too much into it.
I'm interested in clean solutions that don't require going overboard with the complexity of their implementations. Perhaps I've missed something in the WinAPI that handles this problem gracefully?
I haven't decided what I'll be writing this in, but I'm comfortable with: C, C++, C#, D, and Perl.
After the discussions in the comments my proposal would be like so:
Create a partition on your data server, about 5GB for safety.
Create a Windows Service Project in C# that would monitor your data driver / location.
When a file has been modified then create a local copy of the file, containing the same directory structure and place on the new partition.
Create another service that would do the following:
Monitor Bandwidth Usages
Monitor file creations on the temporary partition.
Transfer several files at a time (Use Threading) to your FTP Server, abiding by the bandwidth usages at the current time, decreasing / increasing the worker threads depending on network traffic.
Remove the files from the partition that have successfully transferred.
So basically you have your drives:
C: Windows Installation
D: Share Storage
X: Temporary Partition
Then you would have following services:
LocalMirrorService - Watches D: and copies to X: with the dir structure
TransferClientService - Moves files from X: to ftp server, removes from X:
Also use multi threads to move multiples and monitors bandwidth.
I would bet that this is the idea that you had in mind but this seems like a reasonable approach as long as your really good with your application development and your able create a solid system that would handle most issues.
When a user edits a document in Microsoft Word for instance, the file will change on the share and it may be copied to X: even though the user is still working on it, within windows there would be an API see if the file handle is still opened by the user, if this is the case then you can just create a hook to watch when the user actually closes the document so that all there edits are complete, then you can migrate to drive X:.
this being said that if the user is working on the document and there PC crashes for some reason, the document / files handle may not get released until the document is opened at a later date, thus causing issues.
For anyone in a similar situation (I'm assuming the person who asked the question implemented a solution long ago), I would suggest an implementation of rsync.
rsync.net's Windows Backup Agent does what is described in method 1, and can be run as a service as well (see "Advanced Usage"). Though I'm not entirely sure if it has built-in bandwidth limiting...
Another (probably better) solution that does have bandwidth limiting is Duplicati. It also properly backs up currently-open or locked files. Uses SharpRSync, a managed rsync implementation, for its backend. Open source too, which is always a plus!