I've searched large and deep, but nothing is available, as far as I can see.
TLDR: How can I use rsync with a SharePoint installation? (Or something like rsync)
Long description
We have a large install base of Macs (~50%), Windows (~40%), and Linux (~10%), so our environment is pretty heterogeneous. Being an experimental job we produce a considerable amount of experimental datasets that we need to share, and more importantly, backup.
Right now we use external hard drives to store these files and folders, since our computers cannot hold these amount of data (50GB++, for instance, per dataset). And when we need to share, we "physically" share. We mainly we use rsync with some kind of backend (what kind is not important), but this solution requires computers to be left turned on, and act as servers.
For reasons that I will not bother you with, we cannot leave a computer on after work.
Having OneDrive for Business seemed a very promising technology to use, since we have more than 1TB per user. We could start syncing out datasets from our computers and hard drives, and we could share even when computers are turned off.
We are aware that we may hit some drawbacks, as not being able to actually share, having some limits about the number of objects (files/directories), but we will handle them later.
I prefer rsync, but right now we're open to any solution.
OneDrive for Business has a download that will allow you to synchronize a directory locally. https://onedrive.live.com/about/en-us/download/
For a Linux platform, you should be able to use onedrive-d found here:
https://github.com/xybu/onedrive-d
I know that it's an old question, but it's unanswered. Maybe a solution could be https://rclone.org/. Rclone is a command line program to sync files and directories to and from the cloud.
Preamble:
Recently I came across an interesting story about people who seem to be sending emails with documents that contain child pornography. This is an example (this one is jpeg but im hearing about it being done with PDFs, which generally cant be previewed)
https://www.youtube.com/watch?v=zislzpkpvZc
This can pose a real threat to people in investigative journalism, because even if you delete the file after its been opened in Temp the file may still be recovered by forensics software. Even just having opened the file already puts you in the realm of committing a felony.
This also can pose a real problem to security consultants for a group. Lets say person A emails criminal files, person B is suspicious of email and forwards it to security manager for their program. In order to analyze the file the consultant may have to download it on a harddrive, even if they load it in a VM or Sandbox. Even if they figure out what it is they are still in this legal landmine area that bad timing could land them in jail for 20 years. Thinking about this if the memory was to only enter the RAM then upon a power down all traces of this opened file would disappear.
Question: I have an OK understanding about how computer architecture works, but this problem presented earlier made me start wondering. Is there a limitation, at the OS, hardware, or firmware level, that prevents a program from opening a stream of downloading information directly to the RAM? If not let's say you try to open a pdf, is it possible for the file it's opening to instead be passed to the program as a stream of downloading bytes that could then rewrite/otherwise make retention of the final file on the hdd impossible?
Unfortunately I can only give a Linux/Unix based answer to this, but hopefully it is helpful and extends to Windows too.
There are many ways to pass data between programs without writing to the hard disk, it is usually more of a question of whether the software applications support it (web browser and pdf reader for your example). Streams can be passed via pipes and sockets, but the problem here is that it may be more convenient for the receiving program to seek back in the stream at certain points rather than store all the data in memory. This may be a more efficient use of resources too. Hence many programs do not do this. Indeed a pipe can be made to look like a file, but if the application tries to seek backward, it will cause an error.
If there was more demand for streaming data to applications, it would probably be seen in more cases though as there are no major barriers. Currently it is more common just to store pdfs in a temporary file if they are viewed in a plugin and not downloaded. Video can be different though.
An alternative is to use a RAM drive, it is common for a Linux system to have at least one set up by default (tmpfs), although it seems for Windows that you have to install additional software. Using one of these removes the above limitations and it is fairly easy to set a web browser to use it for temporary files.
So, I need to make a file storage for our team. Also I have SVN server. Opportunity to do rollbacks and control on who created or deleted file is very neccessary and important for our project.
Any ideas? Maybe without SVN. I can connect using WebDAV but only in read-only mode (because there is no LOCKS support in it).
You can set up the SVN server to allow exactly that.
Read the chapter in the SVN book about WebDAV and Autoversioning
So, what you want is the ability to roll back changes, and limit who can make the changes, but without the bother of checking in and out files?
Maybe Subversion isn't for you. I've done similar sharing with Dropbox and there's now BoxNet that's suppose to be like Dropbox on Steroids. Dropbox (and I assume box.net too) has some features that are very nice:
You can setup folder sharing between particular teams. That way, you can say who can and cannot access these files.
Dropbox automatically saves each and every version of a file, so you can always go back to previous versions -- even if that file has been deleted.
Files are stored locally. All a user has to know is to save a particular file in a particular folder, and everyone has access to it. I've successfully used Dropbox to collaborate with managers that make the Pointed Hair boss in Dilbert look like a high tech genius.
There's also Skydrive and Google Drive, but I don't find them as universal as Dropbox or as easy to use. It's possible to use Dropbox without ever going to the Dropbox website. To the non-geek, it appears to be magic as files I've written and edited appear on their drive. It took me a few weeks to train one person that he didn't have to email me his document when he made changes because I already had it.
Dropbox gives you 2 Gb of space for free which doesn't sound like a lot. However, my first hard drive was a whopping 20Mb which was twice the size of the standard 10Mb drive at that time. If you're not storing a lot of multimedia presentations or doing a lot of Photoshop, 2Gb might be more than enough for your project.
I know Windows 7 and later has some sort of versioning system built into it. I know this because anytime someone mentions that Mac OS X has time machine, some Wingeek pipes in stating that Windows has the same thing, but only better!. Unfortunately, Windows is not my forte, so I don't know too much about this specific feature. I believe the default is once per day, but it can be changed. This might be the perfect solution if everyone is on Windows.
Subversion can do autoversioning as Stefan stated. Considering his position in the Subversion community (especially his work on TortoiseSVN), he knows his stuff. Unfortunately I don't know too much about it since I've never used or seen this feature implemented. It's probably due to the fact that I work mainly with developers who know what a version control system is, and therefore have no need for something that does the versioning for them.
Also don't forget to check if you can use your corporate Sharepoint which does something very much what you want. I am not too impressed with Sharepoint, but if the facility is there, and your company can give you the support, it is something you probably want to look into.
Pretty much we've all done an installer here and there - and all of us did an installation of some behemoth of a program. Why do some installations take so much time? Case in point: Adobe CS suite (with newer versions you can take a vacation) or Visual Studio.
I know there are files to copy - most of the time unpack even. There are some registry keys to set (if under Windows), maybe a service or couple to start. Some installations probably even check hardware/software combination. All of this does not justify sllloooow installation time in some of the programs.
How can I speed it up?
It obviously depends what you're installing As Colin Pickard pointed out, you'll be shifting huge quantities of data onto the disk (+optional virus check etc.).
For installations I've built recently, we have to request the shut down of some Windows services, wait for that, and check that they really have shut down before continuing. That takes time.
I confess that in the above, that's not parallelised, whereas it could be. I suspect that installations are not necessarily optimised. They may well be the last thing that the team put together prior to release, and they may well figure that you're only going to do it once (and forget the pain upon completion). Obviously not an ideal state of affairs!
Visual Studio on my machine is 3.03GB - 16,842 files in 1,979 folders. Passing 3GB through virus scan and auditing software and onto the filesystem is too much for my (dualcore,2GB,sata2) system - it's CPU or IO bound the whole way through the process. That's why it takes so long.
Most installers not only pack, but also compress their contents, so at installation time all of these files must be decompressed. All of the data that is decompressed must be written to disk after it is decompressed as well.
Look at the time a zip operation takes on several files. It's also slow.
Many installers maintain a log that is flushed to the disk after each primitive operation so that even if installation encounters a fatal failure the log is preserved and can be sent to the software vendor. Such flushing sums up and significantly contributes to overall time.
Back in the old days, Help was not trivial but possible: generate some funky .rtf file with special tags, run it through a compiler, and you got a WinHelp file (.hlp) that actually works really well.
Then, Microsoft decided that WinHelp was not hip and cool anymore and switched to CHM, up to the point they actually axed WinHelp from Vista.
Now, CHM maybe nice, but everyone that tried to open a .chm file on the Network will know the nice "Navigation to the webpage was canceled" screen that is caused by security restrictions.
While there are ways to make CHM work off the network, this is hardly a good choice, because when a user presses the Help Button he wants help and not have to make some funky settings.
Bottom Line: I find CHM absolutely unusable. But with WinHelp not being an option anymore either, I wonder what the alternatives are, especially when it comes to integrate with my Application (i.e. for WinHelp and CHM there are functions that allow you to directly jump to a topic)?
PDF has the disadvantage of requiring the Adobe Reader (or one of the more lightweight ones that not many people use). I could live with that seeing as this is kind of standard nowadays, but can you tell it reliably to jump to a given page/anchor?
HTML files seem to be the best choice, you then just have to deal with different browsers (CSS and stuff).
Edit: I am looking to create my own Help Files. As I am a fan of the "No Setup, Just Extract and Run" Philosophy, i had that problem many times in the past because many of my users will run it off the network, which causes exactly this problem.
So i am looking for a more robust and future-proof way to provide help to my users without having to code a different help system for each application i make.
CHM is a really nice format, but that Security Stuff makes it unusable, as a Help system is supposed to provide help to the user, not to generate even more problems.
HTML would be the next best choice, ONLY IF you would serve them from a public web server. If you tried to bundle it with your app, all the files (and images (and stylesheets (and ...) ) ) would make CHM look like a gift from gods.
That said, when actually bundled in the installation package, (instead of being served over the network), I found the CHM files to work nicely.
OTOH, another pitfall about CHM files: Even if you try to open a CHM file on a local disk, you may bump into the security block if you initially downloaded it from somewhere, because the file could be marked as "came from external source" when it was obtained.
I don't like the html option, and actually moved from plain HTML to CHM by compressing and indexing them. Even use them on a handful of non-Windows customers even.
It simply solved the constant little breakage of people putting it on the network (nesting depth limited, strange locking effects), antivirus that died in directories with 30000 html files, and 20 minutes decompression time while installing on an older system, browser safety zones and features, miscalculations of needed space in the installer etc.
And then I don't even include the people that start "correcting" them, 3rd party product with faulty "integration" attempts etc, complaints about slowliness (browser start-up)
We all had waited years for the problems to go away as OSes and hardware improved, but the problems kept recurring in a bedazzling number of varieties and enough was enough. We found chmlib, and decided we could forever use something based on this as escape with a simple external reader, if the OS provided ones stopped working and switched.
Meanwhile we also have an own compiler, so we are MS free future-proof. That doesn't mean we never will change (solutions with local web-servers seem favourite nowadays), but at least we have a choice.
Our software is both distributed locally to the clients and served from a network share. We opted for generating both a CHM file and a set of HTML files for serving from the network. Users starting the program locally use the CHM file, and users getting their program served from a network share has to use the HTML files.
We use Help and Manual and can thus easily produce both types of output from the same source project. The HTML files also contain searching capabilities and doesn't require a web server, so though it isn't an optimal solution, works fine.
So far all the single-file types for Windows seems broken in one way or another:
WinHelp - obsoleted
HtmlHelp (CHM) - obsoleted on Vista, doesn't work from network share, other than that works really nice
Microsoft Help 2 (HXS) - this seems to work right up until the point when it doesn't, corrupted indexes or similar, this is used by Visual Studio 2005 and above, as an example
If you don't want to use an installer and you don't want the user to perform any extra steps to allow CHM files over the network, why not fall back to WinHelp? Vista does not include WinHlp32.exe out of the box, but it is freely available as a download for both Vista and Server 2008.
It depends on how import the online documentation is to your product, a good documentation infrastructure can be complex to establish but once done it pays off. Here is how we do it -
Help source DITA compilant XML, stored in SCC (ClearCase).
Help editing XMetal
Help compilation, customized Open DITA Toolkit, with custom Perl/Java preprocessing
Help source cross references applications resources at compile time, .RC files etc
Help deliverables from single source, PDF, CHM, Eclipse Help, HTML.
Single source repository produces help for multiple products 10+ with thousands of shared topics.
From what you describe I would look at Eclipse Help, its not simple to integrate into .NET or MFC applications, you basically have to do the help mapping to resolve the request to a URL then fire the URL to Eclipse Help wrapper or a browser.
Is the question how to generate your own help files, or what is the best help file format?
Personally, I find CHM to be excellent. One of the first things I do when setting up a machine is to download the PHP Manual in CHM format (http://www.php.net/download-docs.php) and add a hotkey to it in Crimson Editor. So when I press F1 it loads the CHM and performs a search for the word my cursor is on (great for quick function reference).
If you are doing "just extract and run", you are going to run in security issues. This is especially true if you are users are running Vista (or later). is there a reason why you wanted to avoid packaging your applications inside an installer? Using an installer would alleviate the "external source" problem. You would be able to use .chm files without any problems.
We use InstallAware to create our install packages. It's not cheap, but is very good. If cost is your concern, WIX is open source and pretty robust. WIX does have a learning curve, but it's easy to work with.
PDF has the disadvantage of requiring the Adobe Reader
I use Foxit Reader on Windows at home and at work. A lot smaller and very quick to open. Very handy when you are wondering what exactly a80000326.pdf is and why it is clogging up your documents folder.
I think the solution we're going to end up going with for our application is hosting the help files ourselves. This gives us immediate access to the files and the ability to keep them up to date.
What I plan is to have the content loaded into a huge series of XML files, each one containing help for a specific item. This XML would contain links to other XML files. We would use XSLT to display the contents as necessary.
Depending on the licensing, we may build a client-specific XSLT file in order to tailor the look and feel to what they need. We may need to be able to only show help for particular versions of our product as well and that can be done by filtering out stuff in the XSLT.
I use a commercial package called AuthorIT that can generate a number of different formats, such as chm, html, pdf, word, windows help, xml, xhtml, and some others I have never heard of (does dita ring a bell?).
It is a content management system oriented towards the needs of technical documentation writers.
The advantage is that you can use and re-use the same content to build a set of guides, and then generate them in different formats.
So the bottom line relative to the question of choosing chm or html or whatever is that if you are using this you are not locked into a given format, but you can provide several among which the user can choose, and you can even add more formats as you go along, at no extra cost.
If you just have one guide to create it won't be worth your while, but if you have a documentation set to manage then it is the best to my knowledge. Their support is very helpful also.