I hope this is not a stupid question. So, I simply want to duplicate a file from the Isolated Storage to be used as a backup. However, speed is really important in this case and I wondered what's the fastest way to do that. Should I open the file from the IS, read it to a stream, then create a backup file and write to it, well from what I've seen so far this will take at least half a second which is a lot.
There's no API for copy/duplicate so yes, your answer is the best way.
If you want to avoid the half a second delay then you'll need to do that via your application design - e.g. writing new data to a new file, or perhaps using smaller files.
If you're interested in the details of IsolatedStorage performance, then this blog has done a superb analysis:
http://appangles.com/blogs/mickn/wp7/?p=6
Related
Hello and thanks for checking out my question,
I am working on a project analysing film and visualizing the data I got from it. I'm quite new at programming and only have some basic experience in java and javascript.
For my project I want to store the db levels of a movie in a csv file, to later work with the data in processing. I couldn't find anything that wasn't too complex for me to comprehend for Mac (OSX.)
Help would be much appreciated!
Thank you.
You're going to have to break your problem down into smaller steps.
Step 1: Generating the CSV file.
There are probably a million different ways to do this, and that can be pretty confusing. But break this down into smaller sub-steps and then take those steps one at a time. Can you get a movie playing in Processing? There is a Video library that does just that. Then can you get the volume level every X seconds? You might start with a separate sketch that just prints something to the console every X seconds. For getting the volume, you might try out the Minim library. If that doesn't work, Google is your friend, and remember to keep breaking your problem down into smaller steps!
Step 2: Loading the CSV file.
Now that you have the CSV file, you have to load it into Proccessing. There are several functions in the reference that might come in handy. Again, start with an example program that just prints the values to the console. Get that working perfectly before moving on.
Step 3: Visualizing the data.
Now that you have the data in your Processing code, you can start thinking about how you want to visualize the data. Maybe a line chart that just shows the volume over time just to start with.
If you get stuck on a specific step, then try to break it down into smaller sub-steps. Create an example program that just tests one of those smaller sub-steps (also known as an MCVE), and you'll be able to ask a more specific code-oriented question. Good luck, sounds like an interesting project!
This may be a question for Survey Monkey, but I felt that someone here may have encountered something like this in past experiences. Is there a way to work with the API of Survey Monkey (SM), to add the information from the survey straight into a database of my own? I realize that I can generate the information into output files, but I was wondering if there was a way to directly access the information from the SM database. I feel like this might cause some privacy concerns for SM. Has anyone attempted this, or would the best option of mine be to create my own surveys without a third party website?
I had a similar issue and here's my solution.
I was doing health related surveys which contain HIPPA protected Personal Health Info. Zapier is NOT HIPAA safe, so the "zap the results over to Google Drive" solution didn't work.
So I wanted a quick n dirty way to grab SM survey data and begin to design a data structure to analyse and store this data. I figured that I would start with <1000 results, sort it out, then build out a bigger/fancier structure as needed.
I just downloaded CSV's of the SM individual responses, munged the downloaded CSV files to make a Python CSV reader happy, then wrote a Python 3.5 script to grab the survey data and spit it out into a couple of output CSV files designed for different analytic purposes.
It was really quick and easy to alter the Python script to deliver different subsets of data to different output files, and really quick and easy to see if these output (CSV or XLS) files really told me what I wanted to know.
This is a really quick and easy way to start analysing right away without spending too much time on procedural overhead. You can alter CSV (or XLS ) tables really quickly and easily, so you can mix and match data / derivative data as much as you want. A wise person once told me "don't think, do." So the more you analyse on small runs of data, the better your final Big Buildout In The Sky will look.
Yah, you can spend a lot of time writing and API and setting up a dbase, but if you are not completely happy with what you want out of the SM data, start small. Hope this helps.
My dearest stackoverflowers,
I want to access the serialized data contained in files with strange, to me, extensions. The bulk of the data seems to be in a .st and an .idt file.
The program is meant to be run on Windows, and the unix file command gives me only false positives. Any ideas on either what these extensions mean or on how to investigate and extract their contents?
Below I provide the entirety of the extensions in a long list in hope somebody recognizes them. Googling also gives me false positives. For example: .st is commonly used for ATARI emulation files.
Thanks in advance!
.cix
.cmp
.cnt
.dam
.das
.drf
.idt
.irc
.lxp
.mp
.mbr
.str
.vlf
.rpf
.st
.st
Some general advice on how to approach this:
One way to approach this is to use a site like http://filext.com/ to try to figure out where the files came from. This can be tough, because it's not like there's a file extension standard anywhere - anyone can use any extension, so you're going to have a lot of conflicts/disambiguation issues to solve.
Sometimes you can get lucky, and if you open up the files in a plain text editor you can occasionally see plain string data that is readable, which can help identify the general sort of data contained in a file, and therefore help cut down on the possible number of sources for a file. For example, I have often helped people who received a file as an email attachment with no extension, figure out what file type it was using this technique, adding the file extension, and then opening it in the appropriate program.
There are also sites like http://www.oldversion.com/ that keep old versions of programs that you (typcially) can download for free. This is especially helpful if the data you're working with was created 5+ years ago, in a proprietary program, and that program is no longer available/purchasable from the vendor who created it.
Once you have a good idea of what files belong to what programs, then you're probably going to spend a lot of time trying to find online resources for what the structure of the files are. If that isn't available, you can get a copy of the original program, but either the program won't open the files you're interested in or you still want raw access to the data, then try generating some sample output files with data that you input, and go Rosetta Stone on it, comparing your known file to the original file.
From there, the additional knowledge you'll probably want, is to try to find out what language/compiler the software was written in, which can give you a lead on what code libraries were used to serialize the data in the first place. Once you know all that, then it's matter of reading through any available documentation on the serialization process, and then writing a deserializer.
The one thing this technique won't solve is, if you're dealing with corrupt/truncated data files, it may be very difficult to tell the difference between that and whether or not you have the file structure correct. The "Rosetta Stone" technique might be helpful in that case.
Depending on how many different pieces of source software you're talking about, sounds like a pretty big project. Good luck!
We want insert something into binary file, but want do I/O as less as possible. We don't want read the second part and write it back. Is there any way to do it. We know in theory, file are saved as blocks in file system. Could we just break the chain of blocks and insert a new one between it?
The same idea could also be used in join two files. Attach one after another. Is there any fast way to do this?
The problem comes from write a large file. We want write it to many small files and join them together.
I'd say don't go there...
It would involve low-level, file-system specific coding that could possibly mess up the file system.
If this is really, really important for you(r) business, perhaps you could 'emulate' blocks inside a file. This would however result in files of which the intended structure can not be resolved by other 'parties' (other software using the file).
You most certainly don't want to go messing with the file system directly. Let the OS worry about that. If you have to ask this question, you probably don't have the requisite knowledge to manipulate things at that level with greater efficiency that the OS does.
Rather than adding data in the middle of a huge file, consider a different solution that allows you to append data to the end of the file. (Or is some more efficient way). Also, if you're dealing with a huge amount of data, maybe a database would be more efficient than a gargantuan file.
I would be interested in hearing more about what you're trying to do. Perhaps there's an easier way.
I currently build a CMS system that need to save a lot of pictures per article. I have a lot of questions :-)
I need to show the pictures in a few sizes, with or without watermark. In addition I need to have the original picture too, for archive and admin purpose. What that I think to do right now is to save the pictures in the database, in two versions: 1. the original picture, 2. web-optimized version.
It is really convenient way to save all the images in a table. But does it really good idea? Let say that the database will contain a hundred of thousand pictures, the original pictures size is probably around 3MB. so the db can be easily 100TB size.... Is this really good strategy?
On the other hand, I save a smaller version to each picture. This version need to be shown in a few sizes, with and without watermark. Currently I think to do think to this in on each request. the request will have parameters width, and according to this I can decide the size and the watermark. (I'll cache this work of course). Again, Is this a good strategy? do it really gonna work, or this is very expensive extra work?
Is it really better to save this on the db? I mean each request to article, will need around 50 another requests to its images, and each request required open/close connection to the database.
Technologies that I going to use: .net, sql-server 2008, NHibernate.
The best approach would be storing those images in filesystem and ids on database. Because of performance and maintenance reasons. Backing up and restoring would be much easier on filesystem and pushing the DBMS for such a work is not the best idea, you will need to transfer them from db to application and then push to the client. I just believe that's not it's job. Put a lighttpd daemon or something for image hosting and leave it do its job.
But if you like the idea, since you are going with sql server 2008, you can use FILESTREAM to store your images in your tables. Eventually, it will create files in a storage location that you choose and store the binary data in filesystem while providing transactional features and data integrity, it is a big bonus. Take a look at that option. As I remember, that performs good and the actual database will be much compact.
About the dynamic resizing, I say avoid that. Storage is cheaper than CPU time, just create variety of thumbnails and watermarked versions upon upload time and store them once in somewhere then use when required. Do not perform same operations again and again. You may do that at first request to the resized version, this way it will be easier to add new versions or purging the cache periodically to remove unused files. You will also be able to backup just the original versions.
Putting the images in the database has a couple of advantages. ACID tanscations and backup consistency come to mind. If you absolutely need that then put the images in the database. As you pointed out, this comes with a price: you'll need a huge database infrastructure like machines, licenses, operation team. Each image retrieval is a huge DB I/O effort.
A lot of things will be much easier with only storing metadata in the DB and putting the image blobs on a filesystem.
Two approaches to come to a decison:
What is the killer feature you absolutely (absolutely like in "if I don't have that, the whole thing will not work at all") need from the image-in-database approach? If there is one, go for it
Do a back-of-the-napkin business case, calculating the total cost of the image-in-database approach (project efforts, infrastructure, machine, license, operation) and compare that with an image-in-filesystem approach. That should give some hints on how to proceed.