How to avoid occasional corrupted downloads - download

My website hosts a msi file that users need to download. There is nothing special about the file. It lives in a directory on the webserver with a regular HREF pointing to it that users click on. Occasionally a user will complain that they can't open the msi file because Windows Installer claims the file is corrupt. Redownloading the file doesn't help. I end up emailing the file as an attachment which usually works.
My guess is that the file is either corrupted in the user's browser cache or perhaps an intermediary proxy's cache which the user goes through.
Why does this happen? Is there a technique / best practice that will minimize chances of corruption or, perhaps make sure users will get a fresh copy of the file if it does get corrupted during download?

Well if the cause is really just the cache, then I think you could just rename the file before having them download it again. This would work for any proxies too.
Edit: Also, I believe most browsers won't cache pages unless the Get and Post parameters remain the same. The same probably applies to any URL in general. Try adding a unique get (or post) parameter to the end of the URL of each download. You could use the current time, or a random number, etc. Rather than a hyperlink, you could have a button that, when clicked, submits a form with a unique parameter to the download URL.

My advice would be:
Recommend users avoiding IE (especially the older versions), because of truncated downloads, cache pollution...
Advice user to clear the cache before re-downloading the files.
Host the file on an FTP instead of HTTP
Provide MD5 checksum for user to verify the download.

Related

All my javascript added some code at bottom

My website uses CodeIgniter. Today I found my website has some added code at the bottom of all the JavaScript files, including the jQuery file. The code is like this:
/*4fd970*/
You are blocked by day limit
/*/4fd970*/
My folders permission is set to 755.
I wonder why this code is being added to my file? Has someone hacked my site?
Is it caused by my server?
This is some kind of a virus. Happened to me too. Only the index.php and index.html files got modified, right? I think this is a password stealer - steals your ftp passwords from filezilla or some other ftp software and then automatically modifies the index pages.
Yes, someone hacked your site...though not very well. Same thing happened to me, but when I looked closer, there was also similar code added to EVERY HTML file as well (not at the end, but somewhere in the middle of the page).......
<!--0f2490--><script type="text/javascript" language="javascript" > You are blocked by day limit</script><!--/0f2490-->
Best I and my web host could determine, someone made a failed attempt to insert some kind of malware into the code. It was formatted in such a way (see code above) that it wasn't very apparent. If you select all (control a), it shows up a lot easier.
Some additional info from me web host;
http://sitecheck.sucuri.net/
web site: xxxxxx
status: Site infected with malware
web trust: Not Blacklisted
Malware entry: MW:JS:ENCODED:BADINJECTED1
Description: We identified a suspicious javascript block that results from a failed attempt to inject malicious content (blackhole injection). The site was compromised, but due to an error by the attackers, the malware was not properly added.
Yes you were hacked, and so was I.
You should first change your server access password. Someone probably managed to get his hands on it and uploaded the malware to your server.
Remove all the infected files and upload backup versions.
In my case my antivirus told me I had my .js and .php infected with Exploit:JS/Blacole.BW. You could also see those "4fd970" litter the code. In some cases hacker will change the code to have your user download some malware. If you only have a few scripts you can restore a backup version of all the scripts. If deleting everything is not an option, you can make a diff with the previous version and you should be able to find out what was changed.
Check for any file that shouldn't be there.
I also had an .htaccess file added to my server, to have the user download the Blacole malware. I replaced it with a proper one.

Files are not changing when I update them via FTP

I made some changes to a CSS file, uploaded it and saw no change. I cleared my browser's cache, repeated the process and still nothing. I also tried another browser and then experimented with other files - all the same result.I then deleted the CSS file altogether - the website still looks the same and I can still see the files in the browser's console.
I can only get results if I actually change the file names altogether (which is really inconvenient). I dont think there is an issue with FTP overwriting the file as there are no errors in FileZillas logs.
Is there another way a website can cache its self? Would anyone know why this is occurring?
EDIT:
I also tried this in cPanel's File Manager and viewed it on another PC - same result
Squid and other web accelerators often sit between a hosted server and your browser. Although they are supposed to invalidate their caches when the backing file changes, that information isn't always sent to specification or acted on properly.
Indeed, there can be multiple caches between you and the server each of which has a chance of hanging onto old data.
First, use Firebug or "Inspect Element" in chrome.
Verify that the css file that the browser is loading the file you think is should load.
Good luck.
Browsers can cache things, did you try SHIFT-F5 on your webpage to force a reload of everything?
Maybe the main server has cached configuration setup to other servers, check with your IT department. If this is the case, you need to tell them to invalidate the cache through all the cached servers.
I had the same issue with fileZilla to solve it you need to clear the file zilla cache or change the name of the files you are uploading.
Open FileZilla and click on the Edit menu.
Choose Clear Private Data.
In the new dialog box, check mark the categories you’d like to clear: Quickconnect history, Reconnect information, Site Manager entries, Transfer queue.
Finally, click OK to validate

Getting images from a URL to a directory

Well, straight to the point, I want to put a URL, and get all the images inside this URL, for example
www.blablabla.com/images
in this images folder I want to get all the images... I already know how to get a image from an specific URL, but I dunno how to get all of the without having to go straight to the exactly path, is there a way to get a list of all the items inside a URL path or something like that?
Well, basically, this can't be done. Well, not under normal circumstances anyway. The problem is that you don't know what files are in that directory.
...unless the server has "directory listing" on. This is considered a security vulnerability, so the chance this is the case isn't too high. (The idea is that you are exposing details about your server that you don't have to, and while it is no problem on its own, it might make things that can be a security problem known to the world.)
This means that if the server is yours, you can turn directory listing on, or that when the server happens to have it turned on, you can visit the url (www.blablabla.com/images) and see a listing of all the files in that directory. This doesn't always look exactly the same, but in general the common thing is that you will get an html page with links to all the files in the directory. As such, all you would need to do is retrieve the page and parse the links, ending up with the urls to the images you want.
If the server is yours, I would recommend at least looking into any other options you might have. One such option could be to make a script that provides all the urls instead of relying on directory listing. This does not have some of the more unfortunate implications that directory listing has (like showing non-images that happen to be in the same directory) and can be more flexible.
Another way to do this might be to use a protocol different from HTTP like FTP, SFTP or SCP. These protocols do not have the same flexibility as a script, but they do have even more safety as they easily allow you to restrict access to both the directory listing and your images to only people with correct login details (or private keys). (Of course, if such a protocol is available for your use and it's not your own server, you could use them as well.)

Prevent direct-linking to .zip files

I'ld like to prevent direct-linking to .zip files I offer for download on my website.
I'm reading posts for hours now but I'm not sure which method is the best to achieve that. PHP seems not to be safe and htaccess refferer can be empty etc.
Which method do you guys use or would suggest?
Cheers
See: http://www.alistapart.com/articles/hotlinking/
and: http://www.webmasterworld.com/forum92/2787.htm
Referrer checking is one option, but as you noted they can be empty or spoofed.
Another possibility is to set a cookie when someone visits normal pages on your site, and check for that when the person tries to download the zip file. This could be gotten around (e.g. by the hot-linker embedding an appropriate cookie-setter page as a 1x1 image along size the hot link), but it's less likely they'll figure it out. It'll also exclude people who block cookies, of course.
Another possibility is to generate limited-time-access URLs on the download page, something along the lines of http://example.com/download.php?file=file.zip&code=some-random-string-here. The link would only be usable for a small number of downloads and/or a short period of time, after which it would no longer function.

Programmatically reset browser cache in Ruby (or remove select item from browser cache)?

I would like to create a rake task or something to clear the browser cache. The issue is, I am running a Flash app, and if I change the data, I more often than not need to reset the browser cache so it removes the old swf and can see the new xml data.
How do you reset the browser cache with ruby? Or even more precisely, how can I only remove a select item from the browser cache?
Thanks for the help!
I see a few possible solutions:
Write some shell script that deletes the temporary files from disk out the cache (what browser are you using?). I'm am not sure deleting the files on disk will necessarily work if the browser has them cached in memory.
Use and HTTP header (No-Cache) to avoid caching in the browser, Adobe has documentation on No-Cache. You could set this header only in development mode, so that in production the swf is cached.
Depending on your browser, force a page and cache refresh (e.g. Crtl-F5 in Firefox)
I'm not sure how you're loading the xml data, but in the past, I've gotten around the issue by appending a random number to the path of the xml file:
xml.load("data.xml?"+Math.random());
Basically, Flash will always think the file is a different URL. It won't be able to find a match in your cache.
Again, I'm not sure how you're loading the XML data, so I'm not sure if this applies to your situation.
Hope it helps, though.
You cannot reset browser cache, even if you would sometimes it will not be sufficient because caching can occur not only on the server and/or client, but also on any number of nodes your response goes through on its way from your server to your client.
The only tool at your disposal is the caching headers.
You can set them to NoCache just keep in mind that it will be hitting the server every time
Since you're using Safari, here's an article describing how to use AppleScript to clear the cache. But you can probably just skip the AppleScript part and remove the files directly in the rake task. The only catch might be that you have to restart the browser for it to take affect, but that could be done with a kill on the process and an "open /Applications/Safari.app" (I'm assuming you're on a Mac; in Windows it would be something like start "c:\program files\Safari...").

Resources