Check on which pages an image is used? - image

Is there a certain way to check which pages on a website use a specific image?
Say I have some image which I don't use on a page anymore, so I'd like to delete it from my server. But I'm not entirely sure if it's being used on other pages, is there a way to check if it's still being shown on other pages?

You can hook your website to google webmaster tools and wait a little bit after a while 404 errors will appear there. This way you can track unused resources and dead ends.
This includes images.
There is a better way if you have direct access to the web server.
Visit every page in your website or let google crawl it.
You can later sort the files by date modified and ones which are not modified lately are not used.
You have to make sure you get the images from the pages so I would use a historyless cahceless session.
How to sort the files according to the time stamp in unix?

Related

Website is different when I upload on FTP

I have just started developing for a few weeks now and I bought a domain, but when I upload the files on live, the website looks different than what I have uploaded. Now, this gets fixed when I clear my cache. The problem is that my visitors enter, they see the page in a way, and after I update it they see it as the previous version!
Is there any possible solution for this? I don't want my visitors to clear cache every time I make a change on my website!
This is quite probable to be due to css cache. Your server is loading a cached version. You can specify the cached time in a few ways. Etags and htaccess (on apache) are the most common.
A very simple trick is just to add at the end of your style link url (where you load your main style in the head of the document) a get-like parameter: just like this:
main.css?v=2

my opencart website is not displaying images on on way back machine

My opencart website is not displaying images on way back machine of Alexa.com (How did
www.aaa------aa.com
look in the past?), I have checked multiple dates but images does not display anywhere.
My opencart images are also not getting crawled by google merchant center, it shows that it is because of the robots.txt but I have removed images from robots.txt file & still it shows same error.
My website is working fine otherwise and I am also getting orders but want to know if the above two issues are interrelated and what can be the best solution to this problem.
Thanks,
Gaurav
from what you are saying i think the problem with wayback machine is that you cache of images got cleared so ofcource most urls of images are dead, since opencart get 98% of images from cache after they get resized. So no problem there, i think this happens with every oencart setup that cached images are cleared from time to time, also in the same fashion please check images folder and sub-directories if you have any .htaccess limiting everything but local requests, also one last notice google usually takes from one day and up to update data regarding robots.txt and other data.
And if nothing works out try to specifically allow access to one directory with robots.txt
Hope i could help you.

Making websites available offline

I am using HTML5 offline storage. The goal is to make the whole site available offline. So intuitively, no server requests means all the pages need to be on the client. The only way I know of to accomplish such a task is to make the site into one page then show hide portions with jquery when the user "navigates". Is there a better way?
The html 5 offline spec allows multiple pages to be saved offline so you don't need to put all your content onto one page.
EDIT: link to spec http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html
Be careful that your jquery does not still point to the cloud. You'll need to save the relevant .js files locally.
N.B. If your whole site can be generated and saved as individual .html files then all you need to do is to save these files in the correct (relative) directory structure.

Prevent direct-linking to .zip files

I'ld like to prevent direct-linking to .zip files I offer for download on my website.
I'm reading posts for hours now but I'm not sure which method is the best to achieve that. PHP seems not to be safe and htaccess refferer can be empty etc.
Which method do you guys use or would suggest?
Cheers
See: http://www.alistapart.com/articles/hotlinking/
and: http://www.webmasterworld.com/forum92/2787.htm
Referrer checking is one option, but as you noted they can be empty or spoofed.
Another possibility is to set a cookie when someone visits normal pages on your site, and check for that when the person tries to download the zip file. This could be gotten around (e.g. by the hot-linker embedding an appropriate cookie-setter page as a 1x1 image along size the hot link), but it's less likely they'll figure it out. It'll also exclude people who block cookies, of course.
Another possibility is to generate limited-time-access URLs on the download page, something along the lines of http://example.com/download.php?file=file.zip&code=some-random-string-here. The link would only be usable for a small number of downloads and/or a short period of time, after which it would no longer function.

Content Water Marking

We have members-only paid content that is frequently copied and republished without our permission.
We are trying to ‘watermark’ our content by including each customer’s user id in a fake css class, for example <p class='userid_1234'> (except not so obivous, of course :), that would help us track the source of the copying, and then we place that class somewhere in the article body.
The problem is, by including user-specific information into an article, it makes it so that the article content is ineligible for caching because it is now unique to each user.
This bumps the page load time from ~.8ms to ~2.5sec for each article page view.
Does anyone know of any watermarking strategies that can still be used with caching?
Alternatively, what can be done to speed up database access? ( ha, ha, that there’s just a tiny topic i’m sure.. )
We're using the CMS Expression Engine, but I'd like to hear about any strategies. They don't have to be EE-specific.
If you're talking about images then you could use PHP to add a watermark to the images.
How can I add an image onto an image in PHP like a watermark
its a tool to help track down the lazy copiers who just copy the source code as-is. this is not preventative, nor is it a deterrent. – Ian 12 hours ago
Going by your above comment you are happy with users copying your content, just not without the formatting etc. So what you could do is provide the users an embed type of source code for that particular content just like YouTube does with videos. Into that embed source code you could add your own links back to your site, utilize your own CSS etc.
That way you can still allow the members to use the content but it will always come out the way you intended it with links back to your site.
Thanks
You could always cache a version that uses a special string, like #!username!#, and then later fill it in with PHP based on which user is viewing it.
Another way I believe is to switch from caching on the server to instead let the browser cache it locally for a little. That way it is only cached per user, and it reduces the calls to your database. Because an article is pretty static, you could just let the local computer cache it, and pull in comments via javascript.
This last one is probably not one you are really looking for, but I'm gonna come out and say it anyway. You could not treat your users like thieves, and instead treat the thieves as thieves. Go to the person hosting the servers your content is on and send them an email telling them copyrighted premium content is being hosted on their servers without your permission. You can even automate that process.
How to find out what sites are posting your content? Put a link in the body content to your site, and do a Google Search/Blog Search for articles linking to that site. To automate it, use Google Blog Search because it offers RSS feeds. Any one that has a link back to your site could go into a database with a link to the page, someone could look at it, and if it is the entire article, go do a Whois and send them an email.
What makes you think adding css to something is going to stop people from copying it without that CSS? It's more likely that they are just coping the source of the content you are showing them and ignoring all the styling around it. For example, I use tamper data to look at all HTTP requests made by Firefox, if I can see it on the page, I can see it in the logs. Even with all the "protection" some sites try to put in place, they generally will never work. I can grab what I want, without using any screen capture/recording.
If you were serving flv's, for example, I would easily be able to grab the source of that even if you overlayed it with some CSS. I think the best approach would be to get the sites publishing your premium content and ask them to remove it. It's either that or watermark the actual content on the fly while sending it to the browser.

Resources