I track traffic to the domains that I own through a "home brew" cookie system. One one site, I have noticed that I get a lot of HTTP referer traffic from one particular domain (http://www.getbig.com/). I did some sleuthing, and found out what this person has done. This person has attempted to use my sites logo as their avatar on a forum. However, instead of linking to the image in the "img" tag:
<img src="http://www.example.com/image.jpg" width="" height="" alt="" border ="" />
they have linked to my main domain name:
<img src="http://www.example.com/" width="" height="" alt="" border ="" />
Every single time a page is loaded where this person has posted in this forum, a new hit gets registered. This is artificially inflating my visitor statistics, and I would like to stop it. If they had simply linked to the image, I could just change the image name, but they have linked to the site itself and I am not sure what to do. Aside from sending them a "cease and desist", what technical options do I have?
The principle is called hotlinking – or at least it is when done correctly, as you pointed out. There are a few solutions to "stop" it from happening.
The most common one is to serve a different page or image instead of the expected one. Apache's mod_rewrite (or similar) allows you to rewrite URLs based on particular criteria, such as the referer header in this case. You will need to be at least allowed to create your own .htaccess file. There are tools to help generate the .htaccess content.
A less informative way to do this would be to deny access via environment variable. First check the referer header with SetEnvIf and deny access based on it. This would only return a HTTP#403 response code.
If you don't have this sort of access, you could read the referer header at the application level and make a decision there. This might only be a partial solution depending how the content is delivered (i.e. request handled by the webserver or an additional application layer such as PHP).
Contact the user in question. This is less scalable and doesn't stop them if they don't agree with your kind request.
The first three are solutions to stop hotlinking in general. They can be adjusted to match only a particular referer.
In this particular case, I doubt any of these will have a significant effect unless you provide a picture in response. If the URI doesn't contain the actual image name but only the protocol and domain name, the browsers opening the page are unlikely to show anything relevant for the img tag at the moment. Providing a different content won't change this situation, unless it's an image. Serving an image explaining why you don't allow hotlinking (even if they request the main page) would probably have a more important impact on the user.
It is difficult to assess how your statistics will be affected by these solutions. Assuming they are collected on the main page, they could bring the data back to normal as that page won't be served anymore. If they're based on the access logs, it might be a different story.
What I would recommend is check out the Referer, and if it is coming from http://www.getbig.com/, instead of your website you serve the absolute filthiest image you can find on the internet.
It's much, much easier to just send them an email though.. (this is my actual advice).
Related
I am making a website with nodejs and express, I want to add an image among text from a local src. Much like how if I were using html to mimmick a Wikipedia article I would do this:
<p> Coins are really cool, people use them as a heavy kind of money. Obviously paper money is better though</p>
<img src="uploads/pictureofcoins.png" alt="nicklesAndDimes">
<p> some pictures have coins of important people on them, sometimes just birds though </p>
In Expressjs the process seems to look something like this. First I have an app.js file where I provider a folder directory. Then you set a variable that is the image pointing to the folder directory. I believe you do this by the following.
app.use(express.bodyParser({ keepExtensions: true, uploadDir: __dirname + '/public/uploads/' }));
the my jade file looks almost identical to the html
p Coins are really cool, people use them as a heavy kind of money. Obviously paper money is better though
img(src='localhost:3000/public/uploads/pictureofcoins.png')
p some pictures have coins of important people on them, sometimes just birds though
I know that the pictureofcoins.png image is in the uploads folder. There something coneptual I am not getting, the icon appears as a broken image icon in chrome.
You are using a full url, in that case you need to specify the protocol as well.
So try adding http:// or // to your image source url.
Note that when using a full url make sure you don't hardcode host and port. It is possible to retrieve that information from the http request.
You can also try /public/uploads/pictureofcoins.png as a relative link to make sure it works regardless of host and port.
I am working on a bootstrap based responsive website. The dropdown menus in the main website navigation are opened with a click rather than a hover. There is no index content for each section, only specific page links in the dropdown.
Is there any SEO penalty for having content located at:
www.mysite.com/books/moby-dick
when
www.mysite.com/books
results in a 404 error?
I could generate index pages with links to all children if I had to, but I'd rather avoid creating any content that isn't meant to be viewed directly.
I would like to organize the pages by "folder" using mod_rewrite which I have a pretty good handle on at this point.
The way I understand it, Search Engines place no relevance on one page's URL in relation to another page's URL. Are you looking for some documentation. What search engine are you looking for documentation on? I'm not even sure if that type of info is "documented" but if you think about it, the order of words in a URL only have meaning to us as humans. An engine doesn't place importance of links further up or down the url hierarchy. It wouldn't make sense.
I don't even think your page moby-dick would have a positive/negative impact on the domains home page. Google at least treats every URL as a unique page, hence the "Page Rank" algorithm. Not the site hierarchy algorithm.
I have a web application that needs to serve a large amount of small images per page (up to 100). I can use caching to reduce calls to the database/backend, but there is a noticeable impact from having to make so many separate requests for the images themselves, as the images take some time to request and render, especially on slower connections.
What good practices exist for serving several images on a page? I'm aware of using a CDN (e.g. S3 + Cloudfront) to reduce bottlenecking on http requests and serve content from a closer geographical location, as well as potentially loading images/content via Ajax only once they come to the user's view in the browser. Are there other techniques that might provide significant performance gains for image-heavy pages? It doesn't really matter whether they relate to hardware, frontend or something else.
Thanks.
Loading 100 images in one page request increases the page load time as each image requires time to load in browser.
simple technique is to load only one default image , means the source of each 100 image should be common default image and only one image wont take much time to load.
when page loads all of its content then try to load each single image with help jQuery.
use lazyload jQuery plugin to load all images after page load.
like this
<img class="lazy" src="default.jpg" data-original="img1.jpg" >
<img class="lazy" src="default.jpg" data-original="img2.jpg" >
<img class="lazy" src="default.jpg" data-original="img3.jpg" >
.......
<img class="lazy" src="default.jpg" data-original="img100.jpg">
and in script use following code
$(document).ready(function(){
$("img.lazy").lazyload();
});
You may add expires header to each image which allows browser to cache them rather requesting them on next request.
hope it will help you.
You can use a different domain for images - these will be called on different threads than for the current domain.
You can also host your images on a web server optimized to serve static content - this will be faster than a dynamic server.
The above can be extended to several such domains - if the browser is set to have 4 threads per domain, each domain you add will parallelize to an additional 4 (which is also one of the benefits of using a CDN).
Another common technique that may apply is the use of CSS sprites - if you have a bunch of images that are commonly used together, you can put them all in a single image and use CSS to only show the bits that are needed where they are needed.
You can always combine the images into a single image and use CSS to display only parts of it at a time (commonly called CSS sprites)
Google also has a rather in depth article about how they implemented "Instant Previews" that covers some of the optimizations:
http://googlecode.blogspot.com/2010/11/instant-previews-under-hood.html?m=1
I'm still not sure how it works(but it's not the point:D). As far as I noticed, whole content(almost:D) is in the iframe and chat window is outside iframe. Request are probably made via ajax, and urls are changing like this const_part_of_url#something - so the only url anchors(or whatever it's called) are changing.
2 things bothering me :
What about googlebot, is it able to index those pages correctly(not gmail, but say some web page with similar "technology" used), 1st beacuse of iframe, 2nd because of only anchor changes in urls?
Is it possible to make some part of url changing not only anchors?
The thing is I have an mp3 search engine where you can listen these mp3s too, and this kind of floating, "not-reloading" player with playlist would be kinda cool:D But I'm very concern about proper page indexing and other SEO blah blah... so I don't really now if it's worth trying:D
Cheers
you can detect robots and not feed them with user-eyes-only content ...
Edit : you can also load it on demand (javascript)... bots wont load it
I like that, these days, we have an option for how we get our web content from the server: we can make an old-style HTTP request (with its own URL in the browser) or we can make an AJAX call and replace parts of the DOM on the fly.
My question is this: how do you decide which method to use when there's an option to use either?
In the "old days" we'd have to redraw the entire page (including the parts that didn't change) if we wanted to show updated content. Now that AJAX has matured we don't need to do that any more; we could, conceivably, render a "page" once and just update the changing parts as needed. But what would be the consequences of doing so? Is there a good rule of thumb for doing a full-page reload vs a partial-page reload via AJAX?
If you want people to be able to bookmark individual pages, use HTTP requests.
If you are changing context, use HTTP requests.
If you are dividing functionality between different pages for better maintainability, use HTTP requests.
If you want to maximize your page views, use HTTP requests.
Lots of good reasons to still use HTTP requests - Stack overflow is a wonderful example of those divisions between AJAX and HTTP requests. Figure out why each function is HTTP or AJAX and I'm sure you will derive lots more reasons when to use each.
My simple rule:
Do everything ajax, especially if this is an application not just pages of content. Unless people are likely to want to link to direct content, like in a blog. Then it is easier to do regular full pages.
Of course there are many blended combinations, it doesn't have to be one or the other completely.
A fair amount of development overhead goes into partial-page reloads with AJAX. You need to create additional javascript handlers for all returned data. If you were to return full HTML blocks, you would still need to specify where the content should go and whether or not it's replacing other content. You would potentially have to re-render header tags to reflect content changes and you would have to implement a history solution to make sure search engines can index each page (using SWFAddress jQuery plugin, for example). If you return JSON encoded data you have an additional processing step.
The trade-off for reduced bandwidth usage by not using a page refresh is offset by an increase in JS code and event bindings which could affect page rendering speed as well as visual effects.
It all really depends on your target audience and the overall feel you are trying to go for on your page. AJAX and preloaders are flashy, and people love flashy things. If you believe the end-user experience will improve by adding partial page loads by all means implement them.