Not long ago I came across this website: http://www.danasoft.com/
This websites provides dynamically updating signatures which are pretty cool in my opinion.
There is just one thing that I don't get and would really like to know how to do.
Here's a direct link to an image on the website: http://www.danasoft.com/vipersig.jpg Try refreshing. Notice it changes? How do I achieve that? How do I have a direct link to a file like www.mypage.com/thing.jpeg output different images each time?
Basically, the URL is not actually retrieving the file directly each time, but rather the server is intercepting that URL and serving a (possibly random) image from a larger set of images. Depending on whether the server is running Apache, IIS, etc, the implementation could vary... This could also probably be achieved with the MVC routing engine by defining a custom route handler for URLs ending in '.jpg', but I'm not actually sure.
EDIT:
See this discussion for more detail on the MVC implementation.
Related
I've got a web app which heavily uses AngularJS / AJAX and I'd like it to be crawlable by Google and other search engines. My understanding is that I need to do something special to make it work, as described here: https://developers.google.com/webmasters/ajax-crawling
Unfortunately, that looks quite nasty and I'd rather not introduce the hash tags. What I'd like to do is to serve a static page to Googlebot (based on the User-Agent), either directly or by sending it a 302 redirect. That way, the web app can be the same, and the whole Googlebot workaround is nicely isolated until it is no longer necessary.
My worry is that Google may mistakenly assume that I'm trying to trick Googlebot, while my goal is to help it. What do you guys think about this approach, and what would you recommend?
Recently I come upon this excellent post from yearofmoo, explaining in details how to make your Angular app SEO friendly. In essence, when bots see an uri with a hash tag they will know it's an ajaxed page and will try to reach the same uri by replacing '#!' in your uri with '?_escaped_fragment_='. This alternative uri instructs bots that they should expect to find a definitive static version of the page they were accessing.
Of course, to achieve this you'd have to introduce hash tags into your uris. I don't see why are you trying to avoid them. Isn't gmail using hash tags?
Yeah unfortunately, if you want to be indexed - you have to adhere to the scheme :( If your running a ruby app - there's a gem that implements the crawling scheme for any rack app....
gem install google_ajax_crawler
writeup of how to use it is at http://thecodeabode.blogspot.com.au/2013/03/backbonejs-and-seo-google-ajax-crawling.html, source code at https://github.com/benkitzelman/google-ajax-crawler
Have a look at these links and it will give you a good direction:
Set up your own Prerender service using Prerender.io open source code:
https://prerender.io/
Use a different existing service such as BromBone, Seo.js or SEO4AJAX:
http://www.brombone.com/
http://getseojs.com/
http://www.seo4ajax.com/
Create your own service for rendering and serving snapshots to search engines. Read this article. It will give you the big picture:
http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io
As of May 2014 GoogleBot now executes JavaScript. Check WebmasterTools to see how Google sees your site.
http://googlewebmastercentral.blogspot.no/2014/05/understanding-web-pages-better.html
Edit: Note that this does not mean other crawlers (Bing, Facebook, etc.) will execute Javascript. You may still need to take additional steps to ensure that these crawlers can see your site.
I'm just getting started with ImageResizer and I'm stuck on what seem like totally basic questions:
I have an uploader that I use to put images into a directory that's not directly accessible over HTTP. (If I just put a image at, say, /images/myimage.jpg, then anyone could access it by just asking for it, whereas I want to limit access via thumbnails, watermarks, etc.). So I want to put it at /offlimits/myimage.jpg, but be able to serve it up at /public/images/myimage.jpg.
I don't really want to dump all the images in the same offlimits folder, because putting lots of files in one folder makes Windows unhappy. But I don't want to expose the details of that subdirectory structure either, so where do I put the mapping between the public facing url and the actual image location?
Most generally, I don't necessarily want an image extension at all, so I'd like to say /public/image_id?width=100... and have this map to /offlimits/sub1/sub2/sub3/image_id.jpg.
Can anyone advise about how to set this up?
Three part questions are generally frowned upon here at SO, but I'll bite anyhow :)
If you're allowing access to images based on authentication, then you need to use ASP.NET's URL Authorization feature. ImageResizer supports URL Authorization rules. If you just don't want the source files available, and want to force them resized or watermarked, read the docs on how to implement arbitrary rules like this.
You can rewrite image paths to your heart's content with Config.Current.Rewrite, which works just like the PostRewrite event mentioned earlier. Just remember you'll have to keep it all straight in your head later.
Image extensions are good things. Don't fight them. They let the server figure out the right mime-type to send and help errant browsers recover from related bugs. They prevent issues on several platforms and make the Save As dialog work. They significantly improve server efficiency as well, since handling logic doesn't have wait as long. This is particularly relevant because of the design of the IIS/ASP.NET modules system.
Seaside by default points example.com/myapp to whatever application is registered at myapp. I'd like to have a core application that can also handle these links, or some other way of handling these links.
So far, I have a home application that is also registered as the default application, so http://mydomain.com will resolve to it, but if I generate a link, like http://mydomain.com/more-info, Seaside tries to resolve an application registered at more-info. What if I want my home application to handle the link? Or handle it in some other way?
I'm hosting Seaside with Apache, so I could use Apache's URL rewriting engine to rewrite http://mydomain.com/more-info to http://mydomain.com/home/more-info, which would be handled by my home app.
Is there a better way to do this? Also, if a link exists to an explanation of the Seaside request/response lifecycle, that'd be sweet.
What you are trying to do is not common practice in Seaside applications. If you want to generate a link from one page to another page in your application, you generally use a callback attached to an anchor:
html anchor callback: [ self call: moreInfoComponent]
In such cases, you do not care about how the url looks like and Seaside generates the url for you. Such generated urls never have a nested structure but use parameters.
More information on the Seaside request/response cycle can be found in the online book (chapters "Fundamentals" and "Sequencing Components").
However, if you indeed want to have such a nested url (to make urls bookmarkable), there are different approaches, depending on what you actually want to achieve. You can either take a look at the approach for handling expired sessions (in the book) or at the Seaside-REST package.
Btw, the mapping of urls to applications happens through (instances of) WADispatcher. If you inspect the result of the following expression, you can see the dispatcher tree of Seaside. It's entirely customizable by adding new applications, dispatchers, etc...
WAAdmin defaultServerManager adaptors first requestHandler
Hope this helps you on your way...
We have members-only paid content that is frequently copied and republished without our permission.
We are trying to ‘watermark’ our content by including each customer’s user id in a fake css class, for example <p class='userid_1234'> (except not so obivous, of course :), that would help us track the source of the copying, and then we place that class somewhere in the article body.
The problem is, by including user-specific information into an article, it makes it so that the article content is ineligible for caching because it is now unique to each user.
This bumps the page load time from ~.8ms to ~2.5sec for each article page view.
Does anyone know of any watermarking strategies that can still be used with caching?
Alternatively, what can be done to speed up database access? ( ha, ha, that there’s just a tiny topic i’m sure.. )
We're using the CMS Expression Engine, but I'd like to hear about any strategies. They don't have to be EE-specific.
If you're talking about images then you could use PHP to add a watermark to the images.
How can I add an image onto an image in PHP like a watermark
its a tool to help track down the lazy copiers who just copy the source code as-is. this is not preventative, nor is it a deterrent. – Ian 12 hours ago
Going by your above comment you are happy with users copying your content, just not without the formatting etc. So what you could do is provide the users an embed type of source code for that particular content just like YouTube does with videos. Into that embed source code you could add your own links back to your site, utilize your own CSS etc.
That way you can still allow the members to use the content but it will always come out the way you intended it with links back to your site.
Thanks
You could always cache a version that uses a special string, like #!username!#, and then later fill it in with PHP based on which user is viewing it.
Another way I believe is to switch from caching on the server to instead let the browser cache it locally for a little. That way it is only cached per user, and it reduces the calls to your database. Because an article is pretty static, you could just let the local computer cache it, and pull in comments via javascript.
This last one is probably not one you are really looking for, but I'm gonna come out and say it anyway. You could not treat your users like thieves, and instead treat the thieves as thieves. Go to the person hosting the servers your content is on and send them an email telling them copyrighted premium content is being hosted on their servers without your permission. You can even automate that process.
How to find out what sites are posting your content? Put a link in the body content to your site, and do a Google Search/Blog Search for articles linking to that site. To automate it, use Google Blog Search because it offers RSS feeds. Any one that has a link back to your site could go into a database with a link to the page, someone could look at it, and if it is the entire article, go do a Whois and send them an email.
What makes you think adding css to something is going to stop people from copying it without that CSS? It's more likely that they are just coping the source of the content you are showing them and ignoring all the styling around it. For example, I use tamper data to look at all HTTP requests made by Firefox, if I can see it on the page, I can see it in the logs. Even with all the "protection" some sites try to put in place, they generally will never work. I can grab what I want, without using any screen capture/recording.
If you were serving flv's, for example, I would easily be able to grab the source of that even if you overlayed it with some CSS. I think the best approach would be to get the sites publishing your premium content and ask them to remove it. It's either that or watermark the actual content on the fly while sending it to the browser.
When I look at Amazon.com and I see their URL for pages, it does not have .htm, .html or .php at the end of the URL.
It is like:
http://www.amazon.com/books-used-books-textbooks/b/ref=topnav_storetab_b?ie=UTF8&node=283155
Why and how? What kind of extension is that?
Your browser doesn't care about the extension of the file, only the content type that the server reports. (Well, unless you use IE because at Microsoft they think they know more about what you're serving up than you do). If your server reports that the content being served up is Content-Type: text/html, then your browser is supposed to treat it like it's HTML no matter what the file name is.
Typically, it's implemented using a URL rewriting scheme of some description. The basic notion is that the web should be moving to addressing resources with proper URIs, not classic old URLs which leak implementation detail, and which are vulnerable to future changes as a result.
A thorough discussion of the topic can be found in Tim Berners-Lee's article Cool URIs Don't Change, which argues in favour of reducing the irrelevant cruft in URIs as a means of helping to avoid the problems that occur when implementations do change, and when resources do move to a different URL. The article itself contains good general advice on planning out a URI scheme, and is well worth a read.
More specifically than most of these answers:
Web content doesn't use the file extension to determine what kind of file is being served (unless you're Internet Explorer). Instead, they use the Content-type HTTP header, which is sent down the wire before the content of the image, HTML page, download, or whatever. For example:
Content-type: text/html
denotes that the page you are viewing should be interpreted as HTML, and
Content-type: image/png
denotes that the page is a PNG image.
Web servers often use the file extension if the file is served directly from disk to determine what Content-type to assign, but web applications can also generate pages with any Content-type they like in response to a request. No matter the filename's structure or extension, so long as the actual content of the page matches with the declared Content-type, the data renders as intended.
For websites that use Apache, they are probably using mod_rewrite that enables them to rewrite URLS (and make them more user and SEO friendly)
You can read more here http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
and here http://www.sitepoint.com/article/apache-mod_rewrite-examples/
EDIT: There are rewriting modules for IIS as well.
Traditionally the file extension represents the file that is being served.
For example
http://someserver/somepath/image.jpg
Later that same approach was used to allow a script process the parameter
http://somerverser/somepath/script.php?param=1234&other=7890
In this case the file was a php script that process the "request" and presented a dinamically created file.
Nowadays, the applications are much more complex than that ( namely amazon that you metioned )
Then there is no a single script that handles the request ( but a much more complex app wit several files/methods/functions/object etc ) , and the url is more like the entry point for a web application ( it may have an script behind but that another thing ) so now web apps like amazon, and yes stackoverflow don't show an file in the URL but anything comming is processed by the app in the server side.
websites urls without file extension?
Here I questions represents the webapp and 322747 the parameter
I hope this little explanation helps you to understand better all the other answers.
Well how about a having an index.html file in the directory and then you type the path into the browser? I see that my Firefox and IE7 both put the trailing slash in automatically, I don't have to type it. This is more suited to people like me that do not think every single url on earth should invoke php, perl, cgi and 10,000 other applications just in order to sent a few kilobytes of data.
A lot of people are using an more "RESTful" type architecture... or at least, REST-looking URLs.
This site (StackOverflow) dosn't show a file extension... it's using ASP.NET MVC.
Depending on the settings of your server you can use (or not) any extension you want. You could even set extensions to be ".JamesRocks" but it won't be very helpful :)
Anyways just in case you're new to web programming all that gibberish on the end there are arguments to a GET operation, and not the page's extension.
A number of posts have mentioned this, and I'll weigh in. It absolutely is a URL rewriting system, and a number of platforms have ways to implement this.
I've worked for a few larger ecommerce sites, and it is now a very important part of the web presence, and offers a number of advantages.
I would recommend taking the technology you want to work with, and researching samples of the URL rewriting mechanism for that platform. For .NET, for example, there google 'asp.net url rewriting' or use an add-on framework like MVC, which does this functionality out of the box.
In Django (a web application framework for python), you design the URLs yourself, independent of any file name, or even any path on the server for that matter.
You just say something like "I want /news/<number>/ urls to be handled by this function"