Serving HTML snapshot from alternative server when using #! and _escaped_fragment_ - ajax

We currently are developing a set of (jquery)plugins to use our software from the sites of our clients. Our clients include these plugins in their pages, the plugins get their data from our API (on another server) and render the application in the client-page.
Within the plugins de user navigates through hash-changes. We really like to do some SEO for these urls. Google provides a way to index ajax-enabled sites through providing HTML snapshots as alternative content: https://developers.google.com/webmasters/ajax-crawling/docs/specification.
This is all very nice when serving the ajax-enabled application from your own server. But in our situations, the host-page is served on client.com, and our javascripts are served from include.oursystem.com. There plugins get their data through CORS-calls from api.oursystem.com. We have no access to the server of client.com.
Long story, short question: is there a way of serving the HTML snapshots from an alternative server we control, e.g. from api.oursystem.com? Or are there other alternative for indexing our application on the client pages?

Related

Can I use github pages to host a web page made with Spring framework(.jsp)? hosting github pages

As it the title states itself,
I was wondering if I could host a web page using github pages, and the web page is made with Spring framework(.jsp files..).
Or to use gitgub pages, only web languages(html,js..etc) can be used?
Nothing can be executed on the server side on GitHub, beside GitHub Actions.
That means GitHub can only host the sources for your project, but cannot be a release/deployment destination.
For instance, Heliohost might have a free tier allowing JSP deployment.

How to launch and serve subpage from my webapplication?

I am using Spring to create a web application in which a user can upload a zipped folder containing an index.html file along with all it's resources(pretty much like an Adobe captivate generated webpage). The user should be able to request the uploaded web pages in the form of inner web pages.
I can only go as far as unzipping the folder itself, but I have no idea how to launch the index.html present inside the zipped folder.
How do I achieve this?
Quite honestly Spring has no restrictions or advantages over
displaying your subpages inside another page. However you can use Spring MVC to dynamically serve the web pages from the uploaded folder.
More over you have to play the tricks from browser side. Going with iFrame seems to be the best option from client side, though there are many other options. Please check this thread.
You can write some smart APIs in SpringController which accepts the folder path or folder name as parameter, picks the necessary pages from the requested folder and serves the user.
Another approach could be to use a headless browser for the server side rendering and give the output as screenshots to client. This can render the pages server side. Please check this thread for more details.
I hope this helps you!

SEO-friendly static snapshots of single-page web apps without using a dynamic server

Scenario:
I'm hosting a single-page application on GitHub Pages. When accessed via browser, JavaScript processes #! URLs and updates the UI.
I also want my app's content to appear in search results. Since I'm using GulpJS already, I want to add a task to create and save pre-rendered HTML snapshots of my page to accommodate web crawlers.
Problem:
My content is served by GitHub Pages, so I don't have any way to handle the _escaped_fragment_ parameter that web crawlers send. Every tutorial I've found on AJAX SEO assumes you are hosting your content on e.g. a NodeJS or Apache server and have a way to process such URL parameters.
Question:
Can a single-page web app be SEO-friendly when hosted on a static file server (e.g. GitHub Pages)? Is there a special directory I can use? Some special Sitemap or robots.txt configuration? Something else?
What I've found already:
Frequently Asked Questions (via Google Webmasters)
How do I create an HTML snapshot? (via Google Webmasters)
AngularJS and SEO (via Yearofmoo)

Having multiple AngularJS apps for one site?

I am developing a site that can be broken down to a handful of main pages. These pages can be thought as isolated from each other, except they share the session data (ie. session id and logged-in username).
Initially, I was gonna build the site as a SPA using ng-view (ie. make the pages into AngularJS views). But then, I don't see any benefits for my site to be implemented in that way. And it would require extra time and efforts to make it support SEO (Making AJAX Applications Crawlable).
Going with an approach that does not provide any benefits and even creates extra workload doesn't seem to be too smart. So I thought to myself, why don't I make the main pages of my site into individual AngularJS apps. The parts of the site that need to be indexed by search engines are simply the initial screens of some of those apps, so I wouldn't need to do extra work for SEO. (Note: The initial screens are rendered by the Django server with data for search engines to crawl, so they are non-blank.)
For each of the apps, it may or may not have its own set of partials, depending on the requirements on it.
Example:
mydomain.com/item_page/1234 (load "item" app)
mydomain.com/dashboard (load "dashboard" app)
mydomain.com/account (load "account" app and default to "tab_1" view)
mydomain.com/account#tab_1 (load "tab_1" view of "account" app)
mydomain.com/account#tab_2 (load "tab_2" view of "account" app)
mydomain.com/post_item (load "post" app)
This is solely my random thought and I haven't seen any AngularJS examples that are comprised of multiple AngularJS apps. I would like to know:
Is the multiple-AngularJS-apps for one site approach feasible? What are some caveats that I should be aware of? Are there any example site out there in the wild that's taking this approach?
If feasible, how do I share the session data between the apps?
Note this post is about multiple AngularJS apps for one site, not multiple AngularJS apps on the same page.
There is nothing wrong with such approach, as long as you keep the size of downloaded JS script small enough, and ensure good caching. One of examples of such applications can be GitHub (they are not using angular, but approach is the same). When you go Issues page on GitHub, it loads an html page, common Github JS libraries and page specific JS code. Navigation and actions inside page, are handled by that single page specific script. If you go to other section (like Code) a new page with new page specific JS code will be loaded. Another example is Amazon AWS console, they even use different frameworks for different pages. (both GitHub and Amazon don't use Angular, but this approach works for any JS based framework, even for GWT).
As for sharing some session data between pages, you can embed this info directly in the page itself, using inline scripts or hidden elements. E.g. when your server is generating page, it should also generate some session information into the page. Another approach is to download session data once, and store them in local storage/session storage.

Google crawl ajax / dynamically generated content - SEO

I've got a very unique situation that I don't believe any of the other topics here can relate.
I have a ecommerce module that is dynamically loaded / embedded into third party sites, no iframe straight JSON to web client into content. I have no access to these third part sites at all, other then my javascript file being loaded from their page and dynamically generating the content.
I'm aware of the #! method, but that's no good here, my JS does generate "urls" within the embedded platform, but they're fake and for the address bar only, and I don't believe google crawlers can reach this far.
So my question is, is there a meta that we can set to point outside the url to i.e. back to my server with static crawlable content. I.e. pointing the canonical to my server... but again I don't think that would work.
If you implement #! then you have to make sure the url your embedded in supports the fragment parameter versions, which you probably can't. It's server side stuff.
You probably can't influence the canonical tag of the page either. It again has to be done server side. Any meta tag you set via JavaScript will not be seen by a bot.
Disqus solved the problem by providing an API so the embedding websites could get there comments server side and render then in plain html. WordPress has a plugin to do this. Disqus are also one of the few systems that Google has worked out how to crawl their AJAX pages.
Some plugins request people to also include a plain link with the JavaScript. Be careful with this as you may break Google Guidelines if you do it wrong. But you may be able to integrate the plain link with your plugin so that it directs bots and users to a crawlable version of the content.
Look into Google's crawlable ajax standard (and why it's a bad idea) and canonical URLs.
Now you can actually do this. A complete guide and examples can be found here: https://github.com/kubrickology/Logical-escaped_fragment

Resources