Ajax Crawlable site - Get HTML Snapshot using WaTin

Ajax Crawlable site - Get HTML Snapshot using WaTin - ajax

Id like to use Watin to get html snapshots to make my single page ajax app crawlable as per google specifications.
Can somebody please show me code example how to return an html snapshot to googlebot using Watin?
This is what I have so far but honestly Im not even sure where Im going with this I really havent found a single example on the web that would really clear it up:
Controller:
public PartialViewResult Solutions()
{
//Googlebot request
HtmlString htmlSnapshot = (HtmlString)GetHtmlSnapshot(Request.Path);
//return ?????????
//Normal user request
return PartialView("Solutions", null);
}
public PartialViewResult MarketData()
{
return PartialView("MarketData", null);
}
public IHtmlString GetHtmlSnapshot(string uglyUrl)
{
if (uglyUrl.Contains("?_escaped_fragment_="))
{
string prettyUrl = uglyUrl.Replace("?_escaped_fragment=", "");
string decodedUrl = HttpUtility.UrlDecode(prettyUrl);
FireFox firefox = new FireFox();
firefox.GoTo(decodedUrl);
//?????????
}
}

Related

How to get rid of forcefully added underscore with timestamp

I'm working on a project with MVC5 and ran into an issue of MVC trying to "help" me, i think. What i'm doing is trying to tidy up views by extracting their javascript without fiddling around with the bundle config.
Messing around with generic loading in the bundle folder would be an option but i'd rather avoid that.
So here's my code, which is working:
public static string GetControllerName(this ControllerBase controller)
{
var name = controller.GetType().Name;
if (name.ToLower().EndsWith("controller"))
return name.Substring(0, name.Length - 10);
return name;
}
public static IHtmlString LoadViewScripts(this HtmlHelper helper)
{
var view = helper.ViewContext.View as RazorView;
if (view != null && !string.IsNullOrEmpty(view.ViewPath))
{
var controllerName = helper.ViewContext.Controller.GetControllerName();
var vPath = string.Format("~/Scripts/View/{0}/{1}.js", controllerName, Path.GetFileName(view.ViewPath));
var realPath = helper.ViewContext.HttpContext.Server.MapPath(vPath);
if (File.Exists(realPath))
{
var urlPath = UrlHelper.GenerateContentUrl(vPath, helper.ViewContext.HttpContext);
var lwrite = File.GetLastWriteTime(realPath);
return new HtmlString(string.Format("<script type='text/javascript' language='javascript' src='{0}?v={1}'></script>", urlPath, lwrite.Ticks));
}
}
return new HtmlString(string.Empty);
}
Result i am passing to src attribute:
http://localhost/AppName/Scripts/View/Map/SelectSite.cshtml.js?v=635569414679160391
What i am getting (network view of firefox):
http://localhost/AppName/Scripts/View/Map/SelectSite.cshtml.js?v=635569414679160391&_=1421342715176
What i am expecting (network view of firefox):
http://localhost/AppName/Scripts/View/Map/SelectSite.cshtml.js?v=635569414679160391
Does anyone know how to stop MVC from "helping" me cache/refresh things in this particular case?

Ok. So it turns out there was 2 reasons why it wasnt working for me.
The kendo property to disable caching which is enabled by default wasn't in the api. Calling it like this fixed my window issue.
content: {
url: targetUrlSiteSelect,
cache : true
}
My additional script tag still contained that the underscore element. I later on found that solution contained within a comment here:
jQuery version 1.5 - ajax - <script> tag timestamp problem
Comment by Blaise:
$.ajaxPrefilter('script', function(options) { options.cache = true; });

Session id issue: Different id for vf and developer console

I am facing following issue:
On load of visualforce page I am making http callout to internal salesforce page but here I facing authentication problem.
If I am running same http callout from developer console then I am getting successful response but that same code is not working with visualforce page. Reason for not working is my session id in developer console and visualforce domain is different.
For fetching session id I am using "UserInfo.getSessionId()"
I have also tried {!$Api.Session_ID} but not working
My controller:
public with sharing class HttpRequestForPage
{
public HttpRequestForPage()
{
requestForPage('https://ap1.salesforce.com/home/home.jsp');
}
public void requestForPage(String pageUrl)
{
HttpResponse responseOfPage;
String responseString;
HttpRequest request = new HttpRequest();
request.setMethod('GET');
request.setEndpoint('https://ap1.salesforce.com/home/home.jsp');
request.setHeader('Cookie', 'sid='+UserInfo.getSessionId());
try
{
responseOfPage = new Http().send(request);
}
catch(Exception e)
{
system.debug(e);
}
responseString = responseOfPage.getBody();
System.debug(responseString=='+responseString);
}
}

Rather than setting a cookie for the session try using the authorization header.
request.setHeader('Authorization','Bearer '+UserInfo.getSessionId());
You will also need to set ap1.salesforce.com as an endpoint in the remote site settings.
If you are requesting pages from within Salesforce you can just use the PageReference getContent() method.
PageReference home = new PageReference('https://ap1.salesforce.com/home/home.jsp');
blob homeblob = home.getContent();
string homeContent = homeblob.toString();

Testing of the PhpBB forum

I am testing forum PhpBB with help of HTML unit. I want to add message in one of topic of PhpBB forum. I enter the message in textarea and click the submit button, but new page doesn't have my message. What seems to be the problem?
final ArrayList<HtmlForm> formList=(ArrayList<HtmlForm>) page.getByXPath("//form[#id='postform']");
final HtmlTextArea myMessage=formList.get(0).getTextAreaByName("message");
myMessage.type("text");
final HtmlSubmitInput submit_post = formList.get(0).getInputByName("post");
page=submit_post.click();
if (page.asText().contains("text")) {
System.out.println("right");
}
Thank you for help!

One way around this is to construct a custom http post request.
Use FIddler to capture the traffic and then replicate it with HTMLUnit. It should look something like below
public Page postMessage() throws Exception
{
URL url = new URL("YOURURL");
WebRequest requestSettings = new WebRequest(url, HttpMethod.POST);
requestSettings.setAdditionalHeader("Accept", "*/*");
requestSettings.setAdditionalHeader("Content-Type", "application/x-www-form-urlencoded");
requestSettings.setAdditionalHeader("Referer", "REFERER");
requestSettings.setAdditionalHeader("Accept-Language", "en-US,en;q=0.8");
requestSettings.setAdditionalHeader("Accept-Encoding", "gzip,deflate,sdch");
requestSettings
.setRequestBody("YOURPOSTREQUEST");
Page page = webClient.getPage(requestSettings);
return page;
}

Returning 'other than HTML' formatted errors for AJAX calls to Web API

Investigating the Web API as part of an MVC 4 project as an alternative way to provide an AJAX-based API. I've extended AuthorizeAttribute for the MVC controllers such that, if an AJAX request is detected, a JSON-formatted error is returned. The Web API returns errors as HTML. Here's the AuthorizeAttribute that I'm using with the MVC controllers:
public class AuthorizeAttribute: System.Web.Mvc.AuthorizeAttribute
{
protected override void HandleUnauthorizedRequest(AuthorizationContext filterContext)
{
base.HandleUnauthorizedRequest(filterContext);
filterContext.Result = new RedirectToRouteResult(
new RouteValueDictionary
{
{ "area", "" },
{ "controller", "Error" },
{ "action", ( filterContext.HttpContext.Request.IsAjaxRequest() ? "JsonHttp" : "Http" ) },
{ "id", "401" },
});
}
}
How could I reproduce this to provide equivalent functionality for the Web API?
I realize that I need to extend System.Web.Http.AuthorizeAttribute instead of System.Web.Mvc.AuthorizeAttribute but this uses an HttpActionContext rather than an AuthorizationContext and so I'm stuck by my limited knowledge of the Web API and the seemingly incomplete documentation on MSDN.
Am I even correct in thinking that this would be the correct approach?
Would appreciate any guidance.

To get the equivalent functionality in a Web API filter you can set the HttpActionContext.Response property to an instance of HttpResponseMessage that has the right redirect status code and location header:
protected override void HandleUnauthorizedRequest(HttpActionContext actionContext) {
var response = new HttpResponseMessage(HttpStatusCode.Redirect);
response.Headers.Location = new Uri("my new location");
actionContext.Response = response;
}

I would very much go with Marcin's answer - at the end of the day, he has written the code!
All I would add is that as Marcin is saying, your best bet is to have a dedicated controller to return the errors as appropriate - rather than setting the response code 401 with JSON content in the attribute.
The main reason is that Web API does the content-negotiation for you and if you want to do it yourself (see if you need to serve JSON or HTML) you lose all that functionality.

Sitecore Extranet login and PDF converter or Webclient

I'm having a problem using Winnovatives PDFConverter on pages that are protected by Extranet security (which is based on ASP.Net Membership).
I've tried several different approaches, but the following I can get to work on my local machine, but not anywhere else.
Code for login page, this code should bypass the login process for:
// check that the current "user" isn't logged in and is the Winnovative UserAgent
if (!Sitecore.Context.IsLoggedIn && Request.UserAgent.Contains(".NET CLR"))
{
//Login with a dummy user I've created
Sitecore.Security.Authentication.AuthenticationManager.Login("extranet\\pdf", "pdf", true);
//redirect to former page
}
The page that generates the PDF uses this code:
private void PDFPrint(string url)
{
PdfConverter pdfConverter = new PdfConverter();
pdfConverter.LicenseKey = "our license";
url = Request.Url.Scheme + "://" + Request.Url.Host + url;
byte[] downloadBytes = pdfConverter.GetPdfFromUrlBytes(url);
HttpResponse response = HttpContext.Current.Response;
response.Clear();
response.AddHeader("Content-Type", "binary/octet-stream");
response.AddHeader("Content-Disposition", "attachment; filename=" + Sitecore.Context.Item.Name + ".pdf" + "; size=" + downloadBytes.Length.ToString());
response.Flush();
response.BinaryWrite(downloadBytes);
response.Flush();
response.End();
}
The Exception I'm getting is this:
"Could not get the metafile from url. Could not get image from url.The URL is not accessible.."
I've also tried this trick from the Winnovative FAQ to no avail:
http://www.winnovative-software.com/FAQ.aspx#authenticationQ
I've also tried to use WebClient or HttpWebRequest to retrieve the content.
But nothing I do seems to work other than locally.
Basically I want to create a way of either getting Winnovatives converter to use the current logged in user, my custom "pdf" user og some other way of getting the html from the response.
I hope this question isn't too vague, but I find it kinda hard to ask. But basically I want to get some html content from a page on a Sitecore solution I control, which is protected by Sitecore normal Extranet security. This html content should be in string or byte[] format.
Help me Stackoverflowers, you're my only hope! :P

I contacted Sitecore to ask if they had a solution.
Their solution was to create a Processor that would set an active user based on some criteria.
This is the code I made for my site (it's probably not the best solution as UserAgent can be spoofed):
public class MyResolver : HttpRequestProcessor
{
// Methods
public override void Process(HttpRequestArgs args)
{
var userAgent = args.Context.Request.UserAgent ?? "";
SiteContext site = Sitecore.Context.Site;
if (site.Name == "site_name_in_webconfig" && userAgent.Contains("this_should_only_be_in_thepdfcreators_userAgent"))
{
Sitecore.Security.Accounts.User pdfuser = Sitecore.Security.Accounts.User.FromName("extranet\\theUser", true);
AuthenticationManager.SetActiveUser(pdfuser);
}
}
}
and then add the following to the web.config, before the UserResolver:
<processor type="Namespace.MyResolver, Assembly" />
I hope this will help some others out there.

I've found a similar issue on the ASP.NET forums and the answer to that was to use a newer version of the PDF tool: SessionState Problems ?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ajax Crawlable site - Get HTML Snapshot using WaTin - ajax

Related

How to get rid of forcefully added underscore with timestamp

Session id issue: Different id for vf and developer console

Testing of the PhpBB forum

Returning 'other than HTML' formatted errors for AJAX calls to Web API

Sitecore Extranet login and PDF converter or Webclient

Categories

Resources