I am using HtmlAgilityPack to get the meta and othe descriptions of the page. The code works find for the simple websites such as Tumblr., Twitter, Stack Overflow.
But when I try to load major sites, such as Google it shows me just a title as Google and no description tag. Similary for Facebook it shows me no description but for the title it shows me Update your browser | Facebook.
I am new to this package, I downloaded latest version of it from NuGet package in MS WebMatrix. The code I am using is as:
#using HtmlAgilityPack;
#{
Layout = "~/_SiteLayout.cshtml";
var Title = "";
var Description = "";
using(var client = new WebClient()){
var html = client.DownloadString("http://www.facebook.com");
var doc = new HtmlDocument();
doc.LoadHtml(html);
var title = doc.DocumentNode.Descendants("title").FirstOrDefault();
if(title != null){
Title = title.InnerText;
}
var description = doc.DocumentNode.Descendants("meta")
.Where(n => n.GetAttributeValue("name", String.Empty)
.Contains("description")).FirstOrDefault();
if(description != null){
Description = description.GetAttributeValue("content", string.Empty);
}
}
}
This issue, by name, looks like an old browser issue. How to fix this?
After searching for a long time for this, I got the solution from Mike Brind on ASP.NET Forums.
var Image = "";
using(var client = new WebClient()){
client.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36");
client.Headers.Add("method", "GET");
client.Headers.Add("version", "HTML/1.1");
client.Headers.Add("accept",
"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
var html = client.DownloadString("http://www.google.com");
var doc = new HtmlDocument();
doc.LoadHtml(html);
var title = doc.DocumentNode.Descendants("title").FirstOrDefault();
if(title != null){
Title = title.InnerText;
}
var description = doc.DocumentNode.Descendants("meta")
.Where(n => n.GetAttributeValue("name", String.Empty)
.Contains("description")).FirstOrDefault();
if(description != null){
Description = description.GetAttributeValue("content", string.Empty);
}
var image = doc.DocumentNode.Descendants("link")
.Where(n => n.GetAttributeValue("rel", String.Empty)
.Contains("shortcut icon")).FirstOrDefault();
if(image != null) {
Image = image.GetAttributeValue("href", string.Empty);
}
}
This was the code required for this. Actually the main thing was, when the person creates a new request from his Computer, Browser sends some details to the server, whereas I wasn't sending anything. That's why, Facebook and Google were not returning anything to me. When I tried to include some fake headers, they gave me the details I needed from them.
After that, it was good to go!
Related
In scraping the following website, I am not get the table in order to scrape. I am waiting for the dynamic text to load. But I never see the results of the correct table.
https://masseyratings.com/nba/games
Here is my Agility Pack code:
var url = "https://masseyratings.com/nba/games";
HtmlWeb web = new HtmlWeb();
var doc = web.LoadFromBrowser(url, o =>
{
var webBrowser = (WebBrowser)o;
// WAIT until the dynamic text is set
return !string.IsNullOrEmpty(webBrowser.Document.GetElementById("mytable0").InnerText);
});
int docLen = doc.Text.Length;
currentSiteData = doc.Text.ToString();
I am not getting any error, I am just not seeing the table of data. And strangely, the HTML tags are getting capitalized.
How can I get the correct data into the currentsiteData variable to further process?
I was able to fix the problem by using the "PuppeteerSharp" and "AngleSharp" nuget package.
Here is my code that works.
using PuppeteerSharp;
using AngleSharp;
var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true
});
var page = await browser.NewPageAsync();
await page.GoToAsync("https://masseyratings.com/nba/games");
var content = await page.GetContentAsync();
var context = BrowsingContext.New(AngleSharp.Configuration.Default);
var document = await context.OpenAsync(req => req.Content(content));
var currentSiteData = document.Source.Text.ToString();
This is a followup to my previous question: Xamarin.Forms App return data to calling App
That works perfectly and I can share images to anywhere, except to Facebook comments. When I click the camera on the content box the app can be selected, I can select the image, Set result and Finish are called, and the app closes and it sends data to Facebook, and then however I then get the error : The image could not be uploaded, try again?
I can't find any fundamental differences between posting to a status or a comment, so I'm guessing it's subtle. Any thoughts on how I can change my intent to post properly?
Adding for completeness:
Bitmap b = null;
string url;
if (!string.IsNullOrEmpty(this.saleItems[i].ImageUrl))
{
url = this.saleItems[i].ImageUrl;
}
else
{
url = await FileHelper.GetLocalFilePathAsync(this.saleItems[i].Id);
}
//download
using (var webClient = new WebClient())
{
var imageBytes = webClient.DownloadData(url);
if (imageBytes != null && imageBytes.Length > 0)
{
b = BitmapFactory.DecodeByteArray(imageBytes, 0, imageBytes.Length);
}
}
//set local path
var tempFilename = "test.png";
var sdCardPath = Android.OS.Environment.ExternalStorageDirectory.AbsolutePath;
var filePath = System.IO.Path.Combine(sdCardPath, tempFilename);
using (var os = new FileStream(filePath, FileMode.Create))
{
b.Compress(Bitmap.CompressFormat.Png, 100, os);
}
b.Dispose();
var imageUri = Android.Net.Uri.Parse($"file://{sdCardPath}/{tempFilename}");
var sharingIntent = new Intent();
sharingIntent.SetAction(Intent.ActionSend);
sharingIntent.SetType("image/*");
sharingIntent.PutExtra(Intent.ExtraText, "some txt content");
sharingIntent.PutExtra(Intent.ExtraStream, imageUri);
sharingIntent.AddFlags(ActivityFlags.GrantReadUriPermission);
//await SaleItemDataService.Instance.BuySaleItemAsync(this.saleItem);
SetResult(Result.Ok, sharingIntent);
Finish();
Use below:
Intent sharingIntent = new Intent();
string imageUri = "file://" + requestedUri;
sharingIntent.SetData(Android.Net.Uri.Parse(imageUri));
private async void refresh_Tapped(object sender, TappedRoutedEventArgs e)
{
httpclient.CancelPendingRequests();
string url = "http://gensav.altervista.org/";
var source = await httpclient.GetStringAsync(url); //PROBLEM
source = WebUtility.HtmlDecode(source);
HtmlDocument result = new HtmlDocument();
result.LoadHtml(source);
List<HtmlNode> toftitle = result.DocumentNode.Descendants().Where
(x => (x.Attributes["style"] != null
&& x.Attributes["style"].Value.Contains("font-size:14px;line-height:20px;margin-bottom:10px;"))).ToList();
var li = toftitle[0].InnerHtml.Replace("<br>", "\n");
li = li.Replace("<span style=\"text-transform: uppercase\">", "");
li = li.Replace("</span>", "");
postTextBlock.Text = li;
}
What this code does is basically retrieve a string from a website (HTML source which is parsed right after). This code is executed whenever i click a button: the first time i click it it works correctly, but the second time i think that the method (GetStringAsync) returns an uncompleted task and then execution continues using the old value of source. Indeed, my TextBlock does not update.
Any solution?
You get probably a cached response.
May this will work for you:
httpclient.CancelPendingRequests();
// disable caching
httpclient.DefaultRequestHeaders.Add("Cache-Control", "no-cache");
string url = "http://gensav.altervista.org/";
var source = await httpclient.GetStringAsync(url);
...
You can also add a meaningless value to your url like this:
string url = "http://gensav.altervista.org/" + "?nocahce=" + Guid.NewGuid();
To prevent Http responses from getting cached, I do this (in WP8.1):
HttpBaseProtocolFilter filter = new HttpBaseProtocolFilter();
filter.CacheControl.ReadBehavior =
Windows.Web.Http.Filters.HttpCacheReadBehavior.MostRecent;
filter.CacheControl.WriteBehavior =
Windows.Web.Http.Filters.HttpCacheWriteBehavior.NoCache;
_httpClient = new HttpClient(filter);
Initialize your HttpClient in this manner to prevent caching behaviour.
I am dynamically generating images through C# , and when the page loads I want the user to be able to view more than one image at a time inside a flipview, but unfortunately when I am adding the images inside the flipview, only one image is visible at a time, and user has to scroll to watch another image. If I use gridview that solves my problem , but i don't want the gridview style scrollbar. Here is my code.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new
Uri(#"http://indiamp3.com/music/index.php"));
HttpWebResponse response = (HttpWebResponse)request.GetResponseAsync().Result;
StreamReader sr = new StreamReader(response.GetResponseStream());
string stream = sr.ReadToEnd();
string reader = stream.Split(new string[] { "IndiaMp3.Com - Download & play free
Indian hindi mp3 songs", "Random Albums" },
StringSplitOptions.None)[1];
List<string> readerLink = new List<string>();
readerLink.AddRange(reader.Split(new string[] { "<img src='", ".jpg'" },
StringSplitOptions.None));
readerLink.RemoveAll(s => s.Contains("audio") == false);
List<string> songList = new List<string>();
List<string> imageLink = new List<string>();
imageLink.AddRange(reader.Split(new string[] { "href='", "'>" },
StringSplitOptions.None));
imageLink.RemoveAll(s => s.Contains("action=") == false);
imageLink = imageLink.Distinct().ToList();
List<Image> images = new List<Image>();
foreach (string link in readerLink)
{
songList.Add(link.Split('/')[3]);
// imageLink.Add("http://indiamp3.com/music/" + link + ".jpg");
image = new Image();
image.Source = new BitmapImage(new Uri(#"http://indiamp3.com/music/" +
link + ".jpg", UriKind.Absolute));
image.Stretch = Stretch.None;
image.HorizontalAlignment = HorizontalAlignment.Left;
image.VerticalAlignment = VerticalAlignment.Center;
images.Add(image);
fvLatest.Items.Add(image); // fvLatest is the name of the flipview
}
I've searched for the same and the only answer I found is this:
I do not think FlipView is design for that, if you want to display multipe flip view, you could try to use other controls to render your content, and navigate by code.
best regards,
Sheldon _Xiao[MSFT]
# MSDN
I'm trying to upload photos into Facebook by using MVC3 C#. The code is running successfully but the photos are not uploaded into Facebook. I'm having add ID and App Secret. I tried many ways and for many days I worked hard, but the result is zero. Here is the code of my controller
[HttpPost][HttpGet]
public ActionResult Profile(HttpPostedFileBase file, FacebookOAuthResult facebookOAuthResult) {
dynamic args = new ExpandoObject();
args = new Dictionary<string, object>();
args["message"] = "hi";
args["picture"] = "http://apps.facebook.com/Uploads/photos";
string accesstoken=FacebookWebContext.Current.AccessToken;
FacebookClient fbApp = new FacebookClient(accesstoken);
try {
fbApp.Post("MYAPPID" + "/Photos", args);
} catch (FacebookOAuthException ex) {
//
}
// Verify that the user selected a file
if (file != null && file.ContentLength > 0) {
var path1 = Path.Combine(Server.MapPath("~/Content/uppoads"), file.FileName);
//file.SaveAs(path1);
fbApp.Post("MYAPPID" + "/photos", path1);
}
// redirect back to the index action to show the form once again
return RedirectToAction("Profile");
}
Could anyone help me to find the solution? Thanks in advance.
You are Post the local path of the photo to Facebook, FB doesn't know what it is.
You should post the photo as binary in the Post body.
var media = new Facebook.FacebookMediaObject();
var filebytes = System.IO.File.ReadAllBytes(path1);
media.SetValue(filebytes);
fbApp.Post("248050331932489" + "/photos", media);