I've got a partial view, i'm trying to use ITextSharp to convert the html to pdf. How can I convert the html to string so I can use ItextSharps HtmlParser?
I've tried something like this with no luck...any ideas?:
var contents = System.IO.File.ReadAllText(Url.Action("myPartial", "myController", new { id = 1 }, "http"));
I have created a special ViewResult class that you can return as the result of an Action.
You can see the code on bitbucket (look at the PdfFromHtmlResult class).
So what it basically does is:
Render the view through the Razor engine (or any other registered engine) to Html
Give the html to iTextSharp
return the pdf as the ViewResult (with correct mimetype, etc).
My ViewResult class looks like:
public class PdfFromHtmlResult : ViewResult {
public override void ExecuteResult(ControllerContext context) {
if (context == null) {
throw new ArgumentNullException("context");
}
if (string.IsNullOrEmpty(this.ViewName)) {
this.ViewName = context.RouteData.GetRequiredString("action");
}
if (this.View == null) {
this.View = this.FindView(context).View;
}
// First get the html from the Html view
using (var writer = new StringWriter()) {
var vwContext = new ViewContext(context, this.View, this.ViewData, this.TempData, writer);
this.View.Render(vwContext, writer);
// Convert to pdf
var response = context.HttpContext.Response;
using (var pdfStream = new MemoryStream()) {
var pdfDoc = new Document();
var pdfWriter = PdfWriter.GetInstance(pdfDoc, pdfStream);
pdfDoc.Open();
using (var htmlRdr = new StringReader(writer.ToString())) {
var parsed = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(htmlRdr, null);
foreach (var parsedElement in parsed) {
pdfDoc.Add(parsedElement);
}
}
pdfDoc.Close();
response.ContentType = "application/pdf";
response.AddHeader("Content-Disposition", this.ViewName + ".pdf");
byte[] pdfBytes = pdfStream.ToArray();
response.OutputStream.Write(pdfBytes, 0, pdfBytes.Length);
}
}
}
}
With the correct extension methods (see BitBucket), etc, the code in my controller is something like:
public ActionResult MyPdf(int id) {
var myModel = findDataWithID(id);
// this assumes there is a MyPdf.cshtml/MyPdf.aspx as the view
return this.PdfFromHtml(myModel);
}
Note: Your method does not work, because you will retrieve the Html on the server, thereby you loose all cookies (=session information) that are stored on the client.
Related
I am trying to fill and combine multiple forms without flattening(need to keep them interactive for users). However I notice a problem. I have PDF files that contain the forms I am trying to fill. The form fields have their fonts set in adobe PDF. I notice after I combine the forms the fields lose their original fonts. Here is my program.
using iText.Forms;
using iText.Kernel.Pdf;
using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;
namespace PdfCombineTest
{
class Program
{
static void Main(string[] args)
{
Stream file1;
Stream file2;
using (var stream = new FileStream("./pdf-form-1.pdf", FileMode.Open, FileAccess.Read))
{
file1 = Program.Fill(stream, new[] { KeyValuePair.Create("Text1", "TESTING"), KeyValuePair.Create("CheckBox1", "Yes") });
}
using (var stream = new FileStream("./pdf-form-2.pdf", FileMode.Open, FileAccess.Read))
{
file2 = Program.Fill(stream, new[] { KeyValuePair.Create("Text2", "text 2 text") });
}
using (Stream output = Program.Combine(new[] { file1, file2 }))
{
using (var fileStream = File.Create("./output.pdf"))
{
output.CopyTo(fileStream);
}
}
}
public static Stream Combine(params Stream[] streams)
{
MemoryStream copyStream = new MemoryStream();
PdfWriter writer = new PdfWriter(copyStream);
writer.SetSmartMode(true);
writer.SetCloseStream(false);
PdfPageFormCopier formCopier = new PdfPageFormCopier();
using (PdfDocument combined = new PdfDocument(writer))
{
combined.InitializeOutlines();
foreach (var stream in streams)
{
using (PdfDocument document = new PdfDocument(new PdfReader(stream)))
{
document.CopyPagesTo(1, document.GetNumberOfPages(), combined, formCopier);
}
}
}
copyStream.Seek(0, SeekOrigin.Begin);
return copyStream;
}
public static Stream Fill(Stream inputStream, IEnumerable<KeyValuePair<string, string>> keyValuePairs)
{
MemoryStream outputStream = new MemoryStream();
PdfWriter writer = new PdfWriter(outputStream);
writer.SetCloseStream(false);
using (PdfDocument document = new PdfDocument(new PdfReader(inputStream), writer))
{
PdfAcroForm acroForm = PdfAcroForm.GetAcroForm(document, true);
acroForm.SetGenerateAppearance(true);
IDictionary<string, iText.Forms.Fields.PdfFormField> fields = acroForm.GetFormFields();
foreach (var kvp in keyValuePairs)
{
fields[kvp.Key].SetValue(kvp.Value);
}
}
outputStream.Seek(0, SeekOrigin.Begin);
return outputStream;
}
}
}
I've noticed after several hours of debugging that PdfPageFormCopier excludes the default resources which contain fonts when merging form fields, is there a way around this? The project I'm working on currently does this process in ItextSharp and it works as intended. However we are looking to migrate to iText7.
Here are links to some sample pdf's I made I can't upload the actual pdf's I'm working with but these display the same problem.
https://www.dropbox.com/s/pukt91d4xe8gmmo/pdf-form-1.pdf?dl=0
https://www.dropbox.com/s/c52x6bc99gnrvo6/pdf-form-2.pdf?dl=0
So my solution was to modify the PdfPageFormCopier class from iText. The main issue is in the function below.
public virtual void Copy(PdfPage fromPage, PdfPage toPage) {
if (documentFrom != fromPage.GetDocument()) {
documentFrom = fromPage.GetDocument();
formFrom = PdfAcroForm.GetAcroForm(documentFrom, false);
}
if (documentTo != toPage.GetDocument()) {
documentTo = toPage.GetDocument();
formTo = PdfAcroForm.GetAcroForm(documentTo, true);
}
if (formFrom == null) {
return;
}
//duplicate AcroForm dictionary
IList<PdfName> excludedKeys = new List<PdfName>();
excludedKeys.Add(PdfName.Fields);
excludedKeys.Add(PdfName.DR);
PdfDictionary dict = formFrom.GetPdfObject().CopyTo(documentTo, excludedKeys, false);
formTo.GetPdfObject().MergeDifferent(dict);
IDictionary<String, PdfFormField> fieldsFrom = formFrom.GetFormFields();
if (fieldsFrom.Count <= 0) {
return;
}
IDictionary<String, PdfFormField> fieldsTo = formTo.GetFormFields();
IList<PdfAnnotation> annots = toPage.GetAnnotations();
foreach (PdfAnnotation annot in annots) {
if (!annot.GetSubtype().Equals(PdfName.Widget)) {
continue;
}
CopyField(toPage, fieldsFrom, fieldsTo, annot);
}
}
Specifically the line here.
excludedKeys.Add(PdfName.DR);
If you walk the the code in the CopyField() function eventually you will end in the PdfFormField class. You can see the constructor below.
public PdfFormField(PdfDictionary pdfObject)
: base(pdfObject) {
EnsureObjectIsAddedToDocument(pdfObject);
SetForbidRelease();
RetrieveStyles();
}
The function RetrieveStyles() will try to set the font for the field based on the default appearance. However that will not work. Due to the function below.
private PdfFont ResolveFontName(String fontName) {
PdfDictionary defaultResources = (PdfDictionary)GetAcroFormObject(PdfName.DR, PdfObject.DICTIONARY);
PdfDictionary defaultFontDic = defaultResources != null ? defaultResources.GetAsDictionary(PdfName.Font) :
null;
if (fontName != null && defaultFontDic != null) {
PdfDictionary daFontDict = defaultFontDic.GetAsDictionary(new PdfName(fontName));
if (daFontDict != null) {
return GetDocument().GetFont(daFontDict);
}
}
return null;
}
You see it is trying to see if the font exists in the default resources which was explicitly excluded in the PdfPageFormCopier class. It will never find the font.
So my solution was to create my own class that implements the IPdfPageExtraCopier interface. I copied the code from the PdfPageFormCopier class and removed the one line excluding the default resources. Then I use my own copier class in my code. Not the prettiest solution but it works.
I am attempting to upload an image using MVC 6; however, I am not able to find the class HttpPostedFileBase. I have checked the GitHub and did not have any luck. Does anyone know the correct way to upload a file in MVC6?
MVC 6 used another mechanism to upload files. You can get more examples on GitHub or other sources. Just use IFormFile as a parameter of your action or a collection of files or IFormFileCollection if you want upload few files in the same time:
public async Task<IActionResult> UploadSingle(IFormFile file)
{
FileDetails fileDetails;
using (var reader = new StreamReader(file.OpenReadStream()))
{
var fileContent = reader.ReadToEnd();
var parsedContentDisposition = ContentDispositionHeaderValue.Parse(file.ContentDisposition);
var fileName = parsedContentDisposition.FileName;
}
...
}
[HttpPost]
public async Task<IActionResult> UploadMultiple(ICollection<IFormFile> files)
{
var uploads = Path.Combine(_environment.WebRootPath,"uploads");
foreach(var file in files)
{
if(file.Length > 0)
{
var fileName = ContentDispositionHeaderValue.Parse(file.ContentDisposition).FileName.Trim('"');
await file.SaveAsAsync(Path.Combine(uploads,fileName));
}
...
}
}
You can see current contract of IFormFile in asp.net sources. See also ContentDispositionHeaderValue for additional file info.
There is no HttpPostedFileBase in MVC6. You can use IFormFile instead.
Example: https://github.com/aspnet/Mvc/blob/dev/test/WebSites/ModelBindingWebSite/Controllers/FileUploadController.cs
Snippet from the above link:
public FileDetails UploadSingle(IFormFile file)
{
FileDetails fileDetails;
using (var reader = new StreamReader(file.OpenReadStream()))
{
var fileContent = reader.ReadToEnd();
var parsedContentDisposition = ContentDispositionHeaderValue.Parse(file.ContentDisposition);
fileDetails = new FileDetails
{
Filename = parsedContentDisposition.FileName,
Content = fileContent
};
}
return fileDetails;
}
I was searching around for quite a while trying to piece this together in .net core and ended up with the below. The Base64 conversion will be next to be done so that the retrieval and display is a little easier. I have used IFormFileCollection to be able to do multiple files.
[HttpPost]
public async Task<IActionResult> Create(IFormFileCollection files)
{
Models.File fileIn = new Models.File();
if(model != null && files != null)
{
foreach (var file in files)
{
if (file.Length > 0)
{
var fileName = ContentDispositionHeaderValue.Parse(file.ContentDisposition).FileName.Trim('"');
byte[] fileBytes = null;
using (var fileStream = file.OpenReadStream())
using (var ms = new MemoryStream())
{
fileStream.CopyTo(ms);
fileBytes = ms.ToArray();
//string s = Convert.ToBase64String(fileBytes);
// act on the Base64 data
}
fileIn.Filename = fileName;
fileIn.FileLength = Convert.ToInt32(file.Length);
fileIn.FileType = file.ContentType;
fileIn.FileTypeId = model.FileTypeId;
fileIn.FileData = fileBytes;
_context.Add(fileIn);
await _context.SaveChangesAsync();
}
}
}
return View();
}
EDIT
And below is return of files to a list and then download.
public JsonResult GetAllFiles()
{
var files = _context.File
.Include(a => a.FileCategory)
.Select(a => new
{
id = a.Id.ToString(),
fileName = a.Filename,
fileData = a.FileData,
fileType = a.FileType,
friendlyName = a.FriendlyName,
fileCategory = a.FileCategory.Name.ToLower()
}).ToList();
return Json(files);
}
public FileStreamResult DownloadFileById(int id)
{
// Fetching file encoded code from database.
var file = _context.File.SingleOrDefault(f => f.Id == id);
var fileData = file.FileData;
var fileName = file.Filename;
// Converting to code to byte array
byte[] bytes = Convert.FromBase64String(fileData);
// Converting byte array to memory stream.
MemoryStream stream = new MemoryStream(bytes);
// Create final file stream result.
FileStreamResult fileStream = new FileStreamResult(stream, "*/*");
// File name with file extension.
fileStream.FileDownloadName = fileName;
return fileStream;
}
Lets say I want a screen scraper that doesn't care if you pass it an HTML page, url that goes to an XML Document, or a Url that goes to a text file.
examples:
http://tonto.eia.doe.gov/oog/info/wohdp/dslpriwk.txt
http://google.com
This will work if the page is HTML or a text file:
public class ScreenScrapingService : IScreenScrapingService
{
public XDocument Scrape(string url)
{
var scraper = new HtmlWeb();
var stringWriter = new StringWriter();
var xml = new XmlTextWriter(stringWriter);
scraper.LoadHtmlAsXml(url, xml);
var text = stringWriter.ToString();
return XDocument.Parse(text);
}
}
However; if it is an XML file such as:
http://www.eia.gov/petroleum/gasdiesel/includes/gas_diesel_rss.xml
[Test]
public void Scrape_ShouldScrapeSomething()
{
//arrange
var sut = new ScreenScrapingService();
//act
var result = sut.Scrape("http://www.eia.gov/petroleum/gasdiesel/includes/gas_diesel_rss.xml");
//assert
}
Then I get the error:
An exception of type 'System.Xml.XmlException' occurred in System.Xml.dll but was not handled in user code
Is it possible to write this so that it doesn't care what the URL ultimately is?
to get the exact exception on visual studio CTR+ALT+E and enable CommonLanguageRunTimeExceptions, it seems like LoadHtmlAsXml expects html, so probably your best bet is to use a WebClient.DownloadString(url) and HtmlDocument with property OptionOutputAsXml set to true as the following, when that fails catch it
public XDocument Scrape(string url)
{
var wc = new WebClient();
var htmlorxml = wc.DownloadString(url);
var doc = new HtmlDocument() { OptionOutputAsXml = true};
var stringWriter = new StringWriter();
doc.Save(stringWriter);
try
{
return XDocument.Parse(stringWriter.ToString());
}
catch
{
//it only gets here when the string is xml already
try
{
return XDocument.Parse(htmlorxml);
}
catch
{
return null;
}
}
}
I store few pdf files in my db as binary format using the below code in my controller,
[HttpPost]
public ActionResult Index(HttpPostedFileBase file)
{
Image newImage = new Image();
newImage.MimeType = file.ContentType;
var binaryReader = new BinaryReader(file.InputStream);
newImage.Data = binaryReader.ReadBytes(file.ContentLength);
binaryReader.Close();
objImage.InsertImage(newImage.Data);
return View();
}
now i want to download them back based on the id passed to the contrller that pdf files should be downloaded??
this is my code for pdf download, wat do i need to add more
public ActionResult Download(int id)
{
DataSet da = new DataSet();
da = objImage.getUserImage(id);
DataTable dt = new DataTable();
dt = da.Tables[0];
Byte[] imagedata=(Byte[])dt.Rows[0]["UsImage"];
}
this is my code for pdf download, wat do i need to add more
Return an ActionResult:
public ActionResult Download(int id)
{
...
byte[] imagedata = (byte[])dt.Rows[0]["UsImage"];
return File(imagedata, "image/png");
}
and if you want the browser to popup a Save As dialog instead of displaying the image inline specify a filename:
public ActionResult Download(int id)
{
...
byte[] imagedata = (byte[])dt.Rows[0]["UsImage"];
return File(imagedata, "image/png", "foo.png");
}
Obviously the MIME type and the filename could come from your database as well. In this example I have hardcoded them but you could adapt this code.
return File(result.Content, result.Extension.Replace(".", ""));
public ActionResult Download(int id)
{
DataSet da = new DataSet();
da = objImage.getUserImage(id);
DataTable dt = new DataTable();
dt = da.Tables[0];
Byte[] imagedata=(Byte[])dt.Rows[0]["UsImage"];
return File(imagedata, "image/png");
}
public ActionResult GetPdf(int id)
{
ProjectProfile projectprofile = db.ProjectProfiles.Find(id);
var image = projectprofile.pdf;
return File(image, "application/pdf");
}
I just want to generate a pdf document of the details presents in view on button click.
In order to generate a PDF file you will need some third party library as this functionality is not built-in the .NET framework. iTextSharp is a popular one.
So for example you could write a custom action result:
public class PdfResult : ActionResult
{
public override void ExecuteResult(ControllerContext context)
{
var response = context.HttpContext.Response;
response.ContentType = "application/pdf";
var cd = new ContentDisposition
{
Inline = true,
FileName = "test.pdf",
};
response.AddHeader("Content-Disposition", cd.ToString());
using (var doc = new Document())
using (var writer = PdfWriter.GetInstance(doc, response.OutputStream))
{
doc.Open();
doc.Add(new Phrase("Hello World"));
}
}
}
and then have your controller action return this result:
public class HomeController : Controller
{
public ActionResult Index()
{
return new PdfResult();
}
}