I want to convert multiple html's to one PDF. I used the below code to create pdf . This code is creating an empty PDF with no contents.
ArrayList<String> ar = new ArrayList<String>();
ar.add(/resources/html1.html);
String outputFile =" C:/WholePDF.pdf";
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(outputFile));
document.open();
for (String htmlfile : ar) {
String html = Utilities.readFileToString(htmlfile);
ElementList list = XMLWorkerHelper.parseToElementList(html, null);
for (Element e : list) {
document.add(e);
}
document.newPage();
}
document.close();
Related
Iam using Itext7 HTML2PDF 3.0.5 to convert my html to PDF. In my html I have an ordered List and after converting to HTML, the listis been overalpped by the content of the list.
Before i was using ITEXT XMLWorker, with i don't have this issue but i ahve other issues with tables. To fix them I moved to itext7 htmltopdf.
Could some one please help me with this issue?
Here is my sample code snippet
public static byte[] html2pdf(String htmlString) throws Exception {
Document pdfDoc = null;
ByteArrayOutputStream baos = null;
try {
baos = new ByteArrayOutputStream();
PdfWriter pdfWriter = new PdfWriter(baos);
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
ConverterProperties converterProperties = new ConverterProperties();
HtmlConverter.convertToPdf(htmlString, pdfDocument, converterProperties);
pdfDocument.close();
return baos.toByteArray();
} catch (Exception exception) {
return null;
} finally {
try {
baos.close();
} catch (Exception ignored) {
}
}
}
sample1-generated pdf
sample2-generated pdf
sample1-Expected result
sample2-expected result
We were using the apache file uploads for uploading a file and below code was used for the same.
ServletFileUpload upload = new ServletFileUpload(itemFactory);
List<FileItem> items = upload.parseRequest(request);
Iterator<FileItem> iterator = items.iterator();
while (iterator.hasNext()) {
FileItem item = iterator.next();
if (item.isFormField()) {
String name = item.getFieldName();
String value = item.getString();
conf.put(name, value);
} else {
InputStream is = item.getInputStream();
byte[] bytes = ByteStreams.toByteArray(is);
String query = new String(bytes, "UTF-8");
conf.put("test", query);
}
}
But we recently moved to Spring boot and trying to use spring multipart to upload a file.The code which will replace the above code will be
DefaultMultipartHttpServletRequest requestMain = (DefaultMultipartHttpServletRequest) request;
Iterator<String> fileNameIterator = requestMain.getFileNames();
ListIterator<MultipartFile> iterator = null;
while (fileNameIterator.hasNext()) {
multipartFiles = requestMain.getFiles(fileNameIterator.next());
iterator = multipartFiles.listIterator();
while (iterator.hasNext()) {
MultipartFile item = iterator.next();
if (item.isEmpty()) {
// String name = item.getFieldName();
// String value = item.getString();
// conf.put(name, value);
} else {
InputStream is = item.getInputStream();
byte[] bytes = ByteStreams.toByteArray(is);
String query = new String(bytes, "UTF-8");
conf.put("query", query);
}
}
}
But I am not able to figure out how to check if the file has a form field in spring boot.In apache file upload it was achieved using item.isFormField() method.
You don't need to check if the item is form field.
The requestMain.getFiles(fileNameIterator.next()); method will return a MultipartFile so you know is not a form field.
To get the form fields use
requestMain.getParameterMap()
Or, for a specific field use
requestMain.getParameter("field")
I am trying to fill and combine multiple forms without flattening(need to keep them interactive for users). However I notice a problem. I have PDF files that contain the forms I am trying to fill. The form fields have their fonts set in adobe PDF. I notice after I combine the forms the fields lose their original fonts. Here is my program.
using iText.Forms;
using iText.Kernel.Pdf;
using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;
namespace PdfCombineTest
{
class Program
{
static void Main(string[] args)
{
Stream file1;
Stream file2;
using (var stream = new FileStream("./pdf-form-1.pdf", FileMode.Open, FileAccess.Read))
{
file1 = Program.Fill(stream, new[] { KeyValuePair.Create("Text1", "TESTING"), KeyValuePair.Create("CheckBox1", "Yes") });
}
using (var stream = new FileStream("./pdf-form-2.pdf", FileMode.Open, FileAccess.Read))
{
file2 = Program.Fill(stream, new[] { KeyValuePair.Create("Text2", "text 2 text") });
}
using (Stream output = Program.Combine(new[] { file1, file2 }))
{
using (var fileStream = File.Create("./output.pdf"))
{
output.CopyTo(fileStream);
}
}
}
public static Stream Combine(params Stream[] streams)
{
MemoryStream copyStream = new MemoryStream();
PdfWriter writer = new PdfWriter(copyStream);
writer.SetSmartMode(true);
writer.SetCloseStream(false);
PdfPageFormCopier formCopier = new PdfPageFormCopier();
using (PdfDocument combined = new PdfDocument(writer))
{
combined.InitializeOutlines();
foreach (var stream in streams)
{
using (PdfDocument document = new PdfDocument(new PdfReader(stream)))
{
document.CopyPagesTo(1, document.GetNumberOfPages(), combined, formCopier);
}
}
}
copyStream.Seek(0, SeekOrigin.Begin);
return copyStream;
}
public static Stream Fill(Stream inputStream, IEnumerable<KeyValuePair<string, string>> keyValuePairs)
{
MemoryStream outputStream = new MemoryStream();
PdfWriter writer = new PdfWriter(outputStream);
writer.SetCloseStream(false);
using (PdfDocument document = new PdfDocument(new PdfReader(inputStream), writer))
{
PdfAcroForm acroForm = PdfAcroForm.GetAcroForm(document, true);
acroForm.SetGenerateAppearance(true);
IDictionary<string, iText.Forms.Fields.PdfFormField> fields = acroForm.GetFormFields();
foreach (var kvp in keyValuePairs)
{
fields[kvp.Key].SetValue(kvp.Value);
}
}
outputStream.Seek(0, SeekOrigin.Begin);
return outputStream;
}
}
}
I've noticed after several hours of debugging that PdfPageFormCopier excludes the default resources which contain fonts when merging form fields, is there a way around this? The project I'm working on currently does this process in ItextSharp and it works as intended. However we are looking to migrate to iText7.
Here are links to some sample pdf's I made I can't upload the actual pdf's I'm working with but these display the same problem.
https://www.dropbox.com/s/pukt91d4xe8gmmo/pdf-form-1.pdf?dl=0
https://www.dropbox.com/s/c52x6bc99gnrvo6/pdf-form-2.pdf?dl=0
So my solution was to modify the PdfPageFormCopier class from iText. The main issue is in the function below.
public virtual void Copy(PdfPage fromPage, PdfPage toPage) {
if (documentFrom != fromPage.GetDocument()) {
documentFrom = fromPage.GetDocument();
formFrom = PdfAcroForm.GetAcroForm(documentFrom, false);
}
if (documentTo != toPage.GetDocument()) {
documentTo = toPage.GetDocument();
formTo = PdfAcroForm.GetAcroForm(documentTo, true);
}
if (formFrom == null) {
return;
}
//duplicate AcroForm dictionary
IList<PdfName> excludedKeys = new List<PdfName>();
excludedKeys.Add(PdfName.Fields);
excludedKeys.Add(PdfName.DR);
PdfDictionary dict = formFrom.GetPdfObject().CopyTo(documentTo, excludedKeys, false);
formTo.GetPdfObject().MergeDifferent(dict);
IDictionary<String, PdfFormField> fieldsFrom = formFrom.GetFormFields();
if (fieldsFrom.Count <= 0) {
return;
}
IDictionary<String, PdfFormField> fieldsTo = formTo.GetFormFields();
IList<PdfAnnotation> annots = toPage.GetAnnotations();
foreach (PdfAnnotation annot in annots) {
if (!annot.GetSubtype().Equals(PdfName.Widget)) {
continue;
}
CopyField(toPage, fieldsFrom, fieldsTo, annot);
}
}
Specifically the line here.
excludedKeys.Add(PdfName.DR);
If you walk the the code in the CopyField() function eventually you will end in the PdfFormField class. You can see the constructor below.
public PdfFormField(PdfDictionary pdfObject)
: base(pdfObject) {
EnsureObjectIsAddedToDocument(pdfObject);
SetForbidRelease();
RetrieveStyles();
}
The function RetrieveStyles() will try to set the font for the field based on the default appearance. However that will not work. Due to the function below.
private PdfFont ResolveFontName(String fontName) {
PdfDictionary defaultResources = (PdfDictionary)GetAcroFormObject(PdfName.DR, PdfObject.DICTIONARY);
PdfDictionary defaultFontDic = defaultResources != null ? defaultResources.GetAsDictionary(PdfName.Font) :
null;
if (fontName != null && defaultFontDic != null) {
PdfDictionary daFontDict = defaultFontDic.GetAsDictionary(new PdfName(fontName));
if (daFontDict != null) {
return GetDocument().GetFont(daFontDict);
}
}
return null;
}
You see it is trying to see if the font exists in the default resources which was explicitly excluded in the PdfPageFormCopier class. It will never find the font.
So my solution was to create my own class that implements the IPdfPageExtraCopier interface. I copied the code from the PdfPageFormCopier class and removed the one line excluding the default resources. Then I use my own copier class in my code. Not the prettiest solution but it works.
I've got a partial view, i'm trying to use ITextSharp to convert the html to pdf. How can I convert the html to string so I can use ItextSharps HtmlParser?
I've tried something like this with no luck...any ideas?:
var contents = System.IO.File.ReadAllText(Url.Action("myPartial", "myController", new { id = 1 }, "http"));
I have created a special ViewResult class that you can return as the result of an Action.
You can see the code on bitbucket (look at the PdfFromHtmlResult class).
So what it basically does is:
Render the view through the Razor engine (or any other registered engine) to Html
Give the html to iTextSharp
return the pdf as the ViewResult (with correct mimetype, etc).
My ViewResult class looks like:
public class PdfFromHtmlResult : ViewResult {
public override void ExecuteResult(ControllerContext context) {
if (context == null) {
throw new ArgumentNullException("context");
}
if (string.IsNullOrEmpty(this.ViewName)) {
this.ViewName = context.RouteData.GetRequiredString("action");
}
if (this.View == null) {
this.View = this.FindView(context).View;
}
// First get the html from the Html view
using (var writer = new StringWriter()) {
var vwContext = new ViewContext(context, this.View, this.ViewData, this.TempData, writer);
this.View.Render(vwContext, writer);
// Convert to pdf
var response = context.HttpContext.Response;
using (var pdfStream = new MemoryStream()) {
var pdfDoc = new Document();
var pdfWriter = PdfWriter.GetInstance(pdfDoc, pdfStream);
pdfDoc.Open();
using (var htmlRdr = new StringReader(writer.ToString())) {
var parsed = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(htmlRdr, null);
foreach (var parsedElement in parsed) {
pdfDoc.Add(parsedElement);
}
}
pdfDoc.Close();
response.ContentType = "application/pdf";
response.AddHeader("Content-Disposition", this.ViewName + ".pdf");
byte[] pdfBytes = pdfStream.ToArray();
response.OutputStream.Write(pdfBytes, 0, pdfBytes.Length);
}
}
}
}
With the correct extension methods (see BitBucket), etc, the code in my controller is something like:
public ActionResult MyPdf(int id) {
var myModel = findDataWithID(id);
// this assumes there is a MyPdf.cshtml/MyPdf.aspx as the view
return this.PdfFromHtml(myModel);
}
Note: Your method does not work, because you will retrieve the Html on the server, thereby you loose all cookies (=session information) that are stored on the client.
i have created some files in the IO
in the "car" files, i would like to put some other reference like model, color etc...
so my question is : is it possible to have a multi-lining files in the IO
if yes how can i get them in the streamreader
// i want to storage many parameters in a file and find them again with the streamreader
protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
//reception des parametres de la listbox
base.OnNavigatedTo(e);
string parameter = this.NavigationContext.QueryString["parameter"];
this.tbTitre.Text = parameter;
try
{
//Create a new StreamReader
StreamReader editionDevisReader = null;
IsolatedStorageFile probyOrange = IsolatedStorageFile.GetUserStoreForApplication();
//Read the file from the specified location.
editionDevisReader = new StreamReader(new IsolatedStorageFileStream("devis\\"+parameter+".txt", FileMode.Open, probyOrange));
//Read the contents of the file .
string textFile = editionDevisReader.ReadLine();
//Write the contents of the file to the TextBlock on the page.
tbTitre.Text = textFile;
while (editionDevisReader != null)
{
RowDefinition rowdefinition = new RowDefinition();
TextBlock textblock = new TextBlock();
textblock.HorizontalAlignment = new System.Drawing.Size(48, 20);
}
editionDevisReader.Close();
}
catch
{
//If the file hasn't been created yet.
tbTitre.Text = "veuillez d abord creer le fichier";
}
thx a lot all
Yes, you can save anything (up to a point) in a file:
using (var store = IsolatedStorageFile.GetUserStoreForApplication())
{
using (var isfs = new IsolatedStorageFileStream("myfile.txt", FileMode.OpenOrCreate, store))
{
using (var sw = new StreamWriter(isfs))
{
sw.Write("anything really. Here it's just a string but could be a serialized object, etc.");
sw.Close();
}
}
}
You can then read the file with:
var result = string.Empty;
try
{
using (var store = IsolatedStorageFile.GetUserStoreForApplication())
{
if (!store.FileExists("myfile.txt"))
{
return result;
}
using (var isfs = new IsolatedStorageFileStream("myfile.txt", FileMode.Open, store))
{
using (var sr = new StreamReader(isfs))
{
string lineOfData;
while ((lineOfData = sr.ReadLine()) != null)
{
result += lineOfData;
}
}
}
}
}
catch (IsolatedStorageException)
{
result = string.Empty; // may have partial data/file before error
}
return result;
You can use
StreamReader.Writeline
and
StreamRead.ReadLine
to write and read blocks of text seperated by line feeds.