I have been successful in creating image from PDF using iTextSharp. It creates images equal to number of pages in PDF but generated images does not preview in any image viewer software. It says image is corrupted. Below is the code I have created.
try
{
PdfReader reader = null;
int currentPage = 1;
int pageCount = 0;
string destinationFolderPath = string.Format(#"{0}PageImages\{1}", BaseDataPath, Convert.ToString(documentId));
if (!Directory.Exists(destinationFolderPath))
{
Directory.CreateDirectory(destinationFolderPath);
}
reader = new PdfReader(filePath);
reader.RemoveUnusedObjects();
pageCount = reader.NumberOfPages;
string ext = ".png";
for (int i = 1; i <= pageCount; i++)
{
PdfReader reader1 = new PdfReader(filePath);
string destinationFilePath = string.Format(#"{0}/{1}{2}", destinationFolderPath, Convert.ToString(i), ext);
reader1.RemoveUnusedObjects();
Document doc = new Document(reader1.GetPageSizeWithRotation(currentPage));
PdfCopy pdfCpy = new PdfCopy(doc, new FileStream(destinationFilePath, FileMode.Create));
doc.Open();
for (int j = 1; j <= 1; j++)
{
PdfImportedPage page = pdfCpy.GetImportedPage(reader1, currentPage);
//pdfCpy.SetFullCompression();
pdfCpy.AddPage(page);
currentPage += 1;
}
doc.Close();
pdfCpy.Close();
reader1.Close();
reader.Close();
}
}
catch (Exception ex)
{
throw ex;
}
Could someone please suggest what is wrong here?
Thanks
You are creating a PDF file using PdfCopy, but you are storing that PDF as if you were creating a PNG file:
string ext = ".png";
string destinationFilePath =
string.Format(#"{0}/{1}{2}",
destinationFolderPath, Convert.ToString(i), ext);
PdfCopy pdfCpy = new PdfCopy(doc,
new FileStream(destinationFilePath, FileMode.Create));
You can't open a .png file in a PDF viewer. Your operating system will try to open the file you're creating as if it were an image, but the bytes of that "image" will be PDF bytes and your image viewer won't recognize it.
Change this line:
string ext = ".png";
To this:
string ext = ".pdf";
And you'll be able to open your file in a PDF viewer.
By the way: your code is awkward. For instance. I don't understand why you'd create a look to execute something only once:
for (int j = 1; j <= 1; j++)
Also: if it's your intention to convert PDF pages to PNG, reconsider. iTextSharp doesn't convert PDF to images.
Related
I have a requirement to generate a PDF from HTML that contains a table with almost 2000 rows. When I am trying to generating the PDF, it's generating up to 250 rows then skipping the other rows and HTML content and only generating 6 pages. What may cause the issue?
string HtmlStream="Some large content", string FileName="abcd.pdf";
object TargetFile = FileName;
string ModifiedFileName = string.Empty;
string FinalFileName = string.Empty;
ModifiedFileName = TargetFile.ToString();
ModifiedFileName = ModifiedFileName.Insert(ModifiedFileName.Length - 4, "1");
SelectPdf.HtmlToPdf converter = new SelectPdf.HtmlToPdf();
// set converter options
string pdf_page_size = "A4";
SelectPdf.PdfPageSize pageSize = (SelectPdf.PdfPageSize)Enum.Parse(typeof(SelectPdf.PdfPageSize),
pdf_page_size, true);
converter.Options.PdfPageSize = pageSize;
converter.Options.PdfPageOrientation = SelectPdf.PdfPageOrientation.Portrait;
converter.Options.MarginLeft = 10;
converter.Options.MarginRight = 10;
converter.Options.MarginTop = 10;
converter.Options.MarginBottom = 10;
converter.Options.MaxPageLoadTime = 1000;
// create a new pdf document converting an url
SelectPdf.PdfDocument doc = converter.ConvertHtmlString(HtmlStream);
doc.Save(ModifiedFileName.ToString());
I am trying to extract images from a PDF using the following code. It works well for some filters like DCTDecode , but is not working for JPXDEcode ."Parameter not valid " error occurs at the point image.GetDrawingImage() is called.
using System.Drawing.Imaging;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
...
PdfReader pdf = new PdfReader(currfilename);
PdfReaderContentParser parser = new PdfReaderContentParser(pdf);
ImageRender listener = new ImageRender();
for (int i = 1; i <= pdf.NumberOfPages; i++)
{
try
{
parser.ProcessContent(i, listener);//calls RenderImage() at this point
}
catch (Exception e)
{
Console.WriteLine(e);
}
}
public void RenderImage(ImageRenderInfo renderInfo)
{
PdfImageObject image = renderInfo.GetImage();
PdfName filter = image.Get(PdfName.FILTER) as PdfName;
if (renderInfo.GetRef() != null && image != null)
{
using (System.Drawing.Image dotnetImg = image.GetDrawingImage())//exception occurs at this point
{
if (dotnetImg != null)
{
ImageNames.Add(string.Format("{0}.tiff", renderInfo.GetRef().Number));
using (MemoryStream ms = new MemoryStream())
{
dotnetImg.Save(ms, ImageFormat.Tiff);
Images.Add(ms.ToArray());
}
}
}
}
}
I tried these links for a solution
Extract images using iTextSharp
Extract Image from a particular page in PDF
and was able to extract the raw image bytes using PdfReader.GetStreamBytesRaw() function but "Parameter not valid "exception always occurs at the point where System.Drawing.Image.FromStream(memory stream) is called.
I also checked this link "Parameter is not valid" exception from System.Drawing.Image.FromStream() method , but could not find anything helpful.
Please help
The JPXDecode filter corresponds to JPEG 2000 compression, which is not supported by .net framework. This other question in SO may help: JPEG 2000 support in C#.NET
Using FreeImage.dll solved the problem. The code is as follows
using FreeImageAPI;
using System.Drawing.Imaging;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
...
imagecount = 0;
PdfReader pdf = new PdfReader(currfilename);
PdfReaderContentParser parser = new PdfReaderContentParser(pdf);
ImageRender listener = new ImageRender();
for (int i = 1; i <= pdf.NumberOfPages; i++)
{
try
{
parser.ProcessContent(i, listener);//calls RenderImage() at this point
}
catch (Exception e)
{
Console.WriteLine(e);
}
}
if (listener.Images.Count > 0)
{
for (int j = 0; (j < listener.Images.Count); ++j)
{
string imgpath = Environment.CurrentDirectory.ToString() + "\\Image" + imagecount + ".bmp";
// create a memory stream
MemoryStream imageStream = new MemoryStream(listener.Images[j]);
// create a FIBITMAP from that stream
FIBITMAP dib = FreeImage.LoadFromStream(imageStream);
if (dib.IsNull) continue;
//turn it into a normal Bitmap
Bitmap bitmap = FreeImage.GetBitmap(dib);
bitmap.Save(imgpath);
//unload the FIBITMAP
FreeImage.UnloadEx(ref dib);
bitmap.Dispose();
System.Drawing.Image img = System.Drawing.Image.FromFile(imgpath);
}
public void RenderImage(ImageRenderInfo renderInfo)
{
PdfImageObject image = renderInfo.GetImage();
if (renderInfo.GetRef() != null && image != null)
{
byte[] tempImage = image.GetImageAsBytes();
ImageNames.Add(string.Format("0}.bmp",renderInfo.GetRef().Number));
Images.Add(tempImage);
}
}
I followed the instructions given here to add FreeImage .Net to solution
I am trying to save the stream of image data to a file. I was able to save it to Pictures library though.
But I want to save it to a file in the root of my application/ project.
I was trying the below but it doesn't work.
using (MediaLibrary mediaLibrary = new MediaLibrary())
mediaLibrary.SavePicture(#"\DefaultScreen.jpg", stream);
In this case you should use LocalStorage.
Here is a simple solution to do this:
using (IsolatedStorageFile isoStore = IsolatedStorageFile.GetUserStoreForApplication())
{
if (!isoStore.FileExists(fileName)
{
var sr = Application.GetResourceStream(new Uri(fileName, UriKind.Relative));
using (var br = new BinaryReader(sr.Stream))
{
byte[] data = br.ReadBytes((int)sr.Stream.Length);
string strBaseDir = string.Empty;
const string DelimStr = "/";
char[] delimiter = DelimStr.ToCharArray();
string[] dirsPath = fileName.Split(delimiter);
// Recreate the directory structure
for (int i = 0; i < dirsPath.Length - 1; i++)
{
strBaseDir = Path.Combine(strBaseDir, dirsPath[i]);
isoStore.CreateDirectory(strBaseDir);
}
using (BinaryWriter bw = new BinaryWriter(isoStore.CreateFile(fileName)))
{
bw.Write(data);
}
}
}
}
Here you can find all info about data in Windows Phone:
http://msdn.microsoft.com/en-us/library/windowsphone/develop/ff402541(v=vs.105).aspx
In my Windows Phone7.1 App Iam loading a HTML file from local path in a WebBrowser. For this I
converted a PNG Image to base64 format using the below code and the problem is base 64 format of image path is not loading the image in the webbrowser.
Please help me where i made mistake?
string s = "data:image/jpg;base64,";
imgStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("NewUIChanges.Htmlfile.round1.png");
byte[] data = new byte[(int)imgStream.Length];
int offset = 0;
while (offset < data.Length)
{
int bytesRead = imgStream.Read(data, offset, data.Length - offset);
if (bytesRead <= 0)
{
throw new EndOfStreamException("Stream wasn't as long as it claimed");
}
offset += bytesRead;
}
base64 = Convert.ToBase64String(data);
Stream htmlStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("NewUIChanges.Htmlfile.equity_built.html");
StreamReader reader = new StreamReader(htmlStream);
string htmlcontent = reader.ReadToEnd();
htmlcontent = htmlcontent.Replace("round1.png", s + base64);
wb.NavigateToString(htmlcontent);
If you have no error, that data contains your image, and round1.png exist in htmlcontent, then it's just probably a image type error, try this:
string s = "data:image/png;base64,";
i'm using websupergoos abcpdf to convert html pages to pdf via addimageurl.
Works great, but the resulting pdf does not allow the user to select text and copy. All is one 'image'.
Is it possible to do this? Which are the settings to use?
This is my current code. The commented "flatten" does not seem to do anything relevant. The HttpStream simply forewards the pdf to users as a doc.
var doc = new Doc();
doc.HtmlOptions.UseScript = true;
doc.Units = "mm";
doc.MediaBox.String = "0 0 210 297";
doc.Rect.String = doc.MediaBox.String;
doc.Rect.Inset(10.0, 10.0);
doc.SetInfo(0, "License", abcpdfkey);
doc.HtmlOptions.UseScript = true;
doc.HtmlOptions.AddMovies = true;
doc.HtmlOptions.RetryCount = 0;
doc.HtmlOptions.ContentCount = 1;
doc.Page = doc.AddPage();
for (int i = doc.AddImageUrl(url); doc.Chainable(i); i = doc.AddImageToChain(i))
{
doc.Page = doc.AddPage();
}
int pageCount = doc.PageCount;
for (int j = 1; j <= pageCount; j++)
{
doc.PageNumber = j;
// doc.Flatten();
}
this.HttpStream(doc.GetData(), filename);
Before sending the PDF to the HTTP stream, you can set the encryption properties
The CanCopy Property sets if the user can copy text from the PDF
To set it add the following code:
doc.Encryption.CanCopy = true;
You may need to set doc.Encryption.CanExtract as well