I have about 2000 images at 8 X 11 inches, they range in size from 10 k to 1 meg With the below code I loop through a directory and insert the TIFF files onto a new page within a new PDF file that I create. this process takes about 10 mins I'm running this on a win 8 server with 12 gigs of ram and 2 X 2.13 GHz processors (Noting else running ) I like to see if I can get the time down.
I'm not sure if this is the most efficient way of doing this. It works just a little slow. It just might be the fasted way but was wondering if anyone might have a better way of doing the above process.
using (ouput = new FileStream(OutputFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (doc = new Document(PageSize.LETTER))
{
using (writer = PdfWriter.GetInstance(doc, ouput))
{
doc.Open();
foreach (string dir in Directory.GetFiles(TIFFiles,"*.tif", SearchOption.TopDirectoryOnly))
{
Console.Write("\rMerging : {0}....", Path.GetFileName(dir));
iTextSharp.text.Image TIFF = iTextSharp.text.Image.GetInstance(dir);
TIFF.SetAbsolutePosition(0, 0);
writer.DirectContent.AddImage(TIFF);
doc.NewPage();
}
Console.WriteLine("");
Console.WriteLine("End Time: {0}", DateTime.Now.ToString("hh:mm:ss"));
doc.Close();
Console.ReadLine();
}
}
}
Related
I'm writting a PowerPoint AddIn (VSTO) which makes the following (very roughly):
1. Loop through all slides
2. Loop through all tags for each slide and write the tag information to a List-Object
3. Do some calucations on the List-Object and update some columns in the list-Object
4. Loop through the List-Object and write the tags back to the sorresponding slides / tags
I recognized that this code runs for every run slower and slowwer - the 10th run is 3-4 times slower than the first one.
I also see the memory usage goes up for every run.
What are my tools to check where the bottle neck is?
How can I find the problem which makes my code slower for every run?
Is there a way to clear the memory etc. the AddOn uses after each run?
I'm sorry to ask such general question, but please let me know if you need more details
My PowerPoint has 32 Slides - for each slide i want to write tags (in this example 49 Tags). Because of updates etc. the code is executed several times by the user. I simulate this by doing it 11 times automatically. I recognize here the same behavior as by user interaction: the writing of the tags for the 32 Slides is getting slower with every execute.
I reduced my code to the minimum which has the same behavior. I tried also to first delete the tags on every slide but without success.
private void btn_DoIt10Times_Click(object sender, EventArgs e)
{
Stopwatch watch = new Stopwatch();
watch.Start();
SlideTags_Write();
watch.Stop();
//MessageBox.Show("Time spend: " + watch.Elapsed);
for (int i = 1; i <= 10; i++)
{
SlideTags_Write();
}
Stopwatch watch2 = new Stopwatch();
watch2.Start();
SlideTags_Write();
watch2.Stop();
MessageBox.Show("Time 1st run: " + watch.Elapsed + "\n Time 11th run: " + watch2.Elapsed);
}
public void SlideTags_Write()
{
PowerPoint.Presentation oPresentation = Globals.ThisAddIn.Application.ActivePresentation;
foreach (PowerPoint.Slide oSlide in oPresentation.Slides)
{
//for (int iTag = 1; iTag <= oSlide.Tags.Count; iTag++)
//{
// oSlide.Tags.Delete(oSlide.Tags.Name(iTag));
//}
for (int iTag = 1; iTag < 50; iTag++)
{
oSlide.Tags.Add("Tag_" + iTag.ToString(), "Tag Value " + iTag.ToString());
}
}
}
I am building app for windows phone 7 ,where i am taking screenshot in every 1 sec and all screenshot's are saving in media library and files name are 1.jpg,2.jpg,3.jpg.........etc. now when i am taking images from library i am getting images randomly like (1.jpg,2.jpg,3.jpg,7.jpg,13.jpg,4.jpg,15.jpg,5.jpg) not in sequence.
how can i get all images in sequence.here is my code
using (MediaLibrary mediaLibrary = new MediaLibrary())
{
PictureCollection AllScreenShot = mediaLibrary.Pictures;
foreach (Picture picture in AllScreenShot)
{
using (IsolatedStorageFile storage = IsolatedStorageFile.GetUserStoreForApplication())
{
if (!storage.DirectoryExists("SavedImg"))
storage.CreateDirectory("SavedImg");
if (storage.FileExists("SavedImg" + "\\" + picture.Name))
storage.DeleteFile("SavedImg" + "\\" + picture.Name);
using (IsolatedStorageFileStream file = storage.CreateFile("SavedImg" + "\\" + picture.Name))
picture.GetImage().CopyTo(file);
}
}
}
Create a list of Images and store all images in a list. It would be like this,
List<Image> listImage = new List<Image>(10); // say 10
listImage.Add(your image Item) in your case its pic 1.jpg; // cast before adding
List<Image> orderedList = listImage.OrderBy(k => k.ToString()).ToList();
Actually its not .ToString(). I declared to make you clear with the concept. In case of Image, you first need to convert it to byte[] and then store the byte[] in list and finally perform OrderBy option which will Order the images in sequence.
I have merged small files into big one. On app first start this file is read and one by one small files are created on file system (Isolated Storage).
When this file contains 44 small files and is ~200kb - algorithm works for 120ms on device.
When this file contains 140 even smaller files and is ~400kb - algorithm works for 3000 ms on device.
If i take from both files only 44 files - first one still works for ~120, second works for ~800ms.
This seems as wonder for me.
Format of data in file is simple
-INT32 - ENTRIES COUNT
--STRING ENTRY NAME |
--INT32 ENTRY DATA LENGTH | REPEATS {ENTRY COUNT} TIMES
--BYTE[] ENTRY DATA |
For me this seems like a magic in Windows Phone IsolatedStorage mechanisms.
There are completely no reasons for second file to work 7-8 times slower when copying equal number of entries.
Repro project - https://www.dropbox.com/s/6bjsve7p8wew3kb/IsoStorageWonder.zip?m
Code:
public static void CopyCache(ILogger logger)
{
using (var isoStorage = IsolatedStorageFile.GetUserStoreForApplication())
{
var streamInfo = Application.GetResourceStream(new Uri(_dataFilePath, UriKind.RelativeOrAbsolute));
isoStorage.CreateDirectory("HttpCache");
var binaryReader = new BinaryReader(streamInfo.Stream);
{
int itemsCount = binaryReader.ReadInt32();
for (int i = 0; i < ENTRIES_COUNT; i++)
{
string fileName = binaryReader.ReadString();
int length = binaryReader.ReadInt32();
byte[] data = binaryReader.ReadBytes(length);
using (
var fileStream =
new IsolatedStorageFileStream(
Path.Combine(_rootCacheDir, fileName),
FileMode.Create,
FileAccess.Write,
FileShare.None,
isoStorage))
{
fileStream.Write(data, 0, data.Length);
}
}
}
}
}
MAGIC!
I have similar problem with WebClient performance. In emulator request takes 0.3-0.5 seconds, on device 8-22 seconds. I was very confused. But in my case the solution was very simple: DO NOT TEST PERFORMANCE ON DEVISE IN DEBUG MODE. What I do:
Compile project to your device.
Stop Debugging
Close your app on phone (and better reboot device)
All works like a charm))
In your test app IsoStorageWonder:
Emulator 551ms
Emulator 256 mB 564ms
HTC Radar WP7.8 Debug Mode 1835ms
HTC Radar WP7.8 Not Debug Mode 958ms
I hope my reserch help you. Regards
UPD
Test with output2
Emulator 440ms
Emulator 256 mB 447ms
HTC Radar WP7.8 Debug Mode 287ms // very nice
HTC Radar WP7.8 Not Debug Mode 144ms // also nice
I've got a windows 8 program that uses an image picker and downloads the selected image on the server.
The server provides an API which needs the image to be converted in base64string. And an image must be less than 7Mb.
I'm using the code below:
FileOpenPicker openPicker = new FileOpenPicker();
openPicker.ViewMode = PickerViewMode.Thumbnail;
openPicker.SuggestedStartLocation = PickerLocationId.PicturesLibrary;
openPicker.FileTypeFilter.Add(".jpg");
openPicker.FileTypeFilter.Add(".jpeg");
openPicker.FileTypeFilter.Add(".png");
StorageFile file = await openPicker.PickSingleFileAsync();
if (file != null)
{
// Application now has read/write access to the picked file
bitmap = new BitmapImage();
byte[] buf;
using (var stream = await file.OpenStreamForReadAsync())
{
buf = ReadToEnd(stream);
}
using (var stream = await file.OpenAsync(FileAccessMode.Read))
{
base64String = Convert.ToBase64String(buf);
bitmap.SetSource(stream);
}
}
And the bitmap goes to the server.
But there is a problem: the bitmap size is much more bigger than jpg's size, for example. And none of small jpgs go to the server, because their bitmap version is larger than 7 Mb.
Can I convert an image to base64string without converting it to a bitmap?
In this code, you read the image (encoded in jpeg) and convert it to a base 64 string.
You can not reduce the size of the base 64 without reducing the size of the image.
To do so, you can use a BitmapEncoder/Decoder and resize the image to a smaller size.
Regards
So I'm taking 10000 2 page pdf files and merging them into one with iTextSharp.
This is some loose code of what I'm doing:
Document document = new Document();
using(PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("merged.pdf", FileMode.Create)))
{
PdfContentByte cb = writer.DirectContent;
PdfReader reader = null;
foreach(string thisFile in files)
{
reader = new PdfReader(thisFile);
var page1 = writer.GetImportedPage(reader, 1);
var page2 = writer.GetImportedPage(reader, 2);
cb.AddTemplate(page1, 1f, 0, 0, 1f, 0, 0);
cb.AddTemplate(page2, 1f, 0, 0, 1f, 0, 0);
}
}
I'm trying to understand where the bottlenecks could be in two places. I ran some performance tests and the slowest processes are naturally reading in each file with PdfReader and the dispose that's saving the file, its called from the using PdfWriter block.
I'm getting about 25% utilization on all 16 cores for this process. I tried a solid state drive instead of my SATA 7.2k rpm drive and it's almost the exact same speed.
How can I speed this process up? There's no distributing the task because the read speed between computers would be even slower. Even if it means changing to another language,library or writing this lower level, I need to get this process done much faster than I currently am. Right now it takes about 10 minutes for the merge.
So I finally solved this. Here are my performance results with code of the winning approach below:
I used the same machine on all three of theses tests
iTextSharp - content builder directly on a pdfwriter
Windows 2008 64 bit
NTFS partition
merges about 30 pages per second during processing
significant overhead at the end when closing out the pdfwriter
25 pages per second over all
iTextSharp - PDFCopy
Windows 2008 64 bit
NTFS partition
writes the output to disk instead of memory so no overhead at the end
40 pages per second
iText (java) - PDFCopy (exact same code, just ported to java)
Ubuntu 12.04 64 bit server edition
EXT3 partition (going to try ext4 soon)
also writes the output to disk during processing
250 pages per second
Haven't tried to figure out why the same code runs faster in java on Ubuntu but I'll take it. In general I defined all major variables outside of this function since it gets called 36000 times during this process.
public void addPage(String inputPdf, String barcodeText, String pageTitle)
{
try
{
//read in the pdf
reader = new PdfReader(inputPdf);
//all pdfs must have 2 pages (front and back).
//set to throw an out of bounds error if not. caught up stream
for (int i = 1; i <= Math.Min(reader.NumberOfPages,2); i++)
{
//import the page from source pdf
copiedPage = copyPdf.GetImportedPage(reader, i);
// add the page to the new document
copyPdf.AddPage(copiedPage);
}
//cleanup this page, keeps a big memory leak away
copyPdf.FreeReader(reader);
copyPdf.Flush();
}
finally
{
reader.Close();
}
}
Give the PdfSmartCopy a try. Not sure if it's faster or not.
Document document = new Document();
using(PdfWriter writer = new PdfSmartCopy(document, new FileStream("merged.pdf", FileMode.Create)))
{
document.Open();
PdfReader reader = null;
foreach(string thisFile in files)
{
reader = new PdfReader(thisFile);
((PdfSmartCopy)writer).AddPage(writer.GetImportedPage(reader , 1));
((PdfSmartCopy)writer).AddPage(writer.GetImportedPage(reader , 2));
}
if(reader != null)
{
reader.Close();
}
}
document.Close();