Reading Excel spreasheet using EPPlus - model-view-controller

Could someone point me in the right direction on how to read a Excel spreasheet, loop through all the rows and columns to retreive value using EPPlus and MVC? So fare I see examples to create a spreasheet, but did not find any on opening an excel file and read values from it. Any help would be appreciated.
TIA
Sue..

Simple example
// Get the file we are going to process
var existingFile = new FileInfo(filePath);
// Open and read the XlSX file.
using (var package = new ExcelPackage(existingFile))
{
// Get the work book in the file
var workBook = package.Workbook;
if (workBook != null)
{
if (workBook.Worksheets.Count > 0)
{
// Get the first worksheet
var currentWorksheet = workBook.Worksheets.First();
// read some data
object col1Header = currentWorksheet.Cells[1, 1].Value;

A simple example how you can read excel file using EPPlus in .net 4.5
public void readXLS(string FilePath)
{
FileInfo existingFile = new FileInfo(FilePath);
using (ExcelPackage package = new ExcelPackage(existingFile))
{
//get the first worksheet in the workbook
ExcelWorksheet worksheet = package.Workbook.Worksheets[1];
int colCount = worksheet.Dimension.End.Column; //get Column Count
int rowCount = worksheet.Dimension.End.Row; //get row count
for (int row = 1; row <= rowCount; row++)
{
for (int col = 1; col <= colCount; col++)
{
Console.WriteLine(" Row:" + row + " column:" + col + " Value:" + worksheet.Cells[row, col].Value.ToString().Trim());
}
}
}
}

Related

How to retrieve column names from a excel sheet?

Using EPPlus I'm writing data to multiple sheets. If a sheet is not created I'm adding a sheet else I'm retrieving the used rows and adding data from that row and saving it
FileInfo newFile = new FileInfo("Excel.xlsx");
using (ExcelPackage xlPackage = new ExcelPackage(newFile))
{
var ws = xlPackage.Workbook.Worksheets.FirstOrDefault(x => x.Name == language.Culture);
if (ws == null)
{
worksheet = xlPackage.Workbook.Worksheets.Add(language.Culture);
//writing data
}
else
{
worksheet = xlPackage.Workbook.Worksheets[language.Culture];
colCount = worksheet.Dimension.End.Column;
rowCount = worksheet.Dimension.End.Row;
//write data
}
worksheet.Cells[worksheet.Dimension.Address].AutoFitColumns();
xlPackage.Save();
And it is working great.
Now I want to retrieve the column names of each sheet in the excel using LinqToExcel and this is my code
string sheetName = language.Culture;
var excelFile = new ExcelQueryFactory(excelPath);
IQueryable<Row> excelSheetValues = from workingSheet in excelFile.Worksheet(sheetName) select workingSheet;
string[] headerRow = excelFile.GetColumnNames(sheetName).ToArray();
At header row it is throwing me an exception
An OleDbException exception was caught
External table is not in the expected format.
But I don't want to use Oledb and want to work with Linq To Excel.
Note: When I'm working with single sheet rather than multiple sheets
it is working fine and retrieving all columns. Where am I going wrong.
(Based on OP's Comments)
The AutoFitColumn function has always been a little touchy. The important thing to remember is to call it AFTER you load the cell data.
But if you want a use a minimum width (when columns are very narrow and you want to use a minimum) I find EPP to be unreliable. It seems to always use DefualtColWidth of the worksheet even if you pass in a minimumWidth to one of the function overloads.
Here is how I get around it:
[TestMethod]
public void Autofit_Column_Range_Test()
{
//http://stackoverflow.com/questions/31165959/how-to-retrieve-column-names-from-a-excel-sheet
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.Add(new DataColumn("Nar", typeof(int))); //This would not be autofitted without the workaround since the default width of a new ws, usually 8.43
datatable.Columns.Add(new DataColumn("Wide Column", typeof(int)));
datatable.Columns.Add(new DataColumn("Really Wide Column", typeof(int)));
for (var i = 0; i < 20; i++)
{
var row = datatable.NewRow();
row[0] = i;
row[1] = i * 10;
row[2] = i * 100;
datatable.Rows.Add(row);
}
var existingFile2 = new FileInfo(#"c:\temp\temp.xlsx");
if (existingFile2.Exists)
existingFile2.Delete();
using (var package = new ExcelPackage(existingFile2))
{
//Add the data
var ws = package.Workbook.Worksheets.Add("Sheet1");
ws.Cells.LoadFromDataTable(datatable, true);
//Keep track of the original default of 8.43 (excel default unless the user has changed it in their local Excel install)
var orginaldefault = ws.DefaultColWidth;
ws.DefaultColWidth = 15;
//Even if you pass in a miniumWidth as the first parameter like '.AutoFitColumns(15)' EPPlus usually ignores it and goes with DefaultColWidth
ws.Cells[ws.Dimension.Address].AutoFitColumns();
//Set it back to what it was so it respects the user's local setting
ws.DefaultColWidth = orginaldefault;
package.Save();
}
}

Kendogrid population

I'm new to all this kendo stuff i need help in populating kendogrid from a csv file.
The csv data is stored in an array of strings returned by a service.
Data looks like :
0: "Module,LogLevel,LogType,LoggedTime,LogMessage"
1: "00D02D5A4B66 ,CommServer ,Level3 ,Information ,03/16/2015 00:32:57:5716 ,[ISOMMessageHandler::Initialize]-[EventCount:20,ObjectRetryCount:6]"
2: "00D02D5A4B66 ,CommServer ,Level1 ,Information ,03/16/2015 00:32:57:5716 ,ISOMProtocolHandler::HandleConnectGeneric] - Before UpdatePanelTouched - CommServerID : 1, ConnectionMode : 2"
3: "00D02D5A4B66 ,CommServer ,Level4 ,Information ,03/16/2015 00:32:57:5716 ,[PanelDataConfigurationHandler : UpdatePanelConnectionStatus] : CommServerID 1, CommMode : 2"
i need to display 0th indexed data as title of the columns
and rest in cells of the column.
My advice is to make a wrapper method yourself and get it into JSON.
needed wrapper as told by Thomas.
here is my wrapper function
function csvJSON(lines) {
var result = [];
var headers = lines[0].split(",");
headers.unshift("MAC");
for (var i = 1; i < lines.length; i++) {
var obj = {};
var currentline = lines[i].split(",");
for (var j = 0; j < headers.length; j++) {
obj[headers[j]] = currentline[j];
}
result.push(obj);
}
return result;
}

Performance difference when reading/writing many files with EPPlus versus Spreadsheet Gear

I've made a simple performance test between EPPlus and Spreadsheet Gear to see if there is any significant difference that would justify buying Spreadsheet Gear.
I am no expert at either application so it's possible the tests aren't written the most efficient way.
The test does the following:
1. Opens an existing Excel-file with 1000 rows and 3 columns. Saves the three values into an entity that is saved into a List<>.
2. Open a new Excel-object
3. Create a header row (bold) with the title of each column.
4. Write back the 1000 entities.
5. Save the new Excelfile.
If I run this test once EPPlus comes out the winner (approx times are EPPlus = 280ms, SG = 500ms). If I run the test 10 times in a row instead (a for-loop opening, copying, saving 10 seperate times) Spreadsheet Gear is faster instead (approx times per file: EPPlus = 165ms, SG = 95ms). For 20 tests the approx times are EPPlus = 160ms / file and SG = 60ms / file.
It seems like (to a certain extent at least) Spreadsheet Gears gets faster and faster the more files I create.
Could anyone explain why EPPlus is the slower one when running consecutive tests? And can I make changes to the code to change this?
EPPlus test function:
var timer = new Stopwatch();
timer.Start();
var data = new List<Item>();
using (var excelIn = new ExcelPackage(new FileInfo(folder + fileIn)))
{
var sheet = excelIn.Workbook.Worksheets[1];
var row = 2;
while (sheet.Cells[row, 1].Value != null)
{
data.Add(new Item()
{
Id = int.Parse(sheet.Cells[row, 1].Text),
Title = sheet.Cells[row, 2].Text,
Value = int.Parse(sheet.Cells[row, 3].Text)
});
row++;
}
}
using (var excelOut = new ExcelPackage())
{
var sheet = excelOut.Workbook.Worksheets.Add("Out");
sheet.Cells.LoadFromCollection(data);
sheet.InsertRow(1, 1);
sheet.Cells[1, 1, 1, 3].Style.Font.Bold = true;
sheet.Cells[1, 1].Value = "Id";
sheet.Cells[1, 2].Value = "Title";
sheet.Cells[1, 3].Value = "Value";
excelOut.SaveAs(new FileInfo(folder + "EPPlus_" + Guid.NewGuid() + ".xlsx"));
}
timer.Stop();
return timer.ElapsedMilliseconds;
Spreadsheet Gear:
var timer = new Stopwatch();
timer.Start();
var data = new List<Item>();
var excelIn = Factory.GetWorkbook(folder + fileIn);
var sheetIn = excelIn.Worksheets[0];
var rowIn = 1;
while (sheetIn.Cells[rowIn, 0].Value != null)
{
data.Add(new Item()
{
Id = int.Parse(sheetIn.Cells[rowIn, 0].Text),
Title = sheetIn.Cells[rowIn, 1].Text,
Value = int.Parse(sheetIn.Cells[rowIn, 2].Text)
});
rowIn++;
}
excelIn.Close();
var excelOut = Factory.GetWorkbook();
var sheetOut = excelOut.Worksheets.Add();
sheetOut.Name = "Out";
var rowOut = 0;
sheetOut.Cells[rowOut, 0, rowOut, 2].Font.Bold = true;
sheetOut.Cells[rowOut, 0].Value = "Id";
sheetOut.Cells[rowOut, 1].Value = "Title";
sheetOut.Cells[rowOut++, 2].Value = "Value";
foreach (var item in data)
{
sheetOut.Cells[rowOut, 0].Value = item.Id;
sheetOut.Cells[rowOut, 1].Value = item.Title;
sheetOut.Cells[rowOut++, 2].Value = item.Value;
}
excelOut.SaveAs(folder + "SpreadsheetGear_" + Guid.NewGuid() + ".xlsx", FileFormat.OpenXMLWorkbook);
excelOut.Close();
timer.Stop();
return timer.ElapsedMilliseconds;
Main function
var runs = 1;
var testerG = new TestSpreadsheetGear();
var testerE = new TestEpPlus();
var msE = 0.0;
var msG = 0.0;
var i = 0;
for (i = 0; i < runs; ++i)
{
msG += new TestSpreadsheetGear().Run(folder, originalFile);
}
for(i = 0; i < runs; ++i)
{
msE += new TestEpPlus().Run(folder, originalFile);
}
Console.WriteLine("Spreadsheet time: " + msG + ". Per file: " + msG / runs);
Console.WriteLine("EP Plus time: " + msE + ". Per file: " + msE / runs);
Console.ReadKey();
I believe that the reason for the results you are seeing is the fact that on the first run the .NET CLR must JIT the code. Since SpreadsheetGear is a complete spreadsheet engine under the hood (as opposed to a read / write library) there is more code to JIT - thus the first run is taking longer for SpreadsheetGear than EPPlus (I am speculating here but have a great deal of experience in benchmarking .NET code over the last 10 years).
I do not have EPPlus installed but I did write a test which tries to do the same thing you are doing. with SpreadsheetGear 2012 Since I don't have your starting workbook I first build the workbook. Then, I used more optimal SpreadsheetGear APIs. The first time I run I get 141 milliseconds for SpreadsheetGear 2012. After the first run I get 9 or 10 milliseconds for each run on an overclocked Core i7-980x running Win7 x86 and a release build run without debugger.
I have pasted my code below (just paste it into a .NET 4.0 C# console application).
One more thought I have is that this is a very small test case. To really see the performance of SpreadsheetGear 2012 try this with 100,000 rows or even 1 million rows.
Disclaimer: I own SpreadsheetGear LLC
using System;
using System.Collections.Generic;
using System.Diagnostics;
using SpreadsheetGear;
namespace SGvsEPPlus
{
class Program
{
internal struct Item
{
internal Item(int id, string title, int value)
{
Id = id;
Title = title;
Value = value;
}
internal int Id;
internal string Title;
internal int Value;
}
static void Test(int rows)
{
string filename = #"C:\tmp\MyWorkbook.xlsx";
Console.Write("Test({0})...", rows);
var timer = new Stopwatch();
// Create workbook since we don't have poster's original workbook.
timer.Restart();
var workbook = Factory.GetWorkbook();
var values = (SpreadsheetGear.Advanced.Cells.IValues)workbook.Worksheets[0];
for (int row = 1; row <= rows; row++)
{
values.SetNumber(row, 0, row);
values.SetText(row, 1, "Title " + row);
values.SetNumber(row, 2, row * 10);
}
Console.Write("Create workbook={0:0}...", timer.Elapsed.TotalMilliseconds);
// Save workbook
timer.Restart();
workbook.SaveAs(filename, FileFormat.OpenXMLWorkbook);
Console.Write("Save workbook={0:0}...", timer.Elapsed.TotalMilliseconds);
// Track total time of original test.
var totalTimer = Stopwatch.StartNew();
// Open workbook
timer.Restart();
var excelIn = Factory.GetWorkbook(filename);
Console.Write("Open excelIn={0:0}...", timer.Elapsed.TotalMilliseconds);
// Copy workbook to list
timer.Restart();
var sheetIn = excelIn.Worksheets[0];
var valuesIn = (SpreadsheetGear.Advanced.Cells.IValues)sheetIn;
var rowIn = 1;
var data = new List<Item>(rows);
while (valuesIn[rowIn, 0] != null)
{
data.Add(new Item(
(int)valuesIn[rowIn, 0].Number,
valuesIn[rowIn, 1].Text,
(int)valuesIn[rowIn, 2].Number));
rowIn++;
}
excelIn.Close(); // Not necessary but left for consistency.
Console.Write("excelIn->data={0:0}...", timer.Elapsed.TotalMilliseconds);
timer.Restart();
var excelOut = Factory.GetWorkbook();
var sheetOut = excelOut.Worksheets[0];
var valuesOut = (SpreadsheetGear.Advanced.Cells.IValues)sheetOut;
sheetOut.Name = "Out";
var rowOut = 0;
sheetOut.Cells[rowOut, 0, rowOut, 2].Font.Bold = true;
sheetOut.Cells[rowOut, 0].Value = "Id";
sheetOut.Cells[rowOut, 1].Value = "Title";
sheetOut.Cells[rowOut++, 2].Value = "Value";
foreach (var item in data)
{
valuesOut.SetNumber(rowOut, 0, item.Id);
valuesOut.SetText(rowOut, 1, item.Title);
valuesOut.SetNumber(rowOut, 2, item.Value);
rowOut++;
}
Console.Write("data->excelOut={0:0}...", timer.Elapsed.TotalMilliseconds);
timer.Restart();
excelOut.SaveAs(#"C:\tmp\SpreadsheetGear_" + Guid.NewGuid() + ".xlsx", FileFormat.OpenXMLWorkbook);
excelOut.Close(); // Again - not necessary.
Console.WriteLine("Save excelOut={0:0}...", timer.Elapsed.TotalMilliseconds);
Console.WriteLine(" Total={0:0}", totalTimer.Elapsed.TotalMilliseconds);
}
static void Main(string[] args)
{
// Do it three times with 1000 rows. Note that the first
// time takes longer because code must be JITted.
Test(1000);
Test(1000);
Test(1000);
}
}
}

How to read Excel cells and write to database

I am writing an C# application that reads from an an excel sheet. It reads the whole sheet, however I am interested in just reading particular cells in a row and move to the next row and read particular cells in that row again. Cells not read or omitted are the same for all rows in excel. Below is my sample code for reading excel:
private void button1_Click(object sender, EventArgs e)
{
Excel.Application xlApp = new Excel.Application();
Excel.Workbook xlWorkbook = xlApp.Workbooks.Open(#"C:/test.xlsx");
Excel._Worksheet xlWorksheet = xlWorkbook.Sheets[1];
Excel.Range xlRange = xlWorksheet.UsedRange;
int rowCount = xlRange.Rows.Count;
int colCount = xlRange.Columns.Count;
for(int i=1; i <= rowCount; i++)
{
for(int j=1; j<=colCount; j++)
{
MessageBox.Show(xlRange.Cells[i,j].Value2.ToString());
}
}
}
You could use OleDB instead of Excel.Interop. Your question requires only reading cell values so it is easier to treat your excel file as a datatable itself....
string fileName = #"C:\test.xlsx";
string connectionString = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source={0};Extended Properties='Excel 12.0;HDR=YES;IMEX=0'", fileName);
using(OleDbConnection cn = new OleDbConnection(connectionString))
{
cn.Open();
OleDbCommand cmd = new OleDbCommand("SELECT Column1, Column2, Column3 From [Sheet1$]", cn);
OleDbDataReader rd = cmd.ExecuteReader();
while(rd.Read())
{
Console.WriteLine("Loop");
Console.WriteLine(rd.GetString(0));
Console.WriteLine(rd.GetString(1));
Console.WriteLine(rd.GetString(2));
Console.WriteLine();
}
}
A couple of notes:
I'm using ACE.OleDB because your file has XLSX extensions
I'm assuming your file has Column headers in the first line called Column1. etc....

NPOI insert row like excel

How can I use NPOI to insert a row like excel?
The excel insert command copy the format for the upper row
Thanks!
static void InsertRows(ref HSSFSheet sheet1, int fromRowIndex, int rowCount)
{
sheet1.ShiftRows(fromRowIndex, sheet1.LastRowNum, rowCount, true, false, true);
for (int rowIndex = fromRowIndex; rowIndex < fromRowIndex + rowCount; rowIndex++)
{
HSSFRow rowSource = sheet1.GetRow(rowIndex + rowCount);
HSSFRow rowInsert = sheet1.CreateRow(rowIndex);
rowInsert.Height = rowSource.Height;
for (int colIndex = 0; colIndex < rowSource.LastCellNum; colIndex++)
{
HSSFCell cellSource = rowSource.GetCell(colIndex);
HSSFCell cellInsert = rowInsert.CreateCell(colIndex);
if (cellSource != null)
{
cellInsert.CellStyle = cellSource.CellStyle;
}
}
}
}
maybe you can look here for some inspiration
HSSFRow newRow = worksheet.GetRow(destinationRowNum);
HSSFRow sourceRow = worksheet.GetRow(sourceRowNum);
// If the row exist in destination, push down all rows by 1 else create a new row
if (newRow != null)
{
worksheet.ShiftRows(destinationRowNum, worksheet.LastRowNum, 1);
}
else
{
newRow = worksheet.CreateRow(destinationRowNum);
}
I am using this code to create a new row. This might come handy to you.

Resources