Slow loop over data table collection - for-loop

I am updating values in an Excel workbook using values from a MySQL database. There are just eleven rows in the WorkbookMap list, and six DataTables in the RptValueSet DataSet. I've proven that the problem is in this loop, not in the communication with the database. Getting the results is fast, but writing them to the workbook is slow. The code below works fine when the DataTables in the DataSet are small - such as a 3-column, 7-row results set, and I receive the updated Excel workbook almost instantly. However, the loop slows down noticeably when the results set increases; a 3-column, 50-row DataTable results in a 7 - 10 second delay in returning the updated Excel workbook. I'm not sure I really need to put the DataTables into a collection, but that's the only way I could figure out how to iterate over them. Any tips on optimizing this loop would be much appreciated!
// Create a list to contain the destination for the data in the workbook
List<WorkbookMap> wbMap = new List<WorkbookMap>();
// Create a new data set to contain results from database
DataSet RptValuesSet = new DataSet();
// RptValuesSet populated from database here....
// Create a collection so we can loop thru the dataset
DataTableCollection RptValuesColl = RptValuesSet.Tables;
for (int i = 0; i < RptValuesColl.Count; i++)
{
DataTable tbl = RptValuesColl[i];
// Find the correct entry in the workbook map
for (int j = 0; j < wbMap.Count; j++)
{
if (wbMap[j].SPCall == tbl.TableName)
{
// Write the results to the correct location in the workbook
MovingColumnRef = wbMap[j].StartColumn;
for (int c = 1; c < tbl.Columns.Count; c++)
{
row = wbMap[j].StartRow; // start at the top row for each new column
for (int r = 0; r < tbl.Rows.Count; r++)
{
// Write the database value to the workbook given the sheetName and cell address
UpdateValue(wbMap[j].SheetName, MovingColumnRef + row, tbl.Rows[r][c].ToString(), 0, wbMap[j].String);
row++;
}
MovingColumnRef = IncrementColRef(MovingColumnRef);
}
}
}
}

Without delving deeply into your code. I noticed you said that you believe the slowness is coming from writing to the sheet. Try putting the data into an Array first before updating the workbook. E.g. you would write to the sheet like this. anchorRangeName is the name of a range which would be only one cell in the workbook.
private void WriteResultToRange(Excel.Workbook wb, string anchorRangeName, object[,] resultArray)
{
Excel.Range resultRange = GetRange(anchorRangeName, wb).get_Resize(resultArray.GetLength(0), resultArray.GetLength(1));
resultRange.Value2 = resultArray;
}
You'll still need to get the data from the database into an array.

Related

office script - range find - return row or array to power automate

I have been trying several different ways to write an office script to search for a value in a cell and return the row or rows to power automate.
I believe I need to use range.find in order to make use of the "completematch: true" option.
However, I have also tried a filter and a foreach loop to find rows which include the text I am searching for.
I'm after a hint as to which method might be best?
essentially trying to:-
power automate - pass text parameter to the script
Scripts search for a match in excel business spreadsheet
the script finds match(s)
Script passes back the row(s) to powerautomate as an array
this is what I have so far: essentially it just finds the row number in which the matching result is found. This seems to work better to avoid partial matched (as happened with the filter method )
any pointers, most welcome
function main(workbook: ExcelScript.Workbook, siteNameToFilter: string) {
let activeSheet = workbook.getActiveWorksheet();
let range = activeSheet.getUsedRange();
let values = range.getValues();
/**
* This script searches for the next instance of the text "Fino" on the current worksheet.
*/
// Get the next cell that contains "Fino".
let findCell = range.find("Fino", {
completeMatch: true, /* Don't match if the cell text only contains "fino" as part of another string. */
matchCase: false,
searchDirection: ExcelScript.SearchDirection.forward /* Start at the beginning of the range and go to later columns and rows. */
});
// Set focus on the found cell.
findCell.select();
// Remove the "TK" text value from the cell, as well as any formatting that may have been added.
//tkCell.clear(ExcelScript.ClearApplyTo.all);
let row = findCell.getRow().getUsedRange();
let ur = findCell.getUsedRange();
console.log(row);
}
I think Find may only be returning the first match. It sounds like you want all matches with the siteName. To do this, you'd either want to filter the range or loop through it.
Here's an example that loops through the range and adds the values from the rows containing the site name to an array. After the loop's completed, the array containing the values is returning by the function:
function main(workbook: ExcelScript.Workbook, siteNameToFilter: string) {
let activeSheet = workbook.getActiveWorksheet();
let range = activeSheet.getUsedRange()
let values = range.getValues() as string[][];
let rowCount = range.getRowCount()
let colCount = range.getColumnCount()
let colIndex = range.getColumnIndex()
let rowsArr: string[][][] = []
for (let i = 0; i < rowCount; i++) {
for (let j = 0; j < colCount; j++) {
if (values[i][j] === siteNameToFilter) {
rowsArr.push(activeSheet.getRangeByIndexes(i, colIndex, 1, colCount).getValues() as string[][])
}
}
}
return rowsArr
}

Google Spreadsheet Loop & IF statement: Loop through co-worker list and add them to a rotation schedule

EDIT - SOLVED
I have to make a rotation schedule for coworkers for the next year. Some coworkers have standard days off and I do not want to schedule them on those days.
This is the manual outcome I would like to get.
Example: Consultant A does not work on mondays, so I do not want Consultant A to be added to the schedule on a monday.
I then want consultant B to be added to the schedule as a fill-up. Consultant A would be next in line on a tuesday etc. Next would be consultant C but consultant C does not work on wednesdays. Therefore, we need to take consultant D for wednesday and consultant C on a thursday, and so on. When we are at the last consultant of the F column, it needs to start again at consultant A.
I have tried all kinds of formulas, like if statements and arrayformula. But there is no way that I know of to loop through the F column just with formulas.
I am not sure if this is at all clear what I want to achieve here, I am stuck 😄
I am using additionally an add-on to send the schedule to everyone's agenda, thats also the reason why i'd love to automate this, because it would help me SO much.
I did try myself on some coding, but I am no coder and I am not sure if it would be helpful at all to share my failure 😄 But this is what I've tried so far:
function Loop() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var EndRow = ss.getLastRow();
for (var i = 2; i <= EndRow; i++) {
var Day = ss.getRange(i,2).getValue();
var Consultants = ss.getRange(i,6).getValue();
var Off = ss.getRange(i,7).getValue ();
var Count = ss.getRange(i,8).getValue();
if(Day == Off){
ss.getRange(i, 3).setValue(Consultants)
}else{
ss.getRange(i, 3).setValue(Consultants)
}
}
}
EDIT:
I found a way without using apps scripts, costs me some more work manually and first tried it with a shorter team list.
The highlighted yellow cells are the cells in which the day off was identical to the work-day cell. So they got switched.
I did have to copy paste my input list of consultants but if this is the only manual way, its fine :)
Try this:
Code:
function myFunction() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sh = ss.getSheetByName("Sheet1");
var dayCol = sh.getRange("B2:B343").getValues().flat(); //get day column values and convert it to 1d array
var dayOffCol = sh.getRange("G2:G9").getValues().flat(); //get day off values and convert it to 1d array
var dayOffColCopy; //initialize copy
var consCol = sh.getRange("F2:F9").getValues().flat(); //get consultants values or column F
var consColCopy; //initialize copy
var tempArray = []; //storage of final value for column C
for(var i = 0; i < dayCol.length; i++){ //loop through dayCol values
var ctr = (i % consCol.length); //used modulo as counter. the value will return to 0 if the value of i is divisible to the length of consCol or in example 8
//The if statement below will help up reset the value of
//consColCopy and dayOffColCopy once the values are emptied because of the splice()
if(ctr == 0){
consColCopy = consCol.slice();
dayOffColCopy = dayOffCol.slice();
}
//the loop below will get the first non-matching values of dayCol and dayOffColCopy,
//the first non-matching values will be removed to the copy variables using splice()
//and insert it to tempArray using push()
for(var j = 0; j < dayOffColCopy.length; j++){ //loop through dayOffColCopy values
if(dayCol[i] != dayOffColCopy[j]){
tempArray.push(consColCopy.splice(j, 1));
dayOffColCopy.splice(j, 1);
break; //exit loop
}
}
}
sh.getRange(2, 3, tempArray.length, 1).setValues(tempArray); //set the values of temp array to column C
}
Example Data & Output:
Note: Make sure to use the cell that has data in your range and change the sheet name. I also added comments in my code to explain the process.
References:
Array.prototype.push()
Array.prototype.slice()
Array.prototype.splice()
Class Range

Processing: How can I find the number of times two fields are equal in a CSV file?

I'm learning Processing for the first time and I've been tasked to deal with data but it's been terribly confusing for me.
For every line of a CSV file (apart from the header), I want to compare two specific columns of each. i.e. ListA vs ListB
For example, with the data below:
ListA,ListB
Male,Yes
Male,No
Female,Yes
Male,Yes
And for example, I want to check for all instances that a value in ListA is "Male" AND that the corresponding value in ListB is "Yes". In this scenario, I should get the value "2" for the two rows this is true.
How would I do that?
For now, I have a 2D String array of the data in the CSV file. From that I managed to assign specific columns as ListA and ListB. I tried using sort but it would only sort one list and not both at the same time.
Current relevant code:
for (int i=1; i<lines.length; i++) {
listA[i-1] = csv[i][int(whichA)];
listB[i-1] = csv[i][int(whichB)];
}
lA = Arrays.asList(listA);
lB = Arrays.asList(listB);
Not sure if this code really helps makes things clearer though. :/
Any help would be appreciated. Thank you.
So something like this should do what you need it to. Pseudocode:
int numRows = 0;
for (int i = 0; i < length; ++i) {
if (array1[i] equals "Male" AND array2[i] equals "Yes") {
++numRows;
//add to new collection here if you need the data
}
}

LinqToExcel - Need to start at a specific row

I'm using the LinqToExcel library. Working great so far, except that I need to start the query at a specific row. This is because the excel spreadsheet from the client uses some images and "header" information at the top of the excel file before the data actually starts.
The data itself will be simple to read and is fairly generic, I just need to know how to tell the ExcelQueryFactory to start at a specific row.
I am aware of the WorksheetRange<Company>("B3", "G10") option, but I don't want to specify an ending row, just where to start reading the file.
Using the latest v. of LinqToExcel with C#
I just tried this code and it seemed to work just fine:
var book = new LinqToExcel.ExcelQueryFactory(#"E:\Temporary\Book1.xlsx");
var query =
from row in book.WorksheetRange("A4", "B16384")
select new
{
Name = row["Name"].Cast<string>(),
Age = row["Age"].Cast<int>(),
};
I only got back the rows with data.
I suppose that you already solved this, but maybe for others - looks like you can use
var excel = new ExcelQueryFactory(path);
var allRows = excel.WorksheetNoHeader();
//start from 3rd row (zero-based indexing), length = allRows.Count() or computed range of rows you want
for (int i = 2; i < length; i++)
{
RowNoHeader row = allRows.ElementAtOrDefault(i);
//process the row - access columns as you want - also zero-based indexing
}
Not as simple as specifying some Range("B3", ...), but also the way.
Hope this helps at least somebody ;)
I had tried this, works fine for my scenario.
//get the sheets info
var faceWrksheet = excel.Worksheet(facemechSheetName);
// get the total rows count.
int _faceMechRows = faceWrksheet.Count();
// append with End Range.
var faceMechResult = excel.WorksheetRange<ExcelFaceMech>("A5", "AS" + _faceMechRows.ToString(), SheetName).
Where(i => i.WorkOrder != null).Select(x => x).ToList();
Have you tried WorksheetRange<Company>("B3", "G")
Unforunatly, at this moment and iteration in the LinqToExcel framework, there does not appear to be any way to do this.
To get around this we are requiring the client to have the data to be uploaded in it's own "sheet" within the excel document. The header row at the first row and the data under it. If they want any "meta data" they will need to include this in another sheet. Below is an example from the LinqToExcel documentation on how to query off a specific sheet.
var excel = new ExcelQueryFactory("excelFileName");
var oldCompanies = from c in repo.Worksheet<Company>("US Companies") //worksheet name = 'US Companies'
where c.LaunchDate < new DateTime(1900, 0, 0)
select c;

How can I get the "actual" count of element in a IEnumerable?

If I wrote :
for (int i = 0; i < Strutture.Count(); i++)
{
}
and Strutture is an IEnumerable with 200 elements, IIS crash. That's because I see every time I do Strutture.Count() it executes all LINQ queries linked with that IEnumerable.
So, how can I get the "current" number of elements? I need a list?
"That's because I see every time I do Strutture.Count() it executes all LINQ queries linked with that IEnumerable."
Without doing such, how is it going to know how many elements there are?
For example:
Enumerable.Range(0,1000).Where(i => i % 2==0).Skip(100).Take(5).Count();
Without executing the LINQ, how could you know how many elements there are?
If you want to know how many elements there are in the source (e.g. Enumerable.Range) then I suggest you use a reference to that source and query it directly. E.g.
var numbers = Enumerable.Range(0,1000);
numbers.Count();
Also keep in mind some data sources don't really have a concept of 'Count' or if they do it involves going through every single item and counting them.
Lastly, if you're using .Count() repetitively [and you don't expect the value to actually change] it can be a good idea to cache:
var count = numbers.Count();
for (int i =0; i<count; i++) // Do Something
Supplemental:
"At first Count(), LINQ queries are executes. Than, for the next, it just "check" the value :) Not "execute the LINQ query again..." :)" - Markzzz
Then why don't we do that?
var query = Enumerable.Range(0,1000).Where(i => i % 2==0).Skip(100).Take(5).Count();
var result = query.ToArray() //Gets and stores the result!
result.Length;
:)
"But when I do the first "count", it should store (after the LINQ queries) the new IEnumerable (the state is changed). If I do again .Count(), why LINQ need to execute again ALL queries." - Markzzz
Because you're creating a query that gets compiled down into X,Y,Z. You're running the same query twice however the result may vary.
For example, check this out:
static void Main(string[] args)
{
var dataSource = Enumerable.Range(0, 100).ToList();
var query = dataSource.Where(i => i % 2 == 0);
//Run the query once and return the count:
Console.WriteLine(query.Count()); //50
//Now lets modify the datasource - remembering this could be a table in a db etc.
dataSource.AddRange(Enumerable.Range(100, 100));
//Run the query again and return the count:
Console.WriteLine(query.Count()); //100
Console.ReadLine();
}
This is why I recommended storing the results of the query above!
Materialize the number:
int number = Strutture.Count();
for (int i = 0; i < number; i++)
{
}
or materialize the list:
var list = Strutture.ToList();
for (int i = 0; i < list.Count; i++)
{
}
or use a foreach
foreach(var item in Strutture)
{
}

Resources