Related
I am trying to find a way to filter large data and remove rows in a worksheet, in less than one minute
The goal:
Find all records containing specific text in column 1, and delete the entire row
Keep all cell formatting (colors, font, borders, column widths) and formulas as they are
.
Test Data:
:
.
How the code works:
It starts by turning all Excel features Off
If the workbook is not empty and the text value to be removed exists in column 1
Copies the used range of column 1 to an array
Iterates over every value in array backwards
When it finds a match:
Appends the cell address to a tmp string in the format "A11,A275,A3900,..."
If the tmp variable length is close to 255 characters
Deletes rows using .Range("A11,A275,A3900,...").EntireRow.Delete Shift:=xlUp
Resets tmp to empty and moves on to the next set of rows
At the end, it turns all Excel features back On
.
The main issue is the Delete operation, and total duration time should be under one minute. Any code-based solution is acceptable as long as it performs under 1 minute.
This narrows the scope to very few acceptable answers. The answers already provided are also very short and easy to implement. One performs the operation in about 30 seconds, so there is at least one answer that provides an acceptable solution, and other may find it useful as well
.
My main initial function:
Sub DeleteRowsWithValuesStrings()
Const MAX_SZ As Byte = 240
Dim i As Long, j As Long, t As Double, ws As Worksheet
Dim memArr As Variant, max As Long, tmp As String
Set ws = Worksheets(1)
max = GetMaxCell(ws.UsedRange).Row
FastWB True: t = Timer
With ws
If max > 1 Then
If IndexOfValInRowOrCol("Test String", , ws.UsedRange) > 0 Then
memArr = .Range(.Cells(1, 1), .Cells(max, 1)).Value2
For i = max To 1 Step -1
If memArr(i, 1) = "Test String" Then
tmp = tmp & "A" & i & ","
If Len(tmp) > MAX_SZ Then
.Range(Left(tmp, Len(tmp) - 1)).EntireRow.Delete Shift:=xlUp
tmp = vbNullString
End If
End If
Next
If Len(tmp) > 0 Then
.Range(Left(tmp, Len(tmp) - 1)).EntireRow.Delete Shift:=xlUp
End If
.Calculate
End If
End If
End With
FastWB False: InputBox "Duration: ", "Duration", Timer - t
End Sub
Helper functions (turn Excel features off and on):
Public Sub FastWB(Optional ByVal opt As Boolean = True)
With Application
.Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic)
.DisplayAlerts = Not opt
.DisplayStatusBar = Not opt
.EnableAnimations = Not opt
.EnableEvents = Not opt
.ScreenUpdating = Not opt
End With
FastWS , opt
End Sub
Public Sub FastWS(Optional ByVal ws As Worksheet = Nothing, _
Optional ByVal opt As Boolean = True)
If ws Is Nothing Then
For Each ws In Application.ActiveWorkbook.Sheets
EnableWS ws, opt
Next
Else
EnableWS ws, opt
End If
End Sub
Private Sub EnableWS(ByVal ws As Worksheet, ByVal opt As Boolean)
With ws
.DisplayPageBreaks = False
.EnableCalculation = Not opt
.EnableFormatConditionsCalculation = Not opt
.EnablePivotTable = Not opt
End With
End Sub
Finds last cell with data (thanks #ZygD - now I tested it in several scenarios):
Public Function GetMaxCell(Optional ByRef rng As Range = Nothing) As Range
'Returns the last cell containing a value, or A1 if Worksheet is empty
Const NONEMPTY As String = "*"
Dim lRow As Range, lCol As Range
If rng Is Nothing Then Set rng = Application.ActiveWorkbook.ActiveSheet.UsedRange
If WorksheetFunction.CountA(rng) = 0 Then
Set GetMaxCell = rng.Parent.Cells(1, 1)
Else
With rng
Set lRow = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
After:=.Cells(1, 1), _
SearchDirection:=xlPrevious, _
SearchOrder:=xlByRows)
If Not lRow Is Nothing Then
Set lCol = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
After:=.Cells(1, 1), _
SearchDirection:=xlPrevious, _
SearchOrder:=xlByColumns)
Set GetMaxCell = .Parent.Cells(lRow.Row, lCol.Column)
End If
End With
End If
End Function
Returns the index of a match in the array, or 0 if a match is not found:
Public Function IndexOfValInRowOrCol( _
ByVal searchVal As String, _
Optional ByRef ws As Worksheet = Nothing, _
Optional ByRef rng As Range = Nothing, _
Optional ByRef vertical As Boolean = True, _
Optional ByRef rowOrColNum As Long = 1 _
) As Long
'Returns position in Row or Column, or 0 if no matches found
Dim usedRng As Range, result As Variant, searchRow As Long, searchCol As Long
result = CVErr(9999) '- generate custom error
Set usedRng = GetUsedRng(ws, rng)
If Not usedRng Is Nothing Then
If rowOrColNum < 1 Then rowOrColNum = 1
With Application
If vertical Then
result = .Match(searchVal, rng.Columns(rowOrColNum), 0)
Else
result = .Match(searchVal, rng.Rows(rowOrColNum), 0)
End If
End With
End If
If IsError(result) Then IndexOfValInRowOrCol = 0 Else IndexOfValInRowOrCol = result
End Function
.
Update:
Tested 6 solutions (3 tests each): Excel Hero's solution is the fastest so far (removes formulas)
.
Here are the results, fastest to the slowest:
.
Test 1. Total of 100,000 records, 10,000 to be deleted:
1. ExcelHero() - 1.5 seconds
2. DeleteRowsWithValuesNewSheet() - 2.4 seconds
3. DeleteRowsWithValuesStrings() - 2.45 minutes
4. DeleteRowsWithValuesArray() - 2.45 minutes
5. QuickAndEasy() - 3.25 minutes
6. DeleteRowsWithValuesUnion() - Stopped after 5 minutes
.
Test 2. Total of 1 million records, 100,000 to be deleted:
1. ExcelHero() - 16 seconds (average)
2. DeleteRowsWithValuesNewSheet() - 33 seconds (average)
3. DeleteRowsWithValuesStrings() - 4 hrs 38 min (16701.375 sec)
4. DeleteRowsWithValuesArray() - 4 hrs 37 min (16626.3051757813 sec)
5. QuickAndEasy() - 5 hrs 40 min (20434.2104492188 sec)
6. DeleteRowsWithValuesUnion() - N/A
.
Notes:
ExcelHero method: easy to implement, reliable, extremely fast, but removes formulas
NewSheet method: easy to implement, reliable, and meets the target
Strings method: more effort to implement, reliable, but doesn't meet requirement
Array method: similar to Strings, but ReDims an array (faster version of Union)
QuickAndEasy: easy to implement (short, reliable and elegant), but doesn't meet requirement
Range Union: implementation complexity similar to 2 and 3, but too slow
I also made the test data more realistic by introducing unusual values:
empty cells, ranges, rows, and columns
special characters, like =[`~!##$%^&*()_-+{}[]\|;:'",.<>/?, separate and multiple combinations
blank spaces, tabs, empty formulas, border, font, and other cell formatting
large and small numbers with decimals (=12.9999999999999 + 0.00000000000000001)
hyperlinks, conditional formatting rules
empty formatting inside and outside data ranges
anything else that might cause data issues
I'm providing the first answer as a reference
Others may find it useful, if there are no other options available
Fastest way to achieve the result is not to use the Delete operation
Out of 1 million records it removes 100,000 rows in an average of 33 seconds
.
Sub DeleteRowsWithValuesNewSheet() '100K records 10K to delete
'Test 1: 2.40234375 sec
'Test 2: 2.41796875 sec
'Test 3: 2.40234375 sec
'1M records 100K to delete
'Test 1: 32.9140625 sec
'Test 2: 33.1484375 sec
'Test 3: 32.90625 sec
Dim oldWs As Worksheet, newWs As Worksheet, rowHeights() As Long
Dim wsName As String, t As Double, oldUsedRng As Range
FastWB True: t = Timer
Set oldWs = Worksheets(1)
wsName = oldWs.Name
Set oldUsedRng = oldWs.Range("A1", GetMaxCell(oldWs.UsedRange))
If oldUsedRng.Rows.Count > 1 Then 'If sheet is not empty
Set newWs = Sheets.Add(After:=oldWs) 'Add new sheet
With oldUsedRng
.AutoFilter Field:=1, Criteria1:="<>Test String"
.Copy 'Copy visible data
End With
With newWs.Cells
.PasteSpecial xlPasteColumnWidths
.PasteSpecial xlPasteAll 'Paste data on new sheet
.Cells(1, 1).Select 'Deselect paste area
.Cells(1, 1).Copy 'Clear Clipboard
End With
oldWs.Delete 'Delete old sheet
newWs.Name = wsName
End If
FastWB False: InputBox "Duration: ", "Duration", Timer - t
End Sub
.
At high level:
It creates a new worksheet, and keeps a reference to the initial sheet
AutoFilters column 1 on the searched text: .AutoFilter Field:=1, Criteria1:="<>Test String"
Copies all (visible) data from initial sheet
Pastes column widths, formats, and data to the new sheet
Deletes initial sheet
Renames the new sheet to the old sheet name
It uses the same helper functions posted in the question
The 99% of the duration is used by the AutoFilter
.
There are a couple limitations I found so far, the first can be addressed:
If there are any hidden rows on the initial sheet, it unhides them
A separate function is needed to hide them back
Depending on implementation, it might significantly increase duration
VBA related:
It changes the Code Name of the sheet; other VBA referring to Sheet1 will be broken (if any)
It deletes all VBA code associated with the initial sheet (if any)
.
A few notes about using large files like this:
The binary format (.xlsb) reduce file size dramatically (from 137 Mb to 43 Mb)
Unmanaged Conditional Formatting rules can cause exponential performance issues
The same for Comments, and Data validation
Reading file or data from network is much slower than working with a locall file
A significant gain in speed can be achieved if the source data do not contain formulas, or if the scenario would allow (or want) the formulas to be converted into hard values during the conditional row deletions.
With the above as a caveat, my solution uses the AdvancedFilter of the range object. It's about twice as fast as DeleteRowsWithValuesNewSheet().
Public Sub ExcelHero()
Dim t#, crit As Range, data As Range, ws As Worksheet
Dim r&, fc As Range, lc As Range, fr1 As Range, fr2 As Range
FastWB True
t = Timer
Set fc = ActiveSheet.UsedRange.Item(1)
Set lc = GetMaxCell
Set data = ActiveSheet.Range(fc, lc)
Set ws = Sheets.Add
With data
Set fr1 = data.Worksheet.Range(fc, fc.Offset(, lc.Column))
Set fr2 = ws.Range(ws.Cells(fc.Row, fc.Column), ws.Cells(fc.Row, lc.Column))
With fr2
fr1.Copy
.PasteSpecial xlPasteColumnWidths: .PasteSpecial xlPasteAll
.Item(1).Select
End With
Set crit = .Resize(2, 1).Offset(, lc.Column + 1)
crit = [{"Column 1";"<>Test String"}]
.AdvancedFilter xlFilterCopy, crit, fr2
.Worksheet.Delete
End With
FastWB False
r = ws.UsedRange.Rows.Count
Debug.Print "Rows: " & r & ", Duration: " & Timer - t & " seconds"
End Sub
On my elderly Dell Inspiron 1564 (Win 7 Office 2007) this:
Sub QuickAndEasy()
Dim rng As Range
Set rng = Range("AA2:AA1000001")
Range("AB1") = Now
Application.ScreenUpdating = False
With rng
.Formula = "=If(A2=""Test String"",0/0,A2)"
.Cells.SpecialCells(xlCellTypeFormulas, xlErrors).EntireRow.Delete
.Clear
End With
Application.ScreenUpdating = True
Range("AC1") = Now
End Sub
took about 10 seconds to run. I am assuming that column AA is available.
EDIT#1:
Please note that this code does not set Calculation to Manual. Performance will improve if the Calculation mode is set to Manual after the "helper" column is allowed to calculate.
I know I'm incredibly late with my answer here however future visitors may find it very useful.
Please Note: My approach requires an index column for the rows to end up in the original order, however if you do not mind the rows being in a different order then an index column isn't needed and the additional line of code can be removed.
My approach: My approach was to simply select all the rows in the selected range (column), sort them in ascending order using Range.Sort and then collecting the first and last index of "Test String" within the selected range (column). I then create a range from the first and last indices and use Range.EntrieRow.Delete to remove all the rows which contain "Test String".
Pros:
- It is blazing fast.
- It doesn't remove formatting, formulas, charts, pictures or anything like the method which copies to a new sheet.
Cons:
- A decent size of code to implement however it is all straight-forward.
Test Range Generation Sub:
Sub DevelopTest()
Dim index As Long
FastWB True
ActiveSheet.UsedRange.Clear
For index = 1 To 1000000 '1 million test
ActiveSheet.Cells(index, 1).Value = index
If (index Mod 10) = 0 Then
ActiveSheet.Cells(index, 2).Value = "Test String"
Else
ActiveSheet.Cells(index, 2).Value = "Blah Blah Blah"
End If
Next index
Application.StatusBar = ""
FastWB False
End Sub
Filter And Delete Rows Sub:
Sub DeleteRowFast()
Dim curWorksheet As Worksheet 'Current worksheet vairable
Dim rangeSelection As Range 'Selected range
Dim startBadVals As Long 'Start of the unwanted values
Dim endBadVals As Long 'End of the unwanted values
Dim strtTime As Double 'Timer variable
Dim lastRow As Long 'Last Row variable
Dim lastColumn As Long 'Last column variable
Dim indexCell As Range 'Index range start
Dim sortRange As Range 'The range which the sort is applied to
Dim currRow As Range 'Current Row index for the for loop
Dim cell As Range 'Current cell for use in the for loop
On Error GoTo Err
Set rangeSelection = Application.InputBox("Select the (N=) range to be checked", "Get Range", Type:=8) 'Get the desired range from the user
Err.Clear
M1 = MsgBox("This is recommended for large files (50,000 or more entries)", vbYesNo, "Enable Fast Workbook?") 'Prompt the user with an option to enable Fast Workbook, roughly 150% performace gains... Recommended for incredibly large files
Select Case M1
Case vbYes
FastWB True 'Enable fast workbook
Case vbNo
FastWB False 'Disable fast workbook
End Select
strtTime = Timer 'Begin the timer
Set curWorksheet = ActiveSheet
lastRow = CLng(rangeSelection.SpecialCells(xlCellTypeLastCell).Row)
lastColumn = curWorksheet.Cells(1, 16384).End(xlToLeft).Column
Set indexCell = curWorksheet.Cells(1, 1)
On Error Resume Next
If rangeSelection.Rows.Count > 1 Then 'Check if there is anything to do
lastVisRow = rangeSelection.Rows.Count
Set sortRange = curWorksheet.Range(indexCell, curWorksheet.Cells(curWorksheet.Rows(lastRow).Row, 16384).End(xlToLeft)) 'Set the sort range
sortRange.Sort Key1:=rangeSelection.Cells(1, 1), Order1:=xlAscending, Header:=xlNo 'Sort by values, lowest to highest
startBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, MatchCase:=False).Row
endBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, SearchDirection:=xlPrevious, MatchCase:=False).Row
curWorksheet.Range(curWorksheet.Rows(startBadVals), curWorksheet.Rows(endBadVals)).EntireRow.Delete 'Delete uneeded rows, deleteing in continuous range blocks is quick than seperated or individual deletions.
sortRange.Sort Key1:=indexCell, Order1:=xlAscending, Header:=xlNo 'Sort by index instead of values, lowest to highest
End If
Application.StatusBar = "" 'Reset the status bar
FastWB False 'Disable fast workbook
MsgBox CStr(Round(Timer - strtTime, 2)) & "s" 'Display duration of task
Err:
Exit Sub
End Sub
THIS CODE USES FastWB, FastWS AND EnableWS BY Paul Bica!
Times at 100K entries (10k to be removed, FastWB True):
1. 0.2 seconds.
2. 0.2 seconds.
3. 0.21 seconds.
Avg. 0.2 seconds.
Times at 1 million entries (100k to be removed, FastWB True):
1. 2.3 seconds.
2. 2.32 seconds.
3. 2.3 seconds.
Avg. 2.31 seconds.
Running on: Windows 10, iMac i3 11,2 (From 2010)
EDIT
This code was originally designed with the purpose of filtering out numeric values outside of a numeric range and has been adapted to filter out "Test String" so some of the code may be redundant.
Your use of arrays in calculating the used range and row count may effect the performance. Here's another approach which in testing proves efficient across 1m+ rows of data - between 25-30 seconds. It doesn't use filters so will delete rows even if hidden. Deleting a whole row won't effect formatting or column widths of the other remaining rows.
First, check if the ActiveSheet has "Test String". Since you're only interested in Column 1 I used this:
TCount = Application.WorksheetFunction.CountIf(sht.Columns(1), "Test String")
If TCount > 0 Then
Instead of using your GetMaxCell() function I simply used Cells.SpecialCells(xlCellTypeLastCell).Row to get the last row:
EndRow = sht.Cells.SpecialCells(xlCellTypeLastCell).Row
Then loop through the rows of data:
While r <= EndRow
To test if the cell in Column 1 is equal to "Test String":
If sht.Cells(r, 1).Text) = "Test String" Then
To delete the row:
Rows(r).Delete Shift:=xlUp
Putting it all together full code below. I've set ActiveSheet to a variable Sht and added turned of ScreenUpdating to improve efficiency. Since it's a lot of data I make sure to clear variables at the end.
Sub RowDeleter()
Dim sht As Worksheet
Dim r As Long
Dim EndRow As Long
Dim TCount As Long
Dim s As Date
Dim e As Date
Application.ScreenUpdating = True
r = 2 'Initialise row number
s = Now 'Start Time
Set sht = ActiveSheet
EndRow = sht.Cells.SpecialCells(xlCellTypeLastCell).Row
'Check if "Test String" is found in Column 1
TCount = Application.WorksheetFunction.CountIf(sht.Columns(1), "Test String")
If TCount > 0 Then
'loop through to the End row
While r <= EndRow
If InStr(sht.Cells(r, 1).Text, "Test String") > 0 Then
sht.Rows(r).Delete Shift:=xlUp
r = r - 1
End If
r = r + 1
Wend
End If
e = Now 'End Time
D = (Hour(e) * 360 + Minute(e) * 60 + Second(e)) - (Hour(s) * 360 + Minute(s) * 60 + Second(s))
Application.ScreenUpdating = True
DurationTime = TimeSerial(0, 0, D)
MsgBox Format(DurationTime, "hh:mm:ss")
End Sub
I have 250.000 rows and I wanted to erase all rows that have a 0 in col AR. This takes too much time using a filter and deleting only visible cells, so I wrote a code. But still takes 1 minute for 1000 lines. So I will have to take 250 minutes!!! Besides after the first 6 minutes (6k lines) the number showed in AS3 (see code below) freezes, so I don't know if it's still running.
Is there a way to do this more efficiently (using less time)?
My code is:
Sub delrow()
Application.Calculation=xlCalculationManual
With Sheets("bners")
LR3 = Range("A" & Rows.Count).End(xlUp).Row
For i3 = 3 To LR3
range("AS2")=i3
a = Sheets("bners").Range("AR" & i3).Value
If a = 0 Then
Rows(i3).Delete
Else
End If
Next i3
End With
Application.calculate
End Sub
thanks!
Yes, definitely Step -1. But does that alone make it fast?
This batches the deletes 10 at a time (if needed now).
Option Explicit
Dim ws as Range
Sub delrow1()
Dim LR3&, i3&, a&
Set ws = Sheets("bners")
LR3 = ws.Range("A" & Rows.Count).End(xlUp).Row
For i3 = LR3 To 3 Step -1
a = ws.Cells(i3, "AR").Value
If a = 0 Then
Call delrow2(i3)
End If
Next i3
Call delrow2(0) ' flush
End Sub
Sub delrow2(delRow&) ' deletes 10 rows at a time
Static a1&(10), na1&
Dim i1&, zRange As Range
If delRow = 0 Then ' finish;end;flush
For i1 = 1 To na1
ws.Rows(a1(i1)).Delete
Next i1
na1 = 0
Else ' store row in array a1
na1 = na1 + 1
a1(na1) = delRow
If na1 = 10 Then ' del 10 rows
Set zRange = Union( _
Rows(a1(1)), Rows(a1(2)), Rows(a1(3)), Rows(a1(4)), Rows(a1(5)), _
Rows(a1(6)), Rows(a1(7)), Rows(a1(8)), Rows(a1(9)), Rows(a1(10)))
ws.Range(zRange).Rows.Delete
na1 = 0
End If
End If
I liked this method I found a couple weeks ago but didn't remember until last night http://goo.gl/NYtY9R that could easily be adapted for yours
Sub RowKiller()
Dim F As Range, rKill As Range
Set F = Range("A2:A250000")
Set rKill = Nothing
For Each r In F
v = r.Text
If InStr(1, v, "0") = 1 Then
If rKill Is Nothing Then
Set rKill = r
Else
Set rKill = Union(r, rKill)
End If
End If
Next r
If Not rKill Is Nothing Then
rKill.EntireRow.Delete
End If
End Sub
To me very efficient in that it builds up into the Union and then deletes all at once instead of deleting one at a time.
in the example , you with sheets() is totally useless, as you forgot every dot "." before the words cells or range or rows.
I'll try an other approach, by using two VBA arrays (not tested, and might memory overflow).
first array is original data before the macro.
second array is data after the macro
I won't delete rows, i just write my second array from the good lines of the 1rst array,
and then paste it over the sheet
Sub RowKill()
'Declaring Variables :
Dim MaxRows as long 'number of lines in the First Array
Dim NewRows as Long 'number of lines in the Second Array
Dim q as long 'simple loop counter
Dim i as long 'simple loop counter , for the purpose of copying line
Dim Rg As Range 'Range of the original Data (number of lines = MaxRows-2, because the Original example code starts at 3, not 1)
Dim Sh as Worksheet
Dim Array1() as variant 'First VBA Array
Dim Array2() as variant 'Second VBA Array
with Application
.enableevents=false
.screnupdating=false
.Calculation=xlCalculationManual
end with
set Sh=thisworkbook.Sheets("bners")
with Sh
MaxRows = .Range( .Rows.Count , 44).End(xlUp).Row ' note the .rows, and i read on cloumn 44 and not 1
Set Rg = .Range( .cells(3,44) , .cells ( MaxRows,44) ) '44 is the column of .range("AR")
'The Range Rg is important , later we delete the whole thing ^^
Redim Array1 ( 1 to MaxRows, 1 to 44) 'Only if "AR" is your last column
Array1 = Rg.value2 'if you work with dates or time format in your cells, please replace by : Array1 = Rg.value
for q= 3 to MaxRows
if Array1 (q , 44) <> 0 Then 'wasn't sure, because empty cells will trigger too, in wich case: <>"" would be better, or: If not IsEmpty( Array1 (q,44)) .....
call CopyRowToSecondArray ( q , NewRows , Array2)
End If
next q
End With 'Sh
'Rg.delete 'old version
With Sh
.range ( .cells(1,1) , .cells (44 , NewRows).Value2 = Array2 ' again use .value, if you have date or time formating inside the data cells
if NewRows<MaxRows then .range ( .cells(1,NewRows+1) , .cells (44 , MaxRows).Value2 = ""
End with
with Application
.enableevents= True
.screnupdating= True
.Calculation=xlCalculationAutomatic
end with
Set Rg = Nothing
Ser Sh = Nothing
Erase Array1, Array 2
End Sub
Sub CopyRowToSecondArray ( byval q as long , byref NewRows as long , byref Array2 as variant)
Dim i as long
NewRows=NewRows+1
Redim Preserve Array2 (1 to NewRows, 1 to 44)
for i = 1 to 44 'this entire for i loop, might be faster with unknown vba array function (i'm new), please share with me
Array2 ( NewRows , i) = Array1 ( q , i )
next i
end sub
Maybe there is a better way to simply copy a whole line from one array to an other, i don't know...
The code is untested, and , I assumed 44 is the last column (change only in loops and Rg if needed), so copy your work before testing my code.
Hope this helps, and is faster.
Ok so I've searched and searched and can't quite find what I'm looking for.
I have a workbook and what I'm basically trying to do is take the entries from certain ranges (Sheet1 - E4:E12, E14:E20, I4:I7, I9:I12, I14:I17, & I19:I21) and put them in a separate list on Sheet2. I then want the new list on Sheet2 to be sorted by how many times an entry appeared on Sheet1 as well as display the amount.
example http://demonik.doomdns.com/images/excel.png
Obviously as can be seen by the ranges I listed above, this sample is much smaller lol, was just having trouble trying to figure out how to describe everything and figured an image would help.
Basically I am trying to use VBA (the update would be initialized by hitting a button) to copy data from Sheet1 and put all the ranges into one list in Sheet2 that is sorted by how many times it appeared on Sheet1, and then alphabetically.
If a better discription is needed just comment and let me know, I've always been horrible at trying to describe stuff like this lol.
Thanks in advance!
Another detail: I cant have it search for specific things as the data in the ranges on Sheet1 may change. Everything must be dynamic.
I started out with this data
and used the following code to read it into an array, sort the array, and count the duplicate values, then output the result to sheet2
Sub Example()
Dim vCell As Range
Dim vRng() As Variant
Dim i As Integer
ReDim vRng(0 To 0) As Variant
Sheets("Sheet2").Cells.Delete
Sheets("Sheet1").Select
For Each vCell In ActiveSheet.UsedRange
If vCell.Value <> "" Then
ReDim Preserve vRng(0 To i) As Variant
vRng(i) = vCell.Value
i = i + 1
End If
Next
vRng = CountDuplicates(vRng)
Sheets("Sheet2").Select
Range(Cells(1, 1), Cells(UBound(vRng), UBound(vRng, 2))) = vRng
Rows(1).Insert
Range("A1:B1") = Array("Entry", "Times Entered")
ActiveSheet.UsedRange.Sort Range("B1"), xlDescending
End Sub
Function CountDuplicates(List() As Variant) As Variant()
Dim CurVal As String
Dim NxtVal As String
Dim DupCnt As Integer
Dim Result() As Variant
Dim i As Integer
Dim x As Integer
ReDim Result(1 To 2, 0 To 0) As Variant
List = SortAZ(List)
For i = 0 To UBound(List)
CurVal = List(i)
If i = UBound(List) Then
NxtVal = ""
Else
NxtVal = List(i + 1)
End If
If CurVal = NxtVal Then
DupCnt = DupCnt + 1
Else
DupCnt = DupCnt + 1
ReDim Preserve Result(1 To 2, 0 To x) As Variant
Result(1, x) = CurVal
Result(2, x) = DupCnt
x = x + 1
DupCnt = 0
End If
Next
Result = WorksheetFunction.Transpose(Result)
CountDuplicates = Result
End Function
Function SortAZ(MyArray() As Variant) As Variant()
Dim First As Integer
Dim Last As Integer
Dim i As Integer
Dim x As Integer
Dim Temp As String
First = LBound(MyArray)
Last = UBound(MyArray)
For i = First To Last - 1
For x = i + 1 To Last
If MyArray(i) > MyArray(x) Then
Temp = MyArray(x)
MyArray(x) = MyArray(i)
MyArray(i) = Temp
End If
Next
Next
SortAZ = MyArray
End Function
End Result:
Here is a possible solution that I have started for you. What you are asking to be done gets rather complicated. Here is what I have so far:
Option Explicit
Sub test()
Dim items() As String
Dim itemCount() As String
Dim currCell As Range
Dim currString As String
Dim inArr As Boolean
Dim arrLength As Integer
Dim iterator As Integer
Dim x As Integer
Dim fullRange As Range
Set fullRange = Range("E1:E15")
iterator = 0
For Each cell In fullRange 'cycle through the range that has the values
inArr = False
For Each currString In items 'cycle through all values in array, if
'values is found in array, then inArr is set to true
If currCell.Value = currString Then 'if the value in the cell we
'are currently checking is in the array, then set inArr to true
inArr = True
End If
Next
If inArr = False Then 'if we did not find the value in the array
arrLength = arrLength + 1
ReDim Preserve items(arrLength) 'resize the array to fit the new values
items(iterator) = currCell.Value 'add the value to the array
iterator = iterator + 1
End If
Next
'This where it gets tricky. Now that you have all unique values in the array,
'you will need to count how many times each value is in the range.
'You can either make another array to hold those values or you can
'put those counts on the sheet somewhere to store them and access them later.
'This is tough stuff! It is not easy what you need to be done.
For x = 1 To UBound(items)
Next
End Sub
All that this does so far is get unique values into the array so that you can count how many times each one is in the range.
I have an Excel table in which multiple rows are given different coloured backgrounds by VBA macros. These background colours should be locked to the rows. My problem is that when the table is sorted by one column or another the background colours move as the data is reordered.
Can I format in another way to stop this happening so that the cells remain locked?
The code I use to format is:
For Each Row In rng.Rows
If Condition Then
Row.Select
cIndex = ColourIndex(colour)
With Selection.Interior
.ColorIndex = cIndex
End With
End If
Next
An example of my table is like this:
EDIT: Extra Code
Sub Quota(ByVal Type As String)
Dim records As Long
Dim sht1 As Worksheet
Set sht1 = Worksheets("Sheet1")
Dim sht2 As Worksheet
Set sht2 = Worksheets("Sheet2")
records = sht1.Range("A1048576").End(xlUp).Row - 5
Dim rng As Range
Dim rngRowCount As Long
Dim rLastCell As Range
Dim i As Long
sht2.Activate
'Last used cell
Set rLastCell = sht2.Cells.Find(What:="*", After:=Cells(1, 1), LookIn:=xlFormulas, LookAt:= _
xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False)
'All used columns except 1st
Set rng = sht2.Range(Cells(2, 1), rLastCell)
rng.Select
rngRowCount = rng.Rows.CountLarge
For i = 1 To rngRowCount
Dim valueAs String
Dim colour As String
Dim VarX As Long
Dim maxValue As Long
value= sht2.Cells(i + 1, 1).Value
colour = sht2.Cells(i + 1, 2).Value
If Type = "A" Then
VarX = sht2.Cells(i + 1, 3).Value
ElseIf Type = "B" Then
VarX = sht2.Cells(i + 1, 5).Value
End If
maxValue = (records / 100) * VarX
ColourRows value, colour, maxValue
Next i
End Sub
Sub ColourRows(value As String, colour As String, maxValue As Long)
Dim sht1 As Worksheet
Set sht1 = Worksheets("Sheet1")
sht1.Activate
Dim rng As Range
Dim firstSixRowsOnwards As Range
Dim lastColumn As Long
Dim usedColumns As Range
Dim usedColumnsString As String
Dim highlightedColumns As Range
Dim rngDataRowCount As Long
Dim performancevalueAs String
Dim cIndex As Integer
Dim count As Long
count = 0
Dim rLastCell As Range
'End row
rngDataRowCount = sht1.Range("A1048576").End(xlUp).Row
'First 6 rows
Set firstSixRowsOnwards = sht1.Range("A6:XFD1048576")
'Last column
lastColumn = Cells.Find("*", SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column
'Used Range
Set rng = sht1.Range(Cells(1, 1), Cells(rngDataRowCount, lastColumn))
'Used Columns
Set usedColumns = sht1.Range(Cells(1, 1), Cells(1048576, lastColumn))
Set rng = Intersect(rng, firstSixRowsOnwards, usedColumns)
For Each Row In rng.Rows
compareValue= Cells(Row.Row, 5)).Value
If (InStr(1, value, compareValue, 1) Then
Dim rowNumber As Long
Row.Select
If count < maxValue Then
cIndex = ColourIndex(colour)
With Selection.Interior
.ColorIndex = cIndex
End With
count = count + 1
Else
cIndex = 3 'red
With Selection.Interior
.ColorIndex = cIndex
End With
End If
End If
Next
End Sub
I believe that if you select your data by column and then sort (instead of a row limited range) then formatting will follow.
EDIT:
If you want to lock the formatting then use conditional formatting that is based on row number, e.g. ROW() = x or ROW() = range of values...
Tested: Use conditional formatting by formula set rule such as =ROW()=3 make sure excel does not double quote it for you, apply this to the entire data range. Row 3 will then always be formatted as you set here.
Setting in vba
Sub test()
Range("A3").Select
With Range("A3")
.FormatConditions.Add Type:=xlExpression, Formula1:="=ROW()=3"
.FormatConditions(1).Interior.ColorIndex = 46
End With
End Sub
Can be done with CF, for example (top rule is >11):
Edit - I inadvertently left out one rule
the second down below uses =ROW($A1)=11:
Here we go:
In this case, what I would do it one of the two things:
Conditional formatting. Needs lot of logics and manual steps so let us leave it.
A macro: Whenever you sort the data, please fire the following function
Sub Option1()
Dim row As Range
Dim rowNum As Integer
Dim tRange As Range
'set range here: in your example, it is A2:D11
Set tRange = ActiveSheet.Range("A2:D11")
'clear colors
tRange.ClearFormats ' clears the previous format
rowNum = 1
For Each row In tRange.Rows
Select Case rowNum
Case 1, 2
row.Interior.Color = RGB(255, 255, 0) ' 1 and 2nd will be yellow
Case 3, 4
row.Interior.Color = 255 ' 3rd and 4th row will be red
Case 5, 6
row.Interior.Color = RGB(0, 0, 255) ' 5 and 6th row will be blue
Case Else
row.Interior.Color = RGB(0, 255, 0) '' all the bottom row would be a Green row
End Select
rowNum = rowNum + 1
Next row
End Sub
Does it help?
How can I count the number of different values (numbers and strings mixed) in a chosen (large) range in VBA?
I think about this in this way:
1. Read in data into one dimensional array.
2. Sort array (quick or merge sort) need to test which
3. Simply count number of different values if sorted array : if(a[i]<>a[i+1]) then counter=counter+1.
Is it the most efficient way to solve this problem?
Edit: I want to do it in Excel.
Here is a VBA Solution
You don't need an Array to get this done. You can also use a collection. Example
Sub Samples()
Dim scol As New Collection
With Sheets("Sheet1")
For i = 1 To 100 '<~~ Assuming the range is from A1 to A100
On Error Resume Next
scol.Add .Range("A" & i).Value, Chr(34) & _
.Range("A" & i).Value & Chr(34)
On Error GoTo 0
Next i
End With
Debug.Print scol.Count
'For Each itm In scol
' Debug.Print itm
'Next
End Sub
FOLLOWUP
Sub Samples()
Dim scol As New Collection
Dim MyAr As Variant
With Sheets("Sheet1")
'~~> Select your range in a column here
MyAr = .Range("A1:A10").Value
For i = 1 To UBound(MyAr)
On Error Resume Next
scol.Add MyAr(i, 1), Chr(34) & _
MyAr(i, 1) & Chr(34)
On Error GoTo 0
Next i
End With
Debug.Print scol.Count
'For Each itm In scol
' Debug.Print itm
'Next
End Sub
Instead of steps 2 and 3, perhaps you could use a Scripting.Dictionary and add each value to the dictionary. Any duplicate entries would cause a runtime error which you could either trap or ignore (resume next). Finally, you could then just return the dictionary's count which would give you the count of unique entries.
Here's a scrap of code I hurriedly threw together:
Function UniqueEntryCount(SourceRange As Range) As Long
Dim MyDataset As Variant
Dim dic As Scripting.Dictionary
Set dic = New Scripting.Dictionary
MyDataset = SourceRange
On Error Resume Next
Dim i As Long
For i = 1 To UBound(MyDataset, 1)
dic.Add MyDataset(i, 1), ""
Next i
On Error GoTo 0
UniqueEntryCount = dic.Count
Set dic = Nothing
End Function
I know that resume next can be considered a 'code smell', but the alternative could be to use the exists function of the dictionary to test whether the specified key already exists and then add the value if did not. I just have a feeling that when I did a similar thing in the past that it was faster to just ignore any errors raised for duplicate keys rather than using exists YMMY. For completeness, here's the other method using exists:
Function UniqueEntryCount(SourceRange As Range) As Long
Dim MyDataset As Variant
Dim dic As Scripting.Dictionary
Set dic = New Scripting.Dictionary
MyDataset = SourceRange
Dim i As Long
For i = 1 To UBound(MyDataset, 1)
if not dic.Exists(MyDataset(i,1)) then dic.Add MyDataset(i, 1), ""
Next i
UniqueEntryCount = dic.Count
Set dic = Nothing
End Function
Whilst the above code is simpler than your proposed method, it would be worth to test the performance of it against your solution.
Building on the idea presented by i_saw_drones, I strongly recommend the Scripting.Dictionary. However, this can be done without On Error Resume Next as shown below. Also, his example requires linking the Microsoft Scripting Runtime library. My example will demonstrate how to do this without needing to do any linking.
Also, since you're doing this in Excel, then you don't need to create the array in step 1 at all. The function below will accept a range of cells, which will be iterated through completely.
(i.e. UniqueCount = UniqueEntryCount(ActiveSheet.Cells) or UniqueCount = UniqueEntryCount(MySheet.Range("A1:D100"))
Function UniqueEntryCount(SourceRange As Range) As Long
Dim MyDataset As Variant
Dim MyRow As Variant
Dim MyCell As Variant
Dim dic As Object
Dim l1 As Long, l2 As Long
Set dic = CreateObject("Scripting.Dictionary")
MyDataset = SourceRange
For l1 = 1 To UBound(MyDataset)
' There is no function to get the UBound of the 2nd dimension
' of an array (that I'm aware of), so use this division to
' get this value. This does not work for >=3 dimensions!
For l2 = 1 To SourceRange.Count / UBound(MyDataset)
If Not dic.Exists(MyDataset(l1, l2)) Then
dic.Add MyDataset(l1, l2), MyDataset(l1, l2)
End If
Next l2
Next l1
UniqueEntryCount = dic.Count
Set dic = Nothing
End Function
It might also be important to note that the above will count a null string "" as a distinct value. If you do not want this to be the case, simply change the code to this:
For l1 = 1 To UBound(MyDataset)
For l2 = 1 To SourceRange.Count / UBound(MyDataset)
If Not dic.Exists(MyDataset(l1, l2)) And MyDataset(l1, l2) <> "" Then
dic.Add MyDataset(l1, l2), MyDataset(l1, l2)
End If
Next l2
Next l1
Sorry this is written in C#. This is how I would do it.
// first copy the array so you don't lose any data
List<value> copiedList = new List<value>(yourArray.ToList());
//for through your list so you test every value
for (int a = 0; a < copiedList.Count; a++)
{
// copy instances to a new list so you can count the values and do something with them
List<value> subList = new List<value>(copiedList.FindAll(v => v == copiedList[i]);
// do not do anything if there is only 1 value found
if(subList.Count > 1)
// You would want to leave 1 'duplicate' in
for (int i = 0; i < subList.Count - 1; i++)
// remove every instance from the array but one
copiedList.Remove(subList[i]);
}
int count = copiedList.Count; //this is your actual count
Have not tested it, please try.
You should wrap this inside a method so there is no messing around with the garbage. Otherwise you would lose the copy of the array only later. (return count)
EDIT: You need a list for this to work, use Array.ToList();