PowerShell Script to Get a Directory Total Size - windows

I need to get the size of a directory, recursively. I have to do this every month so I want to make a PowerShell script to do it.
How can I do it?

Try the following
function Get-DirectorySize() {
param ([string]$root = $(resolve-path .))
gci -re $root |
?{ -not $_.PSIsContainer } |
measure-object -sum -property Length
}
This actually produces a bit of a summary object which will include the Count of items. You can just grab the Sum property though and that will be the sum of the lengths
$sum = (Get-DirectorySize "Some\File\Path").Sum
EDIT Why does this work?
Let's break it down by components of the pipeline. The gci -re $root command will get all items from the starting $root directory recursively and then push them into the pipeline. So every single file and directory under the $root will pass through the second expression ?{ -not $_.PSIsContainer }. Each file / directory when passed to this expression can be accessed through the variable $_. The preceding ? indicates this is a filter expression meaning keep only values in the pipeline which meet this condition. The PSIsContainer method will return true for directories. So in effect the filter expression is only keeping files values. The final cmdlet measure-object will sum the value of the property Length on all values remaining in the pipeline. So it's essentially calling Fileinfo.Length for all files under the current directory (recursively) and summing the values.

If you are interested in including the size of hidden and system files then you should use the -force parameter with Get-ChildItem.

Here's quick way to get size of specific file extensions:
(gci d:\folder1 -r -force -include *.txt,*.csv | measure -sum -property Length).Sum

Thanks to those who posted here. I adopted the knowledge to create this:
# Loops through each directory recursively in the current directory and lists its size.
# Children nodes of parents are tabbed
function getSizeOfFolders($Parent, $TabIndex) {
$Folders = (Get-ChildItem $Parent); # Get the nodes in the current directory
ForEach($Folder in $Folders) # For each of the nodes found above
{
# If the node is a directory
if ($folder.getType().name -eq "DirectoryInfo")
{
# Gets the size of the folder
$FolderSize = Get-ChildItem "$Parent\$Folder" -Recurse | Measure-Object -property length -sum -ErrorAction SilentlyContinue;
# The amount of tabbing at the start of a string
$Tab = " " * $TabIndex;
# String to write to stdout
$Tab + " " + $Folder.Name + " " + ("{0:N2}" -f ($FolderSize.Sum / 1mb));
# Check that this node doesn't have children (Call this function recursively)
getSizeOfFolders $Folder.FullName ($TabIndex + 1);
}
}
}
# First call of the function (starts in the current directory)
getSizeOfFolders "." 0

To refine this answer by #JaredPar to be expanded and more performant:
function Get-DirectorySize() {
param ([string]$root = $(Resolve-Path .))
Get-ChildItem $root -Recurse -File |
Measure-Object -Property Length -Sum |
Select-Object -ExpandProperty Sum
}
Or, to make it more convenient for use explore type data:
Update-TypeData -TypeName System.IO.DirectoryInfo -MemberType ScriptProperty -MemberName Size -Value {
Get-ChildItem $this -Recurse -File |
Measure-Object -Property Length -Sum |
Select-Object -ExpandProperty Sum
}
Then use by Get-ChildItem | Select-Object Name,Length,Size

Related

How to select the file with the maximum number of the specified file

I want to keep only the file with the largest version of the specified zip file in the folder using powershell. I wrote a shell script but it returns all the files. How can I modify the script to select only the file with the largest version?
$files = Get-ChildItem -Filter "*.zip"
$max = $files |Measure-Object -Maximum| ForEach-Object {[int]($_.Split("_")[-1].Split(".")[0])}
$largestFiles = $files | Where-Object {[int]($_.Split("_")[-1].Split(".")[0]) -eq $max}
Write-Output $largestFiles
Expectation:
A1_Fantasic_World_20.zip
A1_Fantasic_World_21.zip
B1_Mythical_Realms_11.zip
B1_Mythical_Realms_12.zip
C1_Eternal_Frame_Corporation_2.zip
C1_Eternal_Frame_Corporation_3.zip
↓
A1_Fantasic_World_21.zip
B1_Mythical_Realms_12.zip
C1_Eternal_Frame_Corporation_3.zip
A1_Fantasic_World's biggest number is 21.B1_Mythical_Realms's is 12.C1_Eternal_Frame_Corporation's is 3. So I want to choose the biggest version of zip.
First you add the calculated properties to your file system objects you use for filtering. Then with a combination of Group-Object, Sort-Object and Select.Object you can filter the desired files.
$FileList =
Get-ChildItem -Filter *.zip |
Select-Object -Property *,
#{
Name = 'Title'
Expression = {($_.BaseName -split '_')[0..$(($_.BaseName -split '_').count - 2)] -join '_' }
},
#{
Name = 'Counter'
Expression = {[INT]($_.BaseName -split '_')[-1]}
}
$LastOnesList =
$FileList |
Group-Object -Property Title |
ForEach-Object {
$_.Group | Sort-Object -Property Counter | Select-Object -Last 1
}
$LastOnesList |
Select-Object -Property Name

Improve the efficiency of my PowerShell script

The below code searches 400+ numbers from a list.txt file to see if it exists within any files within the folder path specified.
The script is very slow and has yet to complete as it did not complete after 25 minutes of running. The folder we are searching is 507 MB (532,369,408 bytes) and it contains 1,119 Files & 480 Folders. Any help to improve the speed of the search and the efficiency is greatly appreciated.
$searchWords = (gc 'C:\temp\list.txt') -split ','
$results = #()
Foreach ($sw in $searchWords)
{
$files = gci -path 'C:\Users\david.craven\Dropbox\Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*' -filter "*$sw*" -recurse
foreach ($file in $files)
{
$object = New-Object System.Object
$object | Add-Member -Type NoteProperty –Name SearchWord –Value $sw
$object | Add-Member -Type NoteProperty –Name FoundFile –Value $file.FullName
$results += $object
}
}
$results | Export-Csv C:\temp\output.csv -NoTypeInformation
The following should speed up your task substantially:
If the intent is truly to look for the search words in the file names:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse -PipelineVariable file |
Select-Object -ExpandProperty Name |
Select-String -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e={$file.FullName}} |
Export-Csv C:\temp\output.csv -NoTypeInformation
If the intent is to look for the search words in the files' contents:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse |
Select-String -List -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e='Path'} |
Export-Csv C:\temp\output.csv -NoTypeInformation
The keys to performance improvement:
Perform the search with a single command, by passing all search words to Select-String. Note: -List limits matching to 1 match (by any of the given patterns).
Instead of constructing custom objects in a script block with New-Object and Add-Member, let Select-Object construct the objects for you directly in the pipeline, using calculated properties.
Instead of building an intermediate array iteratively with += - which behind the scenes recreates the array every time - use a single pipeline to pipe the result objects directly to Export-Csv.
So there are definitely some basic things in the PowerShell code you posted that can be improved, but it may still not be super fast. Based on the sample you gave us I'll assume you're looking to match the file names against a list of words. You're looping through the list of words (400 iterations) and in each loop you're looping through all 1,119 files. That's a total of 447,600 iterations!
Assuming you can't reduce the number of iterations in the loop, let's start by making each iteration faster. The Add-Member cmdlet is going to be really slow, so switch that approach up by casting a hashtable to the [PSCustomObject] type accelerator:
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
Also, there is no reason to pre-create an array object and then add each file to it. You can simply capture the ouptut of the foreach loop in a variable:
$Results = Foreach ($Word in $Words)
{
...
So a faster loop might look like this:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$Results = Foreach ($Word in $Words)
{
foreach ($File in $Files)
{
if ($File.BaseName -match $Word)
{
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
}
}
}
A simpler approach might be to use Where-Object on the files array:
$Results = Foreach ($Word in $Words)
{
$Files | Where-Object BaseName -match $Word
}
Try both and test out the performance.
So if speeding up the loop doesn't meet your needs, try removing the loop entirely. You could use regex and join all the words together:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$WordRegex = $Words -join '|'
$Files | Where basename -match $WordRegex

Select directory from a file

I need my program to give me every folder containing files which are out of the Windows' number of characters limit. It means if a file has more than 260 characters (248 for folders), I need it to write the address of the file's parent. And I need it to write it only once. For now, I'm using this code:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
Select-Object -ExpandProperty FullName |
Split-Path $_.FullName
But the Split-Path won't work (this is the first time I use it). It tells me the -Path parameter has a null value (I can write -Path but it doesn't change anything).
If you want an example of what I need: imagine folder3 has a 230-character address and file.txt has a 280-character address:
C:\users\folder1\folder2\folder3\file.txt
Would write:
C:\users\folder1\folder2\folder3
I'm using PS2, by the way.
Spoiler: the tool you are building may not be able to report paths over the limit since Get-ChildItem cannot access them. You can try nevertheless, and also find other solutions in the links at the bottom.
Issue in your code: $_ only works in specific contexts, for example a ForEach-Object loop.
But here, at the end of the pipeline, you're only left with a string containing the full path (not the complete file object any more), so directly passing it to Split-Path should work:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
Select-Object -ExpandProperty FullName |
Split-Path
as "C:\Windows\System32\regedt32.exe" | Split-Path would output C:\Windows\System32
Sidenote: what do (Get-Item C:\Windows\System32\regedt32.exe).DirectoryName and (Get-Item C:\Windows\System32\regedt32.exe).Directory.FullName output on your computer ? These both show the directory on my system.
Adapted code example:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
ForEach-Object { $_.Directory.FullName } |
Select-Object -Unique
Additional information about MAX_PATH:
How do I find files with a path length greater than 260 characters in Windows?
Why does the 260 character path length limit exist in Windows?
http://www.powershellmagazine.com/2012/07/24/jaap-brassers-favorite-powershell-tips-and-tricks/
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
https://gallery.technet.microsoft.com/scriptcenter/Get-ChildItemV2-to-list-29291aae
you cannot use get-childitem to list paths greater than the windows character limit.
There are a couple of alternatives for you. Try an external library like 'Alphafs' or you can use robocopy. Boe Prox has a script that utilizes robocopy and it is available on technet but i am not sure if it will work on PSV2. Anyway you can give it a try.
I've had a similar problem and resolved it like this:
$PathTooLong = #()
Get-ChildItem -LiteralPath $Path -Recurse -ErrorVariable +e -ErrorAction SilentlyContinue
$e | where {$_.Exception -like 'System.IO.PathTooLongException*'} | ForEach-Object {
$PathTooLong += $_.TargetObject
$Global:Error.Remove($_)
}
$PathTooLong
On every path that is too long, or that the PowerShell engine can't handle, Get-ChildItem will throw an error. This error is saved in the ErrorVariable called e in the example above.
When all errors are collected in $e you can filter out the ones you need by checking the error Exception for the string System.IO.PathTooLongException.
Hope it helps you out.

Extract timestamp from filename and sort

I'm trying to look through each item in a folder and add each item to an array sorted by the datestamp in the filename.
For example, I have three files:
myfile_20150813_040949.txt
myfile_20150812_030949.txt
myfile_20150812_010949.txt
I'm not sure how to parse out the time from each and add them to an array in ascending order. Any help would be appreciated.
I am assuming that you are looking to sort the files by the parsed timestamp that is pulled from the file name with this example. It may not the be the best RegEx approach, but it works in testing.
#RegEx pattern to parse the timestamps
$Pattern = '.*_(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\.txt'
$List = New-Object System.Collections.ArrayList
$Temp = New-Object System.Collections.ArrayList
Get-ChildItem | ForEach {
#Make sure the file matches the pattern
If ($_.Name -match $Pattern) {
Write-Verbose "Add $($_.Name)" -Verbose
$Date = $Matches[2],$Matches[3],$Matches[1] -join '/'
$Time = $Matches[4..6] -join ':'
[void]$Temp.Add(
(New-Object PSObject -Property #{
Date =[datetime]"$($Date) $($Time)"
File = $_
}
))
}
}
#Sort the files by the parsed timestamp and add to the main list
$List.AddRange(#($Temp | Sort Date | Select -Expand File))
#Clear out the temp collection
$Temp.Clear()
#Display the results
$List
What you could be doing for this is using the string method .Split() with the [datetime] method of TryParseExact(). Go though each file and add a property for the "FromFileDate" and then sort on that.
$path = "C:\temp"
Get-ChildItem -Filter "*.txt" -Path $path | ForEach-Object{
$date = ($_.BaseName).Split("_",2)[1]
$result = New-Object DateTime
if([datetime]::TryParseExact($date,"yyyyMMdd_hhmmss",[System.Globalization.CultureInfo]::InvariantCulture,[System.Globalization.DateTimeStyles]::None,[ref]$result)){
# This is a good date
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value $result -PassThru
} Else {
# Could not parse date from filename
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value "Could not Parse" -PassThru
}
} | Select-Object Name,fromfiledate | Sort-Object fromfiledate
We take the basename of the each text file and split it into 2 parts from the first underscore. Using TryParseExact we then attempt to convert the "date" string to the format of "yyyyMMdd_hhmmss". Since we use TryParseExact if we have trouble parsing the date then the code will continue.
Sample Output
Name FromFileDate
---- ------------
myfile_20150812_030949.txt 8/12/2015 3:09:49 AM
myfile_20150813_040949.txt 8/13/2015 4:09:49 AM
files.txt Could not Parse
If you didn't want the erroneous data in the output a simple Where-Object{$_.fromfiledate -is [datetime]} would remove those entries.

Count, Sort and Group-By in Powershell

Are there any cool cmdlets that will help me do the following?
I want something in Powershell that is as simple as doing the same in SQL:
select RootElementName , count(*) from Table
group by RootElementName
order by RootElementName
I'm all XML files in a directory, finding the root element of each XML file.
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force
foreach ($Filename in $AllFiles)
{
$FQFilename = $DirectoryName + $Filename
[xml]$xmlDoc = Get-Content $FQFilename
$rootElementName = $xmlDoc.SelectSingleNode("/*").Name;
Write-Host $FQFilename $rootElementName
}
Desired Result:
RootName Count
-------- -----
Root1 15
MyRoot 16
SomeRoot 24
I know I could could either create two arrays, or an array of objects, store the root elements in the array, and do the counts all using typical code, was just hoping that this new language might have something built-in that I haven't discovered yet.
Could I pipe the "Write-Host $FQFilename $rootElementName " to something that would behave something to the SQL I referred to above?
You can get groups and counts by using Group-Object like this:
$AllFiles | Group-Object RootElementName | Sort-Object Name | Select-Object Name, Count
In your current example, Write-Host doesn't write an object to the pipeline that we can sort or group. Write-Host only prints text to the screen to show the user something, ex. a script menu.
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force | ForEach-Object {
#The FullName-property contains the absolute path, so there's no need to join the filename and $directoryname
[xml]$xmlDoc = Get-Content $_.FullName
$rootElementName = $xmlDoc.SelectSingleNode("/*").Name
#Outputing an object that we can group and sort
New-Object -TypeName psobject -Property #{
FileName = $_.FullName
RootElementName = $rootElementName
}
}
$grped = $AllFiles | Group-Object RootElementName | Sort-Object Name | Select-Object Name, Count
I'm creating an object with a FileName-property and the RootElementName so you have it if you need to retrieve the filename+rootelement for a list. If not, we could simplify this to:
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force | ForEach-Object {
#The FullName-property contains the absolute path, so there's no need to join the filename and $directoryname
[xml]$xmlDoc = Get-Content $_.FullName
#Output rootelementname
$xmlDoc.SelectSingleNode("/*").Name
}
$grped = $AllFiles | Group-Object | Sort-Object Name | Select-Object Name, Count

Resources