Powershell replace file content with output of previous command [duplicate] - windows

I am having a helluva time trying to understand why this script is not working as intended. It is a simple script in which I am attempting to import a CSV, select a few columns that I want, then export the CSV and copy over itself. (Basically we have archived data that I only need a few columns from for another project due to memory size constraints). This script is very simple, which apparently has an inverse relationship with how much frustration it causes when it doesn't work... Right now the end result is I end up with an empty csv instead of a csv containing only the columns I selected with Select-Object.
$RootPath = "D:\SomeFolder"
$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv |
ForEach-Object{
Import-CSV $_ |
Select-Object Test_Name, Test_DataName, Device_Model, Device_FW, Data_Avg_ms, Data_StdDev |
Export-Csv $_.FullName -NoType -Force
}

Unless you read the input file into memory in full, up front, you cannot safely read from and write back to the same file in a given pipeline.
Specifically, a command such as Import-Csv file.csv | ... | Export-Csv file.csv will erase the content of file.csv.
The simplest solution is to enclose the command that reads the input file in (...), but note that:
The file's content (transformed into objects) must fit into memory as a whole.
There is a slight risk of data loss if the pipeline is interrupted before all (transformed) objects have been written back to the file.
Applied to your command:
$RootPath = "D:\SomeFolder"
Get-ChildItem $RootPath -Recurse -Include *.csv -OutVariable csvFiles |
ForEach-Object{
(Import-CSV $_.FullName) | # NOTE THE (...)
Select-Object Test_Name, Test_DataName, Device_Model, Device_FW,
Data_Avg_ms, Data_StdDev |
Export-Csv $_.FullName -NoType -Force
}
Note that I've used -OutVariable csvFiles in order to collect the CSV file-info objects in output variable $csvFiles. Your attempt to collect the file paths via $csvFilePaths = ... doesn't work, because it attempts to collects Export-Csv's output, but Export-Csv produces no output.
Also, to be safe, I've changed the Import-Csv argument from $_ to $_.FullName to ensure that Import-Csv finds the input file (because, regrettably, file-info object $_ is bound as a string, which sometimes expands to the mere file name).
A safer solution would be to output to a temporary file first, and (only) on successful completion replace the original file.
With either approach, the replacement file will have default file attributes and permissions; if the original file had special attributes and/or permissions that you want to preserve, you must recreate them explicitly.

As Matt commented, your last $PSItem ($_) not related to the Get-ChildItem anymore but for the Select-Object cmdlet which don't have a FullName Property
You can use differnt foreach approach:
$RootPath = "D:\SomeFolder"
$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv
foreach ($csv in $csvFilePaths)
{
Import-CSV $csv.FullName |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev |
Export-Csv $csv.FullName -NoType -Force
}
Or keeping your code, add $CsvPath Variable containing the csv path and use it later on:
$RootPath = "D:\SomeFolder"
Get-ChildItem $RootPath -Recurse -Include *.csv | ForEach-Object{
$CsvPath = $_.FullName
Import-CSV $CsvPath |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev |
Export-Csv $CsvPath -NoType -Force
}

So I have figured it out. I was attempting to pipe through the Import-Csv cmdlet directly instead of declaring it as a variable in the o.g. code. Here is the code snippet that gets what I wanted to get done, done. I was trying to pipe in the Import-Csv cmdlet directly before, I simply had to declare a variable that uses the Import-Csv cmdlet as its definition and pipe that variable through to Select-Object then Export-Csv cmdlets. Thank you all for your assistance, I appreciate it!
$RootPath = "\someDirectory\"
$CsvFilePaths = #(Get-ChildItem $RootPath -Recurse -Include *.csv)
$ColumnsWanted = #('Test_Name','Test_DataName','Device_Model','Device_FW','Data_Avg_ms','Data_StdDev')
for($i=0;$i -lt $CsvFilePaths.Length; $i++){
$csvPath = $CsvFilePaths[$i]
Write-Host $csvPath
$importedCsv = Import-CSV $csvPath
$importedCsv | Select-Object $ColumnsWanted | Export-CSV $csvPath -NoTypeInformation
}

Related

Script in power shell to add checksum as alternate data stream fails with some file names but otherwise works

I want to check files for integrity with a checksum. To make it easier I put the hash into an alternate data stream of the file. When someone alters the file I can verify this with the checksum.
However, when I add a data stream the file's LastWriteTime gets updated, so I added functionality to reverse it.
It works like a charm - mostly. But it fails with some files, about 5%. I have no idea why. It looks like it fails with file names that contain spaces or extra dots, but many other that have spaces and multiple dots in the file name work just fine.
Does anyone know what's going on, how to prevent these failures or how to improve the code?
Thanks!
The code:
$filenames = Get-ChildItem *.xl* -Recurse | % { $_.FullName }
foreach( $filename in $filenames ) { ForEach-Object { $timelwt = Get-ItemProperty $filename | select -expand LastWriteTime | select -expand ticks } {add-content -stream MD5 -value (Get-FileHash -a md5 $filename).hash $filename } { Set-ItemProperty $filename -Name LastWriteTime -Value $timelwt}}```
Your code can be reduced to this:
Get-ChildItem *.xl* -Recurse | ForEach-Object {
$lastWriteTime = $_.LastWriteTime
$_ | Add-Content -Stream MD5 -Value ($_ | Get-FileHash -a md5).Hash
$_.LastWriteTime = $lastWriteTime
}
Get-ChildItem with the -Filter you have in place will return FileInfo objects, which have a settable LastWriteTime property, there is no reason for using Get-ItemProperty nor Set-ItemProperty over them.
As for, why your code could be failing, the likeable explanation is that you have some file paths with wildcard metacharacters, and since you're not using -LiteralPath, the cmdlets are defaulting to the -Path parameter (which allows wildcard metacharacters).
As aside, I would personally recommend you to create a separate checksum file for the files instead of adding an alternative data stream.

process multiple CSV file and delete rows in a single column which has double semi colon characters using powershell

consider I have a below CSV file.
input:
ID;ITEM_ID;STATUS;
001;;RELEASED;
002;36530;RELEASED;
003;86246;RELEASED;
004;;RELEASED;
I want to remove the row that has ;; (ITEM_ID) missing and save it.I tried doing it on one sample file and it worked as expected.
Import-Csv -Path ".\TestFile.CSV" | where {$_.ITEM_ID -ne ""} | Export-Csv -Path ".\TestFile-temp.CSV" -NoTypeInformation
Remove-Item -Path '.\TestDir\TestFile.csv'
Rename-Item -Path '.\TestDir\TestFile-temp.csv' -NewName 'TestFile.csv'
output:
ID;ITEM_ID;STATUS;
002;36530;RELEASED;
003;86246;RELEASED;
The challenge is, i have multiple csv files and it doesn't has value in different columns, but in single column when i opened in excel file.
so it's not taking the condition < where {$_.ITEM_ID -ne ""} >.
Now i have to search/parse each row of each csv file, search special character (;;) in that row and delete the line and save the file.
i am good at shell scripting but, i am very new to powershell scripting. can anybody please help me to get the logic here or use other cmdlet that can do the job?
$fileDirectory = "C:\Users\Administrator\Documents\check";
foreach($file in Get-ChildItem $fileDirectory)
{
$csvFileToCheck = Import-Csv -Path $fileDirectory\$file
$noDoubleSemiComma = foreach($line in $csvFileToCheck)
{
if(Select-String << i want the logic here>>)
{
$line
}
}
$noDoubleSemiComma | Export-Csv -Path $fileDirectory\tmp.csv -NoTypeInformation
Remove-Item -Path $fileDirectory\$file
Rename-Item -Path $fileDirectory\tmp.csv -NewName $file
}
As commented, you need to add parameter -Delimiter ';' to the cmdlet otherwise a comma is used to parse the fields in the CSV.
As I understand, you also want to remove the quotes Export-Csv outputs around all fields and headers and for PowerShell version 7 you have the option to use parameter -UseQuotes AsNeeded.
As this is not available for version 5.1, I made a function ConvertTo-CsvNoQuotes some time ago to remove the quotes in a safe way. (simply replacing them all with an empty string is dangerous, because sometimes values do need quotes)
Copy that function into your script at the top, then below that, your code could be simplified like this:
$fileDirectory = "C:\Users\Administrator\Documents\check"
Get-ChildItem -Path $fileDirectory -Filter '*.csv' -File | ForEach-Object {
# for better readability store the full path of the file in a variable
$filePath = $_.FullName
(Import-Csv -Path $filePath -Delimiter ';') | ConvertTo-CsvNoQuotes -Delimiter ';' | Set-Content $filePath -Force
Write-Host "File '$filePath' modified"
}
After all helpful suggestion, i finally nailed it down. AS my power-shell version was 5.1 , i had to use logic for trimming double quotes after export-csv. Powershell version 7 and later has -UseQuotes that could have solve that too.
Hope this help others.
$fileDirectory = "C:\Users\Administrator\Documents\check";
foreach($file in Get-ChildItem $fileDirectory)
{
Import-Csv -Path $fileDirectory\$file -Delimiter ';' | where {$_..ITEM_ID -ne ""} | Export-Csv -Path $fileDirectory\temp.csv -Delimiter ';' -NoTypeInformation
$Test = Get-Content $fileDirectory\temp.csv
$Test.Replace('";"',";").TrimStart('"').TrimEnd('"') | Out-File $fileDirectory\temp.csv -Force -Confirm:$false
Remove-Item -Path $fileDirectory\$file
Rename-Item -Path $fileDirectory\temp.csv -NewName $file
Write-Output "$file file modified."
}
Any suggestion to trim down number of lines of code is welcomed.

PowerShell Format-Table -AutoSize not Producing an Output File

When running the following line in PowerShell including the "Format-Table -AutoSize", an empty output file is generated:
Get-ChildItem -Recurse | select FullName,Length | Format-Table -AutoSize | Out-File filelist.txt
The reason I need the output file to be AutoSized is because longer filenames from the directoy are being trunacted. I am trying to pull all Filenames and File Sizes for all files within a folder and subfolders. When removing the -Autosize element, an output file is generated with truncated file names:
Get-ChildItem -Recurse | select FullName,Length | Out-File filelist.txt
Like AdminOfThings commented, use Export-CSV to get the untruncated values of your object.
Get-ChildItem -Recurse | select FullName,Length | Export-CSv -path $myPath -NoTypeInformation
I do not use Out-File much at all, and only use Format-Table/Format-List for interactive scripts. If I want to write data to a file, Select-Object Column1,Column2 | Sort-Object Column1| Export-CSV lets me select the properties of the object I am exporting that I want to export, and sort the records as needed. you can change the delimiter from a comma to tab/pipe/whatever else you may need.
While the other answer may address the issue, you may have other reasons for wanting to use Out-File. Out-File has a "Width" parameter. If this is not set, PowerShell defaults to 80 characters - hence your issue. This should do the trick:
Get-ChildItem -Recurse | select FullName,Length | Out-File filelist.txt -Width 250 (or any other value)
The Format-* commandlets in PowerShell are only intended to be used in the console. They do not actually produce output that can be piped to other commandlets.
The usual approach to get the data out is with Export-Csv. CSV files are easily imported into other scripts or spreadsheets.
If you really need to output a nicely formatted text file you can use .Net composite formatting with the -f (format) operator. This works similarly to printf() in C. Here is some sample code:
# Get the files for the report
$files = Get-ChildItem $baseDirectory -Recurse
# Path column width
$nameWidth = $files.FullName |
ForEach-Object { $_.Length } |
Measure-Object -Maximum |
Select-Object -ExpandProperty Maximum
# Size column width
$longestFileSize = $files |
ForEach-Object { $_.Length.tostring().Length } |
Measure-Object -Maximum |
Select-Object -ExpandProperty Maximum
# Have to consider that some directories will have no files with
# length strings longer than "Size (Bytes)"
$sizeWidth = [System.Math]::Max($longestFileSize, "Size (Bytes)".Length)
# Right-align paths, left-align file size
$formatString = "{0,-$nameWidth} {1,$sizeWidth}"
# Build the report and write it to a file
# ArrayList are much more efficient than using += with arrays
$lines = [System.Collections.ArrayList]::new($files.Length + 3)
# The [void] cast are just to prevent ArrayList.add() from cluttering the
# console with the returned indices
[void]$lines.Add($formatString -f ("Path", "Size (Bytes)"))
[void]$lines.Add($formatString -f ("----", "------------"))
foreach ($file in $files) {
[void]$lines.Add($formatString -f ($file.FullName, $file.Length.ToString()))
}
$lines | Out-File "Report.txt"

Select directory from a file

I need my program to give me every folder containing files which are out of the Windows' number of characters limit. It means if a file has more than 260 characters (248 for folders), I need it to write the address of the file's parent. And I need it to write it only once. For now, I'm using this code:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
Select-Object -ExpandProperty FullName |
Split-Path $_.FullName
But the Split-Path won't work (this is the first time I use it). It tells me the -Path parameter has a null value (I can write -Path but it doesn't change anything).
If you want an example of what I need: imagine folder3 has a 230-character address and file.txt has a 280-character address:
C:\users\folder1\folder2\folder3\file.txt
Would write:
C:\users\folder1\folder2\folder3
I'm using PS2, by the way.
Spoiler: the tool you are building may not be able to report paths over the limit since Get-ChildItem cannot access them. You can try nevertheless, and also find other solutions in the links at the bottom.
Issue in your code: $_ only works in specific contexts, for example a ForEach-Object loop.
But here, at the end of the pipeline, you're only left with a string containing the full path (not the complete file object any more), so directly passing it to Split-Path should work:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
Select-Object -ExpandProperty FullName |
Split-Path
as "C:\Windows\System32\regedt32.exe" | Split-Path would output C:\Windows\System32
Sidenote: what do (Get-Item C:\Windows\System32\regedt32.exe).DirectoryName and (Get-Item C:\Windows\System32\regedt32.exe).Directory.FullName output on your computer ? These both show the directory on my system.
Adapted code example:
$maxLength = 248
Get-ChildItem $newPath -Recurse |
Where-Object { ($_.FullName.Length -gt $maxLength) } |
ForEach-Object { $_.Directory.FullName } |
Select-Object -Unique
Additional information about MAX_PATH:
How do I find files with a path length greater than 260 characters in Windows?
Why does the 260 character path length limit exist in Windows?
http://www.powershellmagazine.com/2012/07/24/jaap-brassers-favorite-powershell-tips-and-tricks/
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
https://gallery.technet.microsoft.com/scriptcenter/Get-ChildItemV2-to-list-29291aae
you cannot use get-childitem to list paths greater than the windows character limit.
There are a couple of alternatives for you. Try an external library like 'Alphafs' or you can use robocopy. Boe Prox has a script that utilizes robocopy and it is available on technet but i am not sure if it will work on PSV2. Anyway you can give it a try.
I've had a similar problem and resolved it like this:
$PathTooLong = #()
Get-ChildItem -LiteralPath $Path -Recurse -ErrorVariable +e -ErrorAction SilentlyContinue
$e | where {$_.Exception -like 'System.IO.PathTooLongException*'} | ForEach-Object {
$PathTooLong += $_.TargetObject
$Global:Error.Remove($_)
}
$PathTooLong
On every path that is too long, or that the PowerShell engine can't handle, Get-ChildItem will throw an error. This error is saved in the ErrorVariable called e in the example above.
When all errors are collected in $e you can filter out the ones you need by checking the error Exception for the string System.IO.PathTooLongException.
Hope it helps you out.

Compare a log file of file paths to a directory structure and remove files not in log file

I have a file transfer/sync job that is copying files from the main network into a totally secure network using a custom protocol (ie no SMB). The problem is that because I can't look back to see what files exist, the destination is filling up, as the copy doesn't remove any files it hasn't touched (like robocopy MIR does).
Initailly I wrote a script that:
1. Opens the log file and grabs the file paths out (this is quite quick and painless)
2. Does a Get-ChildItem on the destination folder (now using dir /s /b as it's way faster than gci)
3. Compared the two, and then removed the differences.
The problem is that there are more jobs that require this clean up but the log files are 100MB and the folders contain 600,000 files, so it's taking ages and using tons of memory. I actually have yet to see one finish. I'd really like some ideas on how to make this faster (memory/cpu use doesn't bother me too much but speed is essential.
$destinationMatch = "//server/fileshare/folder/"
the log file contains some headers and footers and then 600,000 lines like this one:
"//server/fileshare/folder/dummy/deep/tags/20140826/more_stuff/Deeper/2012-07-02_2_0.dat_v2" 33296B 0B completed
Here's the script:
[CmdletBinding(SupportsShouldProcess=$True)]
param(
[Parameter(Mandatory=$True)]
[String]$logName,
[Parameter(Mandatory=$True)]
[String]$destinationMatch
)
$logPath = [string]("C:\Logs\" + $logName)
$manifestFile = gci -Path $logPath | where {$_.name -match "manifest"} | sort creationtime -descending | select Name -first 1
$manifestFileName = [string]$manifestFile.name
$manifestFullPath = $logPath + "\" + $manifestFileName
$copiedList = #()
(gc $manifestFullPath -ReadCount 0) | where {$_.trim() -match $DestinationMatch} | % {
if ( $_ -cmatch '(?<=")[^"]*(?=")' ){
$copiedList += ($matches[0]).replace("/","\")
}
}
$dest = $destinationMatch.replace("/","\")
$actualPathString = (gci -Path $dest -Recurse | select fullname).fullnameCompare-Object -ReferenceObject $copiedList -DifferenceObject $actualPathString -PassThru | % {
$leaf = Split-Path $_ -leaf
if ($leaf.contains(".")){
$fsoData = gci -Path $_
if (!($fsoData.PSIsContainer)){
Remove-Item $_ -Force
}
}
}
$actualDirectory | where {$_.PSIsContainer -and #(gci -LiteralPath $_.FullName -Recurse -WarningAction SilentlyContinue -ErrorAction SilentlyContinue | where {!$_.PSIsContainer}).Length -eq 0} | remove-item -Recurse -Force
Ok, so let's assume that your file copy preserves the last modified date/time stamp. If you really need to pull a directory listing, and compare it against a log, I think you're doing a decent job of it. The biggest slow down is obviously going to be pulling your directory listing. I'll address that shortly. For right now I would propose the following modification of your code:
[CmdletBinding(SupportsShouldProcess=$True)]
param(
[Parameter(Mandatory=$True)]
[String]$logName,
[Parameter(Mandatory=$True)]
[String]$destinationMatch
)
$logPath = [string]("C:\Logs\" + $logName)
$manifestFile = gci -Path $logPath | where {$_.name -match "manifest"} | sort creationtime -descending | select -first 1
$RegExPattern = [regex]::escape($DestinationMatch)
$FilteredManifest = gc $manifestfile.FullPath | where {$_ -match "`"($RegexPattern[^`"]*)`""} |%{$matches[1] -replace '/','\'}
$dest = $destinationMatch.replace("/","\")
$DestFileList = gci -Path $dest -Recurse | select Fullname,Attributes
$DestFileList | Where{$FilteredManifest -notcontains $_.FullName -and $_.Attributes -notmatch "Directory"}|Remove-Item $_ -Force
$DestFileList | Where{$FilteredManifest -notcontains $_.FullName -and $_.Attributes -match "Directory" -and (gci -LiteralPath $_ -Recurse -WarningAction SilentlyContinue -ErrorAction SilentlyContinue).Length -eq 0}{Remove-Item $_ -Recurse -Force}
This stops you from duplicating efforts. There's no need to get your manifest file, and then assign different variables to different properties of the file object, just reference them directly. Then later when you pull your directory listing of the drive (the slow part here), keep the full name and attributes of the files/folders. That way you can easily filter against Attributes to see what's a directory and what not, so we can deal with files first, then clean up directories later after the files are cleaned up.
That script should be a bit more streamlined version of yours. Now, about pulling that directory listing... Here's the deal, using Get-ChildItem is going to be slower than some alternatives (such as dir /s /b) but it stops you from having to duplicate efforts by later checking what's a file, and what's a directory. I suppose if the actual files/folders that you are concerned with are a small percentage of the total, then the double work may actually be worth the time and effort to pull the list with something like dir /s /b, and then parse against the log, and only pull folder/file info for the specific items you need to address.

Resources