Need help on PowerShell column looping - shell

I have one requirement which should be done in windows PowerShell or command line. I need to split CSV file columns into .txt files.
customer.csv:
id,name
1,a
2,b
I need to split columns into text files (here rows and columns count are dynamic)
The output text files should be as follows:
id.txt:
1
2
name.txt:
a
b
I found the following script with the help of Google:
$a = Import-Csv "D:\Final\customer.csv"
$b = $a[0] | Get-Member | select -Skip 1 | ? { $_.membertype -eq 'noteproperty'}
$b | % { $a | ft -Property $_.name | out-file "$($_.name).txt" }
But the output text files are coming with column names, spaces & etc.. I am unable to customize the above code. Kindly provide any help and let me know if any one needs more information.
Thank you,
Satish Kumar

The problem with your code is the use of ft (Format-Table) which formats data from the CSV file, thus the spaces.
The following PowerShell script is cleaner way to do it:
$csv = Import-Csv -Path 'D:\Final\customer.csv'
$columns = $csv | Get-Member -MemberType NoteProperty
foreach( $c in $columns )
{
foreach( $line in $csv )
{
Add-Content -Path $( $c.Name + '.txt' ) -Value $line.$( $c.Name )
}
}

Related

Compare columns between 2 files and delete non common columns using Powershell

I have a bunch of files in folder A and their corresponding metadata files in folder B. I want to loop though the data files and check if the columns are the same in the metadata file, (since incoming data files could have new columns added at any position without notice). If the columns in both files match, no action to is to be taken. If Data file has more columns than metadata file, then those columns should be deleted from incoming data file. Any help would be appreciated. Thanks!
Data file is ps_job.txt
“empid”|”name”|”deptid”|”zipcode”|”salary”|”gender”
“1”|”Tom”|”10″|”11111″|”1000″|”M”
“2”|”Ann”|”20″|”22222″|”2000″|”F”
Meta data file is ps_job_metadata.dat
“empid”|”name”|”zipcode”|”salary”
I would like my output to be
“empid”|”name”|”zipcode”|”salary”
“1”|”Tom”|”11111″|”1000″
“2”|”Ann”|”22222″|”2000″
That's a seemingly simple question with a very complicated answer. However, I've broken down the code for what you will need to do. Here are the steps that need to happen in order for powershell to do everything you're asking of it.
Read the .dat file
Save the .dat data into an object
Read the .txt file
Save the .txt header into an object
Check for the differences
Delete the old text file (that had too many columns)
Create a new text file with the new columns
I've made some assumptions in how this looks. However, with the way I've structured the code, it should be easy enough to make modifications as necessary if my assumptions are wrong. Here are my assumptions:
The text file will always have all of the columns that the DAT file has (even though it will sometimes have more)
The dat file is structured like a text file and can be directly imported into powershell.
And here is the code, with comments. I've done my best to explain the purpose of each section, but I've written this with the expectation that you have a basic knowledge of powershell, especially arrays. If you have questions I'll do my best to answer, though I'll ask that you refer to the section of code you have questions on.
###
### The paths. I'm sure you will have multiples of each file. However, I didn't want to attempt to pull in
### the files with this sample code as it can vary so much in your environment.
###
$dat = "C:\StackOverflow\thingy.dat"
$txt = "C:\stackoverflow\ps_job.txt"
###
### This is the section to process the DAT file
###
# This will read the file and put it in a variable
$dat_raw = get-content -Path $dat
# Now, let's seperate out the punctuation and give us our object
$dat_array = $dat_raw.split("|")
$dat_object = #()
foreach ($thing in $dat_array)
{
$dat_object+=$thing.Replace("""","")
}
###
### This is the section to process the TXT file
###
# This will read the file and put it into a variable
$txt_raw = get-content -Path $txt
# Now, let's seperate out the punctuation and give us our object
$txt_header_array = $txt_raw[0].split("|")
$txt_header_object = #()
foreach ($thing in $txt_header_array)
{
$txt_header_object += $thing.Replace("""","")
}
###
### Now, let's figure out which columns we're eliminating (if any)
###
$x = 0
$total = $txt_header_object.count
$to_keep = #()
While ($x -le $total)
{
if ($dat_object -contains $txt_header_object[$x])
{
$to_keep += $x
}
$x++
}
### Now that we know which objects to keep, we can apply the changes to each line of the text file.
### We will save each line to a new variable. Then, once we have the new variable, we will delete
### The existing file with a new file that has only the data we want.Note, we will only run this
### Code if there's a difference in the files.
if ($total -ne $to_keep.count)
{
### This first section will go line by line and 'fix' the number of columns
$new_text_file = #()
foreach ($line in $txt_raw)
{
if ($line.Length -gt 0)
{
# Blank out the array each time
$line_array = #()
foreach ($number in $to_keep)
{
$line_array += ($line.split("|"))[$number]
}
$new_text_file += $line_array -join "|"
}
else
{
$new_text_file +=""
}
}
### This second section will delete the original file and replace it with our good
### file that has been created.
Remove-item -Path $txt
$new_text_file | out-file -FilePath $txt
}
This small example can be a start for your solution :
$ps_job = Import-Csv D:\ps_job.txt -Delimiter '|'
$ps_job_metadata = (Get-Content D:\ps_job_metadata.txt) -split '\|'-replace '"'
foreach( $d in (Compare-Object $column $ps_job_metadata))
{
if($d.SideIndicator -eq '<=')
{
$ps_job | %{ $_.psobject.Properties.Remove($d.InputObject) }
}
}
$ps_job | Export-Csv -Path D:\output.txt -Delimiter '|' -NoTypeInformation
I tried this and it works.
$outputFile = "C:\Script_test\ps_job_mod.dat"
$sample = Import-Csv -Path "C:\Script_test\ps_job.dat" -Delimiter '|'
$metadataLine = Get-Content -Path "C:\Script_test\ps_job_metadata.txt" -First 1
$desiredColumns = $metadataLine.Split("|").Replace("`"","")
$sample | select $desiredColumns | Export-Csv $outputFile -Encoding UTF8 -NoTypeInformation -Delimiter '|'
Please note that the smart quotes are in consistent over the rows and there are empty lines between the rows (I highly recommend to reformat/update your question).
Anyways, as long as the quoting of the header is consistent between the two (ps_job.txt and ps_job_metadata.dat) files:
# $JobTxt = Get-Content .\ps_job.txt
$JobTxt = #'
“empid”|”name”|”deptid”|”zipcode”|”salary”|”gender”
“1”|”Tom”|”10″|”11111″|”1000″|”M”
“2”|”Ann”|”20″|”22222″|”2000″|”F”
'#
# $MetaDataTxt = Get-Content .\ps_job_metadata.dat
$MetaDataTxt = #'
“empid”|”name”|”zipcode”|”salary”
'#
$Job = ConvertFrom-Csv -Delimiter '|' $JobTxt
$MetaData = ConvertFrom-Csv -Delimiter '|' (#($MetaDataTxt) + 'x|')
$Job | Select-Object $MetaData.PSObject.Properties.Name
“empid” ”name” ”zipcode” ”salary”
------- ------ --------- --------
“1” ”Tom” ”11111″ ”1000″
“2” ”Ann” ”22222″ ”2000″
Here's the same answer I posted to your question on Powershell.org
$jobfile = "ps_job.dat"
$metafile = "ps_job_metadata.dat"
$outputfile = "some_file.csv"
$meta = ((Get-Content $metafile -First 1 -Encoding UTF8) -split '\|')
Class ColumnSelector : System.Collections.Specialized.OrderedDictionary {
Select($line,$meta)
{
$meta | foreach{$this.add($_,(iex "`$line.$_"))}
}
ColumnSelector($line,$meta)
{
$this.select($line,$meta)
}
}
import-csv $jobfile -Delimiter '|' |
foreach{[pscustomobject]([columnselector]::new($_,$meta))} |
Export-CSV $outputfile -Encoding UTF8 -NoTypeInformation -Delimiter '|'
Output
PS C:\>Get-Content $outputfile
"empid"|"name"|"zipcode"|"salary"
"1"|"Tom"|"11111"|"1000"
"2"|"Ann"|"22222"|"2000"
Provided you want to keep those curly quotes and your code page and console font supports all the characters, you can do the following:
# Create array of properties delimited by |
$headers = (Get-Content .\ps_job_metadata.dat -Encoding UTF8) -split '\|'
Import-Csv ps_job.dat -Delimiter '|' -Encoding utf8 | Select-Object $headers

PowerShell Format-Table -AutoSize not Producing an Output File

When running the following line in PowerShell including the "Format-Table -AutoSize", an empty output file is generated:
Get-ChildItem -Recurse | select FullName,Length | Format-Table -AutoSize | Out-File filelist.txt
The reason I need the output file to be AutoSized is because longer filenames from the directoy are being trunacted. I am trying to pull all Filenames and File Sizes for all files within a folder and subfolders. When removing the -Autosize element, an output file is generated with truncated file names:
Get-ChildItem -Recurse | select FullName,Length | Out-File filelist.txt
Like AdminOfThings commented, use Export-CSV to get the untruncated values of your object.
Get-ChildItem -Recurse | select FullName,Length | Export-CSv -path $myPath -NoTypeInformation
I do not use Out-File much at all, and only use Format-Table/Format-List for interactive scripts. If I want to write data to a file, Select-Object Column1,Column2 | Sort-Object Column1| Export-CSV lets me select the properties of the object I am exporting that I want to export, and sort the records as needed. you can change the delimiter from a comma to tab/pipe/whatever else you may need.
While the other answer may address the issue, you may have other reasons for wanting to use Out-File. Out-File has a "Width" parameter. If this is not set, PowerShell defaults to 80 characters - hence your issue. This should do the trick:
Get-ChildItem -Recurse | select FullName,Length | Out-File filelist.txt -Width 250 (or any other value)
The Format-* commandlets in PowerShell are only intended to be used in the console. They do not actually produce output that can be piped to other commandlets.
The usual approach to get the data out is with Export-Csv. CSV files are easily imported into other scripts or spreadsheets.
If you really need to output a nicely formatted text file you can use .Net composite formatting with the -f (format) operator. This works similarly to printf() in C. Here is some sample code:
# Get the files for the report
$files = Get-ChildItem $baseDirectory -Recurse
# Path column width
$nameWidth = $files.FullName |
ForEach-Object { $_.Length } |
Measure-Object -Maximum |
Select-Object -ExpandProperty Maximum
# Size column width
$longestFileSize = $files |
ForEach-Object { $_.Length.tostring().Length } |
Measure-Object -Maximum |
Select-Object -ExpandProperty Maximum
# Have to consider that some directories will have no files with
# length strings longer than "Size (Bytes)"
$sizeWidth = [System.Math]::Max($longestFileSize, "Size (Bytes)".Length)
# Right-align paths, left-align file size
$formatString = "{0,-$nameWidth} {1,$sizeWidth}"
# Build the report and write it to a file
# ArrayList are much more efficient than using += with arrays
$lines = [System.Collections.ArrayList]::new($files.Length + 3)
# The [void] cast are just to prevent ArrayList.add() from cluttering the
# console with the returned indices
[void]$lines.Add($formatString -f ("Path", "Size (Bytes)"))
[void]$lines.Add($formatString -f ("----", "------------"))
foreach ($file in $files) {
[void]$lines.Add($formatString -f ($file.FullName, $file.Length.ToString()))
}
$lines | Out-File "Report.txt"

How to remove first character in the first column from each row in CSV file?

My CSV file has column headers, then for each rows after that, the first character in the first column starts with "+".
How can I remove the first character "+" on the 1st column from each row (other than the first header row) ?
Thank you.
Sample file:
"col1","col2","col3"...
+"datacol1a","datacol2a","datacol3a"
+"datacol1b","datacol2b","datacol3b"
+"datacol1c","datacol2c","datacol3c"
Desired result:
"col1","col2","col3"...
"datacol1a","datacol2a","datacol3a"
"datacol1b","datacol2b","datacol3b"
"datacol1c","datacol2c","datacol3c"
I tried this powershell code (FindReplace.ps1), but it doesn't work:
param([string]$CSVFile="c:\myfolder\myCsv.csv")
(gc $CSVFile) -replace "^+"
If I use this powershell code, it also delete the first character on the header row (which I don't want to do). How can I do this, except for the header row (row 1) ?
Get-Content -Path $CSVFile | ForEach-Object { $_.substring(1,$_.Length - 1) }
I call this powershell code from a BAT file
powershell -file FindReplace.ps1 -CSVFile "c:\myfolder\myCsv.csv" >
"c:\myfolder\myCsv2.csv"
Because the -replace uses regex, you need to escape the + sign. Its a special character. And you need to tell the script what you want to replace it with.
Try this:
(Get-Content .\csv1.csv) -replace ('^\+', '') | Out-File .\csv2.csv
This could be another solution.
$csv = get-Content C:\Temp\csv1.csv
$header = $csv | Select-Object -First 1
$data = $csv | Select-Object -skip 1 | ForEach-Object { $_.substring(1,
$_.Length - 1) }
$($header; $data) | Set-Content -Path C:\Temp\csv2.csv
Edited to add #TheMadTechnician solution.
Get-Content C:\Temp\text.csv| ForEach-Object {$_.trimstart('+')} | set-content c:\temp\csv232.csv
I left out the alias for ease of reading

How to add different string lines to a new column and each row in a csv?

Image of the error log
i want to get the "Messages:" (in the Generals tab) line to the "CCIS_Error_Log_2017-05-03.csv" in a new column each row mapping the corresponding event.
(Get-WinEvent -FilterHashTable #{logname = 'CCIS'; } | where {$_.LevelDisplayName -eq "Error"}|select Message)|Set-Content -Path "C:\Logs\temp.txt"
Import-Csv C:\CCIS_Error_Log_2017-05-03.csv|select-object *,#{Name="Messages";Expression={select-string C:\Logs\temp.txt -pattern "Message:" | foreach {$_.Line}}} | Export-Csv C:\Logs\CCIS_Error_Log_2017-05-03.csv -notypeinformation
This is a segment of the script i wrote to do the above scenario. but instead mapping to the event it copies all the "Messages:" in each event to a single row in the csv file.
I hope i explained it well
Thanks in advance
If it is possible to map the Events to the entries in the Error CSV file exactly by Timestamp then this could be a solution (assumes that there is a Timestamp column in the CSV):
$Events = Get-WinEvent -FilterHashTable #{logname = 'CCIS'} | Where {$_.LevelDisplayName -eq "Error"}
$ErrorLog = Import-Csv C:\CCIS_Error_Log_2017-05-03.csv
$ErrorLog | ForEach-Object {
$LogEntry = $_
$Event = $Events | Where {$_.Id -eq $LogEntry.Id}
If ($Event){
Write-Output ($LogEntry | Select *,#{Name='Message';Expression={$Event.Message}})
}Else{
Write-Warning "Could not locate a matching message for $LogEntry"
}
} | Export-CSV C:\Logs\CCIS_Error_Log_2017-05-03.csv -NoTypeInformation

Extract timestamp from filename and sort

I'm trying to look through each item in a folder and add each item to an array sorted by the datestamp in the filename.
For example, I have three files:
myfile_20150813_040949.txt
myfile_20150812_030949.txt
myfile_20150812_010949.txt
I'm not sure how to parse out the time from each and add them to an array in ascending order. Any help would be appreciated.
I am assuming that you are looking to sort the files by the parsed timestamp that is pulled from the file name with this example. It may not the be the best RegEx approach, but it works in testing.
#RegEx pattern to parse the timestamps
$Pattern = '.*_(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\.txt'
$List = New-Object System.Collections.ArrayList
$Temp = New-Object System.Collections.ArrayList
Get-ChildItem | ForEach {
#Make sure the file matches the pattern
If ($_.Name -match $Pattern) {
Write-Verbose "Add $($_.Name)" -Verbose
$Date = $Matches[2],$Matches[3],$Matches[1] -join '/'
$Time = $Matches[4..6] -join ':'
[void]$Temp.Add(
(New-Object PSObject -Property #{
Date =[datetime]"$($Date) $($Time)"
File = $_
}
))
}
}
#Sort the files by the parsed timestamp and add to the main list
$List.AddRange(#($Temp | Sort Date | Select -Expand File))
#Clear out the temp collection
$Temp.Clear()
#Display the results
$List
What you could be doing for this is using the string method .Split() with the [datetime] method of TryParseExact(). Go though each file and add a property for the "FromFileDate" and then sort on that.
$path = "C:\temp"
Get-ChildItem -Filter "*.txt" -Path $path | ForEach-Object{
$date = ($_.BaseName).Split("_",2)[1]
$result = New-Object DateTime
if([datetime]::TryParseExact($date,"yyyyMMdd_hhmmss",[System.Globalization.CultureInfo]::InvariantCulture,[System.Globalization.DateTimeStyles]::None,[ref]$result)){
# This is a good date
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value $result -PassThru
} Else {
# Could not parse date from filename
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value "Could not Parse" -PassThru
}
} | Select-Object Name,fromfiledate | Sort-Object fromfiledate
We take the basename of the each text file and split it into 2 parts from the first underscore. Using TryParseExact we then attempt to convert the "date" string to the format of "yyyyMMdd_hhmmss". Since we use TryParseExact if we have trouble parsing the date then the code will continue.
Sample Output
Name FromFileDate
---- ------------
myfile_20150812_030949.txt 8/12/2015 3:09:49 AM
myfile_20150813_040949.txt 8/13/2015 4:09:49 AM
files.txt Could not Parse
If you didn't want the erroneous data in the output a simple Where-Object{$_.fromfiledate -is [datetime]} would remove those entries.

Resources