I'm using packbeat to monitor network traffic for a SIEM-like setup with ELK. I'd like to push it to a large number of machines but the setup requires manual identification in packetbeat.yml.
Has any been able to script the process of selecting the appropriate interface to monitor for packetbeat?
I've put this together - which uses 3 separate .yml
ConfigTemplate.yml which contains the rest of the packetbeat.yml minus the interfaces.
Interfaces.yml which is a temp file used to write the interfaces to.
packetbeat.yml which is the final config file packetbeat will use.
The python script should be in the packetbeat directory along with the config .yml's
The only limitation is that it needs python on the host machines - the next stage is to see if it can be done with powershell.
Hope this helps anyone else! Any improvements are welcome!
import subprocess
devices = subprocess.check_output(["powershell.exe", "(./packetbeat.exe devices).count"])
devicesCount = int(devices.decode('utf-8'))
print(devicesCount)
deviceCount = range(devicesCount)
with open('ConfigTemplate.yml', 'r') as original: data1 = original.read()
with open('Interfaces.yml', 'w') as modified:
for i in deviceCount:
modified.write("packetbeat.interfaces.device: " + str(i)+ "\n" )
with open('Interfaces.yml', 'r') as original: data2 = original.read()
with open('Packetbeat.yml', 'w') as modified2: modified2.write("# ================== Set listening interfaces ==================" +"\n"+ data2 + "\n" + data1 + "\n")
Powershell version -
$count = (C:\path\to\packetbeat.exe - devices).count
$line = ''
for($i=0; $i -le ($count-1); $i++){
$line +="packetbeat.interfaces.device:"+" $i `r`n"
}
$line | Out-File -FilePath "C:\path\to\packetbeat\Interfaces.yml"
$configTemplate = Get-Content -Path "C:\path\to\packetbeat\ConfigTemplate.yml"
$interfaces = Get-Content -Path "C:\path\to\packetbeat\Interfaces.yml"
$interfaces + "`r`n" + $configTemplate | Out-File -FilePath "C:\path\to\packetbeat\packet.yml"
Related
This code works. I just want to see how much faster someone can make it work.
Backup your Windows 10 batch file in case something goes wrong. Find all instances of string {LINE2 1-9999} and replace with {LINE2 "line number the code is on"}. Overwrite, encoding as ASCII.
If _61.bat is:
TITLE %TIME% NO "%zmyapps1%\*.*" ARCHIVE ATTRIBUTE LINE2 1243
TITLE %TIME% DOC/SET YQJ8 LINE2 1887
SET ztitle=%TIME%: WINFOLD LINE2 2557
TITLE %TIME% _*.* IN WINFOLD LINE2 2597
TITLE %TIME% %%ZDATE1%% YQJ25 LINE2 3672
TITLE %TIME% FINISHED. PRESS ANY KEY TO SHUTDOWN ... LINE2 4922
Results:
TITLE %TIME% NO "%zmyapps1%\*.*" ARCHIVE ATTRIBUTE LINE2 1
TITLE %TIME% DOC/SET YQJ8 LINE2 2
SET ztitle=%TIME%: WINFOLD LINE2 3
TITLE %TIME% _*.* IN WINFOLD LINE2 4
TITLE %TIME% %%ZDATE1%% YQJ25 LINE2 5
TITLE %TIME% FINISHED. PRESS ANY KEY TO SHUTDOWN ... LINE2 6
Code:
Copy-Item $env:windir\_61.bat -d $env:temp\_61.bat
(gc $env:windir\_61.bat) | foreach -Begin {$lc = 1} -Process {
$_ -replace "LINE2 \d*", "LINE2 $lc";
$lc += 1
} | Out-File -Encoding Ascii $env:windir\_61.bat
I expect this to take less than 984 milliseconds. It takes 984 milliseconds. Can you think of anything to speed it up?
The key to better performance in PowerShell code (short of embedding C# code compiled on demand with Add-Type, which may or may not help) is to:
avoid use of cmdlets and the pipeline in general,
especially invocation of a script block ({...}) for each pipeline input object, such as with ForEach-Object and Where-Object
However, it isn't the pipeline per se that is to blame, it is the current inefficient implementation of these cmdlets - see GitHub issue #10982 - and there is a workaround that noticeably improves pipeline performance:
# Faster alternative to:
# 1..10 | ForEach-Object { $_ * 10 }
1..10 | . { process { $_ * 10 } }
# Faster alternative to:
# 1..10 | Where-Object { $_ -gt 5 }
1..10 | . { process { if ($_ -gt 5) { $_ } } }
avoiding the pipeline requires direct use of the .NET framework types as an alternative to cmdlets.
if feasible, use switch statements for array or line-by-line file processing - switch statements generally outperform foreach loops.
To be clear: The pipeline and cmdlets offer clear benefits, so avoiding them should only be done if optimizing performance is a must.
In your case, the following code, which combines the switch statement with direct use of the .NET framework for file I/O seems to offer the best performance - note that the input file is read into memory as a whole, as an array of lines, and a copy of that array with the modified lines is created before it is written back to the input file:
$file = "$env:temp\_61.bat" # must be a *full* path.
$lc = 0
$updatedLines = & { switch -Regex -File $file {
'^(.*? LINE2 )\d+(.*)$' { $Matches[1] + ++$lc + $Matches[2] }
default { ++$lc; $_ } # pass non-matching lines through
} }
[IO.File]::WriteAllLines($file, $updatedLines, [Text.Encoding]::ASCII)
Note:
Enclosing the switch statement in & { ... } is an obscure performance optimization explained in this answer.
If case-sensitive matching is sufficient, as suggested by the sample input, you can improve performance a little more by adding the -CaseSensitive option to the switch command.
In my tests (see below), this provided a more than 4-fold performance improvement in Windows PowerShell relative to your command.
Here's a performance comparison via the Time-Command function:
The commands compared are:
The switch command from above.
A slightly streamlined version of your own command.
A PowerShell Core v6.1+ alternative that uses the -replace operator with the array of lines as the LHS and a scriptblock as the replacement expression.
Instead of a 6-line sample file, a 6,000-line file is used.
100 runs are being averaged.
It's easy to adjust these parameters.
# Sample file content (6 lines)
$fileContent = #'
TITLE %TIME% NO "%zmyapps1%\*.*" ARCHIVE ATTRIBUTE LINE2 1243
TITLE %TIME% DOC/SET YQJ8 LINE2 1887
SET ztitle=%TIME%: WINFOLD LINE2 2557
TITLE %TIME% _*.* IN WINFOLD LINE2 2597
TITLE %TIME% %%ZDATE1%% YQJ25 LINE2 3672
TITLE %TIME% FINISHED. PRESS ANY KEY TO SHUTDOWN ... LINE2 4922
'#
# Determine the full path to a sample file.
# NOTE: Using the *full* path is a *must* when calling .NET methods, because
# the latter generally don't see the same working dir. as PowerShell.
$file = "$PWD/test.bat"
# Create the sample file with the sample content repeated N times.
$repeatCount = 1000 # -> 6,000 lines
[IO.File]::WriteAllText($file, $fileContent * $repeatCount)
# Warm up the file cache and count the lines.
$lineCount = [IO.File]::ReadAllLines($file).Count
# Define the commands to compare as an array of scriptblocks.
$commands =
{ # switch -Regex -File + [IO.File]::Read/WriteAllLines()
$i = 0
$updatedLines = & { switch -Regex -File $file {
'^(.*? LINE2 )\d+(.*)$' { $Matches[1] + ++$i + $Matches[2] }
default { ++$lc; $_ }
} }
[IO.File]::WriteAllLines($file, $updatedLines, [text.encoding]::ASCII)
},
{ # Get-Content + -replace + Set-Content
(Get-Content $file) | ForEach-Object -Begin { $i = 1 } -Process {
$_ -replace "LINE2 \d*", "LINE2 $i"
++$i
} | Set-Content -Encoding Ascii $file
}
# In PS Core v6.1+, also test -replace with a scriptblock operand.
if ($PSVersionTable.PSVersion.Major -ge 6 -and $PSVersionTable.PSVersion.Minor -ge 1) {
$commands +=
{ # -replace with scriptblock + [IO.File]::Read/WriteAllLines()
$i = 0
[IO.File]::WriteAllLines($file,
([IO.File]::ReadAllLines($file) -replace '(?<= LINE2 )\d+', { (++$i) }),
[text.encoding]::ASCII
)
}
} else {
Write-Warning "Skipping -replace-with-scriptblock command, because it isn't supported in this PS version."
}
# How many runs to average.
$runs = 100
Write-Verbose -vb "Averaging $runs runs with a $lineCount-line file of size $('{0:N2} MB' -f ((Get-Item $file).Length / 1mb))..."
Time-Command -Count $runs -ScriptBlock $commands
Here are sample results from my Windows 10 machine (the absolute timings aren't important, but hopefully the relative performance show in in the Factor column is somewhat representative); the PowerShell Core version used is v6.2.0-preview.4
# Windows 10, Windows PowerShell v5.1
WARNING: Skipping -replace-with-scriptblock command, because it isn't supported in this PS version.
VERBOSE: Averaging 100 runs with a 6000-line file of size 0.29 MB...
Factor Secs (100-run avg.) Command
------ ------------------- -------
1.00 0.108 # switch -Regex -File + [IO.File]::Read/WriteAllLines()...
4.22 0.455 # Get-Content + -replace + Set-Content...
# Windows 10, PowerShell Core v6.2.0-preview 4
VERBOSE: Averaging 100 runs with a 6000-line file of size 0.29 MB...
Factor Secs (100-run avg.) Command
------ ------------------- -------
1.00 0.101 # switch -Regex -File + [IO.File]::Read/WriteAllLines()…
1.67 0.169 # -replace with scriptblock + [IO.File]::Read/WriteAllLines()…
4.98 0.503 # Get-Content + -replace + Set-Content…
I have a working script who's objective is to parse data files for malformed rows before importing into Oracle. To process a 450MB csv file with > 1 million rows having 8 columns it takes a little over 2.5hrs and maxes a single CPU core. Small files complete quickly (in seconds).
Oddly a 350MB file with similar number of rows and 40 columns only takes 30 mins.
My issue is that the files will grow over time and 2.5 hours tying up a CPU ain't good. Can anyone recommend code optimisation ? A similarly title post recommended local paths - which I'm already doing.
$file = "\Your.csv"
$path = "C:\Folder"
$csv = Get-Content "$path$file"
# Count number of file headers
$count = ($csv[0] -split ',').count
# https://blogs.technet.microsoft.com/gbordier/2009/05/05/powershell-and-writing-files-how-fast-can-you-write-to-a-file/
$stream1 = [System.IO.StreamWriter] "$path\Passed$file-Pass.txt"
$stream2 = [System.IO.StreamWriter] "$path\Failed$file-Fail.txt"
# 2 validation steps: (1) count number of headers is ge (2) Row split after first col. Those right hand side cols must total at least 40 characters.
$csv | Select -Skip 1 | % {
if( ($_ -split ',').count -ge $count -And ($_.split(',',2)[1]).Length -ge 40) {
$stream1.WriteLine($_)
} else {
$stream2.WriteLine($_)
}
}
$stream1.close()
$stream2.close()
Sample Data File:
C1,C2,C3,C4,C5,C6,C7,C8
ABC,000000000000006732,1063,2016-02-20,0,P,ESTIMATE,2015473497A10
ABC,000000000000006732,1110,2016-06-22,0,P,ESTIMATE,2015473497A10
ABC,,2016-06-22,,201501
,,,,,,,,
ABC,000000000000006732,1135,2016-08-28,0,P,ESTIMATE,2015473497B10
ABC,000000000000006732,1167,2015-12-20,0,P,ESTIMATE,2015473497B10
Get-Content is extremely slow in the default mode that produces an array when the file contains millions of lines on all PowerShell versions, including 5.1. What's worse, you're assigning it to a variable so until the entire file is read and split into lines nothing else happens. On Intel i7 3770K CPU at 3.9GHz $csv = Get-Content $path takes more than 2 minutes to read a 350MB file with 8 million lines.
Solution: Use IO.StreamReader to read a line and process it immediately.
In PowerShell2 StreamReader is less optimized than in PS3+ but still faster than Get-Content.
Pipelining via | is at least several times slower than direct enumeration via flow control statements such as while or foreach statement (not cmdlet).
Solution: use the statements.
Splitting each line into an array of strings is slower than manipulating only one string.
Solution: use IndexOf and Replace method (not operator) to count character occurrences.
PowerShell always creates an internal pipeline when loops are used.
Solution: use the Invoke-Command { } trick for 2-3x speedup in this case!
Below is PS2-compatible code.
It's faster in PS3+ (30 seconds for 8 million lines in a 350MB csv on my PC).
$reader = New-Object IO.StreamReader ('r:\data.csv', [Text.Encoding]::UTF8, $true, 4MB)
$header = $reader.ReadLine()
$numCol = $header.Split(',').count
$writer1 = New-Object IO.StreamWriter ('r:\1.csv', $false, [Text.Encoding]::UTF8, 4MB)
$writer2 = New-Object IO.StreamWriter ('r:\2.csv', $false, [Text.Encoding]::UTF8, 4MB)
$writer1.WriteLine($header)
$writer2.WriteLine($header)
Write-Progress 'Filtering...' -status ' '
$watch = [Diagnostics.Stopwatch]::StartNew()
$currLine = 0
Invoke-Command { # the speed-up trick: disables internal pipeline
while (!$reader.EndOfStream) {
$s = $reader.ReadLine()
$slen = $s.length
if ($slen-$s.IndexOf(',')-1 -ge 40 -and $slen-$s.Replace(',','').length+1 -eq $numCol){
$writer1.WriteLine($s)
} else {
$writer2.WriteLine($s)
}
if (++$currLine % 10000 -eq 0) {
$pctDone = $reader.BaseStream.Position / $reader.BaseStream.Length
Write-Progress 'Filtering...' -status "Line: $currLine" `
-PercentComplete ($pctDone * 100) `
-SecondsRemaining ($watch.ElapsedMilliseconds * (1/$pctDone - 1) / 1000)
}
}
} #Invoke-Command end
Write-Progress 'Filtering...' -Completed -status ' '
echo "Elapsed $($watch.Elapsed)"
$reader.close()
$writer1.close()
$writer2.close()
Another approach is to use regex in two passes (it's slower than the above code, though).
PowerShell 3 or newer is required due to array element property shorthand syntax:
$text = [IO.File]::ReadAllText('r:\data.csv')
$header = $text.substring(0, $text.indexOfAny("`r`n"))
$numCol = $header.split(',').count
$rx = [regex]"\r?\n(?:[^,]*,){$($numCol-1)}[^,]*?(?=\r?\n|$)"
[IO.File]::WriteAllText('r:\1.csv', $header + "`r`n" +
($rx.matches($text).groups.value -join "`r`n"))
[IO.File]::WriteAllText('r:\2.csv', $header + "`r`n" + $rx.replace($text, ''))
If you feel like installing awk, you can do 1,000,000 records in under a second - seems like a good optimisation to me :-)
awk -F, '
NR==1 {f=NF; printf("Expecting: %d fields\n",f)} # First record, get expected number of fields
NF!=f {print > "Fail.txt"; next} # Fail for wrong field count
length($0)-length($1)<40 {print > "Fail.txt"; next} # Fail for wrong length
{print > "Pass.txt"} # Pass
' MillionRecord.csv
You can get gawk for Windows from here.
Windows is a bit awkward with single quotes in parameters, so if running under Windows I would use the same code, but formatted like this:
Save this in a file called commands.awk:
NR==1 {f=NF; printf("Expecting: %d fields\n",f)}
NF!=f {print > "Fail.txt"; next}
length($0)-length($1)<40 {print > "Fail.txt"; next}
{print > "Pass.txt"}
Then run with:
awk -F, -f commands.awk Your.csv
The remainder of this answer relates to a "Beat hadoop with the shell" challenge mentioned in the comments section, and I wanted somewhere to save my code, so it's here.... runs in 6.002 seconds on my iMac over the 3.5GB in 1543 files amounting to around 104 million records:
#!/bin/bash
doit(){
awk '!/^\[Result/{next} /1-0/{w++;next} /0-1/{b++} END{print w,b}' $#
}
export -f doit
find . -name \*.pgn -print0 | parallel -0 -n 4 -j 12 doit {}
Try experimenting with different looping strategies, for example, switching to a for loop cuts the processing time by more than 50%, e.g.:
[String] $Local:file = 'Your.csv';
[String] $Local:path = 'C:\temp';
[System.Array] $Local:csv = $null;
[System.IO.StreamWriter] $Local:objPassStream = $null;
[System.IO.StreamWriter] $Local:objFailStream = $null;
[Int32] $Local:intHeaderCount = 0;
[Int32] $Local:intRow = 0;
[String] $Local:strRow = '';
[TimeSpan] $Local:objMeasure = 0;
try {
# Load.
$objMeasure = Measure-Command {
$csv = Get-Content -LiteralPath (Join-Path -Path $path -ChildPath $file) -ErrorAction Stop;
$intHeaderCount = ($csv[0] -split ',').count;
} #measure-command
'Load took {0}ms' -f $objMeasure.TotalMilliseconds;
# Create stream writers.
try {
$objPassStream = New-Object -TypeName System.IO.StreamWriter ( '{0}\Passed{1}-pass.txt' -f $path, $file );
$objFailStream = New-Object -TypeName System.IO.StreamWriter ( '{0}\Failed{1}-fail.txt' -f $path, $file );
# Process CSV (v1).
$objMeasure = Measure-Command {
$csv | Select-Object -Skip 1 | Foreach-Object {
if( (($_ -Split ',').Count -ge $intHeaderCount) -And (($_.Split(',',2)[1]).Length -ge 40) ) {
$objPassStream.WriteLine( $_ );
} else {
$objFailStream.WriteLine( $_ );
} #else-if
} #foreach-object
} #measure-command
'Process took {0}ms' -f $objMeasure.TotalMilliseconds;
# Process CSV (v2).
$objMeasure = Measure-Command {
for ( $intRow = 1; $intRow -lt $csv.Count; $intRow++ ) {
if( (($csv[$intRow] -Split ',').Count -ge $intHeaderCount) -And (($csv[$intRow].Split(',',2)[1]).Length -ge 40) ) {
$objPassStream.WriteLine( $csv[$intRow] );
} else {
$objFailStream.WriteLine( $csv[$intRow] );
} #else-if
} #for
} #measure-command
'Process took {0}ms' -f $objMeasure.TotalMilliseconds;
} #try
catch [System.Exception] {
'ERROR : Failed to create stream writers; exception was "{0}"' -f $_.Exception.Message;
} #catch
finally {
$objFailStream.close();
$objPassStream.close();
} #finally
} #try
catch [System.Exception] {
'ERROR : Failed to load CSV.';
} #catch
exit 0;
I am trying to get the number of times a string (varying from 40 to 400+ characters) in "file1" occurs in "file2" in an effective way. file1 has about 2k lines and file2 has about 130k lines. I currently have a Unix solution that does it in about 2 mins in a VM and about 5 in Cygwin, but I am trying to do it with Powershell/Python since the files are in windows and I am using the output in excel and use it with automation (AutoIT.)
I have a solution, but it takes WAY too long (in about the same times that the Cygwin finished - all 2k lines - I had only 40-50 lines in Powershell!)
Although I haven't prepare a solution yet, I am open to use Python as well if there is a solution that can be fast and accurate.
Here is the Unix Code:
while read SEARCH_STRING;
do printf "%s$" "${SEARCH_STRING}";
grep -Fc "${SEARCH_STRING}" file2.csv;
done < file1.csv | tee -a output.txt;
And here is the Powershell code I currently have
$Target = Get-Content .\file1.csv
Foreach ($line in $Target){
#Just to keep strings small, since I found that not all
#strings were being compared correctly if they where 250+ chars
$line = $line.Substring(0,180)
$Coll = Get-Content .\file2.csv | Select-string -pattern "$line"
$cnt = $Coll | measure
$cnt.count
}
Any ideas of suggestions will help.
Thanks.
EDIT
I'm trying a modified solution suggested by C.B.
del .\output.txt
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "C:\temp\file2.csv" )
Foreach ($line in $Target){
$line = [string]$line.Substring(0, $line.length/2)
$cnt = [regex]::matches( [string]$file, $line).count >> ".\output.txt"
}
But, since my strings in file1 are varying in length I keept getting OutOfBound exceptions for the SubString function, so I halved (/2) the input string to try to get a match. And when I try to halve them, if I it had an open parentheses, it tells me this:
Exception calling "Matches" with "2" argument(s): "parsing "CVE-2013-0796,04/02/2013,MFSA2013-35 SeaMonkey: WebGL
crash with Mesa graphics driver on Linux (C" - Not enough )'s."
At C:\temp\script_test.ps1:6 char:5
+ $cnt = [regex]::matches( [string]$file, $line).count >> ".\output.txt ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : ArgumentException
I don't know if there is a way to raise the input limit in powershell (My biggest size at the moment is 406, but could be bigger in the future) or just give up and try a Python solution.
Thoughts?
EDIT
Thanks to #C.B. I got the correct answer and it matches the output of the Bash script perfectly. Here is the full code that outputs results to a text file:
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "C:\temp\file2.csv" )
Foreach ($line in $Target){
$cnt = [regex]::matches( $file, [regex]::escape($line)).count >> ".\output.txt"
}
Give this a try:
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "c:\test\file2.csv" )
Foreach ($line in $Target){
$line = $line.Substring(0,180)
$cnt = [regex]::matches( $file, [regex]::escape($line)).count
}
One issue with your script is that you read file2.csv over and over again, for each line from file1.csv. Reading the file just once and storing the content in a variable should significantly speed things up. Try this:
$f2 = Get-Content .\file2.csv
foreach ($line in (gc .\file1.csv)) {
$line = $line.Substring(0,180)
#($f2 | ? { $_ -match $line }).Count
}
I need to change the local administrator name and password on servers to those that are contained in a .csv
The CSV file contains a list with all the information in it whereby the Server, Administrator name and Passwords are different on each line
The csv is headed by three columns - Server,Admin,PW
How could this be done using Powershell?
I know i can set them all the same using this but they need to be as per each csv line.
foreach ($strComputer in get-content c:\Servers.txt)
{
$Admin=[adsi]("WinNT://" + $strComputer + "/Administrator, user")
$Admin.psbase.rename("Newname")
$Admin.SetPassword("NewPW")
try this ( not tested ):
import-csv c:\servers.txt | % {
$Admin=[adsi]("WinNT://" + $($_.Server) + "/Administrator, user")
$Admin.psbase.rename($($_.Admin))
$Admin.SetPassword($($_.PW))
$Admin.SetInfo() # I think it's needed
}
you can use the Import-Csv instead of get-content. then you can adress the variables by using the header names.
asuming you have a file like:
Server,Admin,PW
bla1,bla2,bla3
blaA,blaB,blaC
the output of
foreach ($line in Import-Csv c:\Servers.txt) { echo $line.server }
would be:
bla1
blaA
just to complete your code, try this example:
foreach ($line in Import-Csv c:\Servers.txt)
{
$Admin=[adsi]("WinNT://" + $line.Server + "/Administrator, user")
$Admin.psbase.rename($line.Admin)
$Admin.SetPassword($line.PW)
}
How do you create a unix file format in Powershell? I am using the following to create a file, but it always creates it in the windows format.
"hello world" | out-file -filepath test.txt -append
As I understand, the new line characters CRLF make it to be a Windows format file whereas the unix format needs only a LF at the end of the line. I tried replacing the CRLF with the following, but it didn't work
"hello world" | %{ $_.Replace("`r`n","`n") } | out-file -filepath test.txt -append
There is a Cmdlet in the PowerShell Community Extensions called ConvertTo-UnixLineEnding
One ugly-looking answer is (taking input from dos.txt outputting to unix.txt):
[string]::Join( "`n", (gc dos.txt)) | sc unix.txt
but I would really like to be able to make Set-Content do this by itself and this solution does not stream and therefore does not work well on large files...
And this solution will end the file with a DOS line ending as well... so it is not 100%
I've found that solution:
sc unix.txt ([byte[]][char[]] "$contenttext") -Encoding Byte
posted above, fails on encoding convertions in some cases.
So, here is yet another solution (a bit more verbose, but it works directly with bytes):
function ConvertTo-LinuxLineEndings($path) {
$oldBytes = [io.file]::ReadAllBytes($path)
if (!$oldBytes.Length) {
return;
}
[byte[]]$newBytes = #()
[byte[]]::Resize([ref]$newBytes, $oldBytes.Length)
$newLength = 0
for ($i = 0; $i -lt $oldBytes.Length - 1; $i++) {
if (($oldBytes[$i] -eq [byte][char]"`r") -and ($oldBytes[$i + 1] -eq [byte][char]"`n")) {
continue;
}
$newBytes[$newLength++] = $oldBytes[$i]
}
$newBytes[$newLength++] = $oldBytes[$oldBytes.Length - 1]
[byte[]]::Resize([ref]$newBytes, $newLength)
[io.file]::WriteAllBytes($path, $newBytes)
}
make your file in the Windows CRLF format. then convert all lines to Unix format in new file:
$streamWriter = New-Object System.IO.StreamWriter("\\wsl.localhost\Ubuntu\home\user1\.bashrc2")
$streamWriter.NewLine = "`n"
gc "\\wsl.localhost\Ubuntu\home\user1\.bashrc" | % {$streamWriter.WriteLine($_)}
$streamWriter.Flush()
$streamWriter.Close()
not a one-liner, but works for all lines, including EOF. new file now shows as Unix format in Notepad on Win11.
delete original file & rename new file to original, if you like:
ri "\\wsl.localhost\Ubuntu\home\user1\.bashrc" -Force
rni "\\wsl.localhost\Ubuntu\home\user1\.bashrc2" "\\wsl.localhost\Ubuntu\home\user1\.bashrc"
Two more examples on how you can replace CRLF by LF:
Example:
(Get-Content -Raw test.txt) -replace "`r`n","`n" | Set-Content test.txt -NoNewline
Example:
[IO.File]::WriteAllText('C:\test.txt', ([IO.File]::ReadAllText('C:\test.txt') -replace "`r`n","`n"))
Be aware, this does really just replace CRLF by LF. You might need to add a trailing LF if your Windows file does not contain a trailing CRLF.