Awk command for powershell - bash

Is there any command like awk in powershell?
I want to execute this command:
awk '
BEGIN {count=1}
/^Text/{text=$0}
/^Time/{time=$0}
/^Rerayzs/{retext=$0}
{
if (NR % 3 == 0) {
printf("%s\n%s\n%s\n", text, time, retext) > (count ".txt")
count++
}
}' file
to a powershell command.

Usually we like to see what you have tried. It at least shows that you are making an effort, and we aren't just doing your work for you. I think you're new to PowerShell, so I'm going to just spoon-feed you an answer, hoping that you use it to learn and expand your knowledge, and hopefully have better questions in the future.
I am pretty sure that this will accomplish the same thing as what you laid out. You have to give it an array of input (the contents of a text file, an array of strings, something like that), and it will generate several files depending on how many matches it finds for the treo "Text", "Time", and "Rerayzs". It will order them as Text, then a new line with Time, and then a new line with Rerayzs.
$Text,$Time,$Retext = $Null
$FileCounter = 1
gc c:\temp\test.txt|%{
Switch($_){
{$_ -match "^Text"} {$Text = $_}
{$_ -match "^Time"} {$Time = $_}
{$_ -match "^Rerayzs"} {$Retext = $_}
}
If($Text -and $Time -and $Retext){
("{0}`n{1}`n{2}") -f $Text,$Time,$Retext > "c:\temp\$FileCounter.txt"
$FileCounter++
$Text,$Time,$Retext = $Null
}
}
That will get the text of a file C:\Temp\Test.txt and will output numbered files to the same location. The file I tested against is:
Text is good.
Rerayzs initiated.
Stuff to not include
Time is 18:36:12
Time is 20:21:22
Text is completed.
Rerayzs failed.
I was left with 2 files as output. The first reads:
Text is good.
Time is 18:36:12
Rerayzs initiated.
The second reads:
Text is completed.
Time is 20:21:22
Rerayzs failed.

Related

Powershell IF conditional isn't firing in the way I expected. Unsure what I'm doing wrong

I am writing a simple script that makes use of 7zip's command-line to extract archives within folders and then delete the original archives.
There is a part of my script that isn't behaving how I would expect it to. I can't get my if statement to trigger correctly. Here's a snippet of the code:
if($CurrentRar.Contains(".part1.rar")){
[void] $RarGroup.Add($CurrentRar)
# Value of CurrentRar:
# Factory_Selection_2.part1.rar
$CurrentRarBase = $CurrentRar.TrimEnd(".part1.rar")
# Value: Factory_Selection_2
for ($j = 1; $j -lt $AllRarfiles.Count; $j++){
$NextRar = $AllRarfiles[$j].Name
# Value: Factory_Selection_2.part2.rar
if($NextRar.Contains("$CurrentRarBase.part$j.rar")){
Write-Host "Test Hit" -ForegroundColor Green
# Never fires, and I have no idea why
# [void] $RarGroup.Add($NextRar)
}
}
$RarGroups.Add($RarGroup)
}
if($NextRar.Contains("$CurrentRarBase.part$j.rar")) is the line that I can't get to fire.
If I shorten it to if($NextRar.Contains("$CurrentRarBase.part")), it fires true. But as soon as I add the inline $j it always triggers false. I've tried casting $j to string but it still doesn't work. Am I missing something stupid?
Appreciate any help.
The issue seems to be your for statement and the fact that an array / list is zero-indexed (means they start with 0).
In your case, the index 0 of $AllRarfiles is probably the part1 and your for statement starts with 1, but the file name of index 1 does not contain part1 ($NextRar.Contains("$CurrentRarBase.part$j.rar"), but part2 ($j + 1).
As table comparison
Index / $j
Value
Built string for comparison (with Index)
0
Factory_Selection_2.part1.rar
Factory_Selection_2.part0.rar
1
Factory_Selection_2.part2.rar
Factory_Selection_2.part1.rar
2
Factory_Selection_2.part3.rar
Factory_Selection_2.part2.rar
3
Factory_Selection_2.part4.rar
Factory_Selection_2.part3.rar
Another simpler approach
Since it seems you want to group split RAR files which belong together, you could also use a simpler approach with Group-Object
# collect and group all RAR files.
$rarGroups = Get-ChildItem -LiteralPath 'C:\somewhere\' -Filter '*.rar' | Group-Object -Property { $_.Name -replace '\.part\d+\.rar$' }
# do some stuff afterwards
foreach($rarGroup in $rarGroups){
Write-Verbose -Verbose "Processing RAR group: $($rarGroup.Name)"
foreach($rarFile in $rarGroup.Group) {
Write-Verbose -Verbose "`tCurrent RAR file: $($rarFile.Name)"
# do some stuff per file
}
}

Replacing numbered file names with different number without conflicting with other files in folder

I have hundreds of images that were automatically saved on different occasions using different a scheme, and I'm trying to make them consistent. I feel like this should be really simple in Powershell, but I'm not familiar with it.
Currently, I have a bunch of .txt and .tif files named U6Co_AsFab_(####)_raw for indexes from 0118 to 0141 as follows:
U6Co_AsFab_(0118)_raw.tif
U6Co_AsFab_(0118)_raw.txt
U6Co_AsFab_(0119)_raw.tif
U6Co_AsFab_(0119)_raw.txt
*
*
*
U6Co_AsFab_(0141 )_raw.tif
U6Co_AsFab_(0141 )_raw.txt
and I need them to be shifted to range from 0139 to 0162 as follows:
U6Co_AsFab_(0139)_raw.tif
U6Co_AsFab_(0139)_raw.txt
U6Co_AsFab_(0140)_raw.tif
U6Co_AsFab_(0140)_raw.txt
*
*
*
U6Co_AsFab_(0162)_raw.tif
U6Co_AsFab_(0162)_raw.txt
In the mix are U6Co_AsFab_(####) (without raw) that are indexed correctly.
I ran the following in Powershell:
Get-ChildItem U6Co_AsFab_* -Include *raw* |
Where { $_ -match 'U6Co_AsFab_(\d+)' } |
Foreach { $num = [int]$matches[1] + 21; Rename-Item $_ (("U6Co_AsFab_{0:0000}" + $_.Extension) -f $num) }
And the filenames conflicted with ones that had yet to be renamed, giving this error:
Rename-Item : Cannot create a file when that file already exists.
I was thinking I could add an If-Else argument that would only rename the file if it didn't already exist in the directory. Then it seems like I could re-execute that code until all the files were re-numbered.
What is the best way to do this? I have a similar issue in other folders as well.
The problem is basically, that you increase each index one by one, but the files with the higher indexes already exist. You need to reverse the order.
Try this:
$files = #(Get-ChildItem U6Co_AsFab_* -Include *raw*)
[Array]::Reverse($files)
# your remaining code should work fine
$files | where { $_ -match 'U6Co_AsFab_(\d+)' } |
foreach {
$num = [int]$matches[1] + 21
Rename-Item $_ (("U6Co_AsFab_{0:0000}" + $_.Extension) -f $num)
}

Powershell: Count instances of strings in a file using a list

I am trying to get the number of times a string (varying from 40 to 400+ characters) in "file1" occurs in "file2" in an effective way. file1 has about 2k lines and file2 has about 130k lines. I currently have a Unix solution that does it in about 2 mins in a VM and about 5 in Cygwin, but I am trying to do it with Powershell/Python since the files are in windows and I am using the output in excel and use it with automation (AutoIT.)
I have a solution, but it takes WAY too long (in about the same times that the Cygwin finished - all 2k lines - I had only 40-50 lines in Powershell!)
Although I haven't prepare a solution yet, I am open to use Python as well if there is a solution that can be fast and accurate.
Here is the Unix Code:
while read SEARCH_STRING;
do printf "%s$" "${SEARCH_STRING}";
grep -Fc "${SEARCH_STRING}" file2.csv;
done < file1.csv | tee -a output.txt;
And here is the Powershell code I currently have
$Target = Get-Content .\file1.csv
Foreach ($line in $Target){
#Just to keep strings small, since I found that not all
#strings were being compared correctly if they where 250+ chars
$line = $line.Substring(0,180)
$Coll = Get-Content .\file2.csv | Select-string -pattern "$line"
$cnt = $Coll | measure
$cnt.count
}
Any ideas of suggestions will help.
Thanks.
EDIT
I'm trying a modified solution suggested by C.B.
del .\output.txt
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "C:\temp\file2.csv" )
Foreach ($line in $Target){
$line = [string]$line.Substring(0, $line.length/2)
$cnt = [regex]::matches( [string]$file, $line).count >> ".\output.txt"
}
But, since my strings in file1 are varying in length I keept getting OutOfBound exceptions for the SubString function, so I halved (/2) the input string to try to get a match. And when I try to halve them, if I it had an open parentheses, it tells me this:
Exception calling "Matches" with "2" argument(s): "parsing "CVE-2013-0796,04/02/2013,MFSA2013-35 SeaMonkey: WebGL
crash with Mesa graphics driver on Linux (C" - Not enough )'s."
At C:\temp\script_test.ps1:6 char:5
+ $cnt = [regex]::matches( [string]$file, $line).count >> ".\output.txt ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : ArgumentException
I don't know if there is a way to raise the input limit in powershell (My biggest size at the moment is 406, but could be bigger in the future) or just give up and try a Python solution.
Thoughts?
EDIT
Thanks to #C.B. I got the correct answer and it matches the output of the Bash script perfectly. Here is the full code that outputs results to a text file:
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "C:\temp\file2.csv" )
Foreach ($line in $Target){
$cnt = [regex]::matches( $file, [regex]::escape($line)).count >> ".\output.txt"
}
Give this a try:
$Target = Get-Content .\file1.csv
$file= [System.IO.File]::ReadAllText( "c:\test\file2.csv" )
Foreach ($line in $Target){
$line = $line.Substring(0,180)
$cnt = [regex]::matches( $file, [regex]::escape($line)).count
}
One issue with your script is that you read file2.csv over and over again, for each line from file1.csv. Reading the file just once and storing the content in a variable should significantly speed things up. Try this:
$f2 = Get-Content .\file2.csv
foreach ($line in (gc .\file1.csv)) {
$line = $line.Substring(0,180)
#($f2 | ? { $_ -match $line }).Count
}

Searching Multiple Strings in Huge log files

Powershell question
Currently i have 5-10 log files all about 20-25GB each and need to search through each of them to check if any of 900 different search parameters match. i have written a basic powershell script that will search through the whole log file for 1 search parameter. if it matches it will dump out the results into a seperate text file, the problem is it is pretty slow. i was wondering if there is a way to speed this up by either making it search for all 900 parameters at once and only looking through the log once. any help would be good even if its just improving the script.
basic overview :
1 csv file with all the 900 items listed under an "item" column
1 log file (.txt)
1 result file (.txt)
1 ps1 file
here is the code i have below for powershell in a PS1 file:
$search = filepath to csv file<br>
$log = "filepath to log file"<br>
$result = "file path to result text file"<br>
$list = import-csv $search <br>
foreach ($address in $list) {<br>
Get-Content $log | Select-String $address.item | add-content $result <br>
*"#"below is just for displaying a rudimentary counter of how far through searching it is <br>*
$i = $i + 1 <br>
echo $i <br>
}
900 search terms is quite large a group. Can you reduce its size by using regular expressions? A trivial solution is based on reading the file row-by-row and looking for matches. Set up a collection that contains regexps or literal strings for search terms. Like so,
$terms = #("Keyword[12]", "KeywordA", "KeyphraseOne") # Array of regexps
$src = "path-to-some-huge-file" # Path to the file
$reader = new-object IO.StreamReader($src) # Stream reader to file
while(($line = $reader.ReadLine()) -ne $null){ # Read one row at a time
foreach($t in $terms) { # For each search term...
if($line -match $t) { # check if the line read is a match...
$("Hit: {0} ({1})" -f $line, $t) # and print match
}
}
}
$reader.Close() # Close the reader
Surely this is going to be incredibly painful on any parser you use just based on the file sizes you have there, but if your log files are of a format that is standard (for example IIS log files) then you could consider using a Log parsing app such as Log Parser Studio instead of Powershell?

UNIX format files with Powershell

How do you create a unix file format in Powershell? I am using the following to create a file, but it always creates it in the windows format.
"hello world" | out-file -filepath test.txt -append
As I understand, the new line characters CRLF make it to be a Windows format file whereas the unix format needs only a LF at the end of the line. I tried replacing the CRLF with the following, but it didn't work
"hello world" | %{ $_.Replace("`r`n","`n") } | out-file -filepath test.txt -append
There is a Cmdlet in the PowerShell Community Extensions called ConvertTo-UnixLineEnding
One ugly-looking answer is (taking input from dos.txt outputting to unix.txt):
[string]::Join( "`n", (gc dos.txt)) | sc unix.txt
but I would really like to be able to make Set-Content do this by itself and this solution does not stream and therefore does not work well on large files...
And this solution will end the file with a DOS line ending as well... so it is not 100%
I've found that solution:
sc unix.txt ([byte[]][char[]] "$contenttext") -Encoding Byte
posted above, fails on encoding convertions in some cases.
So, here is yet another solution (a bit more verbose, but it works directly with bytes):
function ConvertTo-LinuxLineEndings($path) {
$oldBytes = [io.file]::ReadAllBytes($path)
if (!$oldBytes.Length) {
return;
}
[byte[]]$newBytes = #()
[byte[]]::Resize([ref]$newBytes, $oldBytes.Length)
$newLength = 0
for ($i = 0; $i -lt $oldBytes.Length - 1; $i++) {
if (($oldBytes[$i] -eq [byte][char]"`r") -and ($oldBytes[$i + 1] -eq [byte][char]"`n")) {
continue;
}
$newBytes[$newLength++] = $oldBytes[$i]
}
$newBytes[$newLength++] = $oldBytes[$oldBytes.Length - 1]
[byte[]]::Resize([ref]$newBytes, $newLength)
[io.file]::WriteAllBytes($path, $newBytes)
}
make your file in the Windows CRLF format. then convert all lines to Unix format in new file:
$streamWriter = New-Object System.IO.StreamWriter("\\wsl.localhost\Ubuntu\home\user1\.bashrc2")
$streamWriter.NewLine = "`n"
gc "\\wsl.localhost\Ubuntu\home\user1\.bashrc" | % {$streamWriter.WriteLine($_)}
$streamWriter.Flush()
$streamWriter.Close()
not a one-liner, but works for all lines, including EOF. new file now shows as Unix format in Notepad on Win11.
delete original file & rename new file to original, if you like:
ri "\\wsl.localhost\Ubuntu\home\user1\.bashrc" -Force
rni "\\wsl.localhost\Ubuntu\home\user1\.bashrc2" "\\wsl.localhost\Ubuntu\home\user1\.bashrc"
Two more examples on how you can replace CRLF by LF:
Example:
(Get-Content -Raw test.txt) -replace "`r`n","`n" | Set-Content test.txt -NoNewline
Example:
[IO.File]::WriteAllText('C:\test.txt', ([IO.File]::ReadAllText('C:\test.txt') -replace "`r`n","`n"))
Be aware, this does really just replace CRLF by LF. You might need to add a trailing LF if your Windows file does not contain a trailing CRLF.

Resources