In Powershell how can I use lines in a CSV to search a text file and return multiple lines to another CSV? - windows

Example of what I am trying to do.
I have CSV_A.CSV that contains a list of keywords (each on a new line) like: apple, orange, pear. Note these keywords only occur in the TEXT_FILE exactly 1 time.
I have a text file TEXT_FILE.TXT that has 1000s of lines. I need a script that will search TEXT_FILE for apple, then orange, then pear and return its line as well as the next 5 lines.
So the ending result would be a file that contains 15 lines, 5 for each of the 3 key words.
Currently I have tried the following code and it gives me the first line for each keyword, but nothing more.
# path
$path = 'C:\Users\Documents\4_Testing\TEXT_FILE.TXT'
Import-Csv .\CSV_A.csv | ForEach-Object {
Get-ChildItem -Path $path | Select-String -Pattern "$($_.KeywordColumn)\(" -Context 0, 5 |
Select-Object Line |
add-content -path 'C:\Users\Documents\4_Testing\Output.csv'
}

Change this line:
Select-Object Line |
to:
ForEach-Object { #($_.Line;$_.Context.PostContext) } |
This way, each match will produce an array of 6 strings - the matching line, and the five following.

Related

PowerShell rename files

I have a database full of .pdf and .dwf files.
I need to rename these.
The files are named as follows:
123456 text text.pdf
And should look like this:
123456000_text_text.text.pdf
I can replace the spaces with the following command:
dir | rename-item -NewName {$_.name -replace " ","_"}
Now I need a command to insert "0" three times after the first 6 digits.
Can someone help me?
Thanks already
You need to filter on *.pdf and *.dwf files only and also if the filenames match the criterion of starting with 6 digits followed by a space character. Then you can use regex replacements like this:
Get-ChildItem -Path D:\Test -File | Where-Object { $_.Name -match '^\d{6} .*\.(dwf|pdf)$' } |
Rename-Item -NewName { $_.Name -replace '^(\d{6}) ', '${1}000_' -replace '\s+', '_'}
Before:
D:\TEST
123456 text text.dwf
123456 text text.pdf
123456 text text.txt
After:
D:\TEST
123456 text text.txt
123456000_text_text.dwf
123456000_text_text.pdf
Regex details of filename -match:
^ Assert position at the beginning of the string
\d Match a single digit 0..9
{6} Exactly 6 times
\ Match the character “ ” literally
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
( Match the regular expression below and capture its match into backreference number 1
Match either the regular expression below (attempting the next alternative only if this one fails)
dwf Match the characters “dwf” literally
| Or match regular expression number 2 below (the entire group fails if this one fails to match)
pdf Match the characters “pdf” literally
)
$ Assert position at the end of the string (or before the line break at the end of the string, if any)
What you have is 123456 text text.pdf
Want it to look like 123456000_text_text.pdf
A systematic way to achieve this would be>>
$const = "123456 text text.pdf"
$filename = $const -replace " ","_"
$temp = $filename.split("_")[0]
$rep1 = ([string]$temp).PadRight(9,'0')
$output = $filename -replace $temp,$rep1
Write-Host $output -ForegroundColor Green
The great thing about this method is that it will always trail with 0s keeping your number string to 9 digits.

Unwanted space in substring using powershell

I'm fairly new to PS: I'm extracting fields from multiple xml files ($ABB). The $net var is based on a pattern search and returns a non static substring on line 2. Heres what I have so far:
$ABB = If ($aa -eq $null ) {"nothing to see here"} else {
$count = 0
$files = #($aa)
foreach ($f in $files)
{
$count += 1
$mo=(Get-Content -Path $f )[8].Substring(51,2)
(Get-Content -Path $f | Select-string -Pattern $lf -Context 0,1) | ForEach-Object {
$net = $_.Context.PostContext
$enet = $net -split "<comm:FieldValue>(\d*)</comm:FieldValue>"
$enet = $enet.trim()}
Write-Host "$mo-nti-$lf-$enet" "`r`n"
}}
The output looks like this: 03-nti-260- 8409.
Note the space prefacing the 8409 which corresponds to the $net variable. I haven't been able to solve this on my own, my approach could be all wrong. I'm open to any and all suggestions. Thanks for your help.
Since your first characters in the first line of $net after $net = $_.Context.PostContext contains the split characters, a blank line will output as the first element of the output. Then when you stringify output, each split output item is joined by a single space.
You need to select lines that aren't empty:
$enet = $net -split "<comm:FieldValue>(\d*)</comm:FieldValue>" -ne ''
Explanation:
-Split characters not surrounded by () are removed from the output and the remaining string is split into multiple elements from each of those matched characters. When a matched character starts or ends a string, a blank line is output. Care must be taken to remove those lines if they are not required. Trim() will not work because Trim() applies to a single string rather than an array and will not remove empty string.
Adding -ne '' to the end of the command, removes empty lines. It is just an inline boolean condition that when applied to an array, only outputs elements where the condition is true.
You can see an example of the blank line condition below:
123 -split 1
23
123 -split 1 -ne ''
23
Just use a -replace to get rid of any spaces
For example:
'03-nti-260- 8409' -replace '\s'
<#
# Results
03-nti-260-8409
#>

Trim the blank line and then pick the nth array item

So I have a Powershell code where I am trying to Get-content of a file, trimming the first blank line and then splitting the content in order to get the nth item in the array.
The issue is its giving me nth item of the second line of the file, while I need the nth item of the first line.
Here's my code.
$Ess_keys = "D:\Automation\Encryption\myKeys.txt"
Get-Content $Ess_keys | ? {$_.trim() -ne "" } |ForEach-Object{
$splitUp = $_ -split "\s+"
$PKey = $splitUp[5]}
$Pkey
Here's what the file looks like:
>
Public Key for Encryption: 27743,2195638463
Private Key for Decryption: 2073750047,2195638463
When I run it, this is the output its giving
PS C:\Users\wrtty> $pkey
2073750047,2195638463
As you can see, its picking the 5th array item in the second line. While I need it from the 1st line.
I also checked if its not trimming the 1st non-blank line. But when I run the below 2 set of codes, I can see its not trimming the first non-blank line.
PS C:\Users\wrtty> Get-Content $Ess_keys | ? {$_.trim() -ne "" }
Public Key for Encryption: 27743,2195638463
Private Key for Decryption: 2073750047,2195638463
PS C:\Users\wrtty> Get-Content $Ess_keys | where {$_ -ne ""}
output
Public Key for Encryption: 27743,2195638463
Private Key for Decryption: 2073750047,2195638463
Any suggestions?
In your attempt, you are overwriting $PKey with each loop iteration. Then you are only outputting $PKey at the end. So you only get the last matched line.
Since it appears you already know the data format within the file, you can use a simple Select-String pattern match to get the data you want.
$pkeys = Select-String -Path "D:\Automation\Encryption\myKeys.txt" -Pattern "Public Key for Encryption: (\S+)" -AllMatches |
Foreach-Object {
$_.Matches.Groups[1].Value
}
$pkeys
The above code stores ALL public key matched data in $pkeys. If you only want to access the first match, then $pkeys[0] will suffice. The regex (\S+) matches consecutive non-white space characters.
Thanks for your comment AdminOfThings
Below solution worked for me.
Get-Content $Ess_keys | where {$_ -ne ""} |Select-Object -First 1| ForEach-Object{
$splitUp = $_ -split "\s+"
$PKey = $splitUp[5]}

Split pattern output by spaces in Powershell

I need to extract the third column of a string returned after matching a Pattern.
It also needs to be a one-liner
File contains data like this:
f5834eab44ff bfd0bc8498d8 1557718920
dc8087c38a0d a72e89879030 1557691221
e6d7aaf6d76b caf6cd0ef68c 1557543565
Right now it matches the pattern and returns the line.
But I cannot get it to Split on the spaces so I can get the 3rd column (index 2).
select-string -Path $hashlistfile -Pattern 'dc8087c38a0d') | $_.Split(" ") | $_[2]
Output should be:
1557691221
You can grab the Line property from the output object produced by Select-String, split that and then index directly into the result of String.Split():
Select-String -Path $hashlistfile -Pattern dc8087c38a0d |ForEach-Object {
$_.Line.Split(" ")[2]
}
You can only use '$_' inside cmdlets that have a script block option '{ }'. Select-string returns MatchInfo objects.
(select-string dc8087c38a0d $hashlistfile).gettype()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True False MatchInfo System.Object
The -split operator seems to deal with it more easily. There's an extra parenthesis ')' after the pattern in your example.
select-string dc8087c38a0d $hashlistfile | foreach { -split $_ | select -index 2 }
1557691221

Trying to sort tsv files by one column using powershell

I have several .tsv files that I need to merge and then sort by only one column. Unfortunately that column is presented in digits (no decimals), but different length.
I used the following script:
$a=get-content -path\*filename*.tsv -encoding ASCII
$a|sort-object [int]column1|select-object -first ($a.count - $fileCount)|out-file -filepath -encoding ASCII
Unfortunately the outfile still not sorted. Any suggestion?
P.S Actually it is sorted inside of individual files, but since several files were merged in variable $a, the total output is not sorted.
Ah, probably better to import them using import-csv. Try this:
gci *filename*.tsv|foreach{$a+=import-csv $_.fullname -delimiter "`t"}
Now you have an array that you can sort by whatever field you want, and can use:
$a|convertto-csv -Delimiter "`t" -NoTypeInformation|select -Skip 1|Out-File output.tsv
Edited to remove header output.
`Not tested.
$filename = 'c:\somedir\somefilename.tsv'
$ht=#{}
filter Get-Record { if ($ht[$_] -ne $HeaderLine) {$ht[$_]} }
$HeaderLine = $null
$counter = 0
get-content -path\*filename*.tsv -encoding ASCII |
foreach {
if (-not $HeaderLine)
{ $HeaderLine = $_ }
$counter++
$ht["$($_.split("`t")[1])$counter"] = $_
}
$HeaderLine | set-content $filename #header
$ht.keys | Sort | Get-Record | add-content $filename
It should sort on whatever column you use from the $_.split("t")` array,

Resources