What I want :
open a file, find a match with a regex,increment the match,replace the match in text,save the file.
Is it possible to do this with Powershell or the FINDSTR command ?
"*original file*"
gc matchtest.txt
$match_pat = "^(Match\stext\s)(\d+)"
$newfile = #()
gc matchtest.txt |% {
if ($_ -match $match_pat) {
$incr = 1 + $matches[2]
$newfile += $_ -replace $match_pat,($matches[1] + $incr)
}
else {$newfile += $_}
}
$newfile | out-file matchtest.txt -force
"`n*new file*"
gc matchtest.txt
*original file*
Not match 121
Match text 127
Not match 123
*new file*
Not match 121
Match text 128
Not match 123
Related
I am having a text file that has content in this manner.
One;Thomas;Newyork;2020-12-31 14:00:00;0
Two;David;London;2021-01-31 12:00:00;0
Three;James;Chicago;2021-01-20 15:00:00;0
Four;Edward;India;2020-12-25 15:00:00;0
In these entries according to date time, two are past entries and two are future entries. The last 0 in the string indicates the Flag. With the past entries that flag needs to be changed to 1.
Consider all the entries are separated with the array. I tried this block of code but its not working to solve the problem here.
for ($item=0 ; $item -lt $entries.count ; $item++)
{
if ($entries.DateTime[$item] -lt (Get-Date -Format "yyyy-MM-dd HH:mm:ss"))
{
$cont = Get-Content $entries -ErrorAction Stop
$string = $entries.number[$item] + ";" + $entries.name[$item] + ";" +
$entries.city[$item]+ ";" + $entries.DateTime[$item]
$lineNum = $cont | Select-String $string
$line = $lineNum.LineNumber + 1
$cont[$line] = $string + ";1"
Set-Content -path $entries
}
}
I am getting errors with this concept.
Output should come as:-
One;Thomas;Newyork;2020-12-31 14:00:00;1 ((Past Deployment with respect to current date)
Two;David;London;2021-01-31 12:00:00;0
Three;James;Chicago;2021-01-20 15:00:00;0
Four;Edward;India;2020-12-25 15:00:00;1 (Past Deployment with respect to current date)
This output needs to be overwritten on the file from where the content is extracted ie Entries.txt
param(
$exampleFileName = "d:\tmp\file.txt"
)
#"
One;Thomas;Newyork;2020-12-31 14:00:00;0
Two;David;London;2021-01-31 12:00:00;0
Three;James;Chicago;2021-01-20 15:00:00;0
Four;Edward;India;2020-12-25 15:00:00;0
"# | Out-File $exampleFileName
Remove-Variable out -ErrorAction SilentlyContinue
Get-Content $exampleFileName | ForEach-Object {
$out += ($_ -and [datetime]::Parse(($_ -split ";")[3]) -gt [datetime]::Now) ? $_.SubString(0,$_.Length-1) + "1`r`n" : $_ + "`r`n"
}
Out-File -InputObject $out -FilePath $exampleFileName
I have a directory of .txt files that look like this:
[LINETYPE]S[STARTTIME]00:00:00
[LINETYPE]P[STARTTIME]00:00:00
[LINETYPE]B[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:30
[LINETYPE]S[STARTTIME]01:00:00
[LINETYPE]P[STARTTIME]01:00:00
[LINETYPE]B[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:30
[LINETYPE]S[STARTTIME]02:00:00
I'd like to remove all occurrences of [LINETYPE]S except the first, which happens to always be 00:00:00 and on the first line, and then re-save the file to a new location.
That is, [LINETYPE]S[STARTTIME]00:00:00 must always be present, but the other lines that start with [LINETYPE]S need to be removed.
This is what I came up with, which works except it removes all [LINETYPE]S lines, including the first. I can't seem to figure out how to do that part after Googling for a while, so I'm hoping someone can point me in the right direction. Thanks for your help!
Get-ChildItem "C:\Users\Me\Desktop\Samples" -Filter *.txt | ForEach-Object {
Get-Content $_.FullName | Where-Object {
$_ -notmatch "\[LINETYPE\]S"
} | Set-Content ('C:\Users\Me\Desktop\Samples\Final\' + $_.BaseName + '.txt')
}
i couldn't figure out how to do this via a pipeline [blush], so i went with a foreach loop and a compound test.
# fake reading in a text file
# in real life, use Get-Content
$InStuff = #'
[LINETYPE]S[STARTTIME]00:00:00
[LINETYPE]P[STARTTIME]00:00:00
[LINETYPE]B[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:30
[LINETYPE]S[STARTTIME]01:00:00
[LINETYPE]P[STARTTIME]01:00:00
[LINETYPE]B[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:30
[LINETYPE]S[STARTTIME]02:00:00
'# -split [System.Environment]::NewLine
$KeepFirst = '[LINETYPE]S'
$FoundFirst = $False
$FilteredList = foreach ($IS_Item in $InStuff)
{
if ($IS_Item.StartsWith($KeepFirst))
{
if (-not $FoundFirst)
{
$IS_Item
$FoundFirst = $True
}
}
else
{
$IS_Item
}
}
$FilteredList
output ...
[LINETYPE]S[STARTTIME]00:00:00
[LINETYPE]P[STARTTIME]00:00:00
[LINETYPE]B[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:00
[LINETYPE]C[STARTTIME]00:59:30
[LINETYPE]P[STARTTIME]01:00:00
[LINETYPE]B[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:00
[LINETYPE]C[STARTTIME]01:59:30
at that point, you can send the new collection out to a file. [grin]
Try the following:
Get-ChildItem "C:\Users\Me\Desktop\Samples" -Filter *.txt |
Foreach-Object {
$count = 0
Get-Content $_.FullName |
Where-Object { $_ -notmatch '\[LINETYPE\]S' -or $count++ -eq 0 } |
Set-Content ('C:\Users\Me\Desktop\Samples\Final\' + $_.BaseName + '.txt')
}
The script block passed to Where-Object runs in the same scope as the caller, so variable $count can be directly updated.
The 1st line that does contain [LINETYPE]S is included, because $count is 0 at that point, after which $count is incremented ($count++); subsequent [LINETYPE]S are not included, because $count is then already greater than 0.
This code returns unique and shared lines between two files. Unfortunately, it runs forever if the files have 1 million lines. Is there a faster way to do this (e.g., -eq, -match, wildcard, Compare-Object) or containment operators are the optimal approach?
$afile = Get-Content (Read-Host "Enter 'A' file")
$bfile = Get-Content (Read-Host "Enter 'B' file")
$afile |
? { $bfile -notcontains $_ } |
Set-Content lines_ONLY_in_A.txt
$bfile |
? { $afile -notcontains $_ } |
Set-Content lines_ONLY_in_B.txt
$afile |
? { $bfile -contains $_ } |
Set-Content lines_in_BOTH_A_and_B.txt
As mentioned in my answer to a previous question of yours, -contains is a slow operation, particularly with large arrays.
For exact matches you could use Compare-Object and discriminate the output by side indicator:
Compare-Object $afile $bfile -IncludeEqual | ForEach-Object {
switch ($_.SideIndicator) {
'<=' { $_.InputObject | Add-Content 'lines_ONLY_in_A.txt' }
'=>' { $_.InputObject | Add-Content 'lines_ONLY_in_B.txt' }
'==' { $_.InputObject | Add-Content 'lines_in_BOTH_A_and_B.txt' }
}
}
If that's still too slow try reading each file into a hashtable:
$afile = Get-Content (Read-Host "Enter 'A' file")
$ahash = #{}
$afile | ForEach-Object {
$ahash[$_] = $true
}
and process the files like this:
$afile | Where-Object {
-not $bhash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_A.txt'
If that still doesn't help you need to identify the bottleneck (reading the files, comparing the data, doing multiple comparisons, ...) and proceed from there.
try this:
$All=#()
$All+= Get-Content "c:\temp\a.txt" | %{[pscustomobject]#{Row=$_;File="A"}}
$All+= Get-Content "c:\temp\b.txt" | %{[pscustomobject]#{Row=$_;File="B"}}
$All | group row | %{
$InA=$_.Group.File.Contains("A")
$InB=$_.Group.File.Contains("B")
if ($InA -and $InB)
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_in_A_And_B.txt -Append
}
elseif ($InA)
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_Only_A.txt -Append
}
else
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_Only_B.txt -Append
}
}
Full code for the best option (#ansgar-wiechers). A unique, B unique, and A,B shared lines:
$afile = Get-Content (Read-Host "Enter 'A' file")
$ahash = #{}
$afile | ForEach-Object {
$ahash[$_] = $true
}
$bfile = Get-Content (Read-Host "Enter 'B' file")
$bhash = #{}
$bfile | ForEach-Object {
$bhash[$_] = $true
}
$afile | Where-Object {
-not $bhash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_A.txt'
$bfile | Where-Object {
-not $ahash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_B.txt'
$afile | Where-Object {
$bhash.ContainsKey($_)
} | Set-Content 'lines_in _BOTH_A_and_B.txt'
Considering my suggestion to do a binary search, I have created a reusable Search-SortedArray function for this:
Description
The Search-SortedArray (alias Search) (binary) searches a string in a sorted array. If the string is found, the index of the found string in the array is returned. Otherwise, if the string is not found, a $Null is returned.
Function Search-SortedArray ([String[]]$SortedArray, [String]$Find, [Switch]$CaseSensitive) {
$l = 0; $r = $SortedArray.Count - 1
While ($l -le $r) {
$m = [int](($l + $r) / 2)
Switch ([String]::Compare($find, $SortedArray[$m], !$CaseSensitive)) {
-1 {$r = $m - 1}
1 {$l = $m + 1}
Default {Return $m}
}
}
}; Set-Alias Search Search-SortedArray
$afile |
? {(Search $bfile $_) -eq $Null} |
Set-Content lines_ONLY_in_A.txt
$bfile |
? {(Search $afile $_) -eq $Null} |
Set-Content lines_ONLY_in_B.txt
$afile |
? {(Search $bfile $_) -ne $Null} |
Set-Content lines_in_BOTH_A_and_B.txt
Note 1: Due to the overhead, a binary search will only give advantage with (very) large arrays.
Note 2: The array has to be sorted otherwise the result will be unpredictable.
Nate 3: The search doesn't account for duplicates. In case of duplicate values, just one index will be returned (which isn't a concern for this specific question)
Added 2017-11-07 based on the comment from #Ansgar Wiechers:
Quick benchmark with 2 files with a couple thousand lines each (including duplicate lines): binary search: 2400ms; compare-object: 1850ms; hashtable lookup: 250ms
The idea is that the binary search will take its advantage on the long run: the larger the arrays the more it will proportional gain performance.
Taken $afile |? { $bfile -notcontains $_ } as an example, the performance measurements in the comment and that “a couple thousand lines” is 3000 lines:
For a standard search, you will need an average of 1500 iterations in the $bfile:*1
(3000 + 1) / 2 = 3001 / 2 = 1500
For a binary search, you will need an average of 6.27 iterations in the $bfile:
(log2 3000 + 1) / 2 = (11.55 + 1) / 2 = 6.27
In both situations you do this 3000 times (for each item in $afile)
This means that each single iteration takes:
For a standard search: 250ms / 1500 / 3000 = 56 nanoseconds
For a binary search: 2400ms / 6.27 / 3000 = 127482 nanoseconds
The breakeven point will at about:
56 * ((x + 1) / 2 * 3000) = 127482 * ((log2 x + 1) / 2 * 3000)
Which is (according my calculations) at about 40000 entries.
*1 presuming that a hashtable lookup doesn’t do a binary search itself as it is unaware that the array is sorted
Added 2017-11-07
Conclusion from the comments: Hash tables appear to have a similar associative array algorithms that can't be outperformed with low-level programming commands.
i have a txt file with this :
1230;
012;
45;
125
and i want to convert this in an int
but is doesn't work... he juste return the last number
here is my code :
$numbertxt = get-content -Path C:\mysticpath\number.txt -Raw
$numbertxt.GetType()
write-host $numbertxt
foreach ($flags in $numbertxt)
{
$integer = [int]$flags
}
echo $integer
somebody can help me ?
Sorry for my english
$numbertxt = (get-content -Path C:\mysticpath\number.txt -Raw) -split ';'
$numbertxt.GetType()
write-host $numbertxt
foreach ($flags in $numbertxt)
{
$integer = [int]$flags
echo $integer
}
First a integer can only be made of numbers so you will need to split the contents by ';'. This will make a array of strings that are numbers.
Also put the echo on the inside of the for loop will allow for it to display each number as its processed
try this method (control if it's convertible to integer before print)
$res=0;
#verbose version
(Get-Content "c:\temp\test.txt") -split ';' | where {[int]::TryParse($_, [ref] $res)} | foreach {$res}
#short version
(gc "c:\temp\test.txt") -split ';' | ?{[int]::TryParse($_, [ref] $res)} | %{$res}
I'm trying to add leading zeros to a batch of file names before an underscore.
e.g.: going from 123_ABC.pdf to 000123_ABC.pdf
The goal is that before the underscore there should be 6 numbers, and I therefore need to add leading zeros.
I have done this before for cases where i needed to add leading zeros to a file name that was pure numbers, which is the below code, but I'm not sure how to adapt it to the scenario above.
Get-ChildItem "[Folder Location]" | ForEach-Object {
$NewName = "{0:d6}$($_.Extension)" -f [int]$_.BaseName
Rename-Item $_.FullName $NewName
}
Any help would be really appreciated.
Thanks
Here's how you can get the new file name according to your specifications:
$input = "123_ABC.pdf","_ABC.pdf", "qksdcfg.pdf", "0140ABC.pdf", "014_0_ABC.pdf"
foreach($filename in $input) {
# split on first underscore
$parts = $filename -split "_",2
# if there are more than 1 parts (= there is an underscore in the filename)
if($parts.Count -gt 1) {
# add leading 0's and join with the file name remainder
"{0:d6}_{1}" -f [int]$parts[0], $parts[1]
} else {
$filename
}
}
Output is:
000123_ABC.pdf
000000_ABC.pdf
qksdcfg.pdf
0140ABC.pdf
000014_0_ABC.pdf
Mixed with your code:
Get-ChildItem "[Folder Location]" | ForEach-Object {
$parts = $_.Name -split "_",2
if($parts.Count -gt 1) {
$NewName = "{0:d6}_{1}" -f [int]$parts[0], $parts[1]
} else {
$NewName = $_.Name
}
Rename-Item $_.FullName $NewName
}
try Something like this
Get-ChildItem "c:\temp\*.pdf" -file -filter "*_*" | %{
$Array=$_.Name.Split('_')
$NewName="{0:d6}_{1}" -f [int]$Array[0], ($Array[1..($Array.Length -1)] -join '_')
Rename-Item $_.FullName -NewName $NewName
}