Given a runbook/process with 3 steps
step 1 and step 2 both write some JSON data to an output variable (JSON for an object)
Set-OctopusVariable -name "SharedData" -value ($sharedObject | ConvertTo-Json)
step 2 and step 3 need to read and update the data from the previous steps
$OctopusParameters["Octopus.Action[StepA].Output.SharedData"]
$OctopusParameters["Octopus.Action[StepB].Output.SharedData"]
Assume that any secondary step can use the shared data object from any previous step. It's just an object that is manipulated by multiple steps along the way.
If I choose to skip step 2, then step 3 won't see the step 2 output var value because the read instruction requires the name of the step (StepA or StepB).
Is there a way for the output var read syntax to just get the value from the previous step instead of an explicitly named step? E.g.:
$OctopusParameters["Octopus.Action[previous step alias].Output.SharedData"]
I already tried doing this using the $OctopusParameters dictionary directly.
In one step:
$OctopusParameters["SharedData"] = ($sharedObject | ConvertTo-Json)
Then this in a subsequent step:
$sharedObject = $OctopusParameters["SharedData"] | ConvertFrom-Json
But it doesn't work. The dictionary read returns a null. The raw dictionary assignment isn't persisted between the steps. It only works using the provided Set-OctopusVariable helper or other prescribed methods, but those lock you into knowing the previous step name.
Alternatively, is there a way to store data more "globally" to a process execution for use later without the need to tie it specific to the output of another step of a process?
The way I approached this problem is by considering the use of the Dictionary $OctopusParameters to your advantage. As it's a dictionary, it has keys you can inspect. If you want to get the last variable with the same name, just iterate the keys, and get the last one.
e.g., Suppose you have a deployment process like this:
Step A has code like this:
$sharedObject = [PSCustomObject]#{
StepName = "Step A";
Value = "Value from Step A";
Message = "Step A says Hello!";
};
Set-OctopusVariable -name "SharedData" -value ($sharedObject | ConvertTo-Json)
Whilst Step B has code like this:
$sharedObject = [PSCustomObject]#{
StepName = "Step B";
Value = "Value from Step B";
Message = "Step B says Hello!";
};
Set-OctopusVariable -name "SharedData" -value ($sharedObject | ConvertTo-Json)
Finally, the last step checks for the existence of any Output variable ending in SharedData and then just iterate over each one to print the values to the log.
It then selects the last one, which is the important part. It does this so no matter which of Step A or Step B was skipped, it will always get the last one where the variable was set (you can obviously change this logic to suit your requirements)
$MatchingKeys = $OctopusParameters.Keys | Where-Object { $_ -match "^Octopus\.Action.*\.Output.SharedData$" }
Write-Highlight "Found $($MatchingKeys.Count) matching output variables"
foreach($matchingKey in $matchingKeys) {
$OutputVariableValue = $OctopusParameters[$matchingKey]
Write-Host "$matchingKey value: $OutputVariableValue"
}
Write-Host "Finding last value..."
$lastKey = $matchingKeys | Select-Object -Last 1
Write-Highlight "Last Match: $($OctopusParameters[$lastKey])"
You can also turn the above into a one-liner:
$JsonSharedData = $($OctopusParameters.Keys | Where-Object { $_ -match "^Octopus\.Action.*\.Output.SharedData$" } | Select-Object -Last 1 | ForEach-Object {$OctopusParameters[$_]})
You could wrap it in a conditional depending on whether or not StepB was skipped, e.g.
#{if Octopus.Action[StepB].IsSkipped}
$OctopusParameters["Octopus.Action[StepA].Output.SharedData"]
#{else}
$OctopusParameters["Octopus.Action[StepB].Output.SharedData"]
#{/if}
Related
I'm trying to get my head around powershell and write a function as cmdlet, found the following code sample in one of the articles, but it doesnt seem to want to work as cmdlet even though it has [cmdletbinding()] declaration on the top of the file.
When I try to do something like
1,2,3,4,5 | .\measure-data
it returns empty response (the function itself works just fine if I invoke it at the bottom of the file and run the file itself).
Here's the code that I am working with, any help will be appreciated :)
Function Measure-Data {
<#
.Synopsis
Calculate the median and range from a collection of numbers
.Description
This command takes a collection of numeric values and calculates the
median and range. The result is written as an object to the pipeline.
.Example
PS C:\> 1,4,7,2 | measure-data
Median Range
------ -----
3 6
.Example
PS C:\> dir c:\scripts\*.ps1 | select -expand Length | measure-data
Median Range
------ -----
1843 178435
#>
[cmdletbinding()]
Param (
[Parameter(Mandatory=$True,ValueFromPipeline=$True)]
[ValidateRange([int64]::MinValue,[int64]::MaxValue)]
[psobject]$InputObject
)
Begin {
#define an array to hold incoming data
Write-Verbose "Defining data array"
$Data=#()
} #close Begin
Process {
#add each incoming value to the $data array
Write-Verbose "Adding $inputobject"
$Data+=$InputObject
} #close process
End {
#take incoming data and sort it
Write-Verbose "Sorting data"
$sorted = $data | Sort-Object
#count how many elements in the array
$count = $data.Count
Write-Verbose "Counted $count elements"
#region calculate median
if ($sorted.count%2) {
<#
if the number of elements is odd, add one to the count
and divide by to get middle number. But arrays start
counting at 0 so subtract one
#>
Write-Verbose "processing odd number"
[int]$i = (($sorted.count+1)/2-1)
#get the corresponding element from the sorted array
$median = $sorted[$i]
}
else {
<#
if number of elements is even, find the average
of the two middle numbers
#>
Write-Verbose "processing even number"
$i = $sorted.count/2
#get the lower number
$x = $sorted[$i-1]
#get the upper number
$y = $sorted[-$i]
#average the two numbers to calculate the median
$median = ($x+$y)/2
} #else even
#endregion
#region calculate range
Write-Verbose "Calculating the range"
$range = $sorted[-1] - $sorted[0]
#endregion
#region write result
Write-Verbose "Median = $median"
Write-Verbose "Range = $range"
#define a hash table for the custom object
$hash = #{Median=$median;Range=$Range}
#write result object to pipeline
Write-Verbose "Writing result to the pipeline"
New-Object -TypeName PSobject -Property $hash
#endregion
} #close end
} #close measure-data
this the article where I took the code from:
https://mcpmag.com/articles/2013/10/15/blacksmith-part-4.aspx
edit: maybe I should add that versions of this code from previous parts of the article worked just fine, but after adding all the things that make it a proper cmdlet like the help section and verbose lines, this thing just doesnt want to work, and I believe there is something missing, I have a feeling that this could be because it was written for powershell 3 and I am testing it on win 10 ps 5-point-something, but honestly I dont even know in which direction I should look for, that's why I ask you for help
There is nothing wrong with the code (apart from possible optimizations), but the way how you call it can't work:
1,2,3,4,5 | .\measure-data
When you call a script file that contains a named function, it is expected that "nothing happens". Actually, the scripts runs, but PowerShell does not know which function it should call (there could be multiple). So it just runs any code outside of functions.
You have two options to fix the problem:
Option 1
Remove the function keyword and the curly braces that belong to it. Keep the [cmdletbinding()] and Param sections.
[cmdletbinding()]
Param (
[Parameter(Mandatory=$True,ValueFromPipeline=$True)]
[ValidateRange([int64]::MinValue,[int64]::MaxValue)]
[psobject]$InputObject
)
Begin {
# ... your code ...
} #close Begin
Process {
# ... your code ...
} #close process
End {
# ... your code ...
}
Now the script itself is the "function" and can be called as such:
1,2,3,4,5 | .\measure-data
Option 2
Turn the script into a module. Basically you just need to save it with .psm1 extension (there is more to it, but for getting started it will suffice).
In the script where you want to use the function you have to import the module before you can use its functions. If the module is not installed, you can import it by specifying its full path.
# Import module from directory where current script is located
Import-Module $PSScriptRoot\measure-data.psm1
# Call a function of the module
1,2,3,4,5 | Measure-Data
A module is the way when there are multiple functions in a single script file. It is also more efficient when a function will be called muliple times, because PowerShell needs to parse it only once (it remembers Import-Module calls).
It works as-is, you just need to call it properly. Since the code is now a function, you cannot call it like before when the codes was directly in the file
# method when code is directly in file with no Function Measure-Data {}
1,2,3,4,5 | .\measure-data
Now that you've defined the function you instead need to dot source the file so that it loads your function(s) into memory. Then you can call your function by its name (which happens to be the same as the filename, but doesn't have to be)
# Load the functions by dot-sourcing
. .\measure-data.ps1
# Use the function
1,2,3,4,5 | Measure-Data
You're not passing it an Object but an array of integers. If you change the parameter to:
Param (
[Parameter(Mandatory=$True,ValueFromPipeline=$True)]
[ValidateRange([int64]::MinValue,[int64]::MaxValue)]
[Int[]]$InputObject
)
Now things work:
PS> 1,2,3,4,5 | Measure-Data
Median Range
------ -----
3 4
I am intrigued by this question How to sort 30Million csv records in Powershell and came up with a solution which builds temporary files.
Now I am trying to come up with another approach which comes down to first building an sorted index list ([int[]]) and than pick a bulk of those indices (e.g. 1e6) from the source file and drop them onto the pipeline:
Function Sort-BigCsv {
[CmdletBinding()] param(
[string]$FilePath,
[String]$Property,
[Int]$BulkSize = 1e6,
[System.Text.Encoding]$Encoding = [System.Text.Encoding]::Default
)
Begin {
if ($FilePath.StartsWith('.\')) { $FilePath = Join-Path (Get-Location) $FilePath }
$Index = 0
$Dictionary = [System.Collections.Generic.SortedDictionary[string, int]]::new()
Import-Csv $FilePath -Encoding $Encoding | Foreach-Object { $Dictionary[$_.$Property] = $Index++ }
$IndexList = [int[]]($Dictionary.Values)
$Dictionary = $Null # we only need the sorted index list
}
Process {
$Start = 0
While ($Start -lt $IndexList.Count) {
[System.GC]::Collect()
$End = $Start + $BulkSize - 1
if ($End -ge $IndexList.Count) { $End = $IndexList.Count - 1 }
Import-Csv $FilePath -Encoding $Encoding |
Select-Object -Index $IndexList[$Start..$End] | # Note that the -Index parameter reorders the list
Sort-Object $Property | # Consider smarter sort as this has already be done before
$Start = $End + 1
}
}
}
Example:
Sort-BigCsv .\Input.Csv Id -BulkSize 100 # | Export-Csv .\Output.Csv
I think that the general idea behind this should work, but I have second guesses what PowerShell is actually doing in terms of passing on the objects to the next cmdlet(/display), and questions arise like:
Will every single item (including multiple items created within one Process block cycle) always immediately be picked up and processed by next cmdlet?
Will there be any difference for this function if I put everything in the Process block into the End block?
What if the next process block is slower than the current one?
Will it stall the current one?
Or will the items be buffered?
If they are buffered, can I force them to be taken by the next cmdlet, or wait till they are consumed?
Maybe it is just working as supposed (it is hard to tell from e.g. the memory size in the task manager), but I would like to confirm this...
Is there any check and/or control whether an item is passed on (or is this just simply always the case after a Write-Output`? Meaning, if the last cmdlet stalls, the first cmdlet will also needs to stall...)
The most popular answer for this question involves the following Windows powershell code (edited to fix a bug):
$file1 = Get-Content C:\temp\file1.txt
$file2 = Get-Content C:\temp\file2.txt
$Diff = Compare-Object $File1 $File2
$LeftSide = ($Diff | Where-Object {$_.SideIndicator -eq '<='}).InputObject
$LeftSide | Set-Content C:\temp\file3.txt
I always get a zero byte file as the output, even if I remove the $Diff line.
Why is the output file always null, and how can it be fixed?
PetSerAl, as he routinely does, has provided the crucial pointer in a comment on the question:
Member-access enumeration - the ability to access a member (a property or a method) on a collection and have it implicitly applied to each of its elements, with the results getting collected in an array, was introduced in PSv3.[1]
Member-access enumeration is not only expressive and convenient, it is also faster than alternative approaches.
A simplified example:
PS> ((Get-Item /), (Get-Item $HOME)).Mode
d--hs- # The value of (Get-Item /).Mode
d----- # The value of (Get-Item $HOME).Mode
Applying .Mode to the collection that the (...)-enclosed command outputs causes the .Mode property to be accessed on each item in the collection, with the resulting values returned as an array (a regular PowerShell array, of type[System.Object[]]).
Caveats: Member-access enumeration handles the resulting array like the pipeline does, which means:
If the array has only a single element, that element's property value is returned directly, not inside a single-element array:
PS> #([pscustomobject] #{foo=1}).foo.GetType().Name
Int32 # 1 was returned as a scalar, not as a single-element array.
If the property values being collected are themselves arrays, a flat array of values is returned:
PS> #([pscustomobject] #{foo=1,2}, [pscustomobject] #{foo=3,4}).foo.Count
4 # a single, flat array was returned: 1, 2, 3, 4
Also, member-access enumeration only works for getting (reading) property values, not for setting (writing) them.
This asymmetry is by design, to avoid potentially unwanted bulk modification; in PSv4+, use .ForEach('<property-name', <new-value>) as the quickest workaround (see below).
This convenient feature is NOT available, however:
if you're running on PSv2 (categorically)
if the collection itself has a member by the specified name, in which case the collection-level member is applied.
For instance, even in PSv3+ the following does NOT perform member-access enumeration:
PS> ('abc', 'cdefg').Length # Try to report the string lengths
2 # !! The *array's* .Length property value (item count) is reported, not the items'
In such cases - and in PSv2 in general - a different approach is needed:
Fastest alternative, using the foreach statement, assuming that the entire collection fits into memory as a whole (which is implied when using member-access enumeration).
PS> foreach ($s in 'abc', 'cdefg') { $s.Length }
3
5
PSv4+ alternative, using collection method .ForEach(), also operating on the collection as a whole:
PS> ('abc', 'cdefg').ForEach('Length')
3
5
Note: If applicable to the input collection, you can also set property values with .ForEach('<prop-name>', <new-value>), which is the fastest workaround to not being able to use .<prop-name> = <new-value>, i.e. the inability to set property values with member-access enumeration.
Slowest, but memory-efficient approaches, using the pipeline:
Note: Use of the pipeline is only memory-efficient if you process the items one by one, in isolation, without collecting the results in memory as well.
Using the ForEach-Object cmdlet, as in Burt Harris' helpful answer:
PS> 'abc', 'cdefg' | ForEach-Object { $_.Length }
3
5
For properties only (as opposed to methods), Select-Object -ExpandProperty is an option; it is conceptually clear and simple, and virtually on par with the ForEach-Object approach in terms of performance (for a performance comparison, see the last section of this answer):
PS> 'abc', 'cdefg' | Select-Object -ExpandProperty Length
3
5
[1] Previously, the feature was semi-officially known as just member enumeration, introduced in this 2012 blog post along with the feature itself. A decision to formally introduce the term member-access enumeration was made in early 2022.
Perhaps instead of
$LeftSide = ($Diff | Where-Object {$_.SideIndicator -eq '<='}).InputObject
PowerShell 2 might work better with:
$LeftSide = $Diff | Where-Object {$_.SideIndicator -eq '<='} |
Foreach-object { $_.InputObject }
I have a function which inserts a Y or N menu when called within my Powershell scripts. It uses a while loop to validate that either a Y or N value is entered. Everything works fine, however a new line is created each time an error is made. I could use cls and redisplay everything, but this is not the most ideal solution. Instead, I would like to find a way to redisplay the read-host prompt on the same line while clearing any previously entered answer. Here is my existing code:
# Begin function to display yes or no menu
function ynmenu {
$global:ans = $null
Write-Host -ForegroundColor Cyan "`n Y. [Yes]"
Write-Host -ForegroundColor Cyan "N. [No]`n"
While ($ans -ne "y" -and $ans -ne "n"){
$global:ans = Read-Host "Please select Y or N"
}
}
# End function ynmenu
I have a few other dynamically populated menus which leverage this methodology. Finding a solution to this would resolve the issue with those as well.
I don't think there's any simple way to do that.
But for a yes/no response, you can use $PSCmdlet.ShouldContinue($Query, $Caption) instead, as long as the scope you're in (function, script, etc.) defined the attribute [CmdletBinding(SupportsShouldProcess=$true)]. This shows an appropriate yes/no prompt in ISE and in the console host and avoids manual processing.
I am a newbie in Powershell, but this is driving me a bit crazy. I have looked at various questions here, but could not find an answer so here I go. Apologies if this has been covered already.
I have two text files containing columns of numbers. I would like to create an array containing those 2 columns and sort it by column 1 or 2.
If we had
$a=#(1,5,10,15,25)
$b=#(100,99,98,99,10)
we create
c$=$a,$b
My initial thought was to try something like this:
$c | sort { [int]$_[0] }
But it does not work. I have tried many different things so any advice would be appreciated.
I am editing this as my question was not so clear. Ultimately, if I sort $c by ascending column 2, I expect something like:
25,10
10,98
5,99
15,99
1,100
Any idea how to achieve this ?
I am not sure about how you have declared your dimensional array because it is like you want it to be declared like this or something similar
$c = #(#(1,100),#(5,99),#(10,98),#(15,99),#(25,10))
If it was in that state then sorting is a breeze
$c | Sort-Object #{Expression={$_[1]}; Ascending=$True} | %{
"$($_[0]),$($_[1])"
}
Sort-Object works well with one dimensional arrays. When multiple properties are involved you need to specify which property to sort on to get the expected output. Since there are none we use a calculated expression to make on base on the second "column".
Sample Output
25,10
10,98
5,99
15,99
1,100
If you really want to work with your arrays like that we need an intermediate step to convert what you have to how it can be sorted the way you expect.
$a=#(1,5,10,15,25)
$b=#(100,99,98,99,10)
$c = #()
for($i = 0;$i -lt $a.Count; $i++){
$c += ,#($a[$i],$b[$i])
}
After running this code $c will work just like it does with my sorting.
Welcome to powershell world. The syntax is slightly different from classical programming languages, usually cmdlets take their input from current pipeline. In this case the command you talk about is Sort-Object and you can use it directly with the pipe content where you have the array content
$c = ($a | Sort-Object), ($b | Sort-Object)