Powershell csv calculate total sum - windows

I am currently working on powershell. Powershell is new for me so its kind of hard to figure out this one.
I have three headers in my csv files.
Headers include: Name, MessageCount and Direction.
Names are email addresses and those addresses are all the same. Direction have "Inbound" and "Outbound". MessageCount are bunch of diffrent numbers:
Overview
I want to calculate those number so i get "Inbound" and "Outbound" Totals and emails on those rows.
I am trying to foreach loop out MessageCount and calculate those together it will only give me output like this :
MessageCount

Try something like this:
$data = Import-Csv "path-to-your-csv-file";
$data | group Name
| select Name,
#{n = "Inbound"; e = {
(($_.Group | where Direction -eq "Inbound").MessageCount | Measure-Object -Sum).Sum }
},
#{n = "Outbound"; e = {
(($_.Group | where Direction -eq "Outbound").MessageCount | Measure-Object -Sum).Sum }
}
Code explanation
group Name groups results by property Name - in this case, email address. More here
select allows select property from object or create custom with #{n="";e={}}. More here
($_.Group | where Direction -eq "Outbound").MessageCount gets data from the group, searches for rows with Direction equal to Outbound and then gets the MessageCount from found rows.
Measure-Object -Sum takes array and creates object with properties ie. sum of values in array, so we get sum of MessageCount and return as custom property in object.

Related

How to change multiple headers in a table using Powershell

I am trying to change multiple header names within my code that is pulling the Team Statistics table from this site
I am unsure where to manually change them in my code.
For example, I tried manually changing header 8, GF to GFPG in the line where I add the 'TEAM' header, but I get the error:
Exception calling "Add" with "2" argument(s): "Item has already been added. Key in dictionary: 'GF' Key being added: 'GF'"
At C:\NHLScraper.ps1:32 char:5
+ $objHash.Add($headers[$j],$rowdata[$j])
My code:
$url = "https://www.hockey-reference.com/leagues/NHL_2020.html"
#getting the data
$data = Invoke-WebRequest $url
#grab the third table
$table = $data.ParsedHtml.getElementsByTagName("table") | Select -skip 2 | Select -First 1
#get the rows of the Team Statistics table
$rows = $table.rows
#get table headers
$headers = $rows.item(1).children | select -ExpandProperty InnerText
#count the number of rows
$NumOfRows = $rows | Measure-Object
#Manually injecting TEAM header
$headers = #($headers[0];'TEAM';$headers[1..($headers.Length-1)])
#enumerate the remaining rows (we need to skip the header row) and create a custom object
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]#{}
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash.Add($headers[$j],$rowdata[$j])
}
#turn the hashtable into a custom object
[pscustomobject]$objHash
}
$out | Select TEAM,AvAge,GP,W,L,OL,PTS,PTS%,GF,GA,SOW,SOL,SRS,SOS,TG/G,EVGF,EVGA,PP,PPO,PP%,PPA,PPOA,PK%,SH,SHA,PIM/G,oPIM/G,S,S%,SA,SV%,SO -SkipLast 1 | Export-Csv -Path "C:\$((Get-Date).ToString("'NHL Stats' yyyy-MM-dd")).csv" -NoTypeInformation
You can add a condition to check if the key has already been added and if so, update it or ignore it,
if (!$objHash.Contains(headers[$j]))
$objHash.Add($headers[$j],$rowdata[$j])
else
$objHash[$headers[$j]] = $rowdata[$j] # Overwrite values
But after looking at your code a few times, this doesnt make sense,
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]#{} # Overwritten each loop???
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash.Add($headers[$j],$rowdata[$j]) # Dictionary cannot have duplicate keys
}
#turn the hashtable into a custom object
[pscustomobject]$objHash # what do you do with this?
}
You are looping over x number of times and each time you are overwriting the $objHash. only thing that would be returned is whats created in the last loop.
Suggested Solution
You can use another variable to keep track of all the hashtables you are creating along with making sure duplicate keys are not inserted that would throw the exception.
# If you want to change the header value from GF to GFPG, you can do that in the place you have defined $headers
#get table headers
$headers = $rows.item(1).children | select -ExpandProperty InnerText
$headers = $headers | % { if ($_ -eq "GF") { "GFPG" } else { $_ }}
#count the number of rows
$NumOfRows = $rows | Measure-Object
#Manually injecting TEAM header
$headers = #($headers[0];'TEAM';$headers[1..($headers.Length-1)])
#enumerate the remaining rows (we need to skip the header row) and create a custom object
$allData = #{}
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]#{}
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash[$headers[$j]] = $rowdata[$j]
}
#turn the hashtable into a custom object
[pscustomobject]$objHash
$allData.Add($i, $objHash)
}
I used $AllData with i as the key to store each of those results that can later be accessed.

Put results in a table and then sorted output

I am writing a script which produces two outputs with in a foreach loop , one string $server and one integer $util.(lets say I get 20 results)
What is the simplest approach to put my results in a table while running the loop and then I can output them sorted (descending) after the loop is finished ?
SERVER UTIL
------ ----
SERVER001 95
SERVER002 74
SERVER003 32
SERVER004 12
if you want to sort the results in descending order you will have put the results in an array and then sort outside the loop like so:
$arr = #()
foreach ($item in $collection)
{
$arr += [pscustomobject]#{
Server = $item.server
util = $item.util
}
}
$arr | Sort-Object -Property Util -Descending

Powershell Sorting a tuple not working

This is the most stripped down example of what I am trying and failing to do:
#(1, 3, 2) | Select-Object {New-Object "tuple[Int,Int]" $_, 1} | Sort-Object -Property Item1
It returns:
(2, 1)
(3, 1)
(1, 1)
I'm looking to return the list sorted by the first element of the tuple. Obviously in this little demo I could just do the sorting first, but in my actual use case I perform a calculation on the list during the Select-Object phase and then want to sort based on the results of that calculation.
I'm certainly not married to using tuples but I've tried and failed using other options that haven't had any success. I just need to take a list, add additional information to each element of the list based on the result of a calculation and then sort by that additional information.
Feeling really stupid right now given how long this is taking me to figure out so any help is appreciated.
If you're just adding a custom property to an object, try returning a new object with the additional property. To use your same example with #(1,3,2):
#(1, 3, 2) | % {
[PSCustomObject] #{
Item1 = $_
Item2 = 1
}
} | Sort-Object -Property Item1
You can sort tuples only if you ADD them to a list first.
Sort-order is alway ascending from first item to last item:
$list = New-Object 'Collections.ArrayList'
foreach ($x in 1, 3, 2) {$null = $list.add([tuple]::create($x, 1))}
$list.Sort()
$list | select item1, item2

How to count rows in a csv and then loop through the contents

I have a csv that I want to check the count of rows and then loop through the contents. I'm using the code at the bottom to get the count which works but I'm not sure how I can loop through the csv and get the values in each column.
I've read that I can do it using the select-object cmdlet if I specify the column names however this code will work on a number of csv's all with different column names. How can I make this work?
$csv = Import-Csv -Path $requestFile | measure
if(($csv).count-1 -gt 1){
//do something
}
You don't need to pipe to Measure to get the row count. In fact, the variable you've stored in $csv is not the csv data but the output from Measure, so you should remove the pipe to Measure.
Here's an example:
PS C:\temp> $csv = Import-Csv .\test.csv
PS C:\temp> # Here you can perform your check on the size of the csv
PS C:\temp> $csv.Count
4
PS C:\temp> # ... and you can get all the data like this:
PS C:\temp> $csv
Year : 1997
Make : Ford
Model : E350
Description : ac, abs, moon
Price : 3000.00
Year : 1999
Make : Chevy
Model : Venture "Extended Edition"
Description :
Price : 4900.00
Year : 1999
Make : Chevy
Model : Venture "Extended Edition, Very Large"
Description :
Price : 5000.00
Year : 1996
Make : Jeep
Model : Grand Cherokee
Description : MUST SELL!
air, moon roof, loaded
Price : 4799.00
My csv looks like this:
Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
Import-Csv creates a list of objects. This list already has a Count property, so you don't need to measure it yourself:
$csv = Import-Csv -Path $requestFile
if ($csv.Count -gt 2) {
# do something
}
Not sure why you'd want to restrict the "do someting" to CSVs with more than 2 rows, though.
If you also want to loop over the columns of each row you can do that with a nested loop as described in this answer:
$csv = Import-Csv -Path $requestFile
if ($csv.Count -gt 2) {
$csv | ForEach-Object {
foreach ($property in $_.PSObject.Properties) {
doSomething $property.Name, $property.Value
}
}
}
For further help you'd need to explain what you actually want to do with the columns.

Implementing tables in lua to access specific pieces for later use

I am trying to make a table store 3 parts which will each be huge in length. The first is the name, second is EID, third is SID. I want to be able to get the information like this name[1] gives me the first name in the list of names, and like so for the other two. I'm running into problems with how to do this because it seems like everyone has their own way which are all very very different from one another. right now this is what I have.
info = {
{name = "btest", EID = "19867", SID = "664"},
{name = "btest1", EID = "19867", SID = "664"},
{name = "btest2", EID = "19867", SID = "664"},
{name = "btest3", EID = "19867", SID = "664"},
}
Theoretically speaking would i be able to just say info.name[1]? Or how else would I be able to arrange the table so I can access each part separately?
There are two main "ways" of storing the data:
Horizontal partitioning (Object-oriented)
Store each row of the data in a table. All tables must have the same fields.
Advantages: Each table contains related data, so it's easier passing it around (e.g, f(info[5])).
Disadvantages: A table is to be created for each element, adding some overhead.
This looks exactly like your example:
info = {
{name = "btest", EID = "19867", SID = "664"},
-- etc ...
}
print(info[2].names) -- access second name
Vertical partioning (Array-oriented)
Store each property in a table. All tables must have the same length.
Advantages: Less tables overall, and slightly more time and space efficient (Lua VM uses actual arrays).
Disadvantages: Needs two objects to refer to a row: the table and the index. It's harder to insert/delete.
Your example would look like this:
info = {
names = { "btest", "btest1", "btest2", "btest3", },
EID = { "19867", "19867", "19867", "19867", },
SID = { "664", "664", "664", "664", },
}
print(info.names[2]) -- access second name
So which one should I choose?
Unless you are really need performance, you should go with horizontal partitioning. It's far more common working over full rows, and gives you more freedom in how you use your structures. If you decide to go full OO, having your data in horizontal form will be much easier.
Addendum
The names "horizontal" and "vertical" come from the table representation of a relational database.
| names | EID | SID | | names |
--+-------+-----+-----+ +-------+
1 | | | | | | --+-------+-----+-----+
2 | | | | | | 2 | | | |
3 | | | | | | --+-------+-----+-----+
Your info table is an array, so you can access items using info[N] where N is any number from 1 to the number of items in the table. Each field of the info table is itself a table. The 2nd item of info is info[2], so the name field of that item is info[2].name.

Resources