C# lambda expression for getting the group by and count - linq

I have C# data and want to use Normal data looks like below
ID Count Client MessageType
1 100 MegaTech Missing SO
2 100 MegaTech Not shipped
3 100 NIXIMIXI No PDF
4 100 MegaTech Missing SO
5 100 NIXIMIXI Not shipped
6 100 MegaTech No PDF
7 100 NIXIMIXI Other
8 100 MegaTech Other
9 100 MegaTech No PDF
10 100 NIXIMIXI Missing SO
11 100 MegaTech Not shipped
12 100 NIXIMIXI No PDF
13 100 MegaTech Missing SO
14 100 NIXIMIXI Not shipped
15 100 NIXIMIXI No PDF
16 100 MegaTech other
17 100 NIXIMIXI other
18 100 NIXIMIXI other
So I want out put of the list like below and the message Type other may contains empty space or null values.
ID Count Client MessageType
1 100 MegaTech Missing SO
2 100 MegaTech Missing SO
3 100 MegaTech No PDF
4 100 MegaTech No PDF
5 100 MegaTech No PDF
6 100 MegaTech Not shipped
7 100 MegaTech Not shipped
8 100 MegaTech Other
9 100 MegaTech Other
10 Total For 9 Total Items 900
11 100 NIXIMIXI Missing SO
12 100 NIXIMIXI Missing SO
13 100 NIXIMIXI No PDF
14 100 NIXIMIXI No PDF
15 100 NIXIMIXI Not shipped
16 100 NIXIMIXI Not shipped
17 100 NIXIMIXI other
18 100 NIXIMIXI other
19 100 NIXIMIXI other
20 Total For 9 Total Items 900
I want to use Linq or Lambda expression for the this.
I have use like this but do not know how put the count column.
listEOMRep = listEOMRep.OrderBy(x => x.client).ThenBy(x => x.MessageType).ToList();

This is not excactly as you want it, but it gets the job done I guess:
void Main()
{
var source = new List<CustomObject>()
{
new CustomObject() {Id = 1, Count = 100, Client = "MegaTech", MessageType = "Not Shipped" },
new CustomObject() {Id = 2, Count = 250, Client = "NIXIMIXI", MessageType = "Not Shipped" },
new CustomObject() {Id = 3, Count = 45, Client = "NIXIMIXI", MessageType = "No PDF" },
new CustomObject() {Id = 4, Count = 10, Client = "MegaTech", MessageType = "Not Shipped" },
new CustomObject() {Id = 5, Count = 100, Client = "MegaTech", MessageType = "No PDF" },
new CustomObject() {Id = 6, Count = 100, Client = "NIXIMIXI", MessageType = "Not Shipped" },
new CustomObject() {Id = 7, Count = 100, Client = "NIXIMIXI", MessageType = "Not Shipped" },
new CustomObject() {Id = 8, Count = 140, Client = "MegaTech", MessageType = "other" }
};
var result = source.OrderBy(x => x.Client)
.ThenBy (x => x.MessageType)
.GroupBy (s => s.Client)
.Select (s => new {
Key = s.Key,
Object = s,
TotalFor = s.Count(),
TotalItems = s.Sum(x => x.Count)
})
.Dump();
}
public class CustomObject
{
public int Id { get; set; }
public int Count { get; set; }
public string Client { get; set; }
public string MessageType { get; set; }
}
Output:

Related

Filtering data in Linq c#

Val
param
Status
1
100
1
2
100
1
3
100
1
4
100
1
5
100
1
3
200
0
5
200
0
i want linq filteration c# to filter like this
Val
param
Status
1
100
1
2
100
1
4
100
1
1)I want to eliminate rows with status zero '0'
2)I want to eliminate all rows containing same val column values if one is having status 0
Help will be appreciate.
thanks in advance
Try this:
void Main()
{
var data = new List<Item>()
{
new Item(1,100,1),
new Item(2, 100, 1),
new Item(3, 100, 1),
new Item(4, 100, 1),
new Item(5, 100, 1),
new Item(3, 200, 0),
new Item(5, 200, 0)
};
IEnumerable<Item> dataWithNonZeroSatus = data.Where(d=> d.Status !=0);
int[] itemsToSkip = data.Except(dataWithNonZeroSatus).Select(v=> v.Val).ToArray();
var result = dataWithNonZeroSatus.Where(item=> !itemsToSkip.Contains(item.Val));
result.Dump();
}
class Item{
public int Val{get;set;}
public int param{get;set;}
public int Status{get;set;}
public Item(int val, int par, int status)
{
Val = val;
param = par;
Status =status;
}
}
Result:

D3 Filter Issue

I am trying to filter my data list using D3. What I am trying to do is filter my data based on date I specify and threshold value for precipitation.
Here is my code as
$(function() {
$("#datepicker").datepicker();
$("#datepicker").on("change",function(){
//var currentDate = $( "#datepicker" ).datepicker( "getDate" )/1000;
//console.log(currentDate)
});
});
function GenerateReport() {
d3.csv("/DataTest.csv", function(data) {
var startdate = $( "#datepicker" ).datepicker( "getDate" )/1000;
var enddate = startdate + 24*60*60
var data_Date = d3.values(data.filter(function(d) { return d["Date"] >=
startdate && d["Date"] <= enddate} ))
var x = document.getElementById("threshold").value
console.log(data_Date)
var data_Date_Threshold = data_Date.filter(function(d) {return
d.Precipitation > x});
My data set looks like
ID Date Prcip Flow Stage
1010 1522281000 0 0 0
1010 1522281600 0 0 0
1010 1522285200 10 0 0
1010 1522303200 12 200 1.2
1010 1522364400 6 300 2
1010 1522371600 4 400 2.5
1010 1522364400 6 500 2.8
1010 1522371600 4 600 3.5
2120 1522281000 0 0 0
2120 1522281600 0 0 0
2120 1522285200 10 100 1
2120 1522303200 12 1000 2
2120 1522364400 6 2000 3
2120 1522371600 4 2500 3.2
2290 1522281000 0 0 0
2290 1522281600 4 0 0
2290 1522285200 5 200 1
2290 1522303200 10 800 1.5
2290 1522364400 6 1500 3
2290 1522371600 0 1000 2
6440 1522281000 0 0 0
6440 1522281600 4 0 0
6440 1522285200 5 200 0.5
6440 1522303200 10 800 1
6440 1522364400 6 1500 2
6440 1522371600 0 100 1.4
When I use filter function, I have some problems.
What I have found is that when I use x = 2 to filter precipitation value, it does not catch precipitation = 10 or 12. However, when I use x=1, it works fine. I am guessing that it catches only the first number (e.g., if x=2, it regards precipitation = 10 or 12 is less than 2 since it looks only 1 in 10 and 12) Is there anyone who had the same issue what I have? Can anyone help me to solve this problem?
Thanks.
You are comparing strings. This comparison is therefore done lexicographically.
In order to accomplish what you want, you need to first convert these strings to numbers:
var x = Number(document.getElementById("threshold").value)
var data_Date_Threshold = data_Date.filter(function(d) {return Number(d.Precipitation) > x});
Alternatively, floats:
var x = parseFloat(document.getElementById("threshold").value)
var data_Date_Threshold = data_Date.filter(function(d) {return parseFloat(d.Precipitation) > x});

LinQ Group By and Order By

List list => Contents:
ID Counter
1 34
5 34
3 55
2 45
4 33
3 123
1 4
5 12
5 133
2 33
I want to group by Id. And I want to pick the big one out of every group and throw it in a new list of the same type.
This is the last version of the list:
ID Counter
1 34
2 45
3 123
4 33
5 133
Use Group like this:
var groupedData = from item in list
orderby item.id
group item by item.id into idGroup
select new { Id = idGroup.Key, MaxCounter = idGroup.Max(i => i.counter) };

How to add multiple columns in Apache Spark

Here is my input data with four columns with space as the delimiter. I want to add the second and third column and print the result
sachin 200 10 2
sachin 900 20 2
sachin 500 30 3
Raju 400 40 4
Mike 100 50 5
Raju 50 60 6
My code is in the mid way
from pyspark import SparkContext
sc = SparkContext()
def getLineInfo(lines):
spLine = lines.split(' ')
name = str(spLine[0])
cash = int(spLine[1])
cash2 = int(spLine[2])
cash3 = int(spLine[3])
return (name,cash,cash2)
myFile = sc.textFile("D:\PYSK\cash.txt")
rdd = myFile.map(getLineInfo)
print rdd.collect()
From here I got the result as
[('sachin', 200, 10), ('sachin', 900, 20), ('sachin', 500, 30), ('Raju', 400, 40
), ('Mike', 100, 50), ('Raju', 50, 60)]
Now the final result I need is as below, adding the 2nd and 3rd column and display the remaining fields
sachin 210 2
sachin 920 2
sachin 530 3
Raju 440 4
Mike 150 5
Raju 110 6
Use this:
def getLineInfo(lines):
spLine = lines.split(' ')
name = str(spLine[0])
cash = int(spLine[1])
cash2 = int(spLine[2])
cash3 = int(spLine[3])
return (name, cash + cash2, cash3)

Processing Chromosomal Data in Ruby

Say I have a file of chromosomal data I'm processing with Ruby,
#Base_ID Segment_ID Read_Depth
1 100
2 800
3 seg1 1900
4 seg1 2700
5 1600
6 2400
7 200
8 15000
9 seg2 300
10 seg2 400
11 seg2 900
12 1000
13 600
...
I'm sticking each row into a hash of arrays, with my keys taken from column 2, Segment_ID, and my values from column 3, Read_Depth, giving me
mr_hashy = {
"seg1" => [1900, 2700],
"" => [100, 800, 1600, 2400, 200, 15000, 1000, 600],
"seg2" => [300, 400, 900],
}
A primer, which is a small segment that consists of two consecutive rows in the above data, prepends and follows each regular segment. Regular segments have a non-empty-string value for Segment_ID, and vary in length, while rows with an empty string in the second column are parts of primers. Primer segments always have the same length, 2. Seen above, Base_ID's 1, 2, 5, 6, 7, 8, 12, 13 are parts of primers. In total, there are four primer segments present in the above data.
What I'd like to do is, upon encountering a line with an empty string in column 2, Segment_ID, add the READ_DEPTH to the appropriate element in my hash. For instance, my desired result from above would look like
mr_hashy = {
"seg1" => [100, 800, 1900, 2700, 1600, 2400],
"seg2" => [200, 15000, 300, 400, 900, 1000, 600],
}
hash = Hash.new{|h,k| h[k]=[] }
# Throw away the first (header) row
rows = DATA.read.scan(/.+/)[1..-1].map do |row|
# Throw away the first (entire row) match
row.match(/(\d+)\s+(\w+)?\s+(\d+)/).to_a[1..-1]
end
last_segment = nil
last_valid_segment = nil
rows.each do |base,segment,depth|
if segment && !last_segment
# Put the last two values onto the front of this segment
hash[segment].unshift( *hash[nil][-2..-1] )
# Put the first two values onto the end of the last segment
hash[last_valid_segment].concat(hash[nil][0,2]) if last_valid_segment
hash[nil] = []
end
hash[segment] << depth
last_segment = segment
last_valid_segment = segment if segment
end
# Put the first two values onto the end of the last segment
hash[last_valid_segment].concat(hash[nil][0,2]) if last_valid_segment
hash.delete(nil)
require 'pp'
pp hash
#=> {"seg1"=>["100", "800", "1900", "2700", "1600", "2400"],
#=> "seg2"=>["200", "15000", "300", "400", "900", "1000", "600"]}
__END__
#Base_ID Segment_ID Read_Depth
1 100
2 800
3 seg1 1900
4 seg1 2700
5 1600
6 2400
7 200
8 15000
9 seg2 300
10 seg2 400
11 seg2 900
12 1000
13 600
Second-ish refactor. I think this is clean, elegant, and most of all complete. It's easy to read with no hardcoded field lengths or ugly RegEx. I vote mine as the best! Yay! I'm the best, yay! ;)
def parse_chromo(file_name)
last_segment = ""
segments = Hash.new {|segments, key| segments[key] = []}
IO.foreach(file_name) do |line|
next if !line || line[0] == "#"
values = line.split
if values.length == 3 && last_segment != (segment_id = values[1])
segments[segment_id] += segments[last_segment].pop(2)
last_segment = segment_id
end
segments[last_segment] << values.last
end
segments.delete("")
segments
end
puts parse_chromo("./chromo.data")
I used this as my data file:
#Base_ID Segment_ID Read_Depth
1 101
2 102
3 seg1 103
4 seg1 104
5 105
6 106
7 201
8 202
9 seg2 203
10 seg2 204
11 205
12 206
13 207
14 208
15 209
16 210
17 211
18 212
19 301
20 302
21 seg3 303
21 seg3 304
21 305
21 306
21 307
Which outputs:
{
"seg1"=>["101", "102", "103", "104", "105", "106"],
"seg2"=>["201", "202", "203", "204", "205", "206", "207", "208", "209", "210", "211", "212"],
"seg3"=>["301", "302", "303", "304", "305", "306", "307"]
}
Here's some Ruby code (nice practice example :P). I'm assuming fixed-width columns, which appears to be the case with your input data. The code keeps track of which depth values are primer values until it finds 4 of them, after which it will know the segment id.
require 'pp'
mr_hashy = {}
primer_segment = nil
primer_values = []
while true
line = gets
if not line
break
end
base, segment, depth = line[0..11].rstrip, line[12..27].rstrip, line[28..-1].rstrip
primer_values.push(depth)
if segment.chomp == ''
if primer_values.length == 6
for value in primer_values
(mr_hashy[primer_segment] ||= []).push(value)
end
primer_values = []
primer_segment = nil
end
else
primer_segment = segment
end
end
PP::pp(mr_hashy)
Output on input provided:
{"seg1"=>["100", "800", "1900", "2700", "1600", "2400"],
"seg2"=>["200", "15000", "300", "400", "900", "1000"]}

Resources