Linq to CSV select by column value - linq

I know I have asked this question in a different manner earlier today but I have refined my needs a little better.
Given the following csv file where the first column is the title and there could be any number of columns;
year,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017
income,1000,1500,2000,2100,2100,2100,2100,2100,2100,2100
dividends,100,200,300,300,300,300,300,300,300,300
net profit,1100,1700,2300,2400,2400,2400,2400,2400,2400,2400
expenses,500,600,500,400,400,400,400,400,400,400
profit,600,1100,1800,2000,2000,2000,2000,2000,2000,2000
How do I select the profit value for a given year? So I may provide a year of say 2011 and expect to get the profit value of 2000 back.
At the moment I have this which shows the profit value for each year but ideally I'd like to specify the year and get the profit value;
var data = File.ReadAllLines(fileName)
.Select(
l => {
var split = l.Split(",".ToCharArray());
return split;
}
);
var profit = (from p in data where p[0] == profitFieldName select p).SingleOrDefault();
var years = (from p in data where p[0] == yearFieldName select p).FirstOrDefault();
int columnCount = years.Count() ;
for (int t = 1; t < columnCount; t++)
Console.WriteLine("{0} : ${1}", years[t], profit[t]);

I've already answered this once today, but this answer is a little more fleshed out and hopefully clearer.
string rowName = "profit";
string year = "2011";
var yearRow = data.First();
var yearIndex = Array.IndexOf(yearRow, year);
// get your 'profits' row, or whatever row you want
var row = data.Single(d => d[0] == rowName);
// return the appropriate index for that row.
return row[yearIndex];
This works for me.

You have an unfortunate data format, but I think the best thing to do is just to define a class, create a list, and then use your inputs to create objects to add to the list. Then you can do whatever querying you need to get your desired results.
class MyData
{
public string Year { get; set; }
public decimal Income { get; set; }
public decimal Dividends { get; set; }
public decimal NetProfit { get; set; }
public decimal Expenses { get; set; }
public decimal Profit { get; set; }
}
// ...
string dataFile = #"C:\Temp\data.txt";
List<MyData> list = new List<MyData>();
using (StreamReader reader = new StreamReader(dataFile))
{
string[] years = reader.ReadLine().Split(',');
string[] incomes = reader.ReadLine().Split(',');
string[] dividends = reader.ReadLine().Split(',');
string[] netProfits = reader.ReadLine().Split(',');
string[] expenses = reader.ReadLine().Split(',');
string[] profits = reader.ReadLine().Split(',');
for (int i = 1; i < years.Length; i++) // index 0 is a title
{
MyData myData = new MyData();
myData.Year = years[i];
myData.Income = decimal.Parse(incomes[i]);
myData.Dividends = decimal.Parse(dividends[i]);
myData.NetProfit = decimal.Parse(netProfits[i]);
myData.Expenses = decimal.Parse(expenses[i]);
myData.Profit = decimal.Parse(profits[i]);
list.Add(myData);
}
}
// query for whatever data you need
decimal maxProfit = list.Max(data => data.Profit);

Related

Using Bulk Insert dramatically slows down processing?

I'm fairly new to Oracle but I have used the Bulk insert on a couple other applications. Most seem to go faster using it but I've had a couple where it slows down the application. This is my second one where it slowed it down significantly so I'm wondering if I have something setup incorrectly or maybe I need to set it up differently. In this case I have a console application that processed ~1,900 records. Inserting them individually it will take ~2.5 hours and when I switched over to the Bulk insert it jumped to 5 hours.
The article I based this off of is http://www.oracle.com/technetwork/issue-archive/2009/09-sep/o59odpnet-085168.html
Here is what I'm doing, I'm retrieving some records from the DB, do calculations, and then write the results out to a text file. After the calculations are done I have to write those results back to a different table in the DB so we can look back at what those calculations later on if needed.
When I make the calculation I add the results to a List. Once I'm done writing out the file I look at that List and if there are any records I do the bulk insert.
With the bulk insert I have a setting in the App.config to set the number of records I want to insert. In this case I'm using 250 records. I assumed it would be better to limit my in memory arrays to say 250 records versus the 1,900. I loop through that list to the count in the App.config and create an array for each column. Those arrays are then passed as parameters to Oracle.
App.config
<add key="UpdateBatchCount" value="250" />
Class
class EligibleHours
{
public string EmployeeID { get; set; }
public decimal Hours { get; set; }
public string HoursSource { get; set; }
}
Data Manager
public static void SaveEligibleHours(List<EligibleHours> listHours)
{
//set the number of records to update batch on from config file Subtract one because of 0 based index
int batchCount = int.Parse(ConfigurationManager.AppSettings["UpdateBatchCount"]);
//create the arrays to add values to
string[] arrEmployeeId = new string[batchCount];
decimal[] arrHours = new decimal[batchCount];
string[] arrHoursSource = new string[batchCount];
int i = 0;
foreach (var item in listHours)
{
//Create an array of employee numbers that will be used for a batch update.
//update after every X amount of records, update. Add 1 to i to compensated for 0 based indexing.
if (i + 1 <= batchCount)
{
arrEmployeeId[i] = item.EmployeeID;
arrHours[i] = item.Hours;
arrHoursSource[i] = item.HoursSource;
i++;
}
else
{
UpdateDbWithEligibleHours(arrEmployeeId, arrHours, arrHoursSource);
//reset counter and array
i = 0;
arrEmployeeId = new string[batchCount];
arrHours = new decimal[batchCount];
arrHoursSource = new string[batchCount];
}
}
//process last array
if (arrEmployeeId.Length > 0)
{
UpdateDbWithEligibleHours(arrEmployeeId, arrHours, arrHoursSource);
}
}
private static void UpdateDbWithEligibleHours(string[] arrEmployeeId, decimal[] arrHours, string[] arrHoursSource)
{
StringBuilder sbQuery = new StringBuilder();
sbQuery.Append("insert into ELIGIBLE_HOURS ");
sbQuery.Append("(EMP_ID, HOURS_SOURCE, TOT_ELIG_HRS, REPORT_DATE) ");
sbQuery.Append("values ");
sbQuery.Append("(:1, :2, :3, SYSDATE) ");
string connectionString = ConfigurationManager.ConnectionStrings["Server_Connection"].ToString();
using (OracleConnection dbConn = new OracleConnection(connectionString))
{
dbConn.Open();
//create Oracle parameters and pass arrays of data
OracleParameter p_employee_id = new OracleParameter();
p_employee_id.OracleDbType = OracleDbType.Char;
p_employee_id.Value = arrEmployeeId;
OracleParameter p_hoursSource = new OracleParameter();
p_hoursSource.OracleDbType = OracleDbType.Char;
p_hoursSource.Value = arrHoursSource;
OracleParameter p_hours = new OracleParameter();
p_hours.OracleDbType = OracleDbType.Decimal;
p_hours.Value = arrHours;
OracleCommand objCmd = dbConn.CreateCommand();
objCmd.CommandText = sbQuery.ToString();
objCmd.ArrayBindCount = arrEmployeeId.Length;
objCmd.Parameters.Add(p_employee_id);
objCmd.Parameters.Add(p_hoursSource);
objCmd.Parameters.Add(p_hours);
objCmd.ExecuteNonQuery();
}
}

Comparing two lists with multiple conditions

I have two different lists of same type. I wanted to compare both lists and need to get the values which are not matched.
List of class:
public class pre
{
public int id {get; set;}
public datetime date {get; set;}
public int sID {get; set;}
}
Two lists :
List<pre> pre1 = new List<pre>();
List<pre> pre2 = new List<pre>();
Query which I wrote to get the unmatched values:
var preResult = pre1.where(p1 => !pre
.any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1sID));
But the result is wrong here. I am getting all the values in pre1.
Here is solution :
class Program
{
static void Main(string[] args)
{
var pre1 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
new pre {id = 7, date = DateTime.Now.Date, sID = 2 },
new pre {id = 9, date = DateTime.Now.Date, sID = 3 },
new pre {id = 13, date = DateTime.Now.Date, sID = 4 },
// ... etc ...
};
var pre2 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
// ... etc ...
};
var preResult = pre1.Where(p1 => !pre2.Any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1.sID)).ToList();
Console.ReadKey();
}
}
Note:Property date contain the date and the time part will be 00:00:00.
I fixed some typos and tested your code with sensible values, and your code would correctly select unmatched records. As prabhakaran S's answer mentions, perhaps your date values include time components that differ. You will need to check your data and decide how to proceed.
However, a better way to select unmatched records from one list compared against another would be to utilize a left join technique common to working with relational databases, which you can also do in Linq against in-memory collections. It will scale better as the sizes of your inputs grow.
var preResult = from p1 in pre1
join p2 in pre2
on new { p1.id, p1.date, p1.sID }
equals new { p2.id, p2.date, p2.sID } into grp
from item in grp.DefaultIfEmpty()
where item == null
select p1;

Find / Count Redundant Records in a List<T>

I am looking for a way to identify duplicate records...only I want / expect to see them.
So the records aren't duplicated completely but the unique fields I am unconcerned with at this point. I just want to see if they have made X# payments of the exact same amount, via the exact same card, to the exact same person. (Bogus example just to illustrate)
The collection is a List<> further whatever X# is the List<>.Count will be X#. In other words all the records in the list match (again just the fields I am concerned with) or I will reject it.
The best I can come up with is to take the first record get value of say PayAmount and LINQ the other two to see if they have the same PayAmount value. Repeat for all fields to be matched. This seems horribly inefficient but I am not smart enough to think of a better way.
So any thoughts, ideas, pointers would be greatly appreciated.
JB
Something like this should do it.
var duplicates = list.GroupBy(x => new { x.Amount, x.CardNumber, x.PersonName })
.Where(x => x.Count() > 1);
Working example:
class Program
{
static void Main(string[] args)
{
List<Entry> table = new List<Entry>();
var dup1 = new Entry
{
Name = "David",
CardNumber = 123456789,
PaymentAmount = 70.00M
};
var dup2 = new Entry
{
Name = "Daniel",
CardNumber = 987654321,
PaymentAmount = 45.00M
};
//3 duplicates
table.Add(dup1);
table.Add(dup1);
table.Add(dup1);
//2 duplicates
table.Add(dup2);
table.Add(dup2);
//Find duplicates query
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
foreach (var item in query)
{
Console.WriteLine("{0}, {1}, {2}, {3}", item.name, item.cardNumber, item.amount, item.count);
}
Console.ReadKey();
}
}
public class Entry
{
public string Name { get; set; }
public int CardNumber { get; set; }
public decimal PaymentAmount { get; set; }
}
The meat of which is this:
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
You're unique entries are based off of the 3 criteria of Name, Card Number, and Payment Amount so you group by them and then use .Count() to count how many of those unique values exist. where g.Count() > 1 filters the group to duplicates only.

How can I post a list then access it in my controller?

I created a list property in my model like so
public virtual List<String> listOfDays { get; set; }
then I converted and stored it in the list like this:
for (int i = 0; i < 30 i++)
{
var enrollment = new Enrollment();
enrollment.StudentID = id;
enrollment.listOfDays = searchString.ToList();
db.Enrollments.Add(enrollment);
db.SaveChanges();
}
I put a breakpoint here... enrollment.listOfDays = searchString.ToList(); ... and all is well. I see that the conversion was performed and I can see the values in listOfDays.
So I thought I would find a column in my database called listOfDays since I'm doing code first but the property is not there.
Then I thought I'd try accessing it anyway like this...
var classdays = from e in db.Enrollments where e.StudentID == id select e.listOfDays;
var days = classdays.ToList();
//here I get an error message about this not being supported in Linq.
Questions:
Why do you think the property was not in the database?
How can I post this array to my model then access it in a controller?
Thanks for any help.
Thanks to Decker: http://forums.asp.net/members/Decker%20Dong%20-%20MSFT.aspx
Here’s how it works:
Using form collection here…
In [HttpPost]…
private void Update (FormCollection formCollection, int id)
for (int sc = 0; sc < theSelectedCourses.Count(); sc++)
{
var enrollment = new Enrollment();
enrollment.CourseID = Convert.ToInt32(theSelectedCourses[sc]);
enrollment.StudentID = id;
enrollment.listOfDays = formCollection["searchString"] ;//bind this as a string instead of a list or array.
Then in [HttpGet]…
private void PopulateAssignedenrolledData(Student student, int id)
{
var dayList = from e in db.Enrollments where e.StudentID == id select e;
var days = dayList.ToList();
if (days.Count > 0)
{
string dl = days.FirstOrDefault().listOfDays;
string[] listofdays = dl.Split(',');
ViewBag.classDay = listofdays.ToList();
}
Thanks to Decker: http://forums.asp.net/members/Decker%20Dong%20-%20MSFT.aspx
Here’s how it works:
Using form collection here…
In [HttpPost]…
private void Update (FormCollection formCollection, int id)
for (int sc = 0; sc < theSelectedCourses.Count(); sc++)
{
var enrollment = new Enrollment();
enrollment.CourseID = Convert.ToInt32(theSelectedCourses[sc]);
enrollment.StudentID = id;
enrollment.listOfDays = formCollection["searchString"] ;//bind this as a string instead of a list or array.
Then in [HttpGet]…
private void PopulateAssignedenrolledData(Student student, int id)
{
var dayList = from e in db.Enrollments where e.StudentID == id select e;
var days = dayList.ToList();
if (days.Count > 0)
{
string dl = days.FirstOrDefault().listOfDays;
string[] listofdays = dl.Split(',');
ViewBag.classDay = listofdays.ToList();
}

Linq Convert to Custom Dictionary?

.NET 4, I have
public class Humi
{
public int huKey { get; set; }
public string huVal { get; set; }
}
And in another class is this code in a method:
IEnumerable<Humi> someHumi = new List<Humi>(); //This is actually ISingleResult that comes from a LinqToSql-fronted sproc but I don't think is relevant for my question
var humia = new Humi { huKey = 1 , huVal = "a"};
var humib = new Humi { huKey = 1 , huVal = "b" };
var humic = new Humi { huKey = 2 , huVal = "c" };
var humid = new Humi { huKey = 2 , huVal = "d" };
I want to create a single IDictionary <int,string[]>
with key 1 containing ["a","b"] and key 2 containing ["c","d"]
Can anyone point out a decent way to to that conversion with Linq?
Thanks.
var myDict = someHumi
.GroupBy(h => h.huKey)
.ToDictionary(
g => g.Key,
g => g.ToArray())
Create an IEnumerable<IGrouping<int, Humi>> and then project that into a dictionary. Note .ToDictionary returns a Dictionary, not an IDictionary.
You can use ToLookup() which allows each key to hold multiple values, exactly your scenario (note that each key would hold an IEnumerable<string> of values though not an array):
var myLookup = someHumi.ToLookup(x => x.huKey, x => x.huVal);
foreach (var item in myLookup)
{
Console.WriteLine("{0} contains: {1}", item.Key, string.Join(",", item));
}
Output:
1 contains: a,b
2 contains: c,d

Resources