Read Excel using LINQ - linq

I want to read excel 2003( cannot change as its coming from third party) and group data in List or Dictionary (I don't which one is good)
for example below (Excel formatting )
Books Data [first row and first column in excel]
second row( no records)
Code,Name,IBN [third row (second column, third column]
Aust [fourth row, first column]
UX test1 34 [ fifth row (second column, third column]
......
....
Books Data
Code Name IBN
Aust
UX test1 34
UZ test2 345
UN test3 5654
US
UX name1 567
TG nam2 123
UM name3 234
I am reading excel data using following code( some help from Google)
string filename = #"C:\\" + "Book1.xls";
string connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + filename + ";" +
"Extended Properties=Excel 8.0;";
OleDbDataAdapter dataAdapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connectionString);
DataSet myDataSet = new DataSet();
dataAdapter.Fill(myDataSet, "BookInfo");
DataTable dataTable = myDataSet.Tables["BookInfo"];
var rows = from p in dataTable.AsEnumerable()
where p[0].ToString() != null || p[0].ToString() != "" && p.Field<string>("F2") != null
select new
{ countryName= p[0],
bookCode= p.Field<string>("F2"),
bookName= p.Field<string>("F3")
};
The code above is not good as to get the “Code” I am using “ F2” and for country I am using p[0].What should I use to get the code and name for each country.
Also it’s give the information I want but I don't how to put in list or dictionary or in class so I can get data by passing parameter as a country name.
In short first it must put all data in list or dictionary and then you can call list or dictionary get data filter by country.
Thanks

There's two things you need to do:
First, you need to reformat the spreadsheet to have the column headers on the first row like the table below shows
| Country | Code | Name | IBN |
|---------|------|---------|------|
| Aust | UX | test1 | 34 |
| Aust | UZ | test2 | 345 |
| Aust | UN | test3 | 5654 |
| US | UX | name1 | 567 |
| US | TG | name2 | 123 |
| US | UM | name3 | 234 |
Second, use the Linq to Excel library to retrieve the data. It takes care of making the oledb connection and creating the sql for you. Below is an example of how easy it is to use the library
var book = new ExcelQueryFactory("pathToExcelFile");
var australia = from x in book.Worksheet()
where x["Country"] == "Aust"
select new
{
Country = x["Country"],
BookCode = x["Code"],
BookName = x["Name"]
};
Checkout the Linq to Excel intro video for more information about the open source project.

Suggestion 1
Checkout THIS link......as AKofC suggests, creating a class to hold your data would be your first port of call. The link I have posted has a small example of the sort of idea we are proposing.
Suggestion 2 with example...
The obvious thing to do from the code you have posted would be to create a new class to store your book information in.
Then you simply define which fields from your excel document it is that you want to pass into the new instance of your bookinformation class.
New Book Information Class:
class MyBookInfo
{
public string CountryName { get; set; }
public string BookCode { get; set; }
public string BookName { get; set; }
}
Method To Retrieve Info:
public void GetMyBookInfoFromExcelDocument()
{
string filename = #"C:\\" + "Book1.xls";
string connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + filename + ";" +
"Extended Properties=Excel 8.0;";
OleDbDataAdapter dataAdapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connectionString);
DataSet myDataSet = new DataSet();
dataAdapter.Fill(myDataSet, "BookInfo");
DataTable dataTable = myDataSet.Tables["BookInfo"];
var rows = from p in dataTable.AsEnumerable()
where p[0].ToString() != null || p[0].ToString() != "" && p.Field<string>("F2") != null
select new MyBookInfo
{
CountryName = p.Field<string>("InsertFieldNameHere"),
BookCode = p.Field<string>("InsertFieldNameHere"),
BookName = p.Field<string>("InsertFieldNameHere")
};
}

From what I understand, I suggest creating a BookData class containing the properties you need, in this case Country, Code, Name, and IBN.
Then once you've filled your DataSet with the Excel stuff, create a new List, and loop through the DataRows in the DataSet adding the Excel values to the List.
Then you can use Linq on the List like so:
List<BookData> results = from books in bookList
where books.country == 'US'
select books;
Or something like that. I don't have Visual Studio on me, and Intellisense has spoiled me, so yeah. >__>

Related

How to split a column which has data in XML form to different rows of new Database as KEY VALUE in TALEND

In old DB i have a data in one column as
<ADDRESS>
<CITY>ABC</CITY>
<STATE>PQR</SERVICE>
</ADDRESS>
In my new DB i want this data to be stored in KEY VALUE fashion like:
USER_ID KEY VALUE
1 CITY ABC
1 STATE PQR
Someone please help me how to migrate this kind of data using TALEND tool.
Design job like below.
tOracleInput---tExtractXMLFiled---output.
tOracleInput component you can select XML column and make datatype as String.
tExtractXmlFiled component pass this XML column as " XML Filed" and set the Loop xpath Expression as "/ADDRESS"
Add new two Columns in output Schema of tExtractXmlFiled for city & STATE
Set XPath Query in Mapping for city "/ADDRESS/CITY" and for STATE "/ADDRESS/STATE"
Now you have both the values in output.
See the image for more details.
as I explain in your previous post you can follow the same approach for making Key value pair.
how-to-split-one-row-in-different-rows-in-talend
Or you can use tUnpivot component as you did here.
As you said source data has Special character then use below expression to replace it.
Steps: after oracle input add tMap and use this code for replacement of special symbol
row24.XMLField.replaceAll("&", "<![CDATA["+"&"+"]]>")
once that is done execute the job and see the result it should work.
I'd use tJavaFlex.
Component Settings:
tJavaFlex schema:
In the begin part, use
String input = ((String)globalMap.get("row2.xmlField")); // get the xml Fields value
String firstTag = input.substring(input.indexOf("<")+1,input.indexOf(">"));
input = input.replace("<"+firstTag+">","").replace("</"+firstTag+">","");
int tagCount = input.length() - input.replace("</", "<").length();
int closeTagFinish = -1;
for (int i = 0; i<tagCount ; i++) {
in the main part, parse the XML tag name and value, and have the output schema contain that 2 additional column. MAIN part will be like:
/*set up the output columns */
output.user_id = ((String)globalMap.get("row2.user_id"));
output.user_first_name = ((String)globalMap.get("row2.user_first_name"));
output.user_last_name = ((String)globalMap.get("row2.user_last_name"));
Then we can calculate the key-value pairs for the XML, without knowing the KEY values.
/*calculate columns out of XML */
int openTagStart = input.indexOf("<",closeTagFinish+1);
int openTagFinish = input.indexOf(">",openTagStart);
int closeTagStart = input.indexOf("<",openTagFinish);
closeTagFinish = input.indexOf(">",closeTagStart);
output.xmlKey = input.substring(openTagStart+1,openTagFinish);
output.xmlValue = input.substring(openTagFinish+1,closeTagStart);
tJavaFlex End part:
}
Output looks like:
.-------+---------------+--------------+------+--------.
| tLogRow_2 |
|=------+---------------+--------------+------+-------=|
|user_id|user_first_name|user_last_name|xmlKey|xmlValue|
|=------+---------------+--------------+------+-------=|
|1 |foo |bar |CITY |ABC |
|1 |foo |bar |STATE |PQR |
'-------+---------------+--------------+------+--------'

Implementing tables in lua to access specific pieces for later use

I am trying to make a table store 3 parts which will each be huge in length. The first is the name, second is EID, third is SID. I want to be able to get the information like this name[1] gives me the first name in the list of names, and like so for the other two. I'm running into problems with how to do this because it seems like everyone has their own way which are all very very different from one another. right now this is what I have.
info = {
{name = "btest", EID = "19867", SID = "664"},
{name = "btest1", EID = "19867", SID = "664"},
{name = "btest2", EID = "19867", SID = "664"},
{name = "btest3", EID = "19867", SID = "664"},
}
Theoretically speaking would i be able to just say info.name[1]? Or how else would I be able to arrange the table so I can access each part separately?
There are two main "ways" of storing the data:
Horizontal partitioning (Object-oriented)
Store each row of the data in a table. All tables must have the same fields.
Advantages: Each table contains related data, so it's easier passing it around (e.g, f(info[5])).
Disadvantages: A table is to be created for each element, adding some overhead.
This looks exactly like your example:
info = {
{name = "btest", EID = "19867", SID = "664"},
-- etc ...
}
print(info[2].names) -- access second name
Vertical partioning (Array-oriented)
Store each property in a table. All tables must have the same length.
Advantages: Less tables overall, and slightly more time and space efficient (Lua VM uses actual arrays).
Disadvantages: Needs two objects to refer to a row: the table and the index. It's harder to insert/delete.
Your example would look like this:
info = {
names = { "btest", "btest1", "btest2", "btest3", },
EID = { "19867", "19867", "19867", "19867", },
SID = { "664", "664", "664", "664", },
}
print(info.names[2]) -- access second name
So which one should I choose?
Unless you are really need performance, you should go with horizontal partitioning. It's far more common working over full rows, and gives you more freedom in how you use your structures. If you decide to go full OO, having your data in horizontal form will be much easier.
Addendum
The names "horizontal" and "vertical" come from the table representation of a relational database.
| names | EID | SID | | names |
--+-------+-----+-----+ +-------+
1 | | | | | | --+-------+-----+-----+
2 | | | | | | 2 | | | |
3 | | | | | | --+-------+-----+-----+
Your info table is an array, so you can access items using info[N] where N is any number from 1 to the number of items in the table. Each field of the info table is itself a table. The 2nd item of info is info[2], so the name field of that item is info[2].name.

Linq - multiple rows as string

Lets assume that I have Table which contains Username and the other one which contains FirstName. A single user can have multiple FirstNames:
How can I obtain a record which will contain Username and all first names separated by comma?
i. e.
Users
id | Username
1 | Test
Names
UserId | FirstName
1 | Mike
1 | John
I would like to receive a record which will contain
Test, "Mike, John"
How can I do that?
Edit: What if Users table will have more columns which I want to get?
i. e.
id | Username | Status
1 | Test | Active
How to get Test, Active, "Mike, John"?
You can use GroupBy and String.Join
var userGroups = from u in users
join n in names
on u.ID equals n.UserID
group new{n, u} by n.UserID into UserGrp
select new
{
Username = UserGrp.Key,
Status = UserGrp.First().u.Status,
Names = string.Join(",", UserGrp.Select(x => x.n.Name))
};
foreach (var ug in userGroups)
Console.WriteLine("{0}, {1}, \"{2}\"", ug.Username, ug.Status, ug.Names);

Generate json result based on two related entites

I have two related one to many entities
Race and Cars (on race contains a lot of cars)
I need to generate an json result to pass it to jQGrid, i thought may be it is possible to do that without creating new class witch would contain properties. I thought I can go like that:
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows = (from c in Races
select new
{
//c.Cars.Id.ToString(), - need iteration
cell = new string[] {
//c.Cars.Id.ToString(), - need iteration
c.Date.ToString(),
c.Type.ToString(),
c.Cars //But how i may loop all Cars colection here?
//c.Cars.Name - need iteration
//c.Cars.Speed - need iteration
}
}).ToArray()
};
But the Cars property represent collection. How may i iterate that inside collection initializer? Or should i better create class witch would contain all the properties i need?
Any ideas?
Lets say Car has properties Name Speed Id and Race has properties Date, Type
The data will be displayed like that:
Date | Type | Id | Name | Speed
02/03/2011 | A | 1 | MegaName1 | 130
02/03/2011 | A | 2 | MegaName2 | 112
02/03/2011 | A | 3 | MegaName3 | 132
03/05/2011 | B | 4 | MegaName2 | 112
03/05/2011 | B | 5 | MegaName4 | 33
Try the following:
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows =
(from race in races
from car in race.Cars
select new
{
cell = new string[]
{
race.Date.ToString(),
race.Type,
car.Id,
car.Name,
car.Speed.ToString()
}
}).ToArray()
};

MySQL Query fails in my App, but succeeds in MySQL Command Line Tool

I am running this MySQL query against my db that contains a lot of customer data. the query is giant so I'll cut out parts. We're using 5.1.45-51-log Percona SQL Server which is a pretty common version of MySQL.
select distinct * " +
"from customer_contact_preferences prefs " +
...
"join re_account_attribute uaa on uaa.account_id = ua.id " +
"where ... " +
...
"and uaa.attribute_key = 'R_GROUP' " +
//"and uaa.attribute_value in ( ? ) " + //PROBLEM HERE
"order by customer_contact_id)"
the argument to uaa.attribute value is '1','2',3'.
In the code, we use org.springframework.jdbc.core.simple.SimpleJDBCDaoSupport.update() to call this query.
When it is called in the code through Spring, it incorrectly returns 0 rows. When I substitute the args (4 people have checked that I did the substitution right), and run it on the mysql command line, the query correctly returns >5 rows. If I comment out the problem line, like i'm doing above, I get >5 and actually too many rows (that's why I need that constraint).
The table is described as follows:
| attribute_key | varchar(255) | NO | | NULL | |
| attribute_value | varchar(255) | NO | | NULL | |
It is so we can store any kind of key/value pair, not restricted to varchar or ints.
Anyway, what is wrong with doing "IN " when you have a string or list of strings? Thank you.
Parameterized IN clauses are supported by Spring JDBC templates only if you use named parameters instead of ?s, see 12.7.3 Passing in lists of values for IN clause. So, you need
getSimpleJdbcTemplate().update("... IN (:params) ...",
Collections.singletonMap("params", ...));
You can't bind a List or array to '?'.
You have to create a '?' for each item in the List or array in the SQL and then bind each one individually.
List<String> values = Arrays.asList(new String [] { "foo", "bar", "baz" });
StringBuilder sql = new StringBuilder();
sql.append("SELECT * FROM X WHERE ID IN (");
for (int i = 0; i < values.length; ++i)
{
sql.append('?');
if (i != values.length-1))
sql.append(',');
}
sql.append(")");
PreparedStatement ps = connection.prepareStatement(sql.toString());
for (int i = 0; i < values.length; ++i)
{
ps.setString(i+1, values[i]);
}
ResultSet rs = ps.executeQuery();

Resources