Find ArrayList for no sequence data in Java - logic

I need to get arrayList of postalCodes by Reagion IDs, but my region id's are not in provided sequence and zip codes must be associates with mentioned region id. (This data is an small example, I have csv file in with thousands of zip Code)
Please help me out to write a correct logic for it in Java.
Zip Codes V/s Region Ids = {"71854 , 612" ; "71865, 612"; "88008, 765"; "88021, 765" ; "88252 , 612"}

You can do this of two ways.
first declare a structure that contain the two values required and then fill an array with the values, as show in the example.
struct Data{
int zip_code, region_id;
Data() {}
Data(int z, int r) {
zip_code = z;
region_id = r;
}
};
define MAX 100
Data my_list[MAXN];
second declare declare two parallel arrays and fill these with the data requiered
define MAX 100
int zip_code[MAX] = {71854, 71865, 88008, 88021, 88252};
int region_id[MAX] = {612, 612, 765, 765, 612};

Related

Inserting tuples inside an inner bag using Pig Latin - Hadoop

I am trying to create the following format of relation using Pig Latin:
userid, day, {(pid,fulldate, x,y),(pid,fulldate, x,y), ...}
Relation description: Each user (userid) in each day (day) has purchased multiple products (pid)
I am Loading the data into:
A= LOAD '**from a HDFS URL**' AS (pid: chararray,userid:
chararray,day:int,fulldate: chararray,x: chararray,y:chararray);
B= GROUP A BY (userid, day);
Describe B;
B: {group: (userid: chararray,day: int),A: {(pid: chararray,day: int,fulldate: chararray,x: chararray,userid: chararray,y: chararray)}}
C= FOREACH B FLATTEN(B) AS (userid,day), $1.pid, $1.fulldate,$1.x,$1.y;
Describe C;
C: {userid: chararray,day: int,{(pid: chararray)}},{(fulldate: chararray)},{(x: chararray)},{(y: chararray)}}
The result of Describe C does not give the format I want ! What I am doing wrong?
You are correct till the GROUP BY part. After that however you are trying to do something messy. I'm actually not sure what is happening for your alias C. To arrive at the format you are looking for, you will need a nested foreach.
C = FOREACH B {
data = A.pid, A.fulldate, A.x, A.y;
GENERATE FLATTEN(group), data;
}
This allows C to have one record for each (userid, day) and all the corresponding (pid,fulldate, x, y) tuples in a bag.
You can read more about nested foreach here: https://www.safaribooksonline.com/library/view/programming-pig/9781449317881/ch06.html (Search for nested foreach in that link).
My understanding is that B is almost what you're looking for, except you would like the tuple containing userid and day to be flattened, and you would like only pid, fulldate, x, and y to appear in the bag.
First, you want to flatten the tuple group which has fields userid and day, not the bag A which contains multiple tuples. Flattening group unnests the tuple, which only has 1 set of unique values for each row, whereas flattening the bag A would effectively ungroup your previous GROUP BY statement since the values in the bag A are not unique. So the first part should read C = FOREACH B GENERATE FLATTEN(group) AS (userid, day);
Next, you want to keep pid, fulldate, x, and y in separate tuples for each record, but the way you've selected them essentially makes a bag of all the pid values, a bag of all the fulldate values, etc. Instead, try selecting these fields in a way that keeps the tuples nested in the bag:
C = FOREACH B GENERATE
FLATTEN(group) AS (userid, day),
A.(pid, fulldate, x, y) AS A;

Tables in Gherkin

I am trying to compare between two tables and test the results using Gherkin but I don't know how to make it declare two lists in the #when section instead of one, like how it is shown below:
#When("^the integer table :$")
public void the_integer_table_(List<Integer> comp1, List<Integer> comp2) {
for(int i = 0; i < comp1.size(); i++) {
compare[i] = comp1.get(i);
}
for(int i = 0; i < comp2.size(); i++) {
compare2[i] = comp1.get(i);
}
comparer.comparer_tableau( compare, compare2);
}
Here is my .feature file:
Scenario: Compare the elements of two tables and return a boolean table as a result
Given I am starting a comparision operation
When these two integer table are entered :
|1|2|3|4|5|
|0|2|5|4|5|
Then I should see the correct answer is:
|false|true|false|true|true|
Here is what I get when I run it:
#When("^these two integer table two are entered :$") public void these_two_integer_table_two_are_entered_(DataTable arg1) { }
P.S: I did try and look for solutions but didn't find any.
You can change the step definition as given below and get the values of each row then compare it.
#When("^these two integer table are entered :$")
public void these_two_integer_table_are_entered(DataTable arg1) throws Throwable {
List<DataTableRow> lstRows=arg1.getGherkinRows();
Integer[] compare=new Integer[lstRows.get(0).getCells().size()];
System.out.println(compare.length);
Integer[] compare2=new Integer[lstRows.get(1).getCells().size()];
System.out.println(compare2.length);
//Get the first row values
for(int i = 0; i < compare.length; i++) {
compare[i] = Integer.valueOf(lstRows.get(0).getCells().get(i));
System.out.println(compare[i]);
}
//Get the second row values
for(int i = 0; i < compare2.length; i++) {
compare2[i] = Integer.valueOf(lstRows.get(1).getCells().get(i));
System.out.println(compare2[i]);
}
comparer.comparer_tableau( compare, compare2);
}
AFAIK, You can have only one table per step. But can get the each row values inside your definition.
Cucumber will interpret your When step to use one DataTable
When these two integer table are entered :
|1|2|3|4|5|
|0|2|5|4|5|
Instead of trying to enter two tables in one step, try entering them in two separate steps (one table per step) like this:
When the following integer table is entered :
|1|2|3|4|5|
And the following integer table is entered :
|0|2|5|4|5|
I would use a scenario outline like this and handle the log to retrieve values for each column in step definition. (split value string by comma and use values accordingly).
Scenario Outline : Compare the elements of two tables and return a boolean table as a result
Given I am starting a comparision operation
When integer '<table1>' values are entered
When integer '<table2>' values are entered
Then I should see the correct '<answer>'
Examples:
| table1 | table2 | answer |
| 1,2,3,4,5 | 0,2,5,4,5 | false,true,false,true,true |

Processing: How can I find the number of times two fields are equal in a CSV file?

I'm learning Processing for the first time and I've been tasked to deal with data but it's been terribly confusing for me.
For every line of a CSV file (apart from the header), I want to compare two specific columns of each. i.e. ListA vs ListB
For example, with the data below:
ListA,ListB
Male,Yes
Male,No
Female,Yes
Male,Yes
And for example, I want to check for all instances that a value in ListA is "Male" AND that the corresponding value in ListB is "Yes". In this scenario, I should get the value "2" for the two rows this is true.
How would I do that?
For now, I have a 2D String array of the data in the CSV file. From that I managed to assign specific columns as ListA and ListB. I tried using sort but it would only sort one list and not both at the same time.
Current relevant code:
for (int i=1; i<lines.length; i++) {
listA[i-1] = csv[i][int(whichA)];
listB[i-1] = csv[i][int(whichB)];
}
lA = Arrays.asList(listA);
lB = Arrays.asList(listB);
Not sure if this code really helps makes things clearer though. :/
Any help would be appreciated. Thank you.
So something like this should do what you need it to. Pseudocode:
int numRows = 0;
for (int i = 0; i < length; ++i) {
if (array1[i] equals "Male" AND array2[i] equals "Yes") {
++numRows;
//add to new collection here if you need the data
}
}

Linq to dataset on dynamic columns and dynamic group by fields

I have four parameters to my function a dataset, array consisting of expressions (aggregate functions), array consisting of column names on which to apply expressions and an array consisting of columns on which I have to group by.
My problem is how can I handle dynamic columns or fields for expression and group by as it can vary in numbers (depends on array values).
I have written code for static query, but need a generic way...
This is my code:
public void ExpressionManipulation(DataSet dsExprEvaluate, string[] strExpressions, string[] colName, string[] groupbyFields)
{
int groupByLength = groupbyFields.Length;
var groupByQueryEvaluate = from table in dsExprEvaluate.AsEnumerable()
group table by new { column1 = table["DataSourceType"], column2 = table["Polarity"] }
into groupedTable
select new
{
x = groupedTable.Key, // Each Key contains column1 and column2
y = groupedTable.Count(),
//z = groupedTable.Max(column1),
z = groupedTable.Sum(table => Convert.ToInt32(table["Polarity"]))
};
}
Like in above I can have n number of fields in group by like for now I have taken only two (DataSourceType and Polarity) and similar I can have n number of fields for expressions, for sum, count etc which will be as an array as parameter.
Please help me through this, it is driving me crazy.
Thanks in advance.
I figured it out myself and the solution i ended up is with:
var objGroupSumCountkey = dt.AsEnumerable()
.AsQueryable()
.GroupBy("new ( it[\"DataSourceType\"] as GroupByColumnName1,it[\"Polarity\"] as GroupByColumnName2)", "it")
.Select("new ( Sum(Convert.ToDouble(it[\"Polarity\"].ToString())) as SumValue,Count() as TotalCount,it.key)");
in the above query all the parameters will be supplied as string, in Group By and select

How to order integers according to size and track their positions by variable name

I have a program with multiple int variables where individual counts are added to the specific variable each time a set fail condition is encountered. I want the user to be able to track how many failures of each category they have encountered by a button click. I want to display the range on a datagridview in order from highest value integer down to lowest. I also need to display in the adjacent column the name of the test step that relates to the value. My plan was to use Array.sort to order the integers but i then lose track of their names so cant assign the adjacent string column. I tried using a hashtable but if i use the string as a key it sorts alphabetically not numerically and if i use the integer as a key i get duplicate entries which dont get added to the hash table. here is some of the examples i tried but they have the aforementioned problems. essentially i want to end with two arrays where the order matches the naming and value convention. FYI the variables were declared before this section of code, variables ending in x are the string name for the (non x) value of the integer.
Hashtable sorter = new Hashtable();
sorter[download] = downloadx;
sorter[power] = powerx;
sorter[phase] = phasex;
sorter[eeprom] = eepromx;
sorter[upulse] = upulsex;
sorter[vpulse] = vpulsex;
sorter[wpulse] = wpulsex;
sorter[volts] = voltsx;
sorter[current] = currentx;
sorter[ad] = adx;
sorter[comms] = commsx;
sorter[ntc] = ntcx;
sorter[prt] = prtx;
string list = "";
string[] names = new string[13];
foreach (DictionaryEntry child in sorter)
{
list += child.Value.ToString() + "z";
}
int[] ordered = new int[] { download, power, phase, eeprom, upulse, vpulse, wpulse, volts, current, ad, comms, ntc, prt };
Array.Sort(ordered);
Array.Reverse(ordered);
for (int i = 0; i < sorter.Count; i++)
{
int pos = list.IndexOf("z");
names[i] = list.Substring(0, pos);
list = list.Substring(pos + 1);
}
First question here so hope its not too longwinded.
Thanks
Use a Dictionary. And you can order it by the value : myDico.OrderBy(x => x.Value).Reverse(), the sort will be numerical descending. You just have to enumerate the result.
I hope I understand your need. Otherwise ignore me.
You want to be using a
Dictionary <string, int>
to store your numbers.I'm not clear on how you're displaying results at the end - do you have a grid or a list control?
You ask about usings. Which ones do you already have?
EDIT for .NET 2.0
There might be a more elegant solution, but you could implement the logic by putting your rows in a DataTable. Then you can make a DataView of that table and sort by whichever column you like, ascending or descending.
See http://msdn.microsoft.com/en-us/library/system.data.dataview(v=VS.80).aspx for example.
EDIT for .NET 3.5 and higher
As far as sorting a Dictionary by its values:
var sortedEntries = myDictionary.OrderBy(pair => pair.Value);
If you need the results to be a Dictionary, you can call .ToDictionary() on that. For reverse order, use .OrderByDescending(pair => pair.Value).

Resources