There is a method protected Long doInBackground(URL... urls) {}.
what does URL... urls means?
private class DownloadFilesTask extends AsyncTask<URL, Integer, Long> {
protected Long doInBackground(URL... urls) {
int count = urls.length;
long totalSize = 0;
for (int i = 0; i < count; i++) {
totalSize += Downloader.downloadFile(urls[i]);
publishProgress((int) ((i / (float) count) * 100));
// Escape early if cancel() is called
if (isCancelled()) break;
}
return totalSize;
}
protected void onProgressUpdate(Integer... progress) {
setProgressPercent(progress[0]);
}
protected void onPostExecute(Long result) {
showDialog("Downloaded " + result + " bytes");
}
}
These is from http://developer.android.com/reference/android/os/AsyncTask.html
The "type ... name" syntax is Java for variable-length argument list. It means zero or more URL values. They are referenced by treating urls (in this case) as an array (URL[] urls) in the method. In the case of AsyncTask, you'd pass one or more URLs in the .execute() method: .execute(url1, url2, url3).
Related
I was trying to solve a problem using java 8 that I have already solved using a simple for loop. However I have no idea how to do this.
The Problem is :
File1 :
1,sdfasfsf
2,sdfhfghrt
3,hdfxcgyjs
File2 :
10,xhgdfgxgf
11,hcvcnhfjh
12,sdfgasasdfa
13,ghdhtfhdsdf
Output should be like
1,sdfasfsf
10,xhgdfgxgf
2,sdfhfghrt
11,hcvcnhfjh
3,hdfxcgyjs
12,sdfgasasdfa
13,ghdhtfhdsdf
I already have this basically working,
The core logic is :
List<String> left = readFile(lhs);
List<String> right = readFile(rhs);
int leftSize = left.size();
int rightSize = right.size();
int size = leftSize > rightSize? leftSize : right.size();
for (int i = 0; i < size; i++) {
if(i < leftSize) {
merged.add(left.get(i));
}
if(i < rightSize) {
merged.add(right.get(i));
}
}
MergeInputs.java
UnitTest
Input files are in src/test/resources/com/linux/test/merge/list of the same repo (only allowed to post two links)
However, I boasted I could do this easily using streams and now I am not sure if this can even be done.
Help is really appreciated.
You may simplify your operation to have less conditionals per element:
int leftSize = left.size(), rightSize = right.size(), min = Math.min(leftSize, rightSize);
List<String> merged = new ArrayList<>(leftSize+rightSize);
for(int i = 0; i < min; i++) {
merged.add(left.get(i));
merged.add(right.get(i));
}
if(leftSize!=rightSize) {
merged.addAll(
(leftSize<rightSize? right: left).subList(min, Math.max(leftSize, rightSize)));
}
Then, you may replace the first part by a stream operation:
int leftSize = left.size(), rightSize = right.size(), min = Math.min(leftSize, rightSize);
List<String> merged=IntStream.range(0, min)
.mapToObj(i -> Stream.of(left.get(i), right.get(i)))
.flatMap(Function.identity())
.collect(Collectors.toCollection(ArrayList::new));
if(leftSize!=rightSize) {
merged.addAll(
(leftSize<rightSize? right: left).subList(min, Math.max(leftSize, rightSize)));
}
But it isn’t really simpler than the loop variant. The loop variant may be even more efficient due to its presized list.
Incorporating both operation into one stream operation would be even more complicated (and probably even less efficient).
the code logic should be like as this:
int leftSize = left.size();
int rightSize = right.size();
int minSize = Math.min(leftSize,rightSize);
for (int i = 0; i < minSize; i++) {
merged.add(left.get(i));
merged.add(right.get(i));
}
// adding remaining elements
merged.addAll(
minSize < leftSize ? left.subList(minSize, leftSize)
: right.subList(minSize, rightSize)
);
Another option is using toggle mode through Iterator, for example:
toggle(left, right).forEachRemaining(merged::add);
//OR using stream instead
List<String> merged = Stream.generate(toggle(left, right)::next)
.limit(left.size() + right.size())
.collect(Collectors.toList());
the toggle method as below:
<T> Iterator<? extends T> toggle(List<T> left, List<T> right) {
return new Iterator<T>() {
private final int RIGHT = 1;
private final int LEFT = 0;
int cursor = -1;
Iterator<T>[] pair = arrayOf(left.iterator(), right.iterator());
#SafeVarargs
private final Iterator<T>[] arrayOf(Iterator<T>... iterators) {
return iterators;
}
#Override
public boolean hasNext() {
for (Iterator<T> each : pair) {
if (each.hasNext()) {
return true;
}
}
return false;
}
#Override
public T next() {
return pair[cursor = next(cursor)].next();
}
private int next(int cursor) {
cursor=pair[LEFT].hasNext()?pair[RIGHT].hasNext()?cursor: RIGHT:LEFT;
return (cursor + 1) % pair.length;
}
};
}
Based on the current implementation, I will get an arraylist which contains some 1000 unique names in the alphabetically sorted order(A-Z or Z-A) from some source.
I need to find the index of the first word starting with a given alphabet.
So to be more precise, when I select an alphabet, for eg. "M", it should give me the index of the first occurrence of the word starting in "M" form the sorted list.
And that way I should be able to find the index of all the first words starting in each of the 26 alphabets.
Please help me find a solution which doesn't compromise on the speed.
UPDATE:
Actually after getting the 1000 unique names, the sorting is also done by one of my logics.
If this can be done while doing the sorting itself, I can avoid the reiteration on the list after sorting to find the indices for the alphabets.
Is that possible?
Thanks,
Sen
I hope this little piece of code will help you. I guessed the question is related to Java, because you mentioned ArrayList.
String[] unsorted = {"eve", "bob", "adam", "mike", "monica", "Mia", "marta", "pete", "Sandra"};
ArrayList<String> names = new ArrayList<String>(Arrays.asList(unsorted));
String letter = "M"; // find index of this
class MyComp implements Comparator<String>{
String first = "";
String letter;
MyComp(String letter){
this.letter = letter.toUpperCase();
}
public String getFirst(){
return first;
}
#Override
public int compare(String s0, String s1) {
if(s0.toUpperCase().startsWith(letter)){
if(s0.compareTo(first) == -1 || first.equals("")){
first = s0;
}
}
return s0.toUpperCase().compareTo(s1.toUpperCase());
}
};
MyComp mc = new MyComp(letter);
Collections.sort(names, mc);
int index = names.indexOf(mc.getFirst()); // the index of first name starting with letter
I'm not sure if it's possible to also store the index of the first name in the comparator without much overhead. Anyway, if you implement your own version of sorting algorithm e.g. quicksort, you should know about the index of the elements and could calculate the index while sorting. This depends on your chosen sorting algorithm and implementation. In fact if I know how your sorting is implemented, we could insert the index calculation.
So I came up with my own solution for this.
package test.binarySearch;
import java.util.Random;
/**
*
* Binary search to find the index of the first starting in an alphabet
*
* #author Navaneeth Sen <navaneeth.sen#multichoice.co.za>
*/
class SortedWordArray
{
private final String[] a; // ref to array a
private int nElems; // number of data items
public SortedWordArray(int max) // constructor
{
a = new String[max]; // create array
nElems = 0;
}
public int size()
{
return nElems;
}
public int find(String searchKey)
{
return recFind(searchKey, 0, nElems - 1);
}
String array = null;
int arrayIndex = 0;
private int recFind(String searchKey, int lowerBound,
int upperBound)
{
int curIn;
curIn = (lowerBound + upperBound) / 2;
if (a[curIn].startsWith(searchKey))
{
array = a[curIn];
if ((curIn == 0) || !a[curIn - 1].startsWith(searchKey))
{
return curIn; // found it
}
else
{
return recFind(searchKey, lowerBound, curIn - 1);
}
}
else if (lowerBound > upperBound)
{
return -1; // can't find it
}
else // divide range
{
if (a[curIn].compareTo(searchKey) < 0)
{
return recFind(searchKey, curIn + 1, upperBound);
}
else // it's in lower half
{
return recFind(searchKey, lowerBound, curIn - 1);
}
} // end else divide range
} // end recFind()
public void insert(String value) // put element into array
{
int j;
for (j = 0; j < nElems; j++) // find where it goes
{
if (a[j].compareTo(value) > 0) // (linear search)
{
break;
}
}
for (int k = nElems; k > j; k--) // move bigger ones up
{
a[k] = a[k - 1];
}
a[j] = value; // insert it
nElems++; // increment size
} // end insert()
public void display() // displays array contents
{
for (int j = 0; j < nElems; j++) // for each element,
{
System.out.print(a[j] + " "); // display it
}
System.out.println("");
}
} // end class OrdArray
class BinarySearchWordApp
{
static final String AB = "12345aqwertyjklzxcvbnm";
static Random rnd = new Random();
public static String randomString(int len)
{
StringBuilder sb = new StringBuilder(len);
for (int i = 0; i < len; i++)
{
sb.append(AB.charAt(rnd.nextInt(AB.length())));
}
return sb.toString();
}
public static void main(String[] args)
{
int maxSize = 100000; // array size
SortedWordArray arr; // reference to array
int[] indices = new int[27];
arr = new SortedWordArray(maxSize); // create the array
for (int i = 0; i < 100000; i++)
{
arr.insert(randomString(10)); //insert it into the array
}
arr.display(); // display array
String searchKey;
for (int i = 97; i < 124; i++)
{
searchKey = (i == 123)?"1":Character.toString((char) i);
long time_1 = System.currentTimeMillis();
int result = arr.find(searchKey);
long time_2 = System.currentTimeMillis() - time_1;
if (result != -1)
{
indices[i - 97] = result;
System.out.println("Found " + result + "in "+ time_2 +" ms");
}
else
{
if (!(i == 97))
{
indices[i - 97] = indices[i - 97 - 1];
}
System.out.println("Can't find " + searchKey);
}
}
for (int i = 0; i < indices.length; i++)
{
System.out.println("Index [" + i + "][" + (char)(i+97)+"] = " + indices[i]);
}
} // end main()
}
All comments welcome.
I need to emit a 2D double array as key and value from mapper. There are questions posted in Stack Overflow, but they are not answered.
I am doing some of the matrix multiplication in a given dataset, and after that I need to emit the value of A*Atrns which will be a matrix as key and Atrans*D which will also be a matrix as value. So how to emit these matrices from mapper. And the value should be corresponding to the key itself.
ie key -----> A*Atrans--------->after multiplication the result will be a 2D array which is declared as double (matrix) lets say the result be Matrix "Ekey"(double[][] Ekey)
value ------> Atrans*D ---------> after multiplication the result will be Matrix "Eval" (double[][] Eval).
After that I need to emit these matrix to reducer for further calculations.
So in mapper:
context.write(Ekey,Eval);
Reducer:
I need to do further calculations with these Ekey and Eval.
I wrote my class:
UPDATE
public class MatrixWritable implements WritableComparable<MatrixWritable>{
/**
* #param args
*/
private double[][] value;
private double[][] values;
public MatrixWritable() {
// TODO Auto-generated constructor stub
setValue(new double[0][0]);
}
public MatrixWritable(double[][] value) {
// TODO Auto-generated constructor stub
this.value = value;
}
public void setValue(double[][] value) {
this.value = value;
}
public double[][] getValue() {
return values;
}
#Override
public void write(DataOutput out) throws IOException {
out.writeInt(value.length); // write values
for (int i = 0; i < value.length; i++) {
out.writeInt(value[i].length);
}
for (int i = 0; i < value.length; i++) {
for (int j = 0; j < value[i].length; j++) {
out.writeDouble(value[i][j]);
}
}
}
#Override
public void readFields(DataInput in) throws IOException {
value = new double[in.readInt()][];
for (int i = 0; i < value.length; i++) {
value[i] = new double[in.readInt()];
}
values = new double[value.length][value[0].length];
for(int i=0;i<value.length ; i++){
for(int j= 0 ; j< value[0].length;j++){
values[i][j] = in.readDouble();
}
}
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + Arrays.hashCode(value);
return result;
}
/* (non-Javadoc)
* #see java.lang.Object#equals(java.lang.Object)
*/
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof MatrixWritable)) {
return false;
}
MatrixWritable other = (MatrixWritable) obj;
if (!Arrays.deepEquals(value, other.value)) {
return false;
}
return true;
}
#Override
public int compareTo(MatrixWritable o) {
// TODO Auto-generated method stub
return 0;
}
public String toString() { String separator = "|";
StringBuffer result = new StringBuffer();
// iterate over the first dimension
for (int i = 0; i < values.length; i++) {
// iterate over the second dimension
for(int j = 0; j < values[i].length; j++){
result.append(values[i][j]);
result.append(separator);
}
// remove the last separator
result.setLength(result.length() - separator.length());
// add a line break.
result.append(",");
}
return result.toString();
}
}
I am able to emit a value as matrix from mapper
context.write(...,new MatrixWritable(AAtrans));
How to emit matrix AtransD as key from mapper?
For that I need to write compareto() method, right?
What should be included in that method?
First, to implement a custom key you must implement WritableComparable. To implement a custom value you must implement Writable. In many cases since it is handy to be able to swap keys and values most people write all custom types as WritableComparable.
Here is a link to the section of Hadoop: The Definitive Guide that covers writing a WritableComparable. Writing A Custom Writable
The trick with writing out an array is that on the read side you need to know how many elements to read. So the basic patter is...
On write:
write the number of elements
write each element
On read:
read the number of elements (n)
create an array of the appropriate size
read 0 - (n-1) elements and populate array
Update
You should instantiate your array as empty in the default constructor to prevent a NullPointerException later.
The problem with your implementation is that it assumes that each inner array is of the same length. If that is true, you don't need to calculate the column length more than once. If false, you need to write the length of each row before writing the values of the row.
I would suggest something like this:
context.write(row); // as calculated above
for (int i=0; i<row; i++){
double[] rowVals = array[row];
context.write(rowVals.length);
for (int j=0; j<rowVals.length; j++)
context.write(rowVals[j]);
}
Here is my problem, I'm modifying code I found for Genetic Algorithms to do numerical optimization of a function. Essentially, given a function F and our Desired Value, the program uses GA to searches for values of x and y which provide the appropriate Desired Value.
I keep tinkering with my fitness function, which I feel is the root of the issue.
The basic code break down is:
Generate a random chromosome population
Use a bubble sort based on each chromosomes fitness
Check if any of them happen to solve the function
If one solves it, then stop and print it
Else,
Generate children based on the parents
Sort, check the best answer, loop
I hope someone can point me in the right direction I'm going to dissect it again some more tonight but I seem to have hit a snag on this. For more complex functions than that I have hard coded, it seems to converge around a random percentage (usually less than 20)... but it should be much closer to 0. The simple coded function keeps returning around 99% difference... so I'm not 100% on whats up.
import java.util.*;
import java.util.Comparator;
import java.util.TreeMap;
/**
* Modified from a file created Jul 9, 2003
* Original #author Fabian Jones
* Modified #author Cutright
* #version 2
*/
public class ScratchGA
{
private static int NUM_CHROMOSOMES = 100; //num of chromosomes in population
private static double MUTATE = .01; //chance of a mutation i.e. 88.8%
private static int desiredValue = 60466176; //desired value of function
private static int cutoff = 1000; // number of iterations before cut off
private static int longPrint = 0; //1 means print out each iteration of the population
private boolean done = false;
private Chromosome[] population;
int iteration = 0;
/**
* Constructor for objects of class ScratchGA
*/
public ScratchGA()
{
generateRandomPopulation(NUM_CHROMOSOMES);
printPopulation();
}
/**
* Generate a random population of chromosomes - WORKS
*
*/
private void generateRandomPopulation(int pop)
{
System.out.println("Generating random population of " + pop + ", now." +"\n");
population = new Chromosome[pop];
for(int i=0; i<pop; i++)
{
int rand = (int)(Math.random()*4095); // Range 0 to 4095
population[i] = (new Chromosome(rand, 12));
}
}
/**
* Codesaver for generating a new line in the output
*/
private void newLine()
{
System.out.println("\n");
}
/**
* Prints the population (the chromosomes)
*/
private void printPopulation()
{
int x=1; // variable to print 10 chromosomes on a line
if (iteration==0)
{
System.out.println("Initial population: " + "\n" );
}
else
{
if (longPrint ==1)
{
System.out.println("Population " + iteration + " :" + "\n");
for(int i=0; i<=(NUM_CHROMOSOMES-1); i++)
{
System.out.print(population[i] + " ");
if(x == 10)
{
newLine();
x=1;
}
else
{
x++;
}
}
newLine();
}
else
{
System.out.println("Best answer for iteration " + iteration + " is: " + population[0] + " with a % difference of " +population[0].getFitness());
newLine();
}
}
}
/**
* Start
* Bubblesort initial population by their fitness, see if the first chromosome
* in the sorted array satisfies our constraint.
* IF done ==true or max num of iterations
* Print best solution and its fitness
* ELSE
* generate new population based on old one, and continue on
*/
public void start()
{
// System.out.println("Starting bubblesort... Please Wait.");
bubbleSort();
//System.out.println("After Bubblesort: " );
//printPopulation();
topFitness();
if(done || iteration==cutoff){
System.out.println("DONE!!");
System.out.println("Best solution: " + population[0] + " % Difference: " + population[0].getFitness());
}
else{
iteration++;
generateNewPopulation();
printPopulation();
start();
}
}
/**
* If the top chromosomes fitness (after being sorted by bubblesort) is 100%
* done == true
*/
private void topFitness()
{
if (population[0].getFitness() == 0)
{
done = true;
}
}
/**
* Called from chromosome,
* Tests the x and y values in the function and returns their output
*/
public static double functionTest(int x, int y)
{
return (3*x)^(2*y); // From our desired value we're looking for x=2, y=5
}
/**
* Returns the desired outcome of the function, with ideal x and y
* Stored above in a private static
*/
public static int getDesired()
{
return desiredValue;
}
/**
* Sort Chromosome array, based on fitness
* utilizes a bubblesort
*/
private void bubbleSort()
{
Chromosome temp;
for(int i=0; i<NUM_CHROMOSOMES; i++){
for(int j=1; j<(NUM_CHROMOSOMES-i); j++){
if(population[j-1].getFitness() > population[j].getFitness())
{
//swap elements
temp = population[j-1];
population[j-1] = population[j];
population[j] = temp;
}
}
}
}
/**
* Top 30: Elitism
* Next 60: Offspring of Elitists
* Next 10: Random
*/
private void generateNewPopulation(){
System.out.println("***Generating New Population");
Chromosome[] temp = new Chromosome[100];
for (int i = 0; i < 30; i++)
{
Chromosome x = population[i];
if (shouldMutate())
mutate(x);
temp[i]=x;
}
for (int i = 0; i < 30; i++)
{
temp[i+30] =cross1(population[i], population[i+1]);
temp[i+60] = cross2(population[i], population[i+1]);
}
for (int i = 90; i<100; i++)
{
int rand = (int)(Math.random()*4095); // Range 0 to 4095
Chromosome x = new Chromosome(rand, 12);
temp[i] = x;
}
population = temp;
}
/**
* First cross type, with two parents
*/
private Chromosome cross1(Chromosome parent1, Chromosome parent2){
String bitS1 = parent1.getBitString();
String bitS2 = parent2.getBitString();
int length = bitS1.length();
int num = (int)(Math.random()*length); // number from 0 to length-1
String newBitString = bitS2.substring(0, num) + bitS1.substring(num, length);
Chromosome offspring = new Chromosome();
offspring.setBitString(newBitString);
if(shouldMutate()){
mutate(offspring);
}
return offspring;
}
/**
* Second cross type, parents given in same order as first, but reverses internal workings
*/
private Chromosome cross2(Chromosome parent1, Chromosome parent2){
String bitS1 = parent1.getBitString();
String bitS2 = parent2.getBitString();
int length = bitS1.length();
int num = (int)(Math.random()*length); // number from 0 to length-1
String newBitString = bitS2.substring(0, num) + bitS1.substring(num, length);
Chromosome offspring = new Chromosome();
offspring.setBitString(newBitString);
if(shouldMutate()){
mutate(offspring);
}
return offspring;
}
/**
* Returns a boolean of whether a character should mutate based on the mutation value at top
*/
private boolean shouldMutate(){
double num = Math.random()*100;
return (num <= MUTATE);
}
/**
* Returns a boolean of whether a character should mutate based on the mutation value at top
*/
private void mutate(Chromosome offspring){
String s = offspring.getBitString();
int num = s.length();
int index = (int) (Math.random()*num);
String newBit = flip(s.substring(index, index+1));
String newBitString = s.substring(0, index) + newBit + s.substring(index+1, s.length());
offspring.setBitString(newBitString);
}
/**
* Flips bits in a string 1 to 0, 0 to 1
*/
private String flip(String s){
return s.equals("0")? "1":"0";
}
}
import java.lang.Comparable;
import java.math.*;
/**
* Modified from a file created on Jul 9, 2003
* Unsure of original author
*
*/
public class Chromosome implements Comparable
{
protected String bitString;
/**
* Constructor for objects of class Chromosome
*/
public Chromosome()
{
}
public Chromosome(int value, int length)
{
bitString = convertIntToBitString(value, length);
}
public void setBitString(String s)
{
bitString = s;
}
public String getBitString()
{
return bitString;
}
public int compareTo(Object o)
{
Chromosome c = (Chromosome) o;
int num = countOnes(this.bitString) - countOnes(c.getBitString());
return num;
}
public double getFitness()
{
String working = bitString;
int x1 = Integer.parseInt(working.substring(0,6),2);
int x2 = Integer.parseInt(working.substring(6),2);
double result = ScratchGA.functionTest(x1,x2);
double percentDiff = ((ScratchGA.getDesired() - result)/ScratchGA.getDesired())*100;
if (percentDiff >= 0)
{
return percentDiff;
}
else
{
return -percentDiff;
}
}
public boolean equals(Object o)
{
if(o instanceof Chromosome)
{
Chromosome c = (Chromosome) o;
return c.getBitString().equals(bitString);
}
return false;
}
public int hashCode()
{
return bitString.hashCode();
}
public String toString()
{
return bitString;
}
public static int countOnes(String bits)
{
int sum = 0;
for(int i = 0; i < bits.length(); ++ i){
String test = bits.substring(i, i+1);
if(test.equals("1")){
sum = sum + 1;
}
}
return sum;
}
public static String convertIntToBitString(int val, int length)
{
int reval = val;
StringBuffer bitString = new StringBuffer(length);
for(int i = length-1; i >=0; --i ){
if( reval - (Math.pow(2, i)) >= 0 ){
bitString.append("1");
reval = (int) (reval - Math.pow(2, i));
}
else{
bitString.append("0");
}
}
return bitString.toString();
}
public static void main(String[] args
){
//System.out.println(convertIntToBitString(2046, 10));
Chromosome c = new Chromosome(1234, 10);
//System.out.println(c.fitness());
}
}
Actually, it was a simple error that eluded me, that I should have caught. The major issue was in using return (3*x)^(2*y); ^ is a bitwise XOR in java, but an exponent. (Whoops) The problem was rectified using Math.pow(3*x,2*y); ...and a little double check of the fitness function had it up and running with some other minor changes :)
In an C#-4.0 application, I have a Dictionary of strongly typed ILists having the same length - a dynamically strongly typed column based table.
I want the user to provide one or more (python-)expressions based on the available columns that will be aggregated over all rows. In a static context it would be:
IDictionary<string, IList> table;
// ...
IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += (double)a[i] / b[i]; // Expression to sum up
For n = 10^7 this runs in 0.270 sec on my laptop (win7 x64). Replacing the expression by a delegate with two int arguments it takes 0.580 sec, for a nontyped delegate 1.19 sec.
Creating the delegate from IronPython with
IDictionary<string, IList> table;
// ...
var options = new Dictionary<string, object>();
options["DivisionOptions"] = PythonDivisionOptions.New;
var engine = Python.CreateEngine(options);
string expr = "a / b";
Func<int, int, double> f = engine.Execute("lambda a, b : " + expr);
IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += f(a[i], b[i]);
it takes 3.2 sec (and 5.1 sec with Func<object, object, object>) - factor 4 to 5.5. Is this the expected overhead for what I'm doing? What could be improved?
If I have many columns, the approach chosen above will not be sufficient any more. One solution could be to determine the required columns for each expression and use only those as arguments. The other solution I've unsuccessfully tried was using a ScriptScope and dynamically resolve the columns. For that I defined a RowIterator that has a RowIndex for the active row and a property for each column.
class RowIterator
{
IList<int> la;
IList<int> lb;
public RowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
}
A ScriptScope can be created from a IDynamicMetaObjectProvider, which I expected to be implemented by C#'s dynamic - but at runtime engine.CreateScope(IDictionary) is trying to be called, which fails.
dynamic iterator = new RowIterator(a, b) as dynamic;
var scope = engine.CreateScope(iterator);
var expr = engine.CreateScriptSourceFromString("a / b").Compile();
double sum = 0;
for (int i = 0; i < n; i++)
{
iterator.Index = i;
sum += expr.Execute<double>(scope);
}
Next I tried to let RowIterator inherit from DynamicObject and made it to a running example - with terrible performance: 158 sec.
class DynamicRowIterator : DynamicObject
{
Dictionary<string, object> members = new Dictionary<string, object>();
IList<int> la;
IList<int> lb;
public DynamicRowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
if (binder.Name == "a") // Why does this happen?
{
result = this.a;
return true;
}
if (binder.Name == "b")
{
result = this.b;
return true;
}
if (base.TryGetMember(binder, out result))
return true;
if (members.TryGetValue(binder.Name, out result))
return true;
return false;
}
public override bool TrySetMember(SetMemberBinder binder, object value)
{
if (base.TrySetMember(binder, value))
return true;
members[binder.Name] = value;
return true;
}
}
I was surprised that TryGetMember is called with the name of the properties. From the documentation I would have expected that TryGetMember would only be called for undefined properties.
Probably for a sensible performance I would need to implement IDynamicMetaObjectProvider for my RowIterator to make use of dynamic CallSites, but couldn't find a suited example for me to start with. In my experiments I didn't know how to handle __builtins__ in BindGetMember:
class Iterator : IDynamicMetaObjectProvider
{
IList<int> la;
IList<int> lb;
public Iterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
public DynamicMetaObject GetMetaObject(Expression parameter)
{
return new MetaObject(parameter, this);
}
private class MetaObject : DynamicMetaObject
{
internal MetaObject(Expression parameter, Iterator self)
: base(parameter, BindingRestrictions.Empty, self) { }
public override DynamicMetaObject BindGetMember(GetMemberBinder binder)
{
switch (binder.Name)
{
case "a":
case "b":
Type type = typeof(Iterator);
string methodName = binder.Name;
Expression[] parameters = new Expression[]
{
Expression.Constant(binder.Name)
};
return new DynamicMetaObject(
Expression.Call(
Expression.Convert(Expression, LimitType),
type.GetMethod(methodName),
parameters),
BindingRestrictions.GetTypeRestriction(Expression, LimitType));
default:
return base.BindGetMember(binder);
}
}
}
}
I'm sure my code above is suboptimal, at least it doesn't handle the IDictionary of columns yet. I would be grateful for any advices on how to improve design and/or performance.
I also compared the performance of IronPython against a C# implementation. The expression is simple, just adding the values of two arrays at a specified index. Accessing the arrays directly provides the base line and theoretical optimum. Accessing the values via a symbol dictionary has still acceptable performance.
The third test creates a delegate from a naive (and bad by intend) expression tree without any fancy stuff like call-side caching, but it's still faster than IronPython.
Scripting the expression via IronPython takes the most time. My profiler shows me that most time is spent in PythonOps.GetVariable, PythonDictionary.TryGetValue and PythonOps.TryGetBoundAttr. I think there's room for improvement.
Timings:
Direct: 00:00:00.0052680
via Dictionary: 00:00:00.5577922
Compiled Delegate: 00:00:03.2733377
Scripted: 00:00:09.0485515
Here's the code:
public static void PythonBenchmark()
{
var engine = Python.CreateEngine();
int iterations = 1000;
int count = 10000;
int[] a = Enumerable.Range(0, count).ToArray();
int[] b = Enumerable.Range(0, count).ToArray();
Dictionary<string, object> symbols = new Dictionary<string, object> { { "a", a }, { "b", b } };
Func<int, object> calculate = engine.Execute("lambda i: a[i] + b[i]", engine.CreateScope(symbols));
var sw = Stopwatch.StartNew();
int sum = 0;
for (int iteration = 0; iteration < iterations; iteration++)
{
for (int i = 0; i < count; i++)
{
sum += a[i] + b[i];
}
}
Console.WriteLine("Direct: " + sw.Elapsed);
sw.Restart();
for (int iteration = 0; iteration < iterations; iteration++)
{
for (int i = 0; i < count; i++)
{
sum += ((int[])symbols["a"])[i] + ((int[])symbols["b"])[i];
}
}
Console.WriteLine("via Dictionary: " + sw.Elapsed);
var indexExpression = Expression.Parameter(typeof(int), "index");
var indexerMethod = typeof(IList<int>).GetMethod("get_Item");
var lookupMethod = typeof(IDictionary<string, object>).GetMethod("get_Item");
Func<string, Expression> getSymbolExpression = symbol => Expression.Call(Expression.Constant(symbols), lookupMethod, Expression.Constant(symbol));
var addExpression = Expression.Add(
Expression.Call(Expression.Convert(getSymbolExpression("a"), typeof(IList<int>)), indexerMethod, indexExpression),
Expression.Call(Expression.Convert(getSymbolExpression("b"), typeof(IList<int>)), indexerMethod, indexExpression));
var compiledFunc = Expression.Lambda<Func<int, object>>(Expression.Convert(addExpression, typeof(object)), indexExpression).Compile();
sw.Restart();
for (int iteration = 0; iteration < iterations; iteration++)
{
for (int i = 0; i < count; i++)
{
sum += (int)compiledFunc(i);
}
}
Console.WriteLine("Compiled Delegate: " + sw.Elapsed);
sw.Restart();
for (int iteration = 0; iteration < iterations; iteration++)
{
for (int i = 0; i < count; i++)
{
sum += (int)calculate(i);
}
}
Console.WriteLine("Scripted: " + sw.Elapsed);
Console.WriteLine(sum); // make sure cannot be optimized away
}
Although I don't know all the specific details in your case, a slowdown of only 5x for doing anything this low level in IronPython is actually pretty good. Most entries in the Computer Languages Benchmark Game show a 10-30x slowdown.
A major part of the reason is that IronPython has to allow for the possibility that you've done something sneaky at runtime, and thus can't produce code of the same efficiency.