Understanding of results produced by parallel Java 8 streams - java-8

Not getting the same results every time when I run a parallel stream for reading a file and processing it.
I have data regarding pizzas and want to have a count of different variables using Map and global variables. I should use global variables only. But when I run my code, I am getting different results every time. The input file is not modified at all.
package Assignment;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class GlobalVariables {
static int vegPizzas = 0;
static int N_V_Pizzas = 0;
static int Size_regular = 0;
static int Size_medium = 0;
static int Size_large = 0;
static int Cheese_Burst = 0;
static int Cheese_regular = 0;
static int cheap_cheese = 0;
static Stream<String> reader;
static int rows = 0;
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
reader = Files.lines(Paths.get("data/SampleData.csv")).parallel();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
reader.map(x -> x.split(","))
.filter(x -> (!(x[0].equals("NULL") || x[1].equals("NULL") || x[2].equals("NULL") || x[3].equals("NULL"))))
.map(x -> updateCounts(x)).collect(Collectors.toList());
printResults();
System.out.println(rows);//to have a count of rows after filtering of NULL values is done
}
private static void printResults() {
// TODO Auto-generated method stub
System.out.println("veg pizzas: " + vegPizzas);
System.out.println("non veg pizzas: " + N_V_Pizzas);
System.out.println("regular: " + Size_regular + " medium: " + Size_medium + " large: " +Size_large);
System.out.println("cheese burst pizzas: " + Cheese_Burst);
System.out.println("regular cheese burst: " + Cheese_regular);
System.out.println("cheaper cheese burst: " + cheap_cheese);
}
private static Object updateCounts(String[] x) {
// TODO Auto-generated method stub
// static int vegPizzas = 0;
// static int N_V_Pizzas = 0;
// static int Size_regular = 0;
// static int Size_medium = 0;
// static int Size_large = 0;
// static int Cheese_Burst = 0;
// static int Cheese_regular = 0;
// static int cheap_cheese = 0;
rows++;
int flag_regular = 0;
if(x[9].equals("Y")) {
vegPizzas++;
}else if(x[9].equals("N")) {
N_V_Pizzas++;
}
if(x[6].equals("R")) {
Size_regular++;
flag_regular = 1;
}
else if(x[6].equals("M")) {
Size_medium++;
}
else if(x[6].equals("L")) {
Size_large++;
}
if(x[5].equals("Y")) {
Cheese_Burst++;
if(flag_regular == 1) {
Cheese_regular++;
}
if(Integer.parseInt(x[7]) < 500) {
cheap_cheese++;
}
}
return x;
}
}
//True results or expected results (counts of each variety)
veg: 5303
non-veg: 1786
regular: 1779 medium: 2660 large: 2650
cheese-burst: 3499
regular cheese burst: 900
cheaper cheese burst: 598
//Run -1 results
veg pizzas: 5296
non veg pizzas: 1785
regular: 1779 medium: 2660 large: 2649
cheese burst pizzas: 3498
regular cheese burst: 900
cheaper cheese burst: 598
7060
//Run-2 results
veg pizzas: 5294
non veg pizzas: 1786
regular: 1779 medium: 2659 large: 2648
cheese burst pizzas: 3497
regular cheese burst: 900
cheaper cheese burst: 598
7055
//Run-3 results
veg pizzas: 5303
non veg pizzas: 1786
regular: 1779 medium: 2660 large: 2650
cheese burst pizzas: 3499
regular cheese burst: 900
cheaper cheese burst: 598
7086
I did go through this link. I could not relate my problem with the problem posted in that link. I did notice that if I create a sequential stream I am getting the expected results. Any lead on this can be helpful.

Your updateCounts(String[] x) method, called by the map step of the Stream pipeline is not thread-safe, and it updates static variables.
So when it is called by multiple threads concurrently, it is expected to produce different results in each run (i.e. the final values of the static variables will be different in each run).
The Function passed to map shouldn't have side effects, especially not when used in a parallel Stream.
A better way to make this computation with Streams:
Create a PizzaStatistics class having all the original static counter variables as instance variables.
updateCounts (which should be renamed) would return a new PizzaStatistics instance where the relevant counters are set to 1. It won't update any static variables.
The Stream pipeline would use the terminal operation reduce to produce a single PizzaStatistics instance that contains the totals.

Related

Special numbers challenge in programming

First, sorry for my bad English.
Special numbers are numbers that the sum of the digits is divisible to the number of the digit.
Example: 135 is a special number because the sum of the digits is 1+3+5 = 9, the number of the digit is 3, and 9 is divisible to 3 because 9 % 3 == 0. 2,3,9,13,17,15,225, 14825 are also special numbers.
Requirement:
Write a program that read the number n (n <= 10^6) from a file named SNUMS.INP (SNUMS.INP can contain up to 10^6 numbers) and print the result out into the file SNUMS.OUT. Number n is the order of the special number and the result will be that special number in n order (sorry I don't know how to express it).
Example: n = 3 means you have to print out the 3rd special number which is 3, n = 10 you have to print out 10th special number which is 11, n = 13 you have to print out 13th special number which is 17, n = 15 you have to print out 15th special number which is 20.
The example bellow will demonstrate the file SNUMS.INP and SNUMS.OUT (Remember: SNUMS.INP can contain up to 10^6 numbers)
SNUMS.INP:
2
14
17
22
SNUMS.OUT:
2
19
24
35
I have my own alogrithm but the the running time exceeds 1 second (my SNUMS.INP has 10^6 numbers). So I need the optimal alogrithm so that the running time will be less than or equal 1s.
Guys I decide to post my own code which is written in Java, it always take more than 4 seconds to run. Could you guys please suggest some ideas to improve or how to make it run faster
import java.util.Scanner;
import java.io.*;
public class Test
{
public static void main(String[]args) throws IOException
{
File file = new File("SNUMS.INP");
Scanner inputFile = new Scanner(file);
int order = 1;
int i = 1;
int[] special = new int[1000000+1];
// Write all 10^6 special numbers into an array named "special"
while (order <= 1000000)
{
if (specialNumber(i) == true)
{
special[order] = i;
order++;
}
i++;
}
// Write the result to file
PrintWriter outputFile = new PrintWriter("SNUMS.OUT");
outputFile.println(special[inputFile.nextInt()]);
while (inputFile.hasNext())
outputFile.println(special[inputFile.nextInt()]);
outputFile.close();
}
public static boolean specialNumber(int i)
{
// This method check whether the number is a special number
boolean specialNumber = false;
byte count=0;
long sum=0;
while (i != 0)
{
sum = sum + (i % 10);
count++;
i = i / 10;
}
if (sum % count == 0) return true;
else return false;
}
}
This is file SNUMS.INP (sample) contains 10^6 numbers if you guys want to test.
https://drive.google.com/file/d/0BwOJpa2dAZlUNkE3YmMwZmlBOTg/view?usp=sharing
I've managed to solve it in 0.6 seconds on C# 6.0 (.Net 4.6 IA-64) at Core i7 3.2 GHz with HDD 7200 rpc; hope that precompution will be fast enough at your workstation:
// Precompute beautiful numbers
private static int[] BeautifulNumbers(int length) {
int[] result = new int[length];
int index = 0;
for (int i = 1; ; ++i) {
int sum = 0;
int count = 0;
for (int v = i; v > 0; sum += v % 10, ++count, v /= 10)
;
if (sum % count == 0) {
result[index] = i;
if (++index >= result.Length)
return result;
}
}
}
...
// Test file with 1e6 items
File.WriteAllLines(#"D:\SNUMS.INP", Enumerable
.Range(1, 1000000)
.Select(index => index.ToString()));
...
Stopwatch sw = new Stopwatch();
sw.Start();
// Precomputed numbers (about 0.3 seconds to be created)
int[] data = BeautifulNumbers(1000000);
// File (about 0.3 seconds for both reading and writing)
var result = File
.ReadLines(#"D:\SNUMS.INP")
.Select(line => data[int.Parse(line) - 1].ToString());
File.WriteAllLines(#"D:\SNUMS.OUT", result);
sw.Stop();
Console.Write("Elapsed time {0}", sw.ElapsedMilliseconds);
The output vary from
Elapsed time 516
to
Elapsed time 660
with average elapsed time at about 580 milliseconds
Now that you have the metaphor of abacus implemented below, here are some hints
instead of just incrementing with 1 inside a cycle, can we incremente more aggressively? Indeed we can, but with an extra bit of care.
first, how much aggressive we can be? Looking to 11 (first special with 2 digits), it doesn't pay to just increment by 1, we can increment it by 2. Looking to 102 (special with 3 digits), we can increment it by 3. Is it natural to think we should use increments equal with the number of digits?
now the "extra bit of care" - whenever the "increment by the number of digits" causes a "carry", the naive increment breaks. Because the carry will add 1 to the sum of digits, so that we may need to subtract that one from something to keep the sum of digits well behaved.
one of the issues in the above is that we jumped quite happily at "first special with N digits", but the computer is not us to see it at a glance. Fortunately, the "first special with N digits" is easy to compute: it is 10^(N-1)+(N-1) - 10^(N-1) brings an 1 and the rest is zero, and N-1 brings the rest to make the sum of digits be the first divisible with N. Of course, this will break down if N > 10, but fortunately the problem is limited to 10^6 special numbers, which will require at most 7 digits (the millionth specual number is 6806035 - 7 digits);
so, we can detect the "first special number with N digits" and we know we should try with care to increment it by N. Can we look now better into that "extra care"?.
The code - twice as speedy as the previous one and totally "orthodox" in obtaining the data (via getters instead of direct access to data members).
Feel free to inline:
import java.util.ArrayList;
import java.util.Arrays;
public class Abacus {
static protected int pow10[]=
{1,10,100,1000, 10000, 100000, 1000000, 10000000, 100000000}
;
// the value stored for line[i] corresponds to digit[i]*pow10[i]
protected int lineValues[];
protected int sumDigits;
protected int representedNumber;
public Abacus() {
this.lineValues=new int[0];
this.sumDigits=0;
this.representedNumber=0;
}
public int getLineValue(int line) {
return this.lineValues[line];
}
public void clearUnitLine() {
this.sumDigits-=this.lineValues[0];
this.representedNumber-=this.lineValues[0];
this.lineValues[0]=0;
}
// This is how you operate the abacus in real life being asked
// to add a number of units to the line presenting powers of 10
public boolean addWithCarry(int units, int line) {
if(line-1==pow10.length) {
// don't have enough pow10 stored
pow10=Arrays.copyOf(pow10, pow10.length+1);
pow10[line]=pow10[line-1]*10;
}
if(line>=this.lineValues.length) {
// don't have enough lines for the carry
this.lineValues=Arrays.copyOf(this.lineValues, line+1);
}
int digitOnTheLine=this.lineValues[line]/pow10[line];
int carryOnTheNextLine=0;
while(digitOnTheLine+units>=10) {
carryOnTheNextLine++;
units-=10;
}
if(carryOnTheNextLine>0) {
// we have a carry, the sumDigits will be affected
// 1. the next two statememts are equiv with "set a value of zero on the line"
this.sumDigits-=digitOnTheLine;
this.representedNumber-=this.lineValues[line];
// this is the new value of the digit to set on the line
digitOnTheLine+=units;
// 3. set that value and keep all the values synchronized
this.sumDigits+=digitOnTheLine;
this.lineValues[line]=digitOnTheLine*pow10[line];
this.representedNumber+=this.lineValues[line];
// 4. as we had a carry, the next line will be affected as well.
this.addWithCarry(carryOnTheNextLine, line+1);
}
else { // we an simply add the provided value without carry
int delta=units*pow10[line];
this.lineValues[line]+=delta;
this.representedNumber+=delta;
this.sumDigits+=units;
}
return carryOnTheNextLine>0;
}
public int getSumDigits() {
return this.sumDigits;
}
public int getRepresentedNumber() {
return this.representedNumber;
}
public int getLinesCount() {
return this.lineValues.length;
}
static public ArrayList<Integer> specials(int N) {
ArrayList<Integer> ret=new ArrayList<>(N);
Abacus abacus=new Abacus();
ret.add(1);
abacus.addWithCarry(1, 0); // to have something to add to
int increment=abacus.getLinesCount();
while(ret.size()<N) {
boolean hadCarry=abacus.addWithCarry(increment, 0);
if(hadCarry) {
// need to resynch the sum for a perfect number
int newIncrement=abacus.getLinesCount();
abacus.clearUnitLine();
if(newIncrement!=increment) {
// we switched powers of 10
abacus.addWithCarry(newIncrement-1, 0);
increment=newIncrement;
}
else { // simple carry
int digitsSum=abacus.getSumDigits();
// how much we should add to the last digit to make the sumDigits
// divisible again with the increment?
int units=increment-digitsSum % increment;
if(units<increment) {
abacus.addWithCarry(units, 0);
}
}
}
ret.add(abacus.getRepresentedNumber());
}
return ret;
}
// to understand how the addWithCarry works, try the following code
static void add13To90() {
Abacus abacus; // starts with a represented number of 0
// line==1 means units of 10^1
abacus.addWithCary(9, 1); // so this should make the abacus store 90
System.out.println(abacus.getRepresentedNumber());
// line==0 means units of 10^0
abacus.addWithCarry(13, 0);
System.out.println(abacus.getRepresentedNumber()); // 103
}
static public void main(String[] args) {
int count=1000000;
long t1=System.nanoTime();
ArrayList<Integer> s1=Abacus.specials(count);
long t2=System.nanoTime();
System.out.println("t:"+(t2-t1));
}
}
Constructing the numbers from their digits is bound to be faster.
Remember the abacus? Ever used one?
import java.util.ArrayList;
public class Specials {
static public ArrayList<Integer> computeNSpecials(int N) {
ArrayList<Integer> specials = new ArrayList<>();
int abacus[] = new int[0]; // at index i we have the digit for 10^i
// This way, when we don't have enough specials,
// we simply reallocate the array and continue
while (specials.size() < N) {
// see if a carry operation is necessary
int currDigit = 0;
for (; currDigit < abacus.length && abacus[currDigit] == 9; currDigit++) {
abacus[currDigit] = 0; // a carry occurs when adding 1
}
if (currDigit == abacus.length) {
// a carry, but we don't have enough lines on the abacus
abacus = new int[abacus.length + 1];
abacus[currDigit] = 1; // we resolved the carry, all the digits below
// are 0
} else {
abacus[currDigit]++; // we resolve the carry (if there was one),
currDigit = 0; // now it's safe to continue incrementing at 10^0
}
// let's obtain the current number and the sum of the digits
int sumDigits = 0;
for (int i = 0; i<abacus.length; i++) {
sumDigits += abacus[i];
}
// is it special?
if (sumDigits % abacus.length == 0) {
// only now compute the number and collect it as special
int number = 0;
for (int i = abacus.length - 1; i >= 0; i--) {
number = 10 * number + abacus[i];
}
specials.add(number);
}
}
return specials;
}
static public void main(String[] args) {
ArrayList<Integer> specials=Specials.computeNSpecials(100);
for(int i=0; i<specials.size(); i++) {
System.out.println(specials.get(i));
}
}
}

Efficient tuple search algorithm

Given a store of 3-tuples where:
All elements are numeric ex :( 1, 3, 4) (1300, 3, 15) (1300, 3, 15) …
Tuples are removed and added frequently
At any time the store is typically under 100,000 elements
All Tuples are available in memory
The application is interactive requiring 100s of searches per second.
What are the most efficient algorithms/data structures to perform wild card (*) searches such as:
(1, *, 6) (3601, *, *) (*, 1935, *)
The aim is to have a Linda like tuple space but on an application level
Well, there are only 8 possible arrangements of wildcards, so you can easily construct 6 multi-maps and a set to serve as indices: one for each arrangement of wildcards in the query. You don't need an 8th index because the query (*,*,*) trivially returns all tuples. The set is for tuples with no wildcards; only a membership test is needed in this case.
A multimap takes a key to a set. In your example, e.g., the query (1,*,6) would consult the multimap for queries of the form (X,*,Y), which takes key <X,Y> to the set of all tuples with X in the first position and Y in third. In this case, X=1 and Y=6.
With any reasonable hash-based multimap implementation, lookups ought to be very fast. Several hundred a second ought to be easy, and several thousand per second doable (with e.g a contemporary x86 CPU).
Insertions and deletions require updating the maps and set. Again this ought to be reasonably fast, though not as fast as lookups of course. Again several hundred per second ought to be doable.
With only ~10^5 tuples, this approach ought to be fine for memory as well. You can save a bit of space with tricks, e.g. keeping a single copy of each tuple in an array and storing indices in the map/set to represent both key and value. Manage array slots with a free list.
To make this concrete, here is pseudocode. I'm going to use angle brackets <a,b,c> for tuples to avoid too many parens:
# Definitions
For a query Q <k2,k1,k0> where each of k_i is either * or an integer,
Let I(Q) be a 3-digit binary number b2|b1|b0 where
b_i=0 if k_i is * and 1 if k_i is an integer.
Let N(i) be the number of 1's in the binary representation of i
Let M(i) be a multimap taking a tuple with N(i) elements to a set
of tuples with 3 elements.
Let t be a 3 element tuple. Then T(t,i) returns a new tuple with
only the elements of t in positions where i has a 1. For example
T(<1,2,3>,0) = <> and T(<1,2,3>,6) = <2,3>
Note that function T works fine on query tuples with wildcards.
# Algorithm to insert tuple T into the database:
fun insert(t)
for i = 0 to 7
add the entry T(t,i)->t to M(i)
# Algorithm to delete tuple T from the database:
fun delete(t)
for i = 0 to 7
delete the entry T(t,i)->t from M(i)
# Query algorithm
fun query(Q)
let i = I(Q)
return M(i).lookup(T(Q, i)) # lookup failure returns empty set
Note that for simplicity, I've not shown the "optimizations" for M(0) and M(7). For M(0), the algorithm above would create a multimap taking the empty tuple to the set of all 3-tuples in the database. You can avoid this merely by treating i=0 as a special case. Similarly M(7) would take each tuple to a set containing only itself.
An "optimized" version:
fun insert(t)
for i = 1 to 6
add the entry T(t,i)->t to M(i)
add t to set S
fun delete(t)
for i = 1 to 6
delete the entry T(t,i)->t from M(i)
remove t from set S
fun query(Q)
let i = I(Q)
if i = 0, return S
elsif i = 7 return if Q\in S { Q } else {}
else return M(i).lookup(T(Q, i))
Addition
For fun, a Java implementation:
package hacking;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Random;
import java.util.Scanner;
import java.util.Set;
public class Hacking {
public static void main(String [] args) {
TupleDatabase db = new TupleDatabase();
int n = 200000;
long start = System.nanoTime();
for (int i = 0; i < n; ++i) {
db.insert(db.randomTriple());
}
long stop = System.nanoTime();
double elapsedSec = (stop - start) * 1e-9;
System.out.println("Inserted " + n + " tuples in " + elapsedSec
+ " seconds (" + (elapsedSec / n * 1000.0) + "ms per insert).");
Scanner in = new Scanner(System.in);
for (;;) {
System.out.print("Query: ");
int a = in.nextInt();
int b = in.nextInt();
int c = in.nextInt();
System.out.println(db.query(new Tuple(a, b, c)));
}
}
}
class Tuple {
static final int [] N_ONES = new int[] { 0, 1, 1, 2, 1, 2, 2, 3 };
static final int STAR = -1;
final int [] vals;
Tuple(int a, int b, int c) {
vals = new int[] { a, b, c };
}
Tuple(Tuple t, int code) {
vals = new int[N_ONES[code]];
int m = 0;
for (int k = 0; k < 3; ++k) {
if (((1 << k) & code) > 0) {
vals[m++] = t.vals[k];
}
}
}
#Override
public boolean equals(Object other) {
if (other instanceof Tuple) {
Tuple triple = (Tuple) other;
return Arrays.equals(this.vals, triple.vals);
}
return false;
}
#Override
public int hashCode() {
return Arrays.hashCode(this.vals);
}
#Override
public String toString() {
return Arrays.toString(vals);
}
int code() {
int c = 0;
for (int k = 0; k < 3; k++) {
if (vals[k] != STAR) {
c |= (1 << k);
}
}
return c;
}
Set<Tuple> setOf() {
Set<Tuple> s = new HashSet<>();
s.add(this);
return s;
}
}
class Multimap extends HashMap<Tuple, Set<Tuple>> {
#Override
public Set<Tuple> get(Object key) {
Set<Tuple> r = super.get(key);
return r == null ? Collections.<Tuple>emptySet() : r;
}
void put(Tuple key, Tuple value) {
if (containsKey(key)) {
super.get(key).add(value);
} else {
super.put(key, value.setOf());
}
}
void remove(Tuple key, Tuple value) {
Set<Tuple> set = super.get(key);
set.remove(value);
if (set.isEmpty()) {
super.remove(key);
}
}
}
class TupleDatabase {
final Set<Tuple> set;
final Multimap [] maps;
TupleDatabase() {
set = new HashSet<>();
maps = new Multimap[7];
for (int i = 1; i < 7; i++) {
maps[i] = new Multimap();
}
}
void insert(Tuple t) {
set.add(t);
for (int i = 1; i < 7; i++) {
maps[i].put(new Tuple(t, i), t);
}
}
void delete(Tuple t) {
set.remove(t);
for (int i = 1; i < 7; i++) {
maps[i].remove(new Tuple(t, i), t);
}
}
Set<Tuple> query(Tuple q) {
int c = q.code();
switch (c) {
case 0: return set;
case 7: return set.contains(q) ? q.setOf() : Collections.<Tuple>emptySet();
default: return maps[c].get(new Tuple(q, c));
}
}
Random gen = new Random();
int randPositive() {
return gen.nextInt(1000);
}
Tuple randomTriple() {
return new Tuple(randPositive(), randPositive(), randPositive());
}
}
Some output:
Inserted 200000 tuples in 2.981607358 seconds (0.014908036790000002ms per insert).
Query: -1 -1 -1
[[504, 296, 987], [500, 446, 184], [499, 482, 16], [488, 823, 40], ...
Query: 500 446 -1
[[500, 446, 184], [500, 446, 762]]
Query: -1 -1 500
[[297, 56, 500], [848, 185, 500], [556, 351, 500], [779, 986, 500], [935, 279, 500], ...
If you think of the tuples like a ip address, then a radix tree (trie) type structure might work. Radix tree is used for IP discovery.
Another way maybe to calculate use bit operations and calculate a bit hash for the tuple and in your search do bit (or, and) for quick discovery.

Longest Collatz Sequence

While doing my Java homework which is to implement the Collatz Conjecture, I thought of a different objective which is to find the longest Collatz sequence. My program counts the steps as follows:
public class Collatz {
static int count = 0;
static void bilgi (int n){
int result = n;
System.out.println("Result: "+result+ " Step: "+count);
if (result <= 1) {
result = 1;
} else if (result%2 == 0){
result = result/2;
count = count + 1;
bilgi(result);
} else {
result = (result*3)+1;
count = count + 1;
bilgi(result);
}
}
public static void main(String[] args) {
bilgi(27);
}
}
I want to find the highest step count.
static int bilgi(int n) {
int result = n;
if (result <= 1) return 1;
if (result % 2 == 0) return 1+bilgi(result/2);
return 1+bilgi(3*result+1);
}
Then you collect the results of bilgi(i) calls and select maximal.
The longest progression for any initial starting number less than 100 million is 63,728,127, which has 949 steps. For starting numbers less than 1 billion it is 670,617,279, with 986 steps, and for numbers less than 10 billion it is 9,780,657,630, with 1132 steps
source: http://en.wikipedia.org/wiki/Collatz_conjecture
If you're looking for max between 1 and 100 you could replace:
public static void main(String[] args) {
bilgi(27);
}
with :
public static void main(String[] args) {
static int maxcountsofar = 0;
static int start = 0;
static int thisone = 0;
for (int iloop = 1; iloop <= 100; iloop++)
{
thisone = bilgi(iloop);
if (thisone > maxcountsofar)//if this one is bigger than the highest count so far then
{
start = iloop;//save this information as best so far
maxcountsofar = thisone;
}
}
System.out.println("Result: " + start.Tostring() + " Step: " + maxcountsofar.Tostring() );
//I know this is a really old post but it looked like fun.
}
/*
also, take the println() out of the bilgi() function, it would generate a line for each step encountered which would be worthless and extremely time consuming.
Use Vesper's bigli() because it's much faster than yours.
*/
I know this is an old question, but I was just solving it and I would suggest for anyone doing this, just using an arraylist and getting the .size(), I did it that way, because I wanted to see the values as well.

What algorithm can I use to produce 'Random' value?

Say I have 4 possible results and the probabilities of each result appearing are
1 = 10%
2 = 20%
3 = 30%
4 = 40%
I'd like to write a method like GetRandomValue which if called 1000 times would return
1 x 100 times
2 x 200 times
3 x 300 times
4 x 400 times
Whats the name of an algorithm which would produce such results?
in your case you can generate a random number (int) within 1..10 and if it's 1 then select 1, if it's between 2-3 select 2 and if it's between 4..6 select 3 and if is between 7..10 select 4.
In all if you have some probabilities which sum to 1, you can have a random number within (0,1) distribute your generated result to related value (I simplified in your case within 1..10).
To get a random number you would use the Random class of .Net.
Something like the following would accomplish what you requested:
public class MyRandom
{
private Random m_rand = new Random();
public int GetNextValue()
{
// Gets a random value between 0-9 with equal probability
// and converts it to a number between 1-4 with the probablities requested.
switch (m_rand.Next(0, 9))
{
case 0:
return 1;
case 1: case 2:
return 2;
case 3: case 4: case 5:
return 3;
default:
return 4;
}
}
}
If you just want those probabilities in the long run, you can just get values by randomly selecting one element from the array {1,2,2,3,3,3,4,4,4,4}.
If you however need to retrieve exactly 1000 elements, in those specific quantities, you can try something like this (not C#, but shouldn't be a problem):
import java.util.Random;
import java.util.*;
class Thing{
Random r = new Random();
ArrayList<Integer> numbers=new ArrayList<Integer>();
ArrayList<Integer> counts=new ArrayList<Integer>();
int totalCount;
public void set(int i, int count){
numbers.add(i);
counts.add(count);
totalCount+=count;
}
public int getValue(){
if (totalCount==0)
throw new IllegalStateException();
double pos = r.nextDouble();
double z = 0;
int index = 0;
//we select elements using their remaining counts for probabilities
for (; index<counts.size(); index++){
z += counts.get(index) / ((double)totalCount);
if (pos<z)
break;
}
int result = numbers.get(index);
counts.set( index , counts.get(index)-1);
if (counts.get(index)==0){
counts.remove(index);
numbers.remove(index);
}
totalCount--;
return result;
}
}
class Test{
public static void main(String []args){
Thing t = new Thing(){{
set(1,100);
set(2,200);
set(3,300);
set(4,400);
}};
int[]hist=new int[4];
for (int i=0;i<1000;i++){
int value = t.getValue();
System.out.print(value);
hist[value-1]++;
}
System.out.println();
double sum=0;
for (int i=0;i<4;i++) sum+=hist[i];
for (int i=0;i<4;i++)
System.out.printf("%d: %d values, %f%%\n",i+1,hist[i], (100*hist[i]/sum));
}
}

search tree in scala

I'm trying to put my first steps into Scala, and to practice I took a look at the google code jam storecredit excersize. I tried it in java first, which went well enough, and now I'm trying to port it to Scala. Now with the java collections framework, I could try to do a straight syntax conversion, but I'd end up writing java in scala, and that kind of defeats the purpose. In my Java implementation, I have a PriorityQueue that I empty into a Deque, and pop the ends off untill we have bingo. This all uses mutable collections, which give me the feeling is very 'un-scala'. What I think would be a more functional approach is to construct a datastructure that can be traversed both from highest to lowest, and from lowest to highest. Am I on the right path? Are there any suitable datastructures supplied in the Scala libraries, or should I roll my own here?
EDIT: full code of the much simpler version in Java. It should run in O(max(credit,inputchars)) and has become:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Arrays;
public class StoreCredit {
private static BufferedReader in;
public static void main(String[] args) {
in = new BufferedReader(new InputStreamReader(System.in));
try {
int numCases = Integer.parseInt(in.readLine());
for (int i = 0; i < numCases; i++) {
solveCase(i);
}
} catch (IOException e) {
e.printStackTrace();
}
}
private static void solveCase(int casenum) throws NumberFormatException,
IOException {
int credit = Integer.parseInt(in.readLine());
int numItems = Integer.parseInt(in.readLine());
int itemnumber = 0;
int[] item_numbers_by_price = new int[credit];
Arrays.fill(item_numbers_by_price, -1); // makes this O(max(credit,
// items)) instead of O(items)
int[] read_prices = readItems();
while (itemnumber < numItems) {
int next_price = read_prices[itemnumber];
if (next_price <= credit) {
if (item_numbers_by_price[credit - next_price] >= 0) {
// Bingo! DinoDNA!
printResult(new int[] {
item_numbers_by_price[credit - next_price],
itemnumber }, casenum);
break;
}
item_numbers_by_price[next_price] = itemnumber;
}
itemnumber++;
}
}
private static int[] readItems() throws IOException {
String line = in.readLine();
String[] items = line.split(" "); // uh-oh, now it's O(max(credit,
// inputchars))
int[] result = new int[items.length];
for (int i = 0; i < items.length; i++) {
result[i] = Integer.parseInt(items[i]);
}
return result;
}
private static void printResult(int[] result, int casenum) {
int one;
int two;
if (result[0] > result[1]) {
one = result[1];
two = result[0];
} else {
one = result[0];
two = result[1];
}
one++;
two++;
System.out.println(String.format("Case #%d: %d %d", casenum + 1, one,
two));
}
}
I'm wondering what you are trying to accomplish using sophisticated data structures such as PriorityQueue and Deque for a problem such as this. It can be solved with a pair of nested loops:
for {
i <- 2 to I
j <- 1 until i
if i != j && P(i-1) + P(j - 1) == C
} println("Case #%d: %d %d" format (n, j, i))
Worse than linear, better than quadratic. Since the items are not sorted, and sorting them would require O(nlogn), you can't do much better than this -- as far as I can see.
Actually, having said all that, I now have figured a way to do it in linear time. The trick is that, for every number p you find, you know what its complement is: C - p. I expect there are a few ways to explore that -- I have so far thought of two.
One way is to build a map with O(n) characteristics, such as a bitmap or a hash map. For each element, make it point to its index. One then only has to find an element for which its complement also has an entry in the map. Trivially, this could be as easily as this:
val PM = P.zipWithIndex.toMap
val (p, i) = PM find { case (p, i) => PM isDefinedAt C - p }
val j = PM(C - p)
However, that won't work if the number is equal to its complement. In other words, if there are two p such that p + p == C. There are quite a few such cases in the examples. One could then test for that condition, and then just use indexOf and lastIndexOf -- except that it is possible that there is only one p such that p + p == C, in which case that wouldn't be the answer either.
So I ended with something more complex, that tests the existence of the complement at the same time the map is being built. Here's the full solution:
import scala.io.Source
object StoreCredit3 extends App {
val source = if (args.size > 0) Source fromFile args(0) else Source.stdin
val input = source getLines ()
val N = input.next.toInt
1 to N foreach { n =>
val C = input.next.toInt
val I = input.next.toInt
val Ps = input.next split ' ' map (_.toInt)
val (_, Some((p1, p2))) = Ps.zipWithIndex.foldLeft((Map[Int, Int](), None: Option[(Int, Int)])) {
case ((map, None), (p, i)) =>
if (map isDefinedAt C - p) map -> Some(map(C - p) -> (i + 1))
else (map updated (p, i + 1), None)
case (answer, _) => answer
}
println("Case #%d: %d %d" format (n, p1, p2))
}
}

Resources