Apache CollectionUtils Performance

Apache CollectionUtils Performance - performance

I was wandering if anyone made a benchmark on Apache CollectionUtils.
In my simple benchmark:
List<Integer> ints = Arrays.asList(3, 4, 6, 7,8, 0,9,2, 5, 2,1, 35,11, 44, 5,1 ,2);
long start = System.nanoTime();
ArrayList<Integer> filtered = new ArrayList<Integer>(ints.size());
for (Integer anInt : ints) {
if (anInt > 10) {
filtered.add(anInt);
}
}
long end = System.nanoTime();
System.out.println(filtered + " (" + (end - start) + ")");
Predicate<Integer> predicate = new Predicate<Integer>() {
#Override
public boolean evaluate(Integer integer) {
return integer > 10;
}
};
start = System.nanoTime();
filtered.clear();
CollectionUtils.select(ints, predicate,filtered);
end = System.nanoTime();
System.out.println(filtered + " (" + (end - start) + ")");
I got the following results:
[35, 11, 44] (127643)
[35, 11, 44] (3060230)
I must say Im a big fan of this library coz it makes the code clean and testable but currently Im working on performance sensetive project and Im afraid my affection to this library gonna harm the performances.
I know this is a really general question, but any one used this library for production env? and noticed performance issues?

Apart from running it multiple times to check for JVM optimization (I don't know if given the fact that Predicate can be a functional interface, the JVM could not use the new bytecode keyword invokedynamic introduced in Java 7), I think you error rely just after the start:
start = System.nanoTime();
filtered.clear();
CollectionUtils.select(ints, predicate,filtered);
end = System.nanoTime();
System.out.println(filtered + " (" + (end - start) + ")");
I don't think you should evaluate the time filtered.clear() does it work if you want to check differences between CollectionUtils and plain old foreach.

Well, you are basically comparing method invocation overhead with inline code with the latter being obviously faster.
As long as you do not do something that really challenges your cpu, I would be very surprised if this would be the cause of performance problems in your application.

Related

One-liner to generate Powerball picks in Swift?

With the U.S.'s large $1.5 Billion lottery this week, I wrote a function in Ruby to make Powerball picks. In Powerball, you choose 5 numbers from the range 1..69 (with no duplicates) and 1 number from the range 1..26.
This is what I came up with:
def pball
Array(1..69).shuffle[0..4].sort + [rand(1..26)]
end
It works by creating an array of integers from 1 to 69, shuffling that array, choosing the first 5 numbers, sorting those, and finally adding on a number from 1 to 26.
To do this in Swift takes a bit more work since Swift doesn't have the built-in shuffle method on Array.
This was my attempt:
func pball() -> [Int] {
let arr = Array(1...69).map{($0, drand48())}.sort{$0.1 < $1.1}.map{$0.0}[0...4].sort()
return arr + [Int(arc4random_uniform(26) + 1)]
}
Since there is no shuffle method, it works by creating an [Int] with values in the range 1...69. It then uses map to create [(Int, Double)], an array of tuple pairs that contain the numbers and a random Double in the range 0.0 ..< 1.0. It then sorts this array using the Double values and uses a second map to return to [Int] and then uses the slice [0...4] to extract the first 5 numbers and sort() to sort them.
In the second line, it appends a number in the range 1...26. I tried adding this to the first line, but Swift gave the error:
Expression was too complex to be solved in reasonable time; consider
breaking up the expression into distinct sub-expressions.
Can anyone suggest how to turn this into a 1-line function? Perhaps there is a better way to choose the 5 numbers from 1...69.

Xcode 8.3 • Swift 3.1
import GameKit
var powerballNumbers: [Int] {
return (GKRandomSource.sharedRandom().arrayByShufflingObjects(in: Array(1...69)) as! [Int])[0..<5].sorted() + [Int(arc4random_uniform(26) + 1)]
}
powerballNumbers // [5, 9, 62, 65, 69, 2]
Swift 2.x
import GameKit
var powerballNumbers: [Int] {
return (GKRandomSource.sharedRandom().arrayByShufflingObjectsInArray(Array(1...69)) as! [Int])[0...4].sort() + [Int(arc4random_uniform(26).successor())]
}
powerballNumbers // [21, 37, 39, 42, 65, 23]

I don't find the "one-liner" concept very compelling. Some languages lend themselves to it; others don't. I would suggest giving Swift a shuffle method to start with:
extension Array {
mutating func shuffle () {
for var i = self.count - 1; i != 0; i-- {
let ix1 = i
let ix2 = Int(arc4random_uniform(UInt32(i+1)))
(self[ix1], self[ix2]) = (self[ix2], self[ix1])
}
}
}
But since I made this mutating, we still need more than one line to express the entire operation because we have to have a var reference to our starting array:
var arr = Array(1...69)
(1...4).forEach {_ in arr.shuffle()}
let result = Array(arr[0..<5]) + [Int(arc4random_uniform(26)) + 1]
If you really insist on the one-liner, and you don't count the code needed to implement shuffle, then you can do it, though less efficiently, by defining shuffle more like this:
extension Array {
func shuffle () -> [Element] {
var arr = self
for var i = arr.count - 1; i != 0; i-- {
let ix1 = i
let ix2 = Int(arc4random_uniform(UInt32(i+1)))
(arr[ix1], arr[ix2]) = (arr[ix2], arr[ix1])
}
return arr
}
}
And here's your one-liner:
let result = Array(1...69).shuffle().shuffle().shuffle().shuffle()[0..<5] + [Int(arc4random_uniform(26)) + 1]
But oops, I omitted your sort. I don't see how to do that without getting the "too complex" error; to work around that, I had to split it into two lines:
var result = Array(1...69).shuffle().shuffle().shuffle().shuffle()[0..<5].sort(<)
result.append(Int(arc4random_uniform(26)) + 1)

How about this:
let winningDraw = (1...69).sort{ _ in arc4random_uniform(2) > 0}[0...4].sort() + [Int(arc4random_uniform(26)+1)]
[edit] above formula wasn't random. but this one will be
(1...69).map({Int(rand()%1000*70+$0)}).sort().map({$0%70})[0...4].sort() + [Int(rand()%26+1)]

For the fun of it, a non-GameplayKit (long) one-liner for Swift 3, using the global sequence(state:next:) function to generate random elements from the mutable state array rather than shuffling the array (although mutating the value array 5 times, so some extra copy operations here...)
let powerballNumbers = Array(sequence(state: Array(1...69), next: {
(s: inout [Int]) -> Int? in s.remove(at: Int(arc4random_uniform(UInt32(s.count))))})
.prefix(5).sorted()) + [Int(arc4random_uniform(26) + 1)]
... broken down for readability.
(Possible in future Swift version)
If the type inference weren't broken inout closure parameters (as arguments to closures), we could reduce the above to:
let powerballNumbers = Array(sequence(state: Array(1...69), next: {
$0.remove(at: Int(arc4random_uniform(UInt32($0.count)))) })
.prefix(5).sorted()) + [Int(arc4random_uniform(26) + 1)]
If we'd also allow the following extension
extension Int {
var rand: Int { return Int(arc4random_uniform(UInt32(exactly: self) ?? 0)) }
}
Then, we could go on to reduce the one-line to:
let powerballNumbers = Array(sequence(state: Array(1...69), next: { $0.remove(at: $0.count.rand) }).prefix(5).sorted()) + [26.rand + 1]

Xcode 10 • Swift 4.2
Swift now has added shuffled() to ClosedRange and random(in:) to Int which now makes this easily accomplished in one line:
func pball() -> [Int] {
return (1...69).shuffled().prefix(5).sorted() + [Int.random(in: 1...26)]
}
Further trimmings:
Because of the return type of pball(), the Int can be inferred in the random method call. Also, .prefix(5) can be replaced with [...4]. Finally, return can be omitted from the one-line function:
func pball() -> [Int] {
(1...69).shuffled()[...4].sorted() + [.random(in: 1...26)]
}

How does recursion work in a permutation?

I am looking at a permutation program written here. The code looks like this:
public static void main(String[] args) {
permutation("", "CAT");
}
private static void permutation(String prefix, String str) {
int n = str.length();
if (n == 0) {
System.out.println(prefix);
} else {
for (int i = 0; i < n; i++) {
permutation(prefix + str.charAt(i), str.substring(0, i) + str.substring(i + 1));
}
}
}
For the word CAT, I get the following output:
CAT
CTA
ACT
ATC
TCA
TAC
I can trace through the steps of the recursion and understand how it works to get CAT and CTA, but I don't see how it keeps going. After n == 0 (which is the base case) everything should stop (which would happen after we get CTA).
Other sources:
I read the explanation here, but I'm still having trouble understanding how it keeps going. I feel like I get the concept of recursion. I can use it blindly, but I want to understand HOW it is working here.
There is another version of permutation recursion here, but that is using backtracking, and I understand that one a bit better. It's this tail recursive one I don't understand.
Question:
Can someone please explain how the recursion is working so that we get past CTA in the example above? This isn't homework. I'm just looking into different programs and working through some skillbuilders.
Thanks!

Let's look at what the first call generates:
("" + str.charAt(0), str.substring(0, 0) + str.substring(0 + 1))
p("C", "AT")
("" + str.charAt(1), str.substring(0,1) + str.substring(1 + 1))
p("A", "CT")
("" + str.charAt(2), str.substring(0, 2) + str.substring(2 + 1))
p("T", "CA")
Each call extracts each letter of str and adds it to the current prefix. The first call puts each letter of the original string as the start of a permutation. Then, for each such permutation, the algorithm extracts each letter of the remaining suffix and adds it to the accumulating prefix, so that all possibilities are explored:
C AT
CA T
CT A
"CAT"
"CTA"
A CT
AC T
AT C
"ACT"
"ATC"
T CA
TC A
TA C
"TCA"
"TAC"

Remember the state (values of each local variable and parameters) for each recursive call. Only a single call ends after CAT is returned, the others continue where they left off.
Think of each recursive call as a call to an entirely new function that just happens to do the same exact thing.
This is how your function will execute. It might help if you also wrote the values of each local variable (in your case it's just i and the parameters) as well. I just wrote what calls what.
p("", "CAT") -> p("C", "AT") -> p("CA", "T") -> p("CAT", "") -> CAT and return
-> return
-> p("CT", "A") -> p("CTA", "") -> CTA and return
-> return
-> return
-> p("A", "CT") -> ...

To understand recursion we must start with "factorial recursion"

Which Scala features have poor performance

I was wandering lately: as Scala is run on JVM, and latter is optimized for some types of operations, are there features whose implementation is really inefficient on JVM and which use therefore should be discouraged? Could you also explain why they are inefficient?
The first candidate would be functional programming features - as I know, functions are special classes with applymethod, which obviously creates additional overhead compared to languages where functions are just blocks of code.

Performance tuning is a deep and complex issue, but three things come immediately to mind.
Scala collections are good for expressive power, but not for performance.
Consider:
(1 to 20).map(x => x*x).sum
val a = new Array[Int](20)
var i = 0
while (i < 20) { a(i) = i+1; i += 1 } // (1 to 20)
i = 0
while (i < 20) { a(i) = a(i)*a(i); i += 1 } // map(x => x*x)
var s = 0
i = 0
while (i < 20) { s += a(i); i += 1 } // sum
s
The first is amazingly more compact. The second is 16x faster. Math on integers is really fast; boxing and unboxing is not. The generic collections code is, well, generic, and relies on boxing.
Function2 is only specialized on Int, Long, and Double arguments.
Anything other operation on primitives will require boxing. Beware!
Suppose you want to have a function where you can toggle a capability--maybe you want to capitalize letters or not. You try:
def doOdd(a: Array[Char], f: (Char, Boolean) => Char) = {
var i = 0
while (i<a.length) { a(i) = f(a(i), (i&1)==1); i += 1 }
a
}
And then you
val text = "The quick brown fox jumps over the lazy dog".toArray
val f = (c: Char, b: Boolean) => if (b) c.toUpper else c.toLower
scala> println( doOdd(text, f).mkString )
tHe qUiCk bRoWn fOx jUmPs oVeR ThE LaZy dOg
Okay, great! Except what if we
trait Func_CB_C { def apply(c: Char, b: Boolean): Char }
val g = new Func_CB_C {
def apply(c: Char, b: Boolean) = if (b) c.toUpper else c.toLower
}
def doOdd2(a: Array[Char], f: Func_CB_C) = {
var i = 0
while (i<a.length) { a(i) = f(a(i), (i&1)==1); i += 1 }
a
}
instead? Suddenly it's 3x faster. But if it's (Int, Int) => Int, (or any other permutation of Int/Long/Double arguments and Unit/Boolean/Int/Long/Float/Double return values), rolling your own is unnecessary--it's specialized and works at maximum speed.
Just because you can parallelize easily doesn't mean it's a good idea.
Scala's parallel collections will just try to run your code in parallel. It's up to you to make sure there's enough work so that running in parallel is a smart thing to do. There's a lot of overhead in setting up threads and collecting results. Take, for example,
val v = (1 to 1000).to[Vector]
v.map(x => x*(x+1))
versus
val u = (1 to 1000).to[Vector].par
u.map(x => x*(x+1))
The second map is faster, right, because it's parallel?
Hardly! It's 10x slower because of overhead (on my machine; results can vary substantially)
Summary
These are just a few of very many issues that you'll normally never have to worry about except in the most performance-critical parts of your code. There are oodles more, which eventually you'll encounter, but as I mentioned in my comment, it would take a book to cover a decent fraction of them. Note that there are oodles of performance issues in any language, and optimization is often tricky. Save your effort for where it matters!

Amazon Interview- Design Meeting Scheduler [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Lately, I took an interview. I was asked to design a meeting scheduler, just like in the Microsoft outlook calendar or the gmail calendar.
I proposed that I will create an array of 48 for each day. Every 30 min representing the array entry.
I have to make sure that the next appointment does not collide with a previous meeting.
My solution works fine but it wastes too much memory.
Can anyone please tell me how do I find a better solution to detect collision for meetings.
I don't know all the meetings at the beginning. They will be added randomly later.
Thanks,

Start with an empty list of meetings, each with a start_time and duration. We will maintain a sorted list of meetings by start_time.
To add a meeting to the list, find where it belongs in the list by performing a binary search. Once you find the index, perform two checks to avoid a collision; consider the meetings immediately before and after the to-be-inserted meeting (if they exist).
Assert the before-meeting's start_time + duration does not exceed the new meeting's start_time.
Assert the new meeting's start_time+duration does not exceed the after-meeting's start_time.
If the assertions are satisfied, add the meeting to the list.
This add operation takes O(log(list_size)) time.
Note: This approach assumes that adding a meeting with an overlap is an invalid operation. If overlaps are allowed to exist, you would have to check beyond the meetings immediately preceding/subsequent the new meeting.

We can have a Tree structure (BST) for storing the requests (Request object: start time/end time/date/priority etc.). By doing so, add/delete/search/update operations can be achieved by O(height_of_tree). If we use a balanced tree, we can get the optimized running time. i.e. O(log n) for each of the above mentioned operations.
This approach is better than the sorted list approach as the list is backed by an fixed sized array in case of ArrayList which takes O(n) for copying the elements from old array to new array. If we use a linkedlist, binary search is not possible.
Comments welcome!

Here is my solution which inserts using binary search
public class MeetingScheduler {
static class Meeting implements Comparable<Meeting> {
Date startTime;
Date endTime;
int duration;
public static final int MINUTE = 60000;
//duration in minutes
Meeting(Date startTime, int duration) {
this.startTime = startTime;
this.duration = duration;
this.endTime = new Date(startTime.getTime() + (MINUTE * duration));
}
#Override
public int compareTo(Meeting o) {
if (this.endTime.compareTo(o.startTime) < 0) {
return -1;
}//end time is before the other's start time
if (this.startTime.compareTo(o.endTime) > 0) {
return 1;
}////start time is after the other's end time
return 0;
}
#Override
public String toString() {
return "meeting {" +
"from " + startTime +
", minutes=" + duration +
'}';
}
}
private List<Meeting> meetings = new ArrayList<Meeting>();
public Meeting bookRoom(Meeting meeting) {
if (meetings.isEmpty()) {
meetings.add(meeting);
return null;
} else {
int pos = -Collections.binarySearch(meetings, meeting);
if (pos > 0) {
meetings.add(pos-1, meeting);
return null;
} else {
return meetings.get(-pos);
}
}
}
public List<Meeting> getMeetings() {
return meetings;
}
public static void main(String[] args) {
MeetingScheduler meetingScheduler = new MeetingScheduler();
Meeting[] meetingsToBook = new Meeting[]{
//October 3rd 2014
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 15, 00), 15),
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 16, 00), 15),
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 17, 00), 60),
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 18, 00), 15),
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 14, 50), 10),
new Meeting(new Date(2014 - 1900, 10 - 1, 3, 14, 55), 10)
};
for (Meeting m : meetingsToBook) {
Meeting oldMeeting = meetingScheduler.bookRoom(m);
if (oldMeeting != null) {
System.out.println("Could not book room for " + m + " because it collides with " + oldMeeting);
}
}
System.out.println("meetings booked: " + meetingScheduler.getMeetings().size());
for (Meeting m : meetingScheduler.getMeetings()) {
System.out.println(m.startTime + "-> " + m.duration + " mins");
}
}
}

while the use of a sorted array and binary search is efficient, please note that insert will take o(n) assuming no collision is found since the array needs to slide the meetings over. Not sure if this is the most optimal solution.

If the sorted list is an array, I believe the add operation will take O(n) since you have to shift the meeting that start after the to-be-insert meeting.

Other examples of magical calculations

I have seen this topic here about John Carmack's magical way to calculate square root, which refers to this article: http://www.codemaestro.com/reviews/9. This surprised me a lot, I just didn't ever realized that calculating sqrt could be so faster.
I was just wondering what other examples of "magic" exist out there that computer games use to run faster.
UPDATE:
John Carmack is not the author of the magic code. This article tells more. Thanks #moocha.

There is a book which gathers many of those 'magic tricks' and that may be interesting for you: The Hacker's Delight.
You have for example many tricks like bit twiddling hacks etc... (you have several square root algorithms for example that you can see on the google books version)

Not exactly a mathematical hack, but I like this one about Roman Numerals in Java6:
public class Example {
public static void main(String[] args) {
System.out.println(
MCMLXXVII + XXIV
);
}
}
will give you the expected result (1977 + 24 = 2001), because of a rewrite rule:
class Transform extends TreeTranslator, an internal class of the Java compiler.
Transform visits all statements in the source code, and replaces each variable whose name matches a Roman numeral with an int literal of the same numeric value.
public class Transform extends TreeTranslator {
#Override
public void visitIdent(JCIdent tree) {
String name = tree.getName().toString();
if (isRoman(name)) {
result = make.Literal(numberize(name));
result.pos = tree.pos;
} else {
super.visitIdent(tree);
}
}
}

I'm a big fan of Bresenham Line, but man the CORDIC rotator enabled all kinds of pixel chicanery for me when CPUs were slower.

Bit Twiddling Hacks has many cool tricks.
Although some of it is dated now, I was awed by some of the tricks in "The Zen of Code Optimization" by Michael Abrash. The implementation of the Game Of Life is mind-boggling.

I have always been impressed from two classic 'magic' algorithms that have to do with dates:
Zeller's congruence for computing the day of week of a given date
Gauss's algorithm to calculate the date of Easter
Some (untested) code follows:
import math
def dayOfWeek(dayOfMonth, month, year):
yearOfCentury = year%100
century = year // 100
h = int(dayOfMonth + math.floor(26.0*(month + 1)/10) + yearOfCentury \
+ math.floor(float(yearOfCentury)/4) + math.floor(float(century)/4) \
+ 5*century) % 7
return ['Saturday', 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'][h]
def easter(year):
a = year%19
b = year%4
c = year%7
k = int(math.floor(float(year)/100))
p = int(math.floor((13 + 8.0*k)/25))
q = int(math.floor(float(k)/4))
M = (15 - p + k - q)%30
N = (4 + k - q)%7
d = (19*a + M)%30
e = (2*b + 4*c + 6*d + N)%7
day1 = 22 + d + e
if day1 <= 31: return "March %d"%day1
day2 = d + e - 9
if day2 == 26: return "April 19"
if day2 == 25 and (11*M + 11)%30 < 19: return "April 18"
return "April %d"%day2
print dayOfWeek(2, 12, 2008) # 'Tuesday'
print easter(2008) # 'March 23'

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Apache CollectionUtils Performance - performance

Well, you are basically comparing method invocation overhead with inline code with the latter being obviously faster. As long as you do not do something that really challenges your cpu, I would be very surprised if this would be the cause of performance problems in your application.

Related

One-liner to generate Powerball picks in Swift?

How does recursion work in a permutation?

Which Scala features have poor performance

Amazon Interview- Design Meeting Scheduler [closed]

Other examples of magical calculations

Categories

Resources