Stream<double[]> vs DoubleStream - java-8

I have to convert double value array into stream
What is difference between following two approach? Which one is better ?
double [] dArray = {1.2,2.3,3.4,4.5};
Stream<double[]> usingStream = Stream.of(dArray); //approach 1
DoubleStream usingArrays = Arrays.stream(dArray); //approach 2

Obviously, Stream.of(dArray) gives you a Stream<double[]> whose single element is the input array, which is probably not what you want. You could use that approach if your input was a Double[] instead of a primitive array, since then you would have gotten a Stream<Double> of the elements of the array.
Therefore Arrays.stream(dArray) is the way to go when you need to transform an array of doubles to a stream of doubles.

Besides the fact that they are different?
DoubleStream can be thought as Stream<Double> (but as a primitive), while Stream<double[]> is a Stream of arrays.

Stream.of and Arrays.stream are entirely different things for different purposes and hence should not be compared.
Stream.of when passed a single dimensional array as in your example will yield a stream with a single element being the array itself which in majority of the cases is not what you want.
Arrays.stream, well as name suggests operates on arrays, whereas Stream.of is more general.
It would have been better and more entertaining had you asked what’s the difference between DoubleStream.of(dArray) and Arrays.stream(dArray).

Related

SystemStackError when array destructuring with splat operator

I have an application that gathers an (large-ish) amount of data into an array and appends it into an existing array. When I use the splat operator (to use with Array.push), I get a SystemStackError: stack level too deep message. 'Large' is in the range of 150k entries (each entry contains additional objects).
What is the preferred method to merge large arrays in Ruby?
gathered_info = function_that_returns_a_large_array_of_hashes()
dump.push(*gathered_info)
If you want to add a bunch of things to an array then the splat will need to expand those as individual arguments, each of which takes stack space. That's bad for large lists for reasons you've discovered.
You can always just use concat on the array directly:
dump.concat(gathered_info)
That's far less cumbersome.
You normally use a splat because there's no alternative that takes an array instead, but that's not the case here. concat does exactly what you need.

Why must we call to_a on an enumerator object?

The chaining of each_slice and to_a confuses me. I know that each_slice is a member of Enumerable and therefore can be called on enumerable objects like arrays, and chars does return an array of characters.
I also know that each_slice will slice the array in groups of n elements, which is 2 in the below example. And if a block is not given to each_slice, then it returns an Enumerator object.
'186A08'.chars.each_slice(2).to_a
But why must we call to_a on the enumerator object if each_slice has already grouped the array by n elements? Why doesn't ruby just evaluate what the enumerator object is (which is a collection of n elements)?
The purpose of enumerators is lazy evaluation. When you call each_slice, you get back an enumerator object. This object does not calculate the entire grouped array up front. Instead, it calculates each “slice” as it is needed. This helps save on memory, and also allows you quite a bit of flexibility in your code.
This stack overflow post has a lot of information in it that you’ll find useful:
What is the purpose of the Enumerator class in Ruby
To give you a cut and dry answer to your question “Why must I call to_a when...”, the answer is, it hasn’t. It hasn’t yet looped through the array at all. So far it’s just defined an object that says that when it goes though the array, you’re going to want elements two at a time. You then have the freedom to either force it to do the calculation on all elements in the enumerable (by calling to_a), or you could alternatively use next or each to go through and then stop partway through (maybe calculate only half of them as opposed to calculating all of them and throwing the second half away).
It’s similar to how the Range class does not build up the list of elements in the range. (1..100000) doesn’t make an array of 100000 numbers, but instead defines an object with a min and max and certain operations can be performed on that. For example (1..100000).cover?(5) doesn’t build a massive array to see if that number is in there, but instead just sees if 5 is greater than or equal to 1 and less than or equal to 100000.
The purpose of this all is performance and flexibility.
It may be worth considering whether your implementation actually needs to make an array up front, or whether you can actually keep your RAM consumption down a bit by iterating over the enumerator. (If your real world scenario is as simple as you described, an enumerator won’t help much, but if the array actually is large, an enumerator could help you a lot).

how does this Ruby code work? (hash) (Learnrubythehardway)

I know i will look like a total noob, but there's something I can't wrap my head around. Let me emphasize that i DID google this thing, but i didn't find what I was looking for.
I'm going through the learnrubythehardway course, and for ex39 this is one of the functions we have defined:
def Dict.hash_key(aDict, key)
return key.hash % aDict.length
end
The author gives this explanation:
hash_key
This deceptively simple function is the core of how a hash works. What it does is uses the built-in Ruby hash function to convert a
string to a number. Ruby uses this function for its own hash data
structure, and I'm just reusing it. You should fire up a Ruby console
to see how it works. Once I have a number for the key, I then use the
% (modulus) operator and the aDict.length to get a bucket where this
key can go. As you should know, the % (modulus) operator will divide
any number and give me the remainder. I can also use this as a way of
limiting giant numbers to a fixed smaller set of other numbers. If you
don't get this then use Ruby to explore it
I like this course, but the above paragraph was no help.
Ok, you call the function passing it two arguments (aDict is an array) and it returns something.
(My questions are not totally independent of one another.)
What and how does it do that? (ok, it returns a bucket index, but how do we "get there"?)
What does the key.hash do/what is it?
How does using the % help me get what I need? (What is the use of "modding" the key.hash by the aDict.length?)
"Use Ruby to explore it." - ok, but my question No.2. kinda already suggests that I wouldn't know how to go about doing that.
Thanks in advance.
key.hash is calling Object#hash, which is not to be confused with Hash.
Object#hash converts a string into a number consistently (the same string will always result in the same number, in the same running instance of Ruby).
pry(main)> "abc".hash
=> -1672853150
So now we have a number, but it's way too large for the number of buckets in our Dict structure, which defaults to 256 buckets. So we modulus it to get a number within our bucket range.
pry(main)> "abc".hash % 256
=> 98
This essentially allows us to translate Dict["abc"] into aDict[98].
RE: This example in particular
I'm going to change the order of things in a way that I hope makes more sense:
#2. You can think of a hash as a sort of 'fingerprint' of something. The .hash method will create a (generally) unique output for any given input.
#3. In this case, we know that the hash is a number, so we take the modulo of the generated number by the backing array's length in order to find a (hopefully empty) index that is within our storage's bounds.
#1. That's how. A hashing algorithm will return the same output for any given input. The modulo takes this output and turns it into something we can actually use in an array to find something reliably.
#4. Call hash on something. Call it on a string and then modulo it by the length of an array. Try again on another string. Do that again, and use your result to assign something to that array. Do it again to see that the hash and modulo thing will find that value again.
Further Notes:
By itself, the modulo function is not a good way to pick unique indexes for keys. This example is the first step, but especially in a small array, there is still a relatively large chance for the hashes of different keys to modulo into the same number. That's called a collision, and handling those seems to be outside the scope of this question.

I'm having problems on spliting a string number into single digits with Processing

I'm new with processing and I'm trying to split any string digit into a single array element. Then my goal is to find home many numbers repeat themself anf print them out in an array. I'm not sure if I'm in the right track tho! I'm aware that there are some missing lines, but as I mention before I'm new and exploring the array, modulo and string area.
int[] dig = new string [1233467890];
int n=dig.length;
while(n<0){
arr[i--]=n%10
dig = n % 10;
n = n / 10;
}
println(arr);
Thanks ahead of time for help
Edwin
I think you are mixing things up a little bit here, specially what strings and arrays are.
An array is a sequence of objects, and these objects may be integers, characters, booleans, circles, cups or balls. A String is, in the programming universe, a very special type of array: it is an array of characters.
So, as you may have noticed, there's no way of creating a "string" of integers. And the processing programming interface tells you exactly that if you try to run the code you posted:
"cannot convert [] String to [] Int". That means: strings and ints are things fundamentally different.
As I understood neither your goal nor your code, I can't help you any further.
I think it would be a better idea to read and understand the following link, run and understand the more basic examples there, and only then try to program what you want.
http://processing.org/reference/Array.html
http://processing.org/reference/String.html
Best regards

Hashtables/Dictionaries that use floats/doubles

I read somewhere about other data structures similar to hashtables, dictionaries but instead of using ints, they were using floats/doubles, etc.
Anyone knows what they are?
If you mean using floats/doubles as keys in your hash, that's easy. For example, in .NET, it's just using Dictionary<double,MyValueType>.
If you're talking about having the hash be based off a double instead of an int....
Technically, you can have any element as your internal hash. Normally, this is done using an int or long, since these are fast, and the hashing algorithm is easy to compute.
However, the hash is really just a BitArray at heart, so anything would work. There really isn't much advantage to making this something other than an int or long, other than potentially allowing a larger set of hash values (ie: if you go to an 8 byte or larger type for your hash).
You mean as keys? That strikes me as tricky.
If you're using them as arbitrary keys, they're no better than integers.
If you expect to calculate a floating-point value and use it to look something up in a hash table, you're living very dangerously. Floating point numbers do not have infinite precision, and calculating the same thing in two slightly different ways can result in very tiny differences in the result. Hash keys rely on getting the exact same thing every time, so you'd have to be careful to round, and round in exactly the same way at all times. This is trickier than it sounds, by the way.
So, what would you do with floating-point hashes?
A hash algorithm is, in general terms, just a function that produces a smaller output from a larger input. Good hash functions have interesting properties like a large change in output for a small change in the input, and an assurance that they produce every possible output value for some input.
It's not hard to write a simple polynomial type hash function that outputs a floating-point value, rather than an integer value, but it's difficult to ensure that the resulting hash function has the desired properties without getting into the details of the particular floating-point representation used.
At least part of the reason that hash functions are nearly always implemented in integer arithmetic is because proving various properties about an integer calculation is easier than doing the same for a floating point calculation.
It's fairly easy to prove that some (sum of prime factors) modulo (another prime) must, necessarily, produce every possible output for some input. Doing the same for a calculation with a bunch of floating-point fractions would be a drag.
Add to that the relative difficulty of storing and transmitting floating-point values without corruption, and it's just not worth it.
Your question history shows that you use .Net, so I'll answer in that context.
If you want a Dictionary that is type aware, such that you can specify it should use floats or doubles for the keys or values, use System.Collections.Generic.Dictionary<T, U> http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
If you want a Dictionary that is type blind, such that you can use floats AND doubles for keys and values, use System.Collections.HashTable http://msdn.microsoft.com/en-us/library/system.collections.hashtable.aspx

Resources