Using streams/collect to generate a map in Java 8 - java-8

I'm playing around with maps/streams in Java 8 and I don't see a straightforward way to convert the following Java 7 code. It seems as though I cannot access a method within a method. If I use a .map() as an intermediate step, I lose access to the outer variable("item"). Am I missing something?
private void test(final Collection<SomeObject> items){
// Java 7
Map<SomeKey, List<SomeObject>> map = Maps.newHashMap();
for (SomeObject item : items){
SomeKey someKey = item.someMethod().getKey();
map.put(someKey,item);
}
// Java 8
Map<SomeKey, List<SomeObject>> map2 =
items.stream().collect(Collectors.groupingBy(item::someMethod::getKey));
}
Thanks!

The correct Java 8 expression should be:
Map<SomeKey, List<SomeObject>> map2 =
items.stream().collect(Collectors.groupingBy(item -> item.someMethod().getKey()));

Related

MVStore Online Back Up

The information in the MVStore docs on backing up a database is a little vague, and I'm not familiar with all the concepts and terminology, so I wanted to see if the approach I came up with makes sense.
I'm a Clojure programmer, so please forgive my Java here:
// db is an MVStore instance
FileStore fs = db.getFileStore();
FileOutputStream fos = java.io.FileOutputStream(pathToBackupFile);
FileChannel outChannel = fos.getChannel();
try {
db.commit();
db.setReuseSpace(false);
ByteBuffer bb = fs.readFully(0, fs.size());
outChannel.write(bb);
}
finally {
outChannel.close();
db.setReuseSpace(true);
}
Here's what it looks like in Clojure in case my Java is bad:
(defn backup-db
[db path-to-backup-file]
(let [fs (.getFileStore db)
backup-file (java.io.FileOutputStream. path-to-backup-file)
out-channel (.getChannel backup-file)]
(try
(.commit db)
(.setReuseSpace db false)
(let [file-contents (.readFully fs 0 (.size fs))]
(.write out-channel file-contents))
(finally
(.close out-channel)
(.setReuseSpace db true)))))
My approach seems to work, but I wanted to make sure I'm not missing anything or see if there's a better way. Thanks!
P.S. I used the H2 tag because MVStore doesn't exist and I don't have enough reputation to create it.
The docs currently say:
The persisted data can be backed up at any time, even during write
operations (online backup). To do that, automatic disk space reuse
needs to be first disabled, so that new data is always appended at the
end of the file. Then, the file can be copied. The file handle is
available to the application. It is recommended to use the utility
class FileChannelInputStream to do this.
The classes FileChannelInputStream and FileChannelOutputStream convert a java.nio.FileChannel into a standard InputStream and OutputStream. There is existing H2 code in BackupCommand.java that shows how to use them. We can improve upon it using Java 9 input.transferTo(output); to copy the data:
public void backup(MVStore s, File backupFile) throws Exception {
try {
s.commit();
s.setReuseSpace(false);
try(RandomAccessFile outFile = new java.io.RandomAccessFile(backupFile, "rw");
FileChannelOutputStream output = new FileChannelOutputStream(outFile.getChannel(), false)){
try(FileChannelInputStream input = new FileChannelInputStream(s.getFileStore().getFile(), false)){
input.transferTo(output);
}
}
} finally {
s.setReuseSpace(true);
}
}
Note that when you create the FileChannelInputStream you have to pass false to tell it to not close the underlying file channel when the stream is closed. If you don't do that it will close the file that your FileStore is trying to use. That code uses try-with-resource syntax to make sure that the output file is properly closed.
In order to try this, I checked out the mvstore code then modified the TestMVStore to add a testBackup() method which is similar to the existing testSimple() code:
private void testBackup() throws Exception {
// write some records like testSimple
String fileName = getBaseDir() + "/" + getTestName();
FileUtils.delete(fileName);
MVStore s = openStore(fileName);
MVMap<Integer, String> m = s.openMap("data");
for (int i = 0; i < 3; i++) {
m.put(i, "hello " + i);
}
// create a backup
String fileNameBackup = getBaseDir() + "/" + getTestName() + ".backup";
FileUtils.delete(fileNameBackup);
backup(s, new File(fileNameBackup));
// this throws if you accidentally close the input channel you get from the store
s.close();
// open the backup and verify
s = openStore(fileNameBackup);
m = s.openMap("data");
for (int i = 0; i < 3; i++) {
assertEquals("hello " + i, m.get(i));
}
s.close();
}
With your example, you are reading into a ByteBuffer which must fit into memory. Using the stream transferTo method uses an internal buffer that is currently (as at Java11) set to 8192 bytes.

Play 2.5 Errors with Java 8 - wrong number of type arguments and lambda expression not expected here error

I'm attempting to upgrade from Play 2.4 to Play 2.5. Using the guide from the Replaced F.Promise with Java 8's CompletionStage, I replaced F.Promise, map, and flatMap with the suggested replacements from the previous link (reflected snippet of changes below).
public CompletionStage<Result> parallel() {
final long start = System.currentTimeMillis();
final CompletionStage<WSResponse, Long> getLatency = resp -> System.currentTimeMillis() - start;
CompletionStage<Long> googleLatency = WS.url("http://google.com").get().thenApplyAsync(getLatency);
CompletionStage<Long> yahooLatency = WS.url("http://yahoo.com").get().thenApplyAsync(getLatency);
return googleLatency.thenComposeAsync(googleResponseTime ->
yahooLatency.thenApplyAsync(yahooResponseTime ->
ok(format("Google response time: %d; Yahoo response time: %d",
googleResponseTime, yahooResponseTime)))
);
}
After running ./activator clean dist, I'm getting the error below:
[error] /Play-2-JS-2.5/app/controllers/Java8Controller.java:74: wrong number of type arguments; required 1
[error] CompletionStage
[error] /Play-2-JS-2.5/app/controllers/Java8Controller.java:74: lambda expression not expected here
[error] resp -> System.currentTimeMillis() - start
For some reason, it believes it should have only one instead of two in the CompletionStage<WSResponse, Long> getLatency section and also for some reason it's not liking the lambda expression even though the syntax before the new api replacements worked successfully in Play 2.4.
I tried switching back to the old calls to what is displayed in https://github.com/btgrant-76/Play-2-Java-Scala-Java-8-Async-Comparison/blob/6a85cf31cfb804ef20bacf8e14d30ce46cc9307c/app/controllers/Java8Controller.java#L71-L83 however it doesn't lend any better results. Been googling and searching for sometime but not sure how to approach this. Any suggestions with possible examples would be greatly appreciated.
Replace
final CompletionStage<WSResponse, Long> getLatency = resp ->
System.currentTimeMillis() - start;
with
final Function<WSResponse, Long> getLatency = resp -> System.currentTimeMillis() - start;
since thenApplyAsync metod in CompletionStage interface accepts a java.util.Function
Hope this helps.
Good luck

How to use Hadoop's MapFileOutputFormat in Flink?

I've got stuck while I'm writing a program using Apache Flink. The problem is that I'm trying to generate Hadoop's MapFile as a result of computation but Scala compiler complains about type mismatch.
To illustrate the problem, let me show you the below code snippet which tries to generate two kinds of output: one is Hadoop's SequenceFile and the other is MapFile.
val dataSet: DataSet[(IntWritable, BytesWritable)] =
env.readSequenceFile(classOf[Text], classOf[BytesWritable], inputSequenceFile.toString)
.map(mapper(_))
.partitionCustom(partitioner, 0)
.sortPartition(0, Order.ASCENDING)
val seqOF = new HadoopOutputFormat(
new SequenceFileOutputFormat[IntWritable, BytesWritable](), Job.getInstance(hadoopConf)
)
val mapfileOF = new HadoopOutputFormat(
new MapFileOutputFormat(), Job.getInstance(hadoopConf)
)
val dataSink1 = dataSet.output(seqOF) // it typechecks!
val dataSink2 = dataSet.output(mapfileOF) // syntax error
As commented above, dataSet.output(mapfileOF) causes Scala compiler to complain as follows:
FYI, compared to SequenceFile, MapFile calls for a stronger condition that a key must be WritableComparable.
Before writing the application using Flink, I implemented it using Spark as below and it worked okay (no compilation error and it runs okay without any error).
val rdd = sc
.sequenceFile(inputSequenceFile.toString, classOf[Text], classOf[BytesWritable])
.map(mapper(_))
.repartitionAndSortWithinPartitions(partitioner)
rdd.saveAsNewAPIHadoopFile(
outputPath.toString,
classOf[IntWritable],
classOf[BytesWritable],
classOf[MapFileOutputFormat]
)
Did you check: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/batch/hadoop_compatibility.html#using-hadoop-outputformats
It contains the following example:
// Obtain your result to emit.
val hadoopResult: DataSet[(Text, IntWritable)] = [...]
val hadoopOF = new HadoopOutputFormat[Text,IntWritable](
new TextOutputFormat[Text, IntWritable],
new JobConf)
hadoopOF.getJobConf.set("mapred.textoutputformat.separator", " ")
FileOutputFormat.setOutputPath(hadoopOF.getJobConf, new Path(resultPath))
hadoopResult.output(hadoopOF)

Apache Spark 1.2.1 standalone cluster giving java heap space error

I need information about, how to figure out how much heap space(memory) would be needed to operate on x mb(suppose x means 600 mb) in spark standalone cluster.
Scenario:
I have standalone cluster with 14gb memory and 8 cores. I want to operate(Reading data from files and writing it to Cassandra) on 600 MB of data.
For this task I have SparkConfig as:
.set("spark.cassandra.output.throughput_mb_per_sec","800")
.set("spark.storage.memoryFraction", "0.3")
And --executor-memory=5g --total-executor-cores 6 --driver-memory 6g at the time of submitting task.
In spite of above configuration,I getting java heap space error while writing data to Cassandra.
Below is the java code:
public static void main(String[] args) throws Exception {
String fileName = args[0];
Long now = new Date().getTime();
SparkConf conf = new SparkConf(true)
.setAppName("JavaSparkSQL_" +now)
.set("spark.cassandra.connection.host", "192.168.1.65")
.set("spark.cassandra.connection.native.port", "9042")
.set("spark.cassandra.connection.rpc.port", "9160")
.set("spark.cassandra.output.throughput_mb_per_sec","800")
.set("spark.storage.memoryFraction", "0.3");
JavaSparkContext ctx = new JavaSparkContext(conf);
JavaRDD<String> input =ctx.textFile
("hdfs://abc.xyz.net:9000/figmd/resources/" + fileName, 12);
JavaRDD<PlanOfCare> result = input.mapPartitions(new
ParseJson()).filter(new PickInputData());
System.out.print("Count --> "+result.count());
System.out.println(StringUtils.join(result.collect(), ","));
javaFunctions(result).writerBuilder("ks","pt_planofcarelarge",
mapToRow(PlanOfCare.class)).saveToCassandra();
}
What configuration I am suppose to do?Am I missing anything?
Thanks in advance.
JavaRDD collect method return an array that contains all of the elements in this RDD.
So in your case, it will creates an array with 340000 elements which will result in a Java Heap Error, you may want to take a small sample of your data and collect it or you may want to save it directly to your disk.
For more information about JavaRDD, you can always refer to the official documentation.

fail to use invokeExact of MethodHandle on dynamic load class object

I have some demo code as follwing.
Class<?> myClass = cl.loadClass("com.hp.ac.scriptengine.test." + generateClassName);
Object my_obj = myClass.newInstance();
MethodType mt;
MethodHandle mh;
MethodHandles.Lookup lookup = MethodHandles.lookup();
mt = MethodType.methodType(void.class, int.class);
mh = lookup.findVirtual(my_obj.getClass(), "ToDoit", mt);
mh.invokeExact(my_obj,1);
here '"com.hp.ac.scriptengine.test." + generateClassName' is a generated class.
I got the message as follows.
java.lang.invoke.WrongMethodTypeException: (I)V cannot be called as (Ljava/lang/Object;I)V
at com.hp.ac.scriptengine.test.compliebyCommandline.main(compliebyCommandline.java:138)
Here line 138 is mh.invokeExact(my_obj,1);'
I tried that demo code(such as ... mh.invokeExact("daddy",'d','n')...) in Java 7 API document. It works fine. Such call(mh.invokeExact("daddy",'d','n'))just invoke (CC)Ljava/lang/String other than (Ljava/lang/String;CC)Ljava/lang/String.
But why, in my code, mh.invokeExact(my_obj,1) invoke (Ljava/lang/Object;I)V other than (I)V ?
I think the problem is with int.class. Try Integeer.class or Integer.TYPE instead.

Resources