How to convert List into HashBasedTable using Java 8? - java-8

How can we convert a List into HashBasedTable in Java8?
Current Code is like:
import org.glassfish.jersey.internal.guava.HashBasedTable;
import org.glassfish.jersey.internal.guava.Table;
List<ApplicationUsage> appUsageFromDB = computerDao.findAllCompAppUsages(new HashSet<>(currentBatch));
Table<String, String, Integer> table = HashBasedTable.create();
for(ApplicationUsage au: appUsageFromDB) {
table.put(au.getId(), au.getName(), au);
}
I need to store composite key in this and later fetch the same.

If those internals are guava-21 at least, you could do via their own collector, but I do not see anything wrong with what you are doing with a simple loop.
Table<String, String, ApplicationUsage> result =
appUsageFromDB.stream()
.collect(ImmutableTable.toImmutableTable(
ApplicationUsage::getId,
ApplicationUsage::getName,
Function.identity()
));

First, you should never rely on internal packages, just add Guava to you project explicitly. You can use Tables#toTable collector, if you want to have mutable table as a result, otherwise immutable one as presented in #Eugene's answer is just fine:
import com.google.common.collect.HashBasedTable;
import com.google.common.collect.Table;
import com.google.common.collect.Tables;
// ...
Table<String, String, ApplicationUsage> table2 = appUsageFromDB.stream()
.collect(Tables.toTable(
ApplicationUsage::getId,
ApplicationUsage::getName,
au -> au,
HashBasedTable::create
));
Also, your code doesn't compile, because it expects Integer as table value, but you're adding ApplicationUsage in your loop. Change types and third argument in table collector accordingly if needed.

Related

Go Doc is indenting/grouping functions unexpectedly. What is causing it?

Go Doc is indenting/making a group without me knowingly telling it to.
Here is a screenshot from my browser showing the problem. The four Parse functions should not be indented:
What is causing this behaviour?
I've tried searching for indents/subsections/grouping in Go Docs but I've not found anything beside feature requests. I'm sure the answer to my question is in the documentation somewhere but I can't find it as I don't have the correct vocabulary.
I'm relatively new to Go and Go Doc, so I'm assuming the answer is something simple which I've overlooked.
Here is an extract of my code. Please let me know if I need to share more code.
status.go
package synop
// Status codes returned when parsing Blocks.
type Status int64
const (
VALID Status = iota
NO_DATA
// code omitted
)
// Other functions omitted
cloudwind_block.go
package synop
import (
"strings"
)
/*
Extract cloud cover from the cloud-wind block, Cxxxx.
Cloud, C, is the first digit of the block. Cloud over is given in [okta]:
*/
func ParseCloud(block string) (okta int, s Status) {
slice := [2]int{0, 1}
return parseIfValid(block, slice, str2int)
}
/*
Extract wind direction from from the cloud-wind block, xDDxxx.
Direction, DD, are the second and third digits.
*/
func ParseDir(block string) (dir string, s Status) {
slice := [2]int{1, 3}
return parseIfValid(block, slice, getDir)
}
// Other functions omitted
I have another file, blocks.go, which has almost the same structure as status.go and it does not cause this behaviour. I also don't know if the problem is caused by the preceding type Status or something in the cloudwind_block.go file.
I'm using // for single-line documentation and /* */ for multi line. I've tried making this consistent on the off chance and, as expected, it had no effect.
The reason for the grouping and indentation is that those functions are considered "constructors" of the type under which they are grouped/indented.
https://go.dev/doc/comment#func (if you scroll down a bit, you'll see this):
This example also shows that top-level functions returning a type T or pointer *T, perhaps with an additional error result, are shown alongside the type T and its methods, under the assumption that they are T’s constructors.

How to idiomatically fill empty fields with default values for Kubernetes API objects?

I want to compare two Kubernetes API objects (e.g. v1.PodSpecs): one of them was created manually (expected state), the other one was received from the Kubernetes API/client (actual state).
The problem is that even if the two objects are semantically equal, the manually created struct has zerovalues for unspecified fields where the other struct has default values, and so the two doesn't match. It means that a simple reflect.DeepEqual() call is not sufficient for comparison.
E.g. after this:
expected := &v1.Container{
Name: "busybox",
Image: "busybox",
}
actual := getContainerSpecFromApi(...)
expected.ImagePullPolicy will be "", while actual.ImagePullPolicy will be "IfNotPresent" (the default value), so the comparison fails.
Is there an idiomatic way to replace zerovalues with default values in Kubernetes API structs specifically? Or alternatively is a constructor function that initializes the struct with default values available for them somewhere?
EDIT:
Currently I am using handwritten equality tests for each K8s API object types, but this doesn't seem to be maintainable to me. I am looking for a simple (set of) function(s) that "knows" the default values for all built-in Kubernetes API object fields (maybe somewhere under k8s.io/api*?). Something like this:
expected = api.ApplyContainerDefaults(expected)
if !reflect.DeepEqual(expected, actual) {
reconcile(expected, actual)
}
There are helpers to fill in default values in place of empty/zero ones.
Look at the SetObjectDefaults_Deployment for Deployment for instance.
Looks like the proper way to call it is via (*runtime.Scheme).Default.
Below is the snippet to show the general idea:
import (
"reflect"
appsv1 "k8s.io/api/apps/v1"
"k8s.io/client-go/kubernetes/scheme"
)
func compare() {
scheme := scheme.Scheme
// fetch the existing &appsv1.Deployment via API
actual := ...
expected := &appsv1.Deployment{}
// fill in the fields to generate your expected state
// ...
scheme.Default(expected)
// now you should have your empty values filled in
if !reflect.DeepEqual(expected.Spec, actual.Spec) {
reconcile(expected, actual)
}
}
If you need less strict comparison for instance if you need to tolerate some injected containers then something more relaxed should be used like this.

How to keep non-nullable properties in late initialization

Following issue: In a client/server environment with Spring-Boot and Kotlin the client wants to create objects of type A and therefore posts the data through a RESTful endpoint to the server.
Entity A is realized as a data class in Kotlin like this:
data class A(val mandatoryProperty: String)
Business-wise that property (which is a primary key, too) must never be null. However, it is not known by the client, as it gets generated quite expensively by a Spring #Service Bean on the server.
Now, at the endpoint Spring tries to deserialize the client's payload into an object of type A, however, the mandatoryProperty is unknown at that point in time, which would result in a mapping exception.
Several ways to circumvent that problem, none of which really amazes me.
Don't expect an object of type A at the endpoint, but get a bunch of parameters describing A that are passed on until the entity has actually been created and mandatoryProperty is present . Quite cumbersome actually, since there are a lot more properties than just that single one.
Quite similar to 1, but create a DTO. One of my favorites, however, since data classes can't be extended it would mean to duplicate the properties of type A into the DTO (except for the mandatory property) and copy them over. Furthemore, when A grows, the DTO has to grow, too.
Make mandatoryProperty nullable and work with !! operator throughout the code. Probably the worst solution as it foils the sense of nullable and non-nullable variables.
The client would set a dummy value for the mandatoryProperty which is replaced as soon as the property has been generated. However, A is validated by the endpoint and therefore the dummy value must obey its #Pattern constraint. So each dummy value would be a valid primary key, which gives me a bad feeling.
Any other ways I might have overseen that are more feasible?
I don't think there is a general-purpose answer to this... So I will just give you my 2 cents regarding your variants...
Your first variant has a benefit which no other really has, i.e. that you will not use the given objects for anything else then they were designed to be (i.e. endpoint or backend purposes only), which however probably will lead to cumbersome development.
The second variant is nice, but could lead to some other development errors, e.g. when you thought you used the actual A but you were rather operating on the DTO instead.
Variant 3 and 4 are in that regard similar to 2... You may use it as A even though it has all the properties of a DTO only.
So... if you want to go the safe route, i.e. no one should ever use this object for anything else then its specific purpose you should probably use the first variant. 4 sounds rather like a hack. 2 & 3 are probably ok. 3 because you actually have no mandatoryProperty when you use it as DTO...
Still, as you have your favorite (2) and I have one too, I will concentrate on 2 & 3, starting with 2 using a subclass approach with a sealed class as supertype:
sealed class AbstractA {
// just some properties for demo purposes
lateinit var sharedResettable: String
abstract val sharedReadonly: String
}
data class A(
val mandatoryProperty: Long = 0,
override val sharedReadonly: String
// we deliberately do not override the sharedResettable here... also for demo purposes only
) : AbstractA()
data class ADTO(
// this has no mandatoryProperty
override val sharedReadonly: String
) : AbstractA()
Some demo code, demonstrating the usage:
// just some random setup:
val a = A(123, "from backend").apply { sharedResettable = "i am from backend" }
val dto = ADTO("from dto").apply { sharedResettable = "i am dto" }
listOf(a, dto).forEach { anA ->
// somewhere receiving an A... we do not know what it is exactly... it's just an AbstractA
val param: AbstractA = anA
println("Starting with: $param sharedResettable=${param.sharedResettable}")
// set something on it... we do not mind yet, what it is exactly...
param.sharedResettable = UUID.randomUUID().toString()
// now we want to store it... but wait... did we have an A here? or a newly created DTO?
// lets check: (demo purpose again)
when (param) {
is ADTO -> store(param) // which now returns an A
is A -> update(param) // maybe updated also our A so a current A is returned
}.also { certainlyA ->
println("After saving/updating: $certainlyA sharedResettable=${certainlyA.sharedResettable /* this was deliberately not part of the data class toString() */}")
}
}
// assume the following signature for store & update:
fun <T> update(param : T) : T
fun store(a : AbstractA) : A
Sample output:
Starting with: A(mandatoryProperty=123, sharedReadonly=from backend) sharedResettable=i am from backend
After saving/updating: A(mandatoryProperty=123, sharedReadonly=from backend) sharedResettable=ef7a3dc0-a4ac-47f0-8a73-0ca0ef5069fa
Starting with: ADTO(sharedReadonly=from dto) sharedResettable=i am dto
After saving/updating: A(mandatoryProperty=127, sharedReadonly=from dto) sharedResettable=57b8b3a7-fe03-4b16-9ec7-742f292b5786
I did not yet show you the ugly part, but you already mentioned it yourself... How do you transform your ADTO to A and viceversa? I will leave that up to you. There are several approaches here (manually, using reflection or mapping utilities, etc.).
This variant cleanly seperates all the DTO specific from the non-DTO-specific properties. However it will also lead to redundant code (all the override, etc.). But at least you know on which object type you operate and can setup signatures accordingly.
Something like 3 is probably easier to setup and to maintain (regarding the data class itself ;-)) and if you set the boundaries correctly it may even be clear, when there is a null in there and when not... So showing that example too. Starting with a rather annoying variant first (annoying in the sense that it throws an exception when you try accessing the variable if it wasn't set yet), but at least you spare the !! or null-checks here:
data class B(
val sharedOnly : String,
var sharedResettable : String
) {
// why nullable? Let it hurt ;-)
lateinit var mandatoryProperty: ID // ok... Long is not usable with lateinit... that's why there is this ID instead
}
data class ID(val id : Long)
Demo:
val b = B("backend", "resettable")
// println(newB.mandatoryProperty) // uh oh... this hurts now... UninitializedPropertyAccessException on the way
val newB = store(b)
println(newB.mandatoryProperty) // that's now fine...
But: even though accessing mandatoryProperty will throw an Exception it is not visible in the toString nor does it look nice if you need to check whether it already has been initialized (i.e. by using ::mandatoryProperty::isInitialized).
So I show you another variant (meanwhile my favorite, but... uses null):
data class C(val mandatoryProperty: Long?,
val sharedOnly : String,
var sharedResettable : String) {
// this is our DTO constructor:
constructor(sharedOnly: String, sharedResettable: String) : this(null, sharedOnly, sharedResettable)
fun hasID() = mandatoryProperty != null // or isDTO, etc. what you like/need
}
// note: you could extract the val and the method also in its own interface... then you would use an override on the mandatoryProperty above instead
// here is what such an interface may look like:
interface HasID {
val mandatoryProperty: Long?
fun hasID() = mandatoryProperty != null // or isDTO, etc. what you like/need
}
Usage:
val c = C("dto", "resettable") // C(mandatoryProperty=null, sharedOnly=dto, sharedResettable=resettable)
when {
c.hasID() -> update(c)
else -> store(c)
}.also {newC ->
// from now on you should know that you are actually dealing with an object that has everything in place...
println("$newC") // prints: C(mandatoryProperty=123, sharedOnly=dto, sharedResettable=resettable)
}
The last one has the benefit, that you can use the copy-method again, e.g.:
val myNewObj = c.copy(mandatoryProperty = 123) // well, you probably don't do that yourself...
// but the following might rather be a valid case:
val myNewDTO = c.copy(mandatoryProperty = null)
The last one is my favorite as it needs the fewest code and uses a val instead (so also no accidental override is possible or you operate on a copy instead). You could also just add an accessor for the mandatoryProperty if you do not like using ? or !!, e.g.
fun getMandatoryProperty() = mandatoryProperty ?: throw Exception("You didn't set it!")
Finally if you have some helper methods like hasID(isDTO or whatever) in place it might also be clear from the context what you are exactly doing. The most important is probably to setup a convention that everyone understands, so they know when to apply what or when to expect something specific.

How to allow spark to ignore missing input files?

I want to run a spark job (spark v1.5.1) over some generated S3 paths containing avro files. I'm loading them with:
val avros = paths.map(p => sqlContext.read.avro(p))
Some of the paths will not exist though. How can I get spark to ignore those empty paths? Previously I've used this answer, but I'm not sure how to use that with the new dataframe API.
Note: I'm ideally looking for a similar approach to the linked answer that just makes input paths optional. I don't particularly want to have to explicitly check for the existence of paths in S3 (since that's cumbersome and may make development awkward), but I guess that's my fallback if there's no clean way to implement this now.
I would use the scala Try type in order to handle the possibility of failure when reading a directory of avro files. With 'Try' we can make the possibility of failure explicit in our code, and handle it in a functional manner:
object Main extends App {
import scala.util.{Success, Try}
import org.apache.spark.{SparkConf, SparkContext}
import com.databricks.spark.avro._
val sc = new SparkContext(new SparkConf().setMaster("local[*]").setAppName("example"))
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
//the first path exists, the second one doesn't
val paths = List("/data/1", "/data/2")
//Wrap the attempt to read the paths in a Try, then use collect to filter
//and map with a single partial function.
val avros =
paths
.map(p => Try(sqlContext.read.avro(p)))
.collect{
case Success(df) => df
}
//Do whatever you want with your list of dataframes
avros.foreach{ df =>
println(df.collect())
}
sc.stop()
}

Any way to alias a function name to '_'?

In Go, is there any circumstance where the gettext short-form of:
_("String to be translated.")
can be used? One of those times where I'm fairly certain the answer is 'no', but asking just in case I've overlooked something. I'm thinking the best that can be achieved is:
import . "path/to/gettext-package"
...
s := gettext("String to be translated.")
since underscore has a very specific meaning, and attempting to define a function named '_' results in the compile-time error "cannot use _ as value".
No. The blank identifier
... does not introduce a new binding.
IOW, you can declare "things" named _ but you cannot refer to them in any way using that "name".
However, one can get close to the goal:
package main
import "fmt"
var p = fmt.Println
func main() {
p("Hello, playground")
}
(also here)
ie. you can bind any (local or imported) function to a variable and later invoke the function through that variable, getting rid of the package prefix - if you think that's handy. IMO not, BTW.

Resources