Latest Value from the KStream for a given key - apache-kafka-streams

I have KTable like below
KTable<productId1,productValue1>
KTable<productId2,productValue2>
KTable<productId3,productValue2>
Then i have KStream<custId,productId>, the productId is of any of productId1, productId2, productId3. I want to have the corresponding productValue.
The response would be like KStream<custId, productValueN>
If the KStream receives productId as productId2, then the response would be KStream<custId, productValue2>
How this could be possible?
Thanks

Unfortunately KTables don't support KeyValueMappers: KStream/KTable joins are performed strictly by comparing keys. GlobalKTables support this, making the whole process of joining much easier (see my other answer).
To get this to work then, you need to rekey the customers topic KStream to use the product ID, so that you can perform the join with the products topic KTable.
Once you've obtained the product name, you can rekey the resulting joined stream back to the customer ID. For example:
// Key: product ID, value: product name
KTable<String, String> productsTable =
streamsBuilder.table(PRODUCTS_TOPIC, Consumed.with(stringSerde, stringSerde),
Materialized.<String, String, KeyValueStore<Bytes, byte[]>>as("productsStore")
.withKeySerde(stringSerde)
.withValueSerde(stringSerde));
streamsBuilder.stream(CUSTOMERS_TOPIC, Consumed.with(stringSerde, stringSerde))
// Key: customer ID, value: product ID
.map((customerId, productId) -> KeyValue.pair(productId, String.format("%s|%s", customerId, productId)),
Named.as("mapToProductIdAndCustomerIdProductId"))
// Key: product ID, value: customer ID | product ID
.join(productsTable, (customerIdProductId, productName) -> {
String[] parts = customerIdProductId.split("\\|");
String customerId = parts[0];
return String.format("%s|%s", customerId, productName);
},
Joined.<String, String, String>as("customersProductsInnerJoin")
.withKeySerde(stringSerde)
.withValueSerde(stringSerde))
// Key: product ID, value: customer ID | product name
.map((productId, customerIdProductName) -> {
String[] parts = customerIdProductName.split("\\|");
String customerId = parts[0];
String productName = parts[1];
return KeyValue.pair(customerId, productName);
}, Named.as("mapToCustomerIdAndProductName"))
// Key: customer ID, value: product name
.print(Printed.toSysOut());

You can accomplish this using a GlobalKTable for products, and then joining the customers KStream with it as follows:
Serde<String> stringSerde = new Serdes.StringSerde();
StreamsBuilder streamsBuilder = new StreamsBuilder();
GlobalKTable<String, String> productsTable =
streamsBuilder.globalTable(PRODUCTS_TOPIC, Consumed.with(stringSerde, stringSerde),
Materialized.<String, String, KeyValueStore<Bytes, byte[]>>as("products")
.withKeySerde(stringSerde)
.withValueSerde(stringSerde));
streamsBuilder.stream(CUSTOMERS_TOPIC, Consumed.with(stringSerde, stringSerde))
.join(productsTable,
(customerId, productId) -> productId,
(productName1, productName2) -> productName2,
Named.as("customersProducts"))
.print(Printed.toSysOut());

Related

Android room data relations

I have multiple object tyes inside a parent class.Say I have a College class as below
data class College(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
val groups: List<Group>? = null,
val status: String? = null,
)
data class Group(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
val students: List<Students>? = null,
)
The problem here is there is not relational id in child tables.I mean JSON received don't give relation of groups inside college with collegeId in Groups table or relation of students in group table.
The JSON received is as below
"college": [
{
"id": "collegeid",
"groups": [
"id": "groupid"
"name": "BCOM" // Here no collegeId is mentioned inside it
]
}
If is use #Embedded keyword it is throwing "Entities and POJOs must have a usable public constructor".
Is there anyway with above JSON I can set the id of college inside group and use it as foreign key for relations.
I have used Typeconverters and is working fine but now I need to create relations between these tables with above type of JSON.
I use Gson parsing
I believe that you may be getting mixed up with how to utilise relationships/tables.
IF you have
#Entity
data class College(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
val groups: List<Group>? = null,
val status: String? = null,
)
this will be a table where the groups column will (in theory) contain a List of Groups, there is no relationship as there is just a column with a single stream of data (probably as JSON string).
However, if you wanted to have a table with the Colleges, a table with the Groups and a table with the Students then you wouldn't embed the Groups within the College and subsequently the Students within the Groups.
Rather, if you want to approach this from a db relationship aspect. Assuming that a Group MUST belong to one and only one College and that a Student MUST only belong to a single Group then you would have something along the lines of:-
#Entity
data class College(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
//val groups: List<Group>? = null, //<<<<< NO NEED (see Group)
val status: String? = null,
)
#Entity
data class Group(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
//val students: List<Students>? = null, //<<<<< NO NEED (similar to Group)
val collegeId: String //<<<<< ADDED for relationships to parent College
)
similar with Students
the crucial factor here is the new column/field/variable collegeId, which should contain the id of the parent college.
To get the College(s) with the related groups then you can have a class that has #Embedded annotation for the parent (the College) and #Relation annotation for the children (Groups). e.g.
data class CollegeWithRelatedGroups(
#Embedded
val college: College,
#Relation(
entity = Group::class,
parentColumn = "id",
entityColumn = "collegeId"
)
val groups: List<Group>
)
Working Example
Here's a working example that uses the above and adds (inserts) 3 Colleges, and then 6 Groups. The first College has 3 related Groups, the second 2 and the third 1.
The example then extracts the Colleges with the related Groups outputting the result to the log.
College and also CollegeWithRelatedGroups
#Entity
data class College(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
//val groups: List<Group>? = null,
val status: String? = null
)
data class CollegeWithRelatedGroups(
#Embedded
val college: College,
#Relation(
entity = Group::class,
parentColumn = "id",
entityColumn = "collegeId"
)
val groups: List<Group>
)
Group (with Foreign Key constraint)
#Entity(
/* Optional but suggested Foreign Key to enforce referential integrity
*/
foreignKeys = [
ForeignKey(
entity = College::class,
parentColumns = ["id"],
childColumns = ["collegeId"],
/* Optional within Foreign Key - helps maintain referential integrity (if CASCADE)
ON DELETE will delete the Groups that are related to a College if the College is deleted
ON UPDATE will change the collegeId of Groups that are related to a College if the id of the College is changed
*/
onDelete = ForeignKey.CASCADE,
onUpdate = ForeignKey.CASCADE
)
]
)
data class Group(
#PrimaryKey
val id:String,
val name: String? = null,
val description: String? = null,
//val students: List<Student>? = null,
#ColumnInfo(index = true) /* faster to access via relationship if indexed */
val collegeId: String
)
AllDao
#Dao
interface AllDao {
#Insert(onConflict = OnConflictStrategy.IGNORE)
fun insert(college: College): Long
#Insert(onConflict = OnConflictStrategy.IGNORE)
fun insert(group: Group): Long
#Transaction
#Query("SELECT * FROM college")
fun getCollegesWithRelatedGroups(): List<CollegeWithRelatedGroups>
}
TheDatabase (note for brevity and convenience allows main thread processing)
#Database(entities = [College::class, Group::class], version = 1, exportSchema = false)
abstract class TheDatabase : RoomDatabase() {
abstract fun getAllDao(): AllDao
companion object {
#Volatile
private var instance: TheDatabase? = null
fun getInstance(context: Context): TheDatabase {
if (instance == null) {
instance = Room.databaseBuilder(context, TheDatabase::class.java, "the_database.db")
.allowMainThreadQueries()
.build()
}
return instance as TheDatabase
}
}
}
Note there is no need for TypeConverters as objects are not being stored (aka simple types are stored)
Activity Code (MainActivity in this case)
const val TAG = "DBINFO"
class MainActivity : AppCompatActivity() {
lateinit var db: TheDatabase
lateinit var dao: AllDao
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
db = TheDatabase.getInstance(this)
dao = db.getAllDao()
dao.insert(College("College1","The first College","Opened"))
dao.insert(College("College2","The second College","Opened"))
dao.insert(College("New College","A new College","Being built"))
dao.insert(Group("GroupA","Group A or something","The first Group - hence A","College1",))
dao.insert(Group("GroupB","Group B or whatever"," The second Group ...", "College2"))
dao.insert(Group("GroupC","Group C on so on","The third Group ...","New College"))
dao.insert(Group("GroupX","Group X ...","The Xth group","College1"))
dao.insert(Group("GroupY","...","...","College1"))
dao.insert(Group("GroupZ","...","...","College2"))
for (cwrg in dao.getCollegesWithRelatedGroups()) {
Log.d(TAG,"College is ID=${cwrg.college.id} Desc=${cwrg.college.description} Name=${cwrg.college.name} Status=${cwrg.college.name}\nIt has ${cwrg.groups.size} Groups. They are:-")
for (g in cwrg.groups) {
Log.d(TAG,"\n\t\tGroup is ID=${g.id} Name=${g.name} Desc=${g.description} it reference CollegeID=${g.collegeId}")
}
}
}
}
Note The status has not been supplied for the Colleges. so they will be null.
RESULT the log includes :-
D/DBINFO: College is ID=College1 Desc=Opened Name=The first College Status=The first College
It has 3 Groups. They are:-
D/DBINFO: Group is ID=GroupA Name=Group A or something Desc=The first Group - hence A it reference CollegeID=College1
D/DBINFO: Group is ID=GroupX Name=Group X ... Desc=The Xth group it reference CollegeID=College1
D/DBINFO: Group is ID=GroupY Name=... Desc=... it reference CollegeID=College1
D/DBINFO: College is ID=College2 Desc=Opened Name=The second College Status=The second College
It has 2 Groups. They are:-
D/DBINFO: Group is ID=GroupB Name=Group B or whatever Desc= The second Group ... it reference CollegeID=College2
D/DBINFO: Group is ID=GroupZ Name=... Desc=... it reference CollegeID=College2
D/DBINFO: College is ID=New College Desc=Being built Name=A new College Status=A new College
It has 1 Groups. They are:-
D/DBINFO: Group is ID=GroupC Name=Group C on so on Desc=The third Group ... it reference CollegeID=New College
The Database via App Inspection :-
and

Android Room Multimap issue for the same column names

As stated in official documentation, it's preferable to use the Multimap return type for the Android Room database.
With the next very simple example, it's not working correctly!
#Entity
data class User(#PrimaryKey(autoGenerate = true) val _id: Long = 0, val name: String)
#Entity
data class Book(#PrimaryKey(autoGenerate = true) val _id: Long = 0, val bookName: String, val userId: Long)
(I believe a loooot of the developers have the _id primary key in their tables)
Now, in the Dao class:
#Query(
"SELECT * FROM user " +
"JOIN book ON user._id = book.userId"
)
fun allUserBooks(): Flow<Map<User, List<Book>>>
The database tables:
Finally, when I run the above query, here is what I get:
While it should have 2 entries, as there are 2 users in the corresponding table.
PS. I'm using the latest Room version at this point, Version 2.4.0-beta02.
PPS. The issue is in how UserDao_Impl.java is being generated:
all the _id columns have the same index there.
Is there a chance to do something here? (instead of switching to the intermediate data classes).
all the _id columns have the same index there.
Is there a chance to do something here?
Yes, use unique column names e.g.
#Entity
data class User(#PrimaryKey(autoGenerate = true) val userid: Long = 0, val name: String)
#Entity
data class Book(#PrimaryKey(autoGenerate = true) valbookid: Long = 0, val bookName: String, val useridmap: Long)
as used in the example below.
or
#Entity
data class User(#PrimaryKey(autoGenerate = true) #ColumnInfo(name="userid")val _id: Long = 0, val name: String)
#Entity
data class Book(#PrimaryKey(autoGenerate = true) #ColumnInfo(name="bookid")val _id: Long = 0, val bookName: String, val #ColumnInfo(name="userid_map")userId: Long)
Otherwise, as you may have noticed, Room uses the value of the last found column with the duplicated name and the User's _id is the value of the Book's _id column.
Using the above and replicating your data using :-
db = TheDatabase.getInstance(this)
dao = db.getAllDao()
var currentUserId = dao.insert(User(name = "Eugene"))
dao.insert(Book(bookName = "Eugene's book #1", useridmap = currentUserId))
dao.insert(Book(bookName = "Eugene's book #2", useridmap = currentUserId))
dao.insert(Book(bookName = "Eugene's book #3", useridmap = currentUserId))
currentUserId = dao.insert(User(name = "notEugene"))
dao.insert(Book(bookName = "not Eugene's book #4", useridmap = currentUserId))
dao.insert(Book(bookName = "not Eugene's book #5", useridmap = currentUserId))
var mapping = dao.allUserBooks() //<<<<<<<<<< BREAKPOINT HERE
for(m: Map.Entry<User,List<Book>> in mapping) {
}
for convenience and brevity a Flow hasn't been used and the above was run on the main thread.
Then the result is what I believe you are expecting :-
Additional
What if we already have the database structure with a lot of "_id" fields?
Then you have some decisions to make.
You could
do a migration to rename columns to avoid the ambiguous/duplicate column names.
use alternative POJO's in conjunction with changing the extract output column names accordingly
e.g. have :-
data class Alt_User(val userId: Long, val name: String)
and
data class Alt_Book (val bookId: Long, val bookName: String, val user_id: Long)
along with :-
#Query("SELECT user._id AS userId, user.name, book._id AS bookId, bookName, user_id " +
"FROM user JOIN book ON user._id = book.user_id")
fun allUserBooksAlt(): Map<Alt_User, List<Alt_Book>>
so user._id is output with the name as per the Alt_User POJO
other columns output specifically (although you could use * as per allUserBookAlt2)
:-
#Query("SELECT *, user._id AS userId, book._id AS bookId " +
"FROM user JOIN book ON user._id = book.user_id")
fun allUserBooksAlt2(): Map<Alt_User, List<Alt_Book>>
same as allUserBooksAlt but also has the extra columns
you would get a warning warning: The query returns some columns [_id, _id] which are not used by any of [a.a.so70190116kotlinroomambiguouscolumnsfromdocs.Alt_User, a.a.so70190116kotlinroomambiguouscolumnsfromdocs.Alt_Book]. You can use #ColumnInfo annotation on the fields to specify the mapping. You can annotate the method with #RewriteQueriesToDropUnusedColumns to direct Room to rewrite your query to avoid fetching unused columns. You can suppress this warning by annotating the method with #SuppressWarnings(RoomWarnings.CURSOR_MISMATCH). Columns returned by the query: _id, name, _id, bookName, user_id, userId, bookId. public abstract java.util.Map<a.a.so70190116kotlinroomambiguouscolumnsfromdocs.Alt_User, java.util.List<a.a.so70190116kotlinroomambiguouscolumnsfromdocs.Alt_Book>> allUserBooksAlt2();
Due to Note that Room will not rewrite the query if it has multiple columns that have the same name as it does not yet have a way to distinguish which one is necessary. the #RewriteQueriesToDropUnusedColumns doesn't do away with the warning.
if using :-
var mapping = dao.allUserBooksAlt() //<<<<<<<<<< BREAKPOINT HERE
for(m: Map.Entry<Alt_User,List<Alt_Book>> in mapping) {
}
Would result in :-
possibly other options.
However, I'd suggest fixing the issue once and for all by using a migration to rename columns to all have unique names. e.g.

Java 8 stream map custom function and convert it to Map

I have the following object:
public class Book {
private Long id;
private Long bookId;
private String bookName;
private String owner;
}
Represented from following table:
Basically, a book can be owned by multiple owners i.e. Owner "a" owns books 1 and 2.
I have a basic function that will when passed a book object, will give its owner(s) in a List.
private List<String> getBookToOwner(Book book) {
List<String> a = new ArrayList<>();
if (book.getOwner() != null && !book.getOwner().isEmpty()) {
a.addAll(Arrays.asList(book.getOwner().split("/")));
}
return a;
}
I want to use that to apply to each book, retrieve their owners and create the following Map.
Map<String, List<Long>> ownerToBookMap;
Like this:
How do I use streams here?
//books is List<Book>
Map<String, List<Long>> ownerToBookMap = books.stream().map(
// apply the above function to get its owners, flatten it and finally collect it to get the above Map object
// Need some help here..
);
You can get the owner list from the book, then flatten the owners and map as pair of bookId and owner using flatMap. Then grouping by owner using groupingBy and collect the list of bookId of owner.
Map<String, List<Long>> ownerToBookMap =
books.stream()
.flatMap(b -> getBookToOwner(b)
.stream()
.map(o -> new AbstractMap.SimpleEntry<>(o, b.getBookId())))
.collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.mapping(Map.Entry::getValue, Collectors.toList())));
Flatmap the owners into a single one, create entries with key as an single owner and value as a bookId. Then group the structure by the key (owner). Finally use Collectors::mapping to get the List of bookIds instead of the actual entries:
List<Book> books = ...
Map<String, List<Long>> booksByOwner = books.stream()
.flatMap(book -> Arrays.stream(book.getOwner().split("/"))
.map(owner -> new AbstractMap.SimpleEntry<>(owner, book.getBookId())))
.collect(Collectors.groupingBy(
AbstractMap.SimpleEntry::getKey,
Collectors.mapping(AbstractMap.SimpleEntry::getValue, Collectors.toList())));
I use reduce instead of map.
Map<String, List<Long>> ownerToBookMap = books.stream().reduce(
HashMap::new,
(acc,b) -> {
getBookToOwner(b).stream().forEach( o -> {
if (!acc.containsKey(o))
acc.put(o, new ArrayList<Long>());
acc.get(o).put(b.bookId);
});
return acc;
}
).get();

leftjoin on two GlobalKTables

I am trying to join a stream to 2 differents GlobalTables, treating them as a lookup, more specifically, devices (user agent) and geocoding (ip address).
The issue being with the serialization, but I dont get why. It gets stuck on DEFAULT_VALUE_SERDE_CLASS_CONFIG but the topic to which I want to write is serialized correctly.
//
// Set up serialization / de-serialization
private static Serde<String> stringSerde = Serdes.String();
private static Serde<PodcastData> podcastSerde = StreamsSerdes.PodCastSerde();
private static Serde<GeoCodedData> geocodedSerde = StreamsSerdes.GeoIPSerde();
private static Serde<DeviceData> deviceSerde = StreamsSerdes.DeviceSerde();
private static Serde<JoinedPodcastGeoDeviceData> podcastGeoDeviceSerde = StreamsSerdes.PodcastGeoDeviceSerde();
private static Serde<JoinedPodCastDeviceData> podcastDeviceSerde = StreamsSerdes.PodcastDeviceDataSerde()
...
GlobalKTable<String, DeviceData> deviceIDTable = builder.globalTable(kafkaProperties.getProperty("deviceid-topic"));
GlobalKTable<String, GeoCodedData> geoIPTable = builder.globalTable(kafkaProperties.getProperty("geoip-topic"));
//
// Stream from source topic
KStream<String, PodcastData> podcastStream = builder.stream(
kafkaProperties.getProperty("source-topic"),
Consumed.with(stringSerde, podcastSerde));
//
podcastStream
// left join the podcast stream to the device table, looking up the device
.leftJoin(deviceIDTable,
// get a DeviceData object from the user agent
(podcastID, podcastData) -> podcastData.getUser_agent(),
// join podcast and device and return a JoinedPodCastDeviceData object
(podcastData, deviceData) -> {
JoinedPodCastDeviceData data =
JoinedPodCastDeviceData.builder().build();
data.setPodcastObject(podcastData);
data.setDeviceData(deviceData);
return data;
})
// left join the podcast stream to the geo table, looking up the geo data
.leftJoin(geoIPTable,
// get a Geo object from the ip address
(podcastID, podcastDeviceData) -> podcastDeviceData.getPodcastObject().getIp_address(),
// join podcast and geo
(podcastDeviceData, geoCodedData) -> {
JoinedPodcastGeoDeviceData data=
JoinedPodcastGeoDeviceData.builder().build();
data.setGeoData(geoCodedData);
data.setDeviceData(podcastDeviceData.getDeviceData());
data.setPodcastData(podcastDeviceData.getPodcastObject());
return data;
})
//
.to(kafkaProperties.getProperty("sink-topic"),
Produced.with(stringSerde, podcastGeoDeviceSerde));
...
...
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());
The error
ERROR java.lang.String cannot be cast to DeviceData
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());
Due to above value, the application will use String serde as default value serde unless you specify explicitly while making KTable/KStream/GlobalKTable.
Since expected value Type for deviceIDTable is DeviceData, specify that as given below:
You need to define the value serde in GlobalKTable .
GlobalKTable<String, DeviceData> deviceIDTable = builder.globalTable(kafkaProperties.getProperty("deviceid-topic"), Materialized.<String, DeviceData, KeyValueStore<Bytes, byte[]>>as(DEVICE_STORE)
.withKeySerde(stringSerde)
.withValueSerde(deviceSerde));

how to select value and count in spring jpa?

i have table named gifts that contains field company_value_id and i want to make select for all company_value_id,count(company_value_id) so that the result will be list of object and each object will contain company_value_id,count(company_value_id)
i am using spring jpa with annotations as follows:
public interface GiftsRepository extends JpaRepository<Gifts, String> {
#Query("from Gifts g where g.companyGuid = :companyGuid")
List<Gifts> getGiftsByCompany(#Param("companyGuid") String companyGuid);
}
please advise, thanks.
i was able to accomplish it as follows:
#Query("select g.value.id,cr.value.name,count(g.value.id) from Gift g where g.user.id=:userId group by g.value")
List<Object[]> getUserGifts(
#Param("userId") String userId);
and in the service layer i extract the values as follows:
List<Object[]> results = giftsRepository
.getUserGifts(userId);
for (Object[] result : results) {
String id = (String) result[0];
String name = (String) result[1];
int count = ((Number) result[2]).intValue();
}
You need add a parameter to your function,just like this:
#Query("from Gifts g where g.companyGuid = :companyGuid")
List<Gifts> getGiftsByCompany(#Param("companyGuid") String companyGuid,Pageable pageable);
and the pageabel can be create like this:
Pageable pageable = new PageRequest(pageIndex, pageSize, Direction.ASC, sortColumn);

Resources