Specifing a Sharded Collection with Spring Data MongoDB - spring

I am using Spring Boot and Spring Data MongoDB to interface with an underlying sharded MongoDB cluster. My Spring Boot Application access the cluster via a mongos router.
Using Spring Data MongoDB, you can specify the collection an object is persisted to via #Document(collection = "nameOfCollection"), or it defaults to the class name (first letter lowercase). These collections do not need to exist before-hand; they can be created at runtime.
To shard a collection in MongoDB, you need to
1 - Enable sharding on the Database: sh.enableSharding("myDb")
2 - Shard the collection on a sharded database: sh.shardCollection("myDb.myCollection", {id:"hashed"})
Assuming there is an existing sharded database, does Spring Data MongoDB offer a way to shard a collection with a shard key? As far as I can tell, I cannot shard a collection with Spring, and therefore must configure the sharded collection before my Boot application runs. I find it odd that Spring would allow me to use undefined collections, but does not provide a way to configure the collection.
Edit:
I have seen both Sharding with spring mongo and How configuring access to a sharded collection in spring-data for mongo? which refer more to the deployment of a sharded MongoDB cluster. This question assumes all the plumbing is there and that the collection itself simply must be sharded.

Despite this question being old, I've got the same question, and it looks like there is away to provide custom sharding key since recently.
Annotation-based Shard Key configuration is available on spring-data-mongodb:3.x,
https://docs.spring.io/spring-data/mongodb/docs/3.0.x/reference/html/#sharding
#Document("users")
#Sharded(shardKey = { "country", "userId" })
public class User {
#Id
Long id;
#Field("userid")
String userId;
String country;
}
As of today spring-boot-starter-mongodb comes with 2.x version though.

Even though this is not a Spring Data solution, a potential workaround is posed in how to execute mongo admin command from java, where DB can be acquired from a Spring MongoTemplate.
DB db = mongo.getDB("admin");
DBObject cmd = new BasicDBObject();
cmd.put("shardcollection", "testDB.x");
cmd.put("key", new BasicDBObject("userId", 1));
CommandResult result = db.command(cmd);

Was running into the same problem with our update queries that internally used a save().
How it was solved?
So I now have overridden the spring-data-mongo core dependency from spring-boot-starter which is 2.1.x by 3.x release in our model which is now supporting #Sharded() annotation .
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-mongodb</artifactId>
<version>3.1.5</version>
</dependency>
allows you to say
#Document(collection = "hotelsdevice")
#Sharded(shardKey = { "device" })
public class Hotel extends BaseModel {
which internally is now able to tell the underlying mongo which is our shardkey. I am assuming this will further fix our count() queries too which were failing due to the same error "query need to target a shard "

Related

How spring data Cassandra handles null while saving

I have a spring boot application with spring-boot-starter-data-cassandra of version-2.1.2.RELEASE.
Need to understand how spring data Cassandra internally handles null in entity while performing insert option.
Using AsyncCassandraOperations.insert(<T>) method for persisting these entities. In some cases these entity's few fields may be null. Is this approach impact Cassandra performance or tombstones may create in Cassandra. Or please suggest an ideal approach.
There are no issues if some of the individual fields of an object are null. There are problems if one of the fields that comprise the key is null.

Java based ETL Application

I want to build a spring framework based ETL application. I should be able to create an exact copy of any table in a database. Hence, the structure of the table is not known to me beforehand. So, creation of entities is not possible within the application.
The idea is to provide some external configuration to the application for each table. The application should then be able to create an exact copy of the table.
I cannot use Spring JPA as it requires creation of entities. Thus, planning to use Spring JDBCTemplate. Will Spring JDBCTemplate be the right framework for my application?
I am not ready to use Pentaho,rather I want to build something like it with Java.
You can use Spark.
Here is an example of how you can do it
public class DemoApp {
SparkSession spark = SparkSession.builder()
.master("local[1]")
.appName(DemoApp.class.getName())
.getOrCreate();
Dataset<Row> table1 = spark.read().jdbc("jdbc:postgresql://127.0.0.1:5432/postgres", "demo.table", getConnectionProperties(dbProperties));
private Properties getConnectionProperties(Properties dbProperties) {
Properties connectionProperties = new Properties();
connectionProperties.put("user", "postgres");
connectionProperties.put("password", "password");
connectionProperties.put("driver", "org.postgresql.Driver");
connectionProperties.put("stringtype", "unspecified");
return connectionProperties;
}
}
You can read several tables and after that join them or do other things you like.

Spring Listener for MongoDB data insertion

I have a requirement where I need to display data inserted into a particular MongoDB collection onto a User's dashboard in real time. Please note data maybe inserted by this user or other users. The dashboard is part of a Spring MVC web-application. The MongoDB data-layer is written in Spring Data.
I intend to use Server-sent events approach to push the newly inserted data to the dashboard. I am looking for an efficient way to listen to data insertion using Spring. I am even open to a non-Spring approach to implement the Listener that will eventually talk to my Spring SSE emitter.
If all the save goes through your spring-data layer, then you can make use of Mongo Listener Life cycle events docs.
#Component
public class MongoListener extends AbstractMongoEventListener<Account>
{
#Override
public void onAfterSave(AfterSaveEvent<E> event) {
if (LOG.isDebugEnabled()) {
LOG.debug("onAfterSave({}, {})", event.getSource(), event.getDocument());
}
}
}
If not you would have to read mongo oplog and process or create a capped collection and use tailable cursors.
here is a sample project using tailable cursors.
You can do it via OpLog collection and Tailable Cursors in MongoDB. For example, get MongoDB OpLog Collection by using flags QUERYOPTION_TAILABLE | QUERYOPTION_AWAITDATA via your MongoDB framework (for example MongoDB Java Driver) and do the following query
MongoCursor<> cursor = db.getCollection('oplog.rs').find({ns:"collectionName", op:"i"})
where "collectionName" is the name of your collection and "i" is insert operation. After receiving events from the cursor you can send the events into a shared stream.
Unfortunately, I'm not familiar with Spring Data to provide an example for this, but the approach should be the same.

How should I define non-entity repositories with Spring Data MongoDB?

On my domain I have the usual entities (User, Company, etc) and also "entities" that doesn't change, I mean they are fixed values but stored on data base. My backend is Mongo so I make use of MongoRepository. I'm also using Spring Data Rest.
Let's say I have defined Sector as entity, which is nothing more than a String wrapped on a Java object.
So this is how I define the repository.
#RepositoryRestResource
public interface SectorRepo extends MongoRepository<Sector,String>{
}
The thing is that this seems to be inappropriate, as I should not define an object that only wraps an string and treat it as an entity, it isn't. The only purpose for Sector collection is to be loaded on a combo box, nothing more.
The problem gets serious when you have more and more of these non-entities objects.
How I should approach this situation so I can still use MongoRepository + Spring Data Rest?
This is similar to couple of other questions. Please see my answers for both. Hope it helps
Spring Data MongoDB eliminate POJO's
Storing a JSON schema in mongodb with spring

Spring Data : relationships between 2 different data sources

In a Spring Boot Application project, I have 2 data sources:
a MySQL database (aka "db1")
a MongoDB database (aka "db2")
I'm using Spring Data JPA and Spring Data MongoDB, and it's working great... one at a time.
Saying db1 handles "Players", and db2 handles "Teams" (with a list of players' ID). Is it possible to make the relationship between those 2 heterogeneous entities working? (i.e. #ManyToOne, #Transactional, Lazy/Eager, etc.)
For example, I want to be able to write:
List<Player> fooPlayers = teamDao.findOneById(foo).getPlayers();
EDIT: If possible, I'd like to find a solution working with any spring data project
Unfortunately your conundrum has no solution in spring data.
what may be a possibility is that you create an interface (DAO) class of your own. That DAO class would have implementations to to query both of your DBs. A very crude and short example would be
your DAO
{
yourFind (id)
{
this would find in db2 and return a relevant list of objects
findOneByID(id)
get the player from the above retrieved list and query db1
getPlayer(player)
}
}
i hope this points you in the right direction

Resources