Morphia. How to get a part of information from big datastore

Morphia. How to get a part of information from big datastore - spring

I have some problem with Morphia.
Could someone help me?
I am writing web-project on Spring + MongoDB about movies and celebrities.
I have entity class Genre:
#Entity(value="genres")
public class Genre implements IGenre {
#Id
#Indexed
private ObjectId id;
#Indexed
private String name;
private String description;
private long quantity;
private Set <IMovie> movies;
//getters and setters
}
And entity class Movie:
#Entity(value="movies")
public class Movie implements IMovie {
#Id
#Indexed
private ObjectId id;
#Indexed
private String originalTitle;
private String year;
private Set <IGenre> genres;
// getters and setters
}
I have 30 genres. And for example one of them: Comedy.
Also I have 250 000 comedies.
And now I want to do movie pagination by genre = comedy.
How I can get only 20 records from all comedies.
If I use #Embedded or #Reference annotation I will still get the entire list at once. And it's to big for use it in controllers.

You should change your data schema for doing such a query. The schema you use has a circular dependency, in your Genre entity you are holding Movies entity and in movies you hold Genre. Also holding all of the movies according to genre is not easy to query. If I were you I would use such a schema.
#Entity(noClassnameStored = true) // you wouldn't have any problem when you change the class name or package of your class if you don't store the classname through this annotation
public class Movie implements IMovie {
#Id
#Indexed
private ObjectId id;
#Indexed
private String originalTitle;
private String year;
private Set <String> genres; // unique identifier of the genres instead of embedding the whole genre entity
// getters and setters
}
So having such a schema, you can retrieve the movies having a particular genre by writing a simple $in query through genres field. Example query for your case:
datastore.find(Movie.class).field("genres").in(Lists.newArrayList("comedy")).limit(20).asList;
In the below web-page of mongo you can find suggestions about how to design your schema according to diffrerent scenarios.
http://docs.mongodb.org/manual/core/data-modeling/#data-modeling-patterns-and-examples

I'm not too familiar with Mongo, but it looks like you would need to implement a custom query here. Whatever you do, you need to pass a start for your page, as well as a page size (20 in your case).
You can do pagination in Morphia by combining .offset(page_start) and .limit(page_size) on a query. So first you would create a query to get movies that belong to a certain genre, and then apply the pagination.
Looks like it is handled using repositories in Spring http://static.springsource.org/spring-data/data-mongodb/docs/1.0.0.RELEASE/reference/html/#repositories.special-parameters
You'd use a Pageable implementation to pass paging data, without worrying about doing the offset and limit calls yourself. There's an example of "Web pagination" further in the doc.
Hope this helps!

Related

Elastic search with Spring Data for Reactive Repository - Deleting based on nested attributes

I am using Spring data for Elastic Search and am using the ReactiveCrudRepository for stuff like finding and deleting. I noticed that with attributes that are in root and are simple objects, the deletion works (deleteByAttributeName). However if I have nested objects then it does not work.
Here's my entities
Book
#Data
#TypeAlias("book")
#Document(indexName = "book")
public class EsBook{
#Field(type = FieldType.Long)
private Long id;
#Field(type = FieldType.Nested)
private EsStats stats;
#Field(type = FieldType.Date, format = DateFormat.date)
private LocalDate publishDate;
}
Stats
#Data
#Builder
#NoArgsConstructor
#AllArgsConstructor
#EqualsAndHashCode
public class EsStats{
#Field(type = FieldType.Double)
private Double averageRating;
#Field(type = FieldType.Integer)
private Double totalRatings;
#Field(type = FieldType.Keyword)
private String category; //this can be null
}
Here is what I have tried and is working and not working
I used ReactiveCrudRepository to delete documents in index. For all the regular fields on Book Level like id or with id and publishDate deletion works perfectly. As soon as I use embedded object like Stats, it stops working. I see the documents and the stats that I am sending match atleast visually but never finds or deletes them.
I tried to use EqualsAndHashcode in the Stats assuming maybe iternally somehow does not consider equal for some reason. I also tried changing double data type to int, because on looking at the elastic search document, I see that average review if whole number like 3 is save as 3 but when we send it from Java, i see in the debug 3 being shown as 3.0, so I was doubting if that is the case, but does not seem so. Even changing the datatype to int deletion does not work.
public interface ReactiveBookRepository extends ReactiveCrudRepository<EsBook, String> {
Mono<Void> deleteById(long id); //working
Mono<Void> deleteByIdAndPublishDate(long id, LocalDate publishDate); //Nor working
Mono<Void> deleteByIdAndStats(long id, LocalDate startDate);
}
Any help will be appreciated

Have you verified that your Elasticsearch index mapping matches your Spring Data annotations?
Verify that the index mapping defines the stats field as a nested field type.
If not, then try changing your Spring annotation to:
#Field(type = FieldType.Object)
private EsStats stats;

Hibernate Search #IndexedEmbedded on a polymorphic relationship (#Any, #ManyToAny)

I'm using Hibernate Search and looking to index an object that has polymorphic relationships that use #Any and/or #ManyToAny.
#Indexed
public class Foo {
#Any(metaDef="fooOwnerType", metaColumn=#Column(name="ownerType"))
#JoinColumn(name="ownerId")
#IndexedEmbedded // this DOES NOT WORK
private OwnerType owner;
#OneToOne
#IndexedEmbedded // this WORKS
private User user;
#OneToOne
#IndexedEmbedded // this WORKS
private Company company;
#Field
private String description;
}
#Indexed
public class User implements OwnerType {
#Field
private String name;
#Field
private String address;
}
public class Company implements OwnerType {
#Field
private String name;
}
public interface OwnerType {
}
I can search and find Foo objects using text in the description field without issue. What I'd also like to do is find Foo objects when User.name or User.address is matched... but Hibernate Search doesn't seem to index these fields for me due to the polymorphic relationship OwnerType owner.
It would work fine if I use #IndexedEmbedded on a concrete object (User or Company) directly as expected.

Yes, this is expected. #IndexedEmbedded only adds fields for the exposed type of the embedded field. There are no concrete plans to fix it at the moment, but there is a feature request here: https://hibernate.atlassian.net/browse/HSEARCH-438
Also, interfaces cannot be mapped for indexing, only classes can. This will be fixed in Search 6: https://hibernate.atlassian.net/browse/HSEARCH-1656
One way to make your code work would be to turn OwnerType into an abstract class, either a #MappedSuperclass or an #Entity, and move the fields that are common to every subclass there.
EDIT: If the user/company associations are mutually exclusive (only one can be non-null), another way to make it work would be to simply query both. For example instead of querying owner.name, query both fields user.name and company.name. The Hibernate Search DSL allows that: just use .onFields("user.name", "company.name") instead of .onField("owner.name").

How do I get Spring's Data Rest Repository to retrieve data by its name instead of its id

I am using Spring Data's Rest Repositories from spring-boot-starter-data-rest, with Couchbase being used as the underlining DBMS.
My Pojo for the object is setup as so.
#Document
public class Item{
#Id #GeneratedValue(strategy = UNIQUE)
private String id;
#NotNull
private String name;
//other items and getters and setters here
}
And say the Item has an id of "xxx-xxx-xxx-xxx" and name of "testItem".
Problem is, that when I want to access the item, I need to be accessible by /items/testItem, but instead it is accessible by /items/xxx-xxx-xxx-xxx.
How do I get use its name instead of its generated id, to get the data.

I found out the answer to my own question.
I just need to override the config for the EntityLookup.
#Component
public class SpringDataRestCustomization extends RepositoryRestConfigurerAdapter {
#Override
public void configureRepositoryRestConfiguration(RepositoryRestConfiguration config) {
config.withEntityLookup().forRepository(UserRepository.class).
withIdMapping(User::getUsername).
withLookup(UserRepository::findByUsername);
}
}
Found the info here, though the method name changed slightly.
https://github.com/spring-projects/spring-data-examples/tree/master/rest/uri-customization

If you want query the item by name and want it perform as querying by id,you should make sure the name is unique too.You cant identify a explicit object by name if all objects have a same name,right?
With jpa you could do it like:
#NotNull
#Column(name="name",nullable=false,unique=true)
private String name;

Sorting on #Transient column in Spring Data Rest via PagingAndSortingRepository

Our application uses PagingAndSortingRepository to serve our REST API. This works great, but we ran into a specific edge case that we can't seem to solve:
We have a alphanumeric field that has to be sortable (e.g. SOMETHING-123). One possible solution was to use something like a regex inside the database query's order by. This was ruled out, as we wanted to stay database independant. Thus we split up the column into two columns.
So before we had an Entity with 1 String field:
#Entity
public class TestingEntity {
#Id
#GeneratedValue
private long id;
private String alphanumeric
}
And now we have an Entity with 2 additional fields and the old field made #Transient which is filled at #PostLoad:
#Entity
public class Testing {
#Id
#GeneratedValue
private long id;
#Transient
public String alphanumeric;
#PostLoad
public void postLoad(){
this.alphanumeric = this.alphabetic + "-" + this.numeric;
}
public void setAlphanumeric(String alphanumeric) {
int index = alphanumeric.indexOf("-");
this.alphabetic = alphanumeric.substring(0, index);
this.numeric = Long.parseLong(alphanumeric.substring(index + 1));
}
#JsonIgnore
private String alphabetic;
#JsonIgnore
private Long numeric;
}
This is working great and the additional fields do not get exposed. However the sorting on the field "alphanumeric" does obviously not work anymore. The simplest solution would be to make this request:
localhost:8080/api/testingEntity?sort=alphanumeric,asc
and internally rewrite it to the working request:
localhost:8080/api/testingEntity?sort=alphabetic,asc&sort=numeric,asc
What is the best way to tackle this issue?

Save object in database if it does not already exist (Hibernate & Spring)

I'm working on a Hibernate/Spring application to manage some movies.
The class movie has a many to many relationship with the class genre.
Both of these classes have generated id's using the GeneratedValue annotation.
The genre is saved through the movie object by using #Cascade(CascadeType.SAVE_UPDATE)
I have placed a unique constraint on the genre's type attribute (which is it's name; "Fantasy" for example).
What I would like to do now is have Hibernate check if there is already a genre with type "Fantasy" and if there is, use that genre's id instead of trying to insert a new record.
(The latter would obviously throw an error)
Finally what I need is something like select-before-update but more like select-before-save.
Is there such a function in Hibernate?
Some code:
Movie class
#Entity
public class Movie {
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private int id;
private String name;
#Lob
private String description;
#Temporal(TemporalType.TIME)
private Date releaseDate;
#ManyToMany
#Cascade(CascadeType.SAVE_UPDATE)
private Set<Genre> genres = new HashSet<Genre>();
.... //other methods
Genre class
#Entity
public class Genre {
#Column(unique=true)
private String type;
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private int id
....//other methods

You may be over-thinking this. Any select-before-update/select-before-save option is going to result in 2 DB round trips, the first for the select, and the second for the insert if necessary.
If you know you won't have a lot of genres from the outset, you do have a couple of options for doing this in 1 RT most of the time:
The Hibernate second-level cache can hold many if not all of your Genres, resulting in a simple hashtable lookup (assuming a single node) when you check for existence.
You can assume all of your genres are already existing, use session.load(), and handle the new insert as a result of the row not found exception that gets thrown when you reference a genre that doesn't already exist.
Realistically, though, unless you're talking about a LOT of transactions, a simple pre-query before save/update is not going to kill your performance.

I haven't heard of such a function in Hibernate select-before-update/select-before-save
In situations like these you should treat Hibernate as if it was JDBC.
First if you want to know if you even have such a Genre you should query for it.
if you do. then the SAVE_UPDATE will not create a new one when you add it to a movie.
if you don't, Hibernate will create a new Genre row in the database and add the connection to the many_to_many table for you.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Morphia. How to get a part of information from big datastore - spring

Related

Elastic search with Spring Data for Reactive Repository - Deleting based on nested attributes

Hibernate Search #IndexedEmbedded on a polymorphic relationship (#Any, #ManyToAny)

How do I get Spring's Data Rest Repository to retrieve data by its name instead of its id

Sorting on #Transient column in Spring Data Rest via PagingAndSortingRepository

Save object in database if it does not already exist (Hibernate & Spring)

Categories

Resources