Enum data type in Cassandra

Enum data type in Cassandra - enums

I am trying to migrate my database from MySQL to Cassasndra. The problem I am facing is with one of the column type defined as Enum (enum('GP','NGP','PGP','PAGP')). Cassandra does not support Enum data types (it supports collections though). Is there a way to implement Enum data type in Cassandra, so that the value of a column should be restricted from a set of values? I am using Apache Cassandra version 2.0.7.

See datastax cassandra Object-mapping API,
http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/crudOperations.html
enum Gender { FEMALE, MALE };
// FEMALE will be persisted as 'FEMALE'
#Enumerated(EnumType.STRING)
private Gender gender;
// FEMALE will be persisted as 0, MALE as 1
#Enumerated(EnumType.ORDINAL)
private Gender gender
for cassandra 3.0
enum State {INIT, RUNNING, STOPPING, STOPPED}
cluster.getConfiguration().getCodecRegistry()
.register(new EnumNameCodec<State>(State.class));
// schema: create table name_example(id int PRIMARY KEY, state text)
session.execute("insert into name_example (id, state) values (1, ?)", State.INIT);
// state is saved as 'INIT'
http://docs.datastax.com/en/developer/java-driver/3.1/manual/custom_codecs/extras/

As far I know and after reading the documentation about cql types, you can not use directly enum in cql statements (I check this for the java clients).
So the option you have is convert the Enum to String to include the field in a cql statement. BY this way all your application use the Enum but in the backend layer use the string representation for the enum.

I was facing the same issue with an integer enum... here's what I did:
MappingConfiguration.Global.Define(
new[] {
new Map<Login>()
.TableName("logins")
.PartitionKey(el => el.UserId)
.Column(el => el.UserId, cm => cm.WithName("user_id")),
.Column(el => el.Gender, cm => cm.WithName("gender_id").WithDbType<int>()),
});
Using C# driver 2.5 and DSE 4.7.

there is more or less native support of enums in cassandra
http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/crudOperations.html
As far as I know you can write your own custom serializers etc for cassandra and it will be able to understand your specific enum. But the those jars should be places in cassandra folder.
You can also store it as String or ordinal int value

Related

Jooq enum converter uses the ordinal number. How can I switch to use the enum value number instead?

We have an Enum class with customized values. The values are different from their ordinals intentionally for business purpose which I cannot change.
enum class Role(val value: Int) {
EXECUTOR(1),
MONITOR(3),
ADMIN(5),
companion object {
private val map = Role.values().associateBy(Role::value)
fun fromInt(role: Int) = map[role]
}
}
We are using JOOQ and postgres. We use the JOOQ default EnumConverter to convert db role integer values to objects.
ForcedType()
.withUserType("com.company.enums.Role")
.withEnumConverter(true)
.withIncludeExpression("role"),
However we noticed a problem -- the database stored the ordinals of the enum, instead of the values. For example we see in the db in roles column, the db value is 1, and the translated Enum is MONITOR, coz MONITOR has an ordinal of 1.
How can we store the values of Enum into db using JOOQ?
Thank you!

You can of course implement a custom converter from scratch, as you suggested in your own answer. But do note that starting from jOOQ 3.16 and https://github.com/jOOQ/jOOQ/issues/12423, you can simplify that implementation by extending the org.jooq.impl.EnumConverter like this:
class RoleConverter : EnumConverter<Int, Role>(
Int::class.java,
Role::class.java,
Role::value
)

Ohhh I figured it out! There's Custom Converter https://www.jooq.org/doc/latest/manual/code-generation/custom-data-types/ which is exactly what I need.

Spring Data / Hibernate save entity with Postgres using Insert on Conflict Update Some fields

I have a domain object in Spring which I am saving using JpaRepository.save method and using Sequence generator from Postgres to generate id automatically.
#SequenceGenerator(initialValue = 1, name = "device_metric_gen", sequenceName = "device_metric_seq")
public class DeviceMetric extends BaseTimeModel {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "device_metric_gen")
#Column(nullable = false, updatable = false)
private Long id;
///// extra fields
My use-case requires to do an upsert instead of normal save operation (which I am aware will update if the id is present). I want to update an existing row if a combination of three columns (assume a composite unique) is present or else create a new row.
This is something similar to this:
INSERT INTO customers (name, email)
VALUES
(
'Microsoft',
'hotline#microsoft.com'
)
ON CONFLICT (name)
DO
UPDATE
SET email = EXCLUDED.email || ';' || customers.email;
One way of achieving the same in Spring-data that I can think of is:
Write a custom save operation in the service layer that
Does a get for the three-column and if a row is present
Set the same id in current object and do a repository.save
If no row present, do a normal repository.save
Problem with the above approach is that every insert now does a select and then save which makes two database calls whereas the same can be achieved by postgres insert on conflict feature with just one db call.
Any pointers on how to implement this in Spring Data?
One way is to write a native query insert into values (all fields here). The object in question has around 25 fields so I am looking for an another better way to achieve the same.

As #JBNizet mentioned, you answered your own question by suggesting reading for the data and then updating if found and inserting otherwise. Here's how you could do it using spring data and Optional.
Define a findByField1AndField2AndField3 method on your DeviceMetricRepository.
public interface DeviceMetricRepository extends JpaRepository<DeviceMetric, UUID> {
Optional<DeviceMetric> findByField1AndField2AndField3(String field1, String field2, String field3);
}
Use the repository in a service method.
#RequiredArgsConstructor
public class DeviceMetricService {
private final DeviceMetricRepository repo;
DeviceMetric save(String email, String phoneNumber) {
DeviceMetric deviceMetric = repo.findByField1AndField2AndField3("field1", "field", "field3")
.orElse(new DeviceMetric()); // create new object in a way that makes sense for you
deviceMetric.setEmail(email);
deviceMetric.setPhoneNumber(phoneNumber);
return repo.save(deviceMetric);
}
}
A word of advice on observability:
You mentioned that this is a high throughput use case in your system. Regardless of the approach taken, consider instrumenting timers around this save. This way you can measure the initial performance against any tunings you make in an objective way. Look at this an experiment and be prepared to pivot to other solutions as needed. If you are always reading these three columns together, ensure they are indexed. With these things in place, you may find that reading to determine update/insert is acceptable.

I would recommend using a named query to fetch a row based on your candidate keys. If a row is present, update it, otherwise create a new row. Both of these operations can be done using the save method.
#NamedQuery(name="getCustomerByNameAndEmail", query="select a from Customers a where a.name = :name and a.email = :email");
You can also use the #UniqueColumns() annotation on the entity to make sure that these columns always maintain uniqueness when grouped together.
Optional<Customers> customer = customerRepo.getCustomersByNameAndEmail(name, email);
Implement the above method in your repository. All it will do it call the query and pass the name and email as parameters. Make sure to return an Optional.empty() if there is no row present.
Customers c;
if (customer.isPresent()) {
c = customer.get();
c.setEmail("newemail#gmail.com");
c.setPhone("9420420420");
customerRepo.save(c);
} else {
c = new Customer(0, "name", "email", "5451515478");
customerRepo.save(c);
}
Pass the ID as 0 and JPA will insert a new row with the ID generated according to the sequence generator.
Although I never recommend using a number as an ID, if possible use a randomly generated UUID for the primary key, it will qurantee uniqueness and avoid any unexpected behaviour that may come with sequence generators.

With spring JPA it's pretty simple to implement this with clean java code.
Using Spring Data JPA's method T getOne(ID id), you're not querying the DB itself but you are using a reference to the DB object (proxy). Therefore when updating/saving the entity you are performing a one time operation.
To be able to modify the object Spring provides the #Transactional annotation which is a method level annotation that declares that the method starts a transaction and closes it only when the method itself ends its runtime.
You'd have to:
Start a jpa transaction
get the Db reference through getOne
modify the DB reference
save it on the database
close the transaction
Not having much visibility of your actual code I'm gonna abstract it as much as possible:
#Transactional
public void saveOrUpdate(DeviceMetric metric) {
DeviceMetric deviceMetric = metricRepository.getOne(metric.getId());
//modify it
deviceMetric.setName("Hello World!");
metricRepository.save(metric);
}
The tricky part is to not think the getOne as a SELECT from the DB. The database never gets called until the 'save' method.

Entity Framework Core 2.1 System.Data.SqlClient.SqlException (0x80131904): Type Flag is not a defined system type

After upgrading to EntityFramework 2.1.11, I am facing the following issue.
System.Data.SqlClient.SqlException (0x80131904): Type Flag is not a defined system type.
I am getting this Error for Linq to SQL internal translation. There are two columns in database table which are of tinyint datatype which have corresponding C# datatype as byte, which is throwing exception in Linq while querying.
The reason is column == 1 is translated as CAST(1 AS Flag)internally in 2.1 which was working in 2.0.
It is working if we change == 1 to == Convert.ToByte(1) or assigning to byte variable and using that as == variable which I think is not the ideal fix.
This is the piece of code which is throwing error.
var query = await (from pl in _azureContext.Product.AsNoTracking()
where pl.Primary ==1 && pl.Deleted == 0
select new Product
{
ProductId = pl.ProductId,
ProductName = pl.ProductName
}).OrderBy(P => P.ProductName).ToListAsync<Product>();
SQL Internal Translation which throws exception is as follows:
SELECT [pl].[ProductId] , [pl].[ProductName] FROM [Products] AS [pl] WHERE ([pl].[Primary] = CAST(1 AS Flag)) AND ([pl].[Deleted] = CAST(0 AS Flag)) ORDER BY [pl].[ProductName]
The Expected SQL Translation is as follows:
SELECT [pl].[ProductId] , [pl].[ProductName] FROM [Products] AS [pl] WHERE ([pl].[Primary] = 1) AND ([pl].[Deleted] = 0) ORDER BY [pl].[ProductName]
It looks like a bug in Entityframework Core 2.1. Could anyone please help me on this?
Added additional information based on comments from David.
1) I haven't created any custom type for this and not missing.
2) C# datat type is Byte for pl.Primary and pl.Deleted.
3) In the dbContext I am seeing the following in onModelCreating method.
entity.Property(e => e.Primary).HasColumnType("Flag");
entity.Property(e => e.Deleted).HasColumnType("Flag");
Note: DbContext was generated earlier with .net core 2.0 and no code changes done on that.

The problem is that you have HasColumnType("Flag") in the configuration for your properties. This tells Entity Framework that the type of the column is Flag, obviously not a standard SQL Server data type. The simple solution is to remove that configuration method.
However, those columns are obviously meant to be boolean flags, and you should be using the appropriate data type. This means in C# your type is bool and in SQL Server it is bit. For example, your table would look something like this:
CREATE TABLE Products
(
-- Other columns
Primary BIT,
Deleted BIT
)
and your C# class like this
public class Product
{
// Snip other columns
public bool Primary { get; set; }
public bool Deleted { get; set; }
}

What is the difference between unique_index and unique?

What is the difference between unique_index and unique in GORM?
I am using MySQL 8.0, I cannot find the description about the difference between unique_index & unique form manual.
From here, see specifically the Email and MemberNumber fields:
Declaring Models
Models are usually just normal Golang structs, basic Go types, or pointers of them. sql.Scanner and driver.Valuer interfaces are also supported.
Model Example:
type User struct {
gorm.Model
Name string
Age sql.NullInt64
Birthday *time.Time
Email string `gorm:"type:varchar(100);unique_index"`
Role string `gorm:"size:255"` // set field size to 255
MemberNumber *string `gorm:"unique;not null"` // set member number to unique and not null
Num int `gorm:"AUTO_INCREMENT"` // set num to auto incrementable
Address string `gorm:"index:addr"` // create index with name `addr` for address
IgnoreMe int `gorm:"-"` // ignore this field
}

unique is a database constraint that (in this case) prevents the multiple record have the same value for MemberNumber. If such an insert or update is made, the operation will not succeed and return an error.
unique_index will create a database index that also ensures that no two values can be the same. It will do the same, but create an index.
In your case: MySQL will use a unique index behind the scenes when using a unique constraint. So when using MySQL, there is no difference when using unique and using unique index.
If you use other database management systems there might be differences.
The differences (if any) will be handled by the database management system internally. For practical purposes you can regard them as the same. The differences will be documented for each database management system.

Mapping enums while using dapper

I have the following problem. I am using Dapper to connect to a database, the field that is a varchar in the database is an enum in my object. There is no problem for Dapper to map the database object to my DTO if the enum has the same name as the string in the database. Unfortunately, the strings in the database are not very user friendly and I was wondering if there is a way to map them or convert (only enums) to use more user friendly versions. For example, database value for a field:
SomeVeIRdLooking_Value
And I would like it to map to:
public enum MyEnum {
MyFormattedValue
}

You can select string values from database and convert it by hand.
public enum MyEnum
{
None,
Success,
Failure
}
var enums = connection.Query<string>("select 'None' union select 'Success' union select 'Failure'")
.Select(x => Enum.Parse(typeof (MyEnum), x)) //use your own method to parse enum from string
.ToList();

This is nearly 8 years later, but in case this helps someone else, you can correct "bad" database values with the query
SELECT *,
CASE DbColumnName
WHEN 'SomeVeIRdLooking_Value'
THEN 'MyFormattedValue'
WHEN 'SomeOtherWierd_Value'
THEN 'MyOtherFormattedValue'
ELSE DbColumnName
END AS DbColumnNameFix

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Enum data type in Cassandra - enums

Related

Jooq enum converter uses the ordinal number. How can I switch to use the enum value number instead?

Spring Data / Hibernate save entity with Postgres using Insert on Conflict Update Some fields

Entity Framework Core 2.1 System.Data.SqlClient.SqlException (0x80131904): Type Flag is not a defined system type

What is the difference between unique_index and unique?

Mapping enums while using dapper

Categories

Resources