postgresql custom primarykey - spring

I try to make a project using hibernate and postgres as DB. The problem I have is I need to store the primary key as this 22/2017 or like 432/1990.
Let's say the first number is object_id and second year_added.
I think what I want to achieve is to make a first number and second number together a primary key so 22/2017 is different from 22/2016.
The only idea I have is when user add new object I generate current date year and trying to find last id and increment it.
So next year first added object should be : 1/2018.
So far in my db only object_id is stored as a primary key.

This solution seems to work fine:
PostgreSQL: Auto-increment based on multi-column unique constraint
Thanks for helping me anyway.

Related

Randomly generated public unique ids

Currently I'm generating unique ids for rows in my database using int and auto_increment. These ids are public facing so in the url you can see something like this https://example.com/path/1 or https://example.com/path/2
After talking with another engineer they've advised me that I should use randomly generated ids so that they're not guessable.
How can I generate a unique ID & without doing a forloop on the database each time to make sure it's unique? e.g. take stripe for example. All of their ids are price_sdfgsdfg or prod_iisdfgsdfg. Whats the best way to generate unique ids for rows like these?
Without knowing which language or database you're using, the simplest way is using uuids.
To prevent downloading all existing database unique keys, and then for looping over them all, simply just try to INSERT INTO whichever table you are using.
If the result fails (e.g. Exception), then the row is taken, continue.
If the result passes, break loop.
This only works when you have a column which is NOT NULL, and UNIQUE.
That's how I "know" without looping over the whole database of IDs, or downloading them into local memory, etc.
Using auto_increment wont lead to duplicates because when a SQL or no-SQL table is in use, it will be locked and given to the next available number in the queue, which is the beauty of databases.
SQL example (mySQL, SQLite, mariadb):
CREATE TABLE `my_db`.`my_table` ( `unique_id` INT NOT NULL , UNIQUE (`unique_id`)) ENGINE = InnoDB;`
Insert a unique_id
INSERT INTO `test` (`unique_id`) VALUES ('999999999');
Great, we have a row
INSERT INTO `test` (`unique_id`) VALUES ('999999999');
If not, then retry:
Error:
#1062 - Duplicate entry '999999999' for key 'unique_id'
If these are public URLs, and the content is sensitive, then I definitely do not recommend int's as someone can trivially guess 1 through 99999999... etc.
In any language, have a look at /dev/urandom.
In shell/bash scripts, I might use uuidgen:
9dccd646-043e-4984-9126-3060b4ced180
In Python, I'll use pandas:
df.set_index(pd.util.hash_pandas_object(df, encoding='utf8'), drop=True, inplace=True)
df.index.rename('hash', inplace=True)
Lastly, UUID's aren't perfect: they are only a-f 0-9 all lowercase, but they are easy to generate: every language has one.
In JavaScript you may want to check out some secure Open Source apps, for example, Jitsi: https://github.com/jitsi/js-utils/blob/master/random/roomNameGenerator.js where they conjugate word:
E.g. Satisfied-Global-Architectural-Bitter

Alternative to ORA_HASH?

We are working with a table in a 3rd party database that does not have a primary key but does have a unique index.
I have therefore been looking at using the ORA_HASH function to produce a de facto unique Id by passing in the values of the columns in the unique index.
Unfortunately, I can already see that we have a few collisions, which means that we can't derive a unique id using this method.
Is there an alternative to ORA_HASH that would provide a unique id for a unique input?
I suppose I could generate an Id using DBMS_CRYPTO.Hash but I'd ideally like to get a numeric value.
Edit
The added complication is that I then need to store these records in another (SQL Server) database and then compare the records from the original and the replica tables. So rank doesn't help me here since records can be added or deleted in the original table.
DBMS_CRYPTO.HASH could be used to generate a high-bit hash (high enough to give you a very low, but not zero, chance of collisions), but it returns 'RAW' not 'NUMBER'.
To guarantee no collisions ever, you need a one-to-one hash function. As far as I know, Oracle does not provide one.
A practical approach would be to create a new table to map unique keys to a newly generated primary key. E.g., unique value ("ABC",123, 888) maps to 838491 (where you generated 838491 using a sequence).
You'd have to update the mapping table periodically, to account for inserted rows, and that would be a pain, but it would let you generate your own PKs and keep track of them without a lot of complication.
Have you tried:
DBMS_UTILITY.GET_HASH_VALUE (
name VARCHAR2,
base NUMBER,
hash_size NUMBER)
RETURN NUMBER;

How to insert a unique random integer in SQLite?

Let's say I'm saving users in a database and I want each user to have a unique random ID (this isn't actually the case, just a simpler example). When I INSERT the user, is there a way to insert a unique random ID?
I know I can easily just do an auto-increment column so that each row would have a unique integer, but I need a random number for this system specifically.
Sample of what my standard insert query for a new user:
INSERT INTO 'Users' VALUES ('RandomID', 'Bleh', 'Bleh2') (random value here, 1, 2)
I was wondering the same and found an interesting answer in this article: use a pre-populated table.
You create before hand a table of n rows with unique random numbers, using any method (for example, by trying to insert random numbers in a UNIQUE field and silently failing when the number isn't unique, until you get the number of rows needed).
Once that table is created you are 100% certain that the numbers in it are unique, and you can simply use them sequentially and discarding them after use (or not).
When using this method you need to have some kind of alert system for when you approach the limit of rows in the pre-populated table, so that you can generate a new set of values anew.
Sqlite has random() which returns a random integer. But it may not be unique every time. You can append time stamp or row_id with it to get unique random number.
Based on this documentation for the C API function sqlite3_randomness(), it might be possible to make a table's primary key random by creating a dummy row and forcing it's primary key value to the largest possible ROWID. Any new rows after that should be random.
That said, I don't know that that behavior is contractual or just a current implementation detail. Use at your own risk.

HBase row key design for reads and updates

I'm try to understand the best way to design the key for my HBase Table.
My use case :
Structure right now
PersonID | BatchDate | PersonJSON
When some thing about the person is modified, a new PersonJSON and new a batchdate is inserted in to Hbase updating the old records. And every 4 hours a scan of all the people who are modified are then pushed to Hadoop for further processing.
If my key is just personID it great for updating the data. But my performance sucks because I have to add a filter on BatchData column to scan all the rows greater than a batch date.
If my key is a composite key like BatchDate|PersonID I could use startrow and endrow on the row key and get all the rows that have been modified. But then I would have lot of duplicated since the key is not unique and can no longer update a person.
Is bloom filter on row+col (personid+batchdate) an option ?
Any help is appreciated.
Thanks,
Abhishek
In addition to the table with PersonID as the rowkey, it sounds like you need a dual-write secondary index, with BatchDate as the rowkey.
Another option would be Apache Phoenix, which provides support for secondary indexes.
I usually do two steps:
Create table one just have key is commbine of BatchDate+PersonId, value could be empty.
Create table two just as normal you did. Key is PersonId Value is the whole data.
For date range query: query table one first to get the PersonIds, and then use Hbase batch get API to get the data by batch. it would be very fast.

Is there any way to generate an ID without a sequence?

Current application use JPA to auto generate table/entity id. Now a requirement wants to get a query to manually insert data in to the database using SQL queries
So the questions are:
Is it worth to create a sequence in this schema just for this little requirement?
If answer to 1 is no, then what could be a plan b?
Yes. A sequence is trivial - why would you not do it?
N/A
Few ways:
Use a UUID. UUIDs are pseudo-random, large alphanumeric strings which are guaranteed to be unique once generated.
Does the data have something unique? Like a timestamp, or IP address, etc? If so, use that
Combination of current timestamp + some less unique value in the data
Combination of current timestamp + some integer i that you keep incrementing
There are others (including generating a checksum, custom random numbers instead of UUIDs, etc) - but those have the possibility of overlaps, so not mentioning them.
Edit: Minor clarifications
Are you just doing a single data load into an empty table, and there are no other users concurrently inserting data? If so, you can just use ROWNUM to generate the IDs starting from 1, e.g.
INSERT INTO mytable
SELECT ROWNUM AS ID
,etc AS etc
FROM ...

Resources