Is there a way to get shorter unique IDs for entries in RethinkDB?
We have a URL scheme where <url>/:ID URLs get very long with the default ID
Their docs don't seem to talk about this...
https://rethinkdb.com/api/javascript/uuid/
According to the linked documentation, we should read Wikipedia's article for UUID. Which starts said article with: A universally unique identifier (UUID) is a 128-bit number [...]. So I guess that no: UUID is a fixed-length thing. (Yeah, I learnt something too here.)
RethinkDB will by default use uuid as id . If you want a shorter id, handle it from your application side. Before inserting set id value. Be careful with your id generation logic, if it conflicts with existing ids in your table, it will overwrite them.
Related
In redis I'm planning to store key as a unique string and value will be a list.
I have a use case where I need to do 2 things.
First, I need to get all the values associated with a key by providing the key as input.
Second, I want to get all the keys associated with a value by providing one of the value in the values list.
Second part is where I need the advice, how we can achive this ?
I cannot get all the keys or key value pair and loop through because I will have millions of entries in Redis.
As mentioned in the comment above the retrieving of all keys with associated value at will probably sometimes create a performance issue as this will be a run through large entries.As also suggested in the official documentation about retrieving data from the memory caches you can try and use the following Redis command to get the value and see if that is what can solve your purpose.
GET
MGET
I am implementing a feed for a social network in that newly uploaded post should be served first and so on.I use hashes of posts as keys and posts as values.I need the posts in "newest first order".How to do it?
My idea is
Store the post and timestamp with hash as a key
Get all keys and timestamps
Sort the timestamps in descending order
Then use the respective keys to get th latest images
Question1:But this approach is not good.How to do it ?
EDIT:Question2:Please tell me what algorithm you use to serve feed.If feed is common for all users based on "newest-first",how to implement it?
This is my first time in backend.Please if the question is dumb.
Thanks.
Here are three options for you:
Use a sorted set, using the timestamp as score, and the post-hash as value. The post-hash is also the key in a hash where the actual posts are stored. Commands involved: ZADD, HSET, ZREVRANGEBYSCORE, HGET.
Use a sorted set, using the timestamp as score, and the post with metadata as value. Make sure "post with metadata" is unique, you can include the timestamp and user to achieve this. This will have better performance, but makes it a bit harder if you have to find a specific post. Commands involved: ZADD, ZREVRANGEBYSCORE, ZRANGEBYSCORE.
Use Redis Streams. If you want a uniform insert order independent of client time, Redis can set the timestamp for you. However, stream entries cannot be modified, so either users cannot edit posts, or whenever they edit the post is brought up as new. Commands involved: XADD, XREVRANGE, XDEL.
See:
Redis Commands
Introduction to Redis Streams
I'm writing a script which supposed to merge some data from sql-based db. Each row has a long-integer as a primary key (incremental). I was thinking about hashing these ids so that they'll somehow 'look' like the other ids already in my RethinkDB table. What I'm trying to achive here is to avoid dups in case of an attempt to merge the same data again, but keeping the original integers as ids along with the generated ids of the data saved directly to RethinkDB's table feels weird.
Can I do that?
How does RethinkDB generate auto ids anyways?
And am I approaching this correctly..?
RethinkDB uses a string-encoding of 128 bit UUIDs (basically hashed integers).
The string format looks like this: "HHHHHHHH-HHHH-HHHH-HHHH-HHHHHHHHHHHH" where every 'H' is a hexadecimal digit of the 128 bit integer. The characters 0-9 and a-f (lower case) are used.
If you want to generate such UUIDs from an existing integer, I recommend hashing the integer first. This will give you an even distribution over the whole key space (this makes sharding easier and avoids hotspots).
As a second step you have to format the hash value in a string of the format shown above. If you don't have enough digits, it's fine to leave some of the last 'H' as constant 0.
If you really want to go into the details of UUID generation, here are two links for further reading:
RFC 4122 "A Universally Unique IDentifier (UUID) URN Namespace" https://www.rfc-editor.org/rfc/rfc4122
RethinkDB's implementation of UUID generation and formatting https://github.com/rethinkdb/rethinkdb/blob/next/src/containers/uuid.cc
Let's assume I have a keyspace with a column family that stores user objects and the key of these objects is the username.
How can I use Hector to get a list of users sorted by username?
I tried to use a RangeSlicesQuery, paging works fine with this query, but the results are not sorted in any way.
I'm an absolute Cassandra beginner, can anyone point me to a simple example that shows how to sort a column family by key? Please ask if you need more details on my efforts.
Edit:
The result was not sorted because I used the default RandomPartitioner instead of the OrderPreseveringPartitioner in cassandra.yaml.
Probably it's better not to rely on the sorting by key but to use a secondary index.
Quoting Cassandra - The Definitive Guide
Column names are stored in sorted order according to the value of compare_with. Rows,
on the other hand, are stored in an order defined by the partitioner (for example,
with RandomPartitioner, they are in random order, etc.)
I guess you are using RandomPartitioner which
... return data in an essentially random order.
You should probably use OrderPreservingPartitioner (OPP) where
Rows are therefore stored
by key order, aligning the physical structure of the data with your sort order.
Be aware of inefficiency of OPP.
(edit on Mar 07, 2014)
Important:
This answer is very old now.
It is a system-wide setting. You can set in cassandra.yaml. See this doc. Again, OPP is highly discouraged. This document is for version 1.1, and you can see it is deprecated. It is likely that it is removed from latest version. If you do want to use OPP, you may want to revisit the architecture the architecture.
Or create a row called "meta:userNames" in same column family and put all user names as a look up hash. Something like that.
Users {
key: "meta:userNames" {david:david, paolo:paolo, victor:victor},
key: "paolo" {password:"*****", locale:"it_it"},
key: "david" {password:"*****", locale:"en_us"},
key: "victor" {password:"*****", locale:"en_uk"}
}
First query the meta:userNames columns (that are sorted) and use them to get the user rows. Don't try to get everything via single db query as in SQL driven databases. Use Cassandra as huge Hash Map which provides rapid random access to its data.
just wondering does anyone in here have good idea about generating nice order id?
for example
832-28-394, which show a quite nice and formal order id (rather than just use an database auto increment number like ID=35).
the order id need to look random so it can not be able to guess by user.
e.g. 832-28-395 (shoudnt exist) so there will always some gap between each id.
just like the account number for your bank card?
Cheers
If you are using .NET you can use System.Guid.NewGuid()
The auto-incremented IDs are stored as integer or long integer data. One of the reasons for this is that this format is compact, saving space, including in indexes which are typically inclusive a primary key for use with joins and such.
If you wish to create a nice looking id following a particular format syntax, you'll need to manage the generation of the IDs yourself, and store these in a "regular" column not one that is auto-incremented.
I suggest you keep using "ugly looking" ids, be they auto-incremented or not, and format these value for display purposes only, using whatever format you may desire, including some format that use the values from several columns. Depending on the database system you are using you may be able to declare custom functions, at the level of the database itself, allowing you to obtain the readily formatted value with a simple query (as in
SELECT MakeAFancyId(id_field), some_other_columns, ..
FROM ...
If you cannot use some built-in or custom function at the level of SQL, you'll need to format the value supplied by SQL (an integer of sorts), into the desired format, on the client-side, using the language associated with your UI / presentation framework.
I'd create something where the first eight numbers are loosely in a pattern, and a third quartet looks random but is really a sort of checksum.
So, for example, the first eight digits increment based on the current seconds on the server clock.
The last four could be something like the sum of the first four, plus twice the sum of the second four, which will give either a two or three digit number. The final digit is calculated so that the sum of all 11 digits plus this last one is a multiple of 9.
This is slightly akin to how barcode numbers are verified. You can format the resulting 12 digits any way you want, although it is the first eight that are unique here.
Hash the clock time.
Mod by 100,000 or something.
Format with hyphens.
Check for duplicates. If found, restart.
I would suggest using a autoincrement ID in the database to link tables and as a primary key. Integer fields are always faster than string fields for indexing and well as searching.
You can have the order number field (which is for display) as a different field in the order table which will be used to display. And whenever you are planning to send a URl to a user or display a URL to the user which has order ID (which is a autoincremented number) you can encrypt it with some algorithm.
Both your purpose will be solved.
But I suggest not to make string as primary key. Though you can have a unique constraint on the order number which is going to be displayed.
Hope this helps.
Kalpak Luniya
I would suggest internally you keep the database derived primary key, which is auto-incremented.
For the visible order number, you will probably need a longer length than 8 characters, if you are using this for security.
If you are using Ruby, look at SecureRandom, which will generate sufficiently random strings to accomodate this. For example, you can use SecureRandom.hex(16), and it will give you a 16 digit hex number. I believe it can also give you base 64 strings, which will look weirder but be shorter.
Make sure this is not your only security on an order, as it may not be that hard to find a valid order number within your 8 digit code, especially if some are some sort of checksum.
For security reasons i suggest that you should use Criptographicaly secure random number generator. Think about idea on icreasing User Id length -if you have 1 million users then the probability to gues User ID in first try is 0.01 and 67 tries to increase probability over 0.5