Can I somehow tag data in Redis?

Can I somehow tag data in Redis? - caching

I have an object Company and multiple methods that can be used to get this object. Ex. GetById, GetByEmail, GetByName.
What I'd like is to cache those method calls with a possibility to invalidate all cache entries related to one object at once.
For example, a company is cached. There are 3 entries in cache with following keys:
Company:GetById:123
Company:GetByEmail:foo#bar.com
Company:GetByName:Acme
All three keys are related to one company.
Now let's assume that company has changed. Then I would like to invalidate all keys related to this company. I didn't find any built-in solution for that purpose.
Tagging cache entries with some common id (companyId for example) and then removing all entries by it would be great, but this feature doesn't seem to exist.

So to answer your question directly, You'd probably want to maintain all the keys related to your company in a list, scan through that list, and delete all the associated keys with a DEL command.
So something like:
LPUSH companies-keys:Acme Company:GetById:123 Company:GetByEmail:foo#bar.com Company:GetByName:Acme
Then
RPOP companies-keys:Acme
and for each entry you get out of the list:
UNLINK keyname
To answer it not so directly, you may want to consider using a Hash rather than just keys, that way you can just modify one of the fields in the hash rather than having to invalidate all the keys associated with it.
So you could create it with:
HSET companies:123 id 123 email foo#bar.com name acme
Then you could update a particular entry in the company record with HMSET:
HMSET companies:123 email bar#foo.com
Since it sounds like being able to look up a given record by different fields is really important to your use case - you may also want to consider adding RediSearch and indexing the fields you want to be able to search on different fields for the set of fields listed above, and index of:
FT.CREATE companies-idx ON HASH PREFIX 1 companies: SCHEMA id TAG email TEXT name TEXT
Might be appropriate - then you could look up a company with a given email like:
FT.SEARCH companies-idx "#email: foo"

Related

Map multiple values to a unique column in Elasticsearch

I want to work with Elasticsearch to process some Whatsapp chats. So I am initially planning the data load.
The problem is that the data exported from Whatsapp, doesn't contain a real unique id per user but it only contains the name of the user taken from the contact directory of the device where the chat is exported (ie. a user can change the number or have two numbers in the same group).
Because of that, I need to create a custom explicit mapping table between the user names and a self-generated unique id, that gets populated in an additional column.
Then, my question is: "How can I implement such kind of explicit mapping in Elasticsearch to generate an additional unique column?". Alternatively, a valid answer could be a totally different approach to the problem.
PS. As I write, I think the solution could be in the ingestion process, like in a python script, but I still want to post the question to understand if this is something that Elasticsearch can do by itself.

yes, do it during the index process
if you had the data that maps the name and the id stored in a separate index you could do this with an enrich processor when you index the data to add whichever value you want to the document via a pipeline
also - Elasticsearch doesn't have columns, only fields

Laravel Lighthouse: how to delete records that match certain conditions (rather than deleting via primary key)

In Laravel Lighthouse GraphQL, I'd love to be able to delete records that match certain conditions rather than passing just an individual ID.
I get this error:
The #delete directive requires the field deletePostTag to only contain a single argument.
This functionality seems currently unsupported, but if I'm wrong and this is actually supported, please let me know, because this would be the most straightforward approach.
So then my second approach was to try to first run an #find query to retrieve the ID of the record that I want to delete (based on certain fields equaling certain values).
But https://lighthouse-php.com/4.16/api-reference/directives.html#find shows:
type Query {
userById(id: ID! #eq): User #find
}
and does not show how I could provide (instead of the primary key ID) 2 arguments: a foreign key ID, and a string.
How can I most simply accomplish my goal of deleting records that match certain conditions (rather than deleting via primary key)?

I'm not sure about the #delete functionality regarding multiple arguments, but from what you've posted that appears to be unsupported at the moment. Regarding your query, you should instead use something like #all in conjunction with #where which would allow you to filter the collection by as many vars/args as you'd like. If your argument list grows beyond 3 or so, I would take a look at Complex Where Conditions. They have worked very well for my team so far, and allow a lot of filtering flexibility.
Also take a look at the directive's docs stating:
You can also delete multiple models at once. Define a field that takes a list of IDs and returns a collection of the deleted models.
So if you return multiple models you'd like to delete from your query, you may use this approach to delete them all at once.

DynamoDB Throughput vs Search time

I've just figured out a big mistake I had while creating the dynamodb structure.
I've created 11 tables, whereas one of them is the table mostly refereed to and the others are complementary tables.
For example, I have a table where I hold names (together with other info) called "Names" and another table called "NamesMappings" holding all these names added to the "Names" table so that each time a user wants to add a name to the "Names" table he first tries to put the name in "NamesMappings" and only if it succeed (therefore this name doesn't exist) he can add the name into the "Names" table. This procedure helps if the name is not unique and is not the primary key in the "Names" table and with this technique I don't have to search inside the "Names" table if the name exists, but instead I can try to add it to the "NamesMappings" table and only if it succeed I know this is a unique name.
First of all, I would like to ask you if this is a common approach or there is a better one?
Next, I figured out that with this design I soon reached to 11 tables each has 5 provisioned capacity of read and write which leads to overall 55 provisioned read and write under the free-tier. Then I understood why I get all these payments each month, because as the number of tables is getting bigger, and I leave the provisioned capacity as default (both read/write capacity are 5) I get more and more provisioned capacity.
So, what should be my conclusion from this understanding? Should I try to reduce the number of tables even if it takes more effort to preform scanning and querying inside the table? Or should I split the table same as I do but reduce the capacity of these mappings tables used only for indication if an item exists or not in another table?

If I understand your problem correctly you're missing the whole concept of NoSQL Databases.
Your Names table should have a Hash key (which is similar to a Primary key) that has a uniformly generated identifier (an UUID is a great candidate). This would automatically make this Table queryable by this unique identifier. You said, however, that you don't know the ID but you only know the Name instead. This leads me to think you could create a Global Secondary Index (GSI) on the Name attribute inside the Names table so you can also query by Name. Up to this point, your table structure should look like this:
id | name
Both of them are independently queryable, which gives you a lot of flexibility already.
Now, let's say you want to add the NameMapping attribute (which I don't know how it looks like), you can simply add it under the Names table, getting rid of the NamesMappings table, greatly reducing the number of WCUs and RCUs across your account. Your table structure should now look like this:
id | name | mappings
where mappings is, let's say, a JSON object.
Since you can only query on top level attributes in DynamoDB, you can now perform a query against the name attribute which has a GSI configured. If the query returns nothing, then name is unique. But let's say you still need some data inside the mappings object, then you could query by name and, in your code, you could apply a map/filter/reduce operation on the mappings attribute and decide what to do next.
Remember that duplication is just OK in a NoSQL world. This may look scary if you come from a purely SQL background, but data should be stored in such a way in NoSQL databases that you should be able to fetch all the needed information in one go, therefore avoiding "joins" (joins are still possible in a NoSQL database, but since there are no strong relationships between entities, you need to perform these joins manually on the code level). To give you some real context, imagine you have a Orders table where you keep track of the ordered Products and the Store that the Order belongs to: you'd save both the Products and the Store objects (and not their IDs, as it would happen in the SQL way) inside the Order object, so if you want to query for a given OrderId in the future, you wouldn't need to make extra calls (aka "joins") to the Product/Store tables to fetch the information, since everything would already be stored inside the Order object.

How do I store static data in Laravel?

In our application we will give titles to users based on their points. So, if a user has 10-99 points, that user might get the "Novice" title, but a user with 100-199 points might get the "Regular User" title. I plan on eager loading a user's points using an attribute and relationship, and once I have those points I will use an attribute method to assign the title.
But how do I get the list of possible titles?
I could make a model, a migration, and a seed file, but I feel like these titles won't change much and certainly would never need to be updated in an API call. I could also hardcode an array of points and titles and do a quick lookup to see which title belongs to a user, but then I need to somehow deliver those titles to the user in an Attribute method. Or I could store them in a repository or the cache.
Can I access a repository from within a model? Is it better to store this sort of data in a DB anyways, regardless of how often it's updated or queried?

You could use entries in your .env file to store the entries and then use some logic in your php to select the correct .env entry.
LEVEL_1_TITLE=Novice
LEVEL_2_TITLE=Regular
...
if($user->points < 99){
$title = env('LEVEL_1_TITLE');
}
...
Or do the same thing from an array in a class that you create and just select the correct array entry based on the points.

Parse.com post-comment relationship

I would like to build a application like facebook (actually has nothing to do with facebook, but for the nature of the question we can say so).
I currently have a table named Post and another named Comment and of course I would represent the one-to-many relationship between them (I read the documentation here but wasn't really helpful to me).
In Comment I created a column with a pointer to the Post class with the parent Post.
In Post I created then a column with an Array where will be stored the related comment's id.
(each post will have a number of comments not very high, between 10 and 100).
The technique used here is the best? There are more efficient methods?

If your array is only storing the objectIDs for the comments then it's probably more idiomatic to use a Relation as the column type rather than an Array.
A Relation is more efficient in that the ID's aren't returned when you retrieve your Post object, so your Post objects will transfer faster, and it has the same disadvantages as storing the object ID's in an Array in that you'll still have to run a query to get the Comment objects. The only possible downside I can see is that if you need to have the number of comments, you can calculate this based on the size of the array, but with a Relation you'll have to run a count query (or maintain a separate count field).
With an Array you're introducing a slight data maintenance/integrity overhead as well. If your users have the ability to delete comments, then you'll also need to remove the comment ID from the array. And this will require a permissive ACL (to allow a commenter to edit a post they may not have created, and because of this they'll have the ability to edit any value in the post), or you'll have to have a before/after save action to update the Post when a Comment is deleted.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio