I have the graph with organizations and employees with 2 types of relationships:
(:Employee)-[:Worked]->(:Organization)
(:Employee)-[:Managed]->(:Organization)
Organization has unique property Id with index on it. Employee have property Name without index. I need to add new Employee to organization if he is not exist, or only add new relationship if he is exist. But i dont know how acheive this without index on Name.
1. Find organization by Id. Ok, this is fast.
match (o:Organization {Id:1})
2. Find employee that already linked to organization or add new link and employee if not exist. I cant use simple
merge (e:Employee {Name: "name"})
merge (e)-[:Worked]->(o)
because i dont have index on Name(this is will be slow) and i need find only employee that connected to selected organization
merge (e:Employee { Name: "name" })-[:Worked]->(o) doesnt work - it will add new employee if there is already exist employee with that name but other relationship, [:Managed] for example.
You seem to be misunderstanding the requirements for MERGE. An index is not required for MERGE to work.
In your specific case, having an index on :Employee(Name) is not required in order for MERGE (e:Employee {Name: "name"}) to work as expected. However, having such an index will speed up the processing of that MERGE clause, so it is recommended.
There are a few simple things to understand about how Neo4j handles Cypher MERGE operations to avoid undesired or unexpected behavior when using it.
The Cypher MERGE operation is a MATCH or CREATE of the entire pattern. This means that if any element of the pattern does NOT exist, Neo4j will attempt to create the entire pattern.
Always MERGE on each segment of the pattern that has the possibility to already exist.
After a MERGE operation you are guaranteed to have a useable reference to all identifiers established during the Cypher MERGE operation because they were either found or created.
Merge in CQL dosent require any index on the node. you can try the following :
merge (:Employee)-[:< relation_name >]->(:Organization)
If the nodes specified are existing, the relation among them is created else new nodes would be created and the relation among them is created .
Related
In Laravel Lighthouse GraphQL, I'd love to be able to delete records that match certain conditions rather than passing just an individual ID.
I get this error:
The #delete directive requires the field deletePostTag to only contain a single argument.
This functionality seems currently unsupported, but if I'm wrong and this is actually supported, please let me know, because this would be the most straightforward approach.
So then my second approach was to try to first run an #find query to retrieve the ID of the record that I want to delete (based on certain fields equaling certain values).
But https://lighthouse-php.com/4.16/api-reference/directives.html#find shows:
type Query {
userById(id: ID! #eq): User #find
}
and does not show how I could provide (instead of the primary key ID) 2 arguments: a foreign key ID, and a string.
How can I most simply accomplish my goal of deleting records that match certain conditions (rather than deleting via primary key)?
I'm not sure about the #delete functionality regarding multiple arguments, but from what you've posted that appears to be unsupported at the moment. Regarding your query, you should instead use something like #all in conjunction with #where which would allow you to filter the collection by as many vars/args as you'd like. If your argument list grows beyond 3 or so, I would take a look at Complex Where Conditions. They have worked very well for my team so far, and allow a lot of filtering flexibility.
Also take a look at the directive's docs stating:
You can also delete multiple models at once. Define a field that takes a list of IDs and returns a collection of the deleted models.
So if you return multiple models you'd like to delete from your query, you may use this approach to delete them all at once.
I've just figured out a big mistake I had while creating the dynamodb structure.
I've created 11 tables, whereas one of them is the table mostly refereed to and the others are complementary tables.
For example, I have a table where I hold names (together with other info) called "Names" and another table called "NamesMappings" holding all these names added to the "Names" table so that each time a user wants to add a name to the "Names" table he first tries to put the name in "NamesMappings" and only if it succeed (therefore this name doesn't exist) he can add the name into the "Names" table. This procedure helps if the name is not unique and is not the primary key in the "Names" table and with this technique I don't have to search inside the "Names" table if the name exists, but instead I can try to add it to the "NamesMappings" table and only if it succeed I know this is a unique name.
First of all, I would like to ask you if this is a common approach or there is a better one?
Next, I figured out that with this design I soon reached to 11 tables each has 5 provisioned capacity of read and write which leads to overall 55 provisioned read and write under the free-tier. Then I understood why I get all these payments each month, because as the number of tables is getting bigger, and I leave the provisioned capacity as default (both read/write capacity are 5) I get more and more provisioned capacity.
So, what should be my conclusion from this understanding? Should I try to reduce the number of tables even if it takes more effort to preform scanning and querying inside the table? Or should I split the table same as I do but reduce the capacity of these mappings tables used only for indication if an item exists or not in another table?
If I understand your problem correctly you're missing the whole concept of NoSQL Databases.
Your Names table should have a Hash key (which is similar to a Primary key) that has a uniformly generated identifier (an UUID is a great candidate). This would automatically make this Table queryable by this unique identifier. You said, however, that you don't know the ID but you only know the Name instead. This leads me to think you could create a Global Secondary Index (GSI) on the Name attribute inside the Names table so you can also query by Name. Up to this point, your table structure should look like this:
id | name
Both of them are independently queryable, which gives you a lot of flexibility already.
Now, let's say you want to add the NameMapping attribute (which I don't know how it looks like), you can simply add it under the Names table, getting rid of the NamesMappings table, greatly reducing the number of WCUs and RCUs across your account. Your table structure should now look like this:
id | name | mappings
where mappings is, let's say, a JSON object.
Since you can only query on top level attributes in DynamoDB, you can now perform a query against the name attribute which has a GSI configured. If the query returns nothing, then name is unique. But let's say you still need some data inside the mappings object, then you could query by name and, in your code, you could apply a map/filter/reduce operation on the mappings attribute and decide what to do next.
Remember that duplication is just OK in a NoSQL world. This may look scary if you come from a purely SQL background, but data should be stored in such a way in NoSQL databases that you should be able to fetch all the needed information in one go, therefore avoiding "joins" (joins are still possible in a NoSQL database, but since there are no strong relationships between entities, you need to perform these joins manually on the code level). To give you some real context, imagine you have a Orders table where you keep track of the ordered Products and the Store that the Order belongs to: you'd save both the Products and the Store objects (and not their IDs, as it would happen in the SQL way) inside the Order object, so if you want to query for a given OrderId in the future, you wouldn't need to make extra calls (aka "joins") to the Product/Store tables to fetch the information, since everything would already be stored inside the Order object.
I have two scenarios that I want to support but I don’t know the best way to design relations in the elasticsearch. I read the entire elasticsearch documentation but I couldn’t find the best way to design types for my scenarios.
Multiple one to many.
Let’s assume that I have the following tables in my relational database that I want to transfer to the elasticsearch:
Transaction table Id User1Id User2Id ….
User table Id Name
Transaction contains two references to User. As far as I know I cannot use the parent->child relation specifying two parents? I need to store transaction and user in separate types because they can be changed separately. I need to be able to search transaction through user details and return users connected with transactions. Any idea how to design such structure in the elastic search?
Many to many
Let’s assume that we have the following tables:
Order Id …
OrderLine OrderId UserId Amount …
User Id Name
Order line is always saved with the order so I thought that I can store order with order lines as a nested object relation but the user must be in the separate table. Is there any way how can I connected multiple users from order line with user type? I assume that I can use application side join but I need to retrieve order and order line always together and be able to search order by user data.
I can use grandparent and grandchildren relations but then I need to make joins in the application. Any idea how to design it in the best way?
Does anyone have an example on how to create an Hbase table with a nested entity?
Example
UserName (string)
SSN (string)
+ Books (collection)
The books collection would look like this for example
Books
isbn
title
etc...
I cannot find a single example are how to create a table like this. I see many people talk about it, and how it is a best practice in certain scenarios, but I cannot find an example on how to do it anywhere.
Thanks...
Nested entities isn't an official feature of HBase; it's just a way some people talk about one usage pattern. In this pattern, you use the fact that "columns" in HBase are really just a big map (a bunch of key/value pairs) to let you to model a dimension of cardinality inside the row by adding one column per "row" of the nested entity.
Schema-wise, you don't need to do much on the table itself; when you create a table in HBase, you just specify the name & column family (and associated properties), like so (in hbase shell):
hbase:001:0> create 'UserWithBooks', 'cf1'
Then, it's up to you what you put in it, column wise. You could insert values like:
hbase:002:0> put 'UsersWithBooks', 'userid1234', 'cf1:username', 'my username'
hbase:003:0> put 'UsersWithBooks', 'userid1234', 'cf1:ssn', 'my ssn'
hbase:004:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_12345', '<isbn>12345</isbn><title>mary had a little lamb</title>'
hbase:005:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_67890', '<isbn>67890</isbn><title>the importance of being earnest</title>'
The column names are totally up to you, and there's no limit to how many you can have (within reason: see the HBase Reference Guide for more on this). Of course, doing this, you have to do your own legwork re: putting in and getting out values (and you'd probably do it with the java client in a more sophisticated way than I'm doing with these shell commands, they're just for explanatory purposes). And while you can efficiently scan just a portion of the columns in a table by key (using a column pagination filter), you can't do much with the contents of the cells other than pull them and parse them elsewhere.
Why would you do this? Probably just if you wanted atomicity around all the nested rows for one parent row. It's not very common, your best bet is probably to start by modeling them as separate tables, and only move to this approach if you really understand the tradeoffs.
There are some limitations to this. First, this technique only works to
one level deep: your nested entities can’t themselves have nested entities. You can still
have multiple different nested child entities in a single parent, and the column qualifier is their identifying attributes.
Second, it’s not as efficient to access an individual value stored as a nested column
qualifier inside a row, as compared to accessing a row in another table, as you learned
earlier in the chapter.
Still, there are compelling cases where this kind of schema design is appropriate. If
the only way you get at the child entities is via the parent entity, and you’d like to have transactional protection around all children of a parent, this can be the right way to go.
I am somewhat new to LINQ and have a quick question regarding deleting.
Say, for example I have 2 tables, Orders and OrderItems. Using LINQ, I can easily create a new child record by using
order.Items.Add(new OrderItem());
and this will create the child record in the database and update its foreign key to the orderId. This is great, I like it! However when I want to remove a child record
order.Items.Remove(orderItem);
I get an error when I sumbit the changes (because its not actually deleting the child row (order item), just removing the foreign keyId). Is it possible to do this the way I would like to? I don't want to have to create a whole bunch of repositories and if ladders to delete all child rows for a large database.
Thanks in advance.
E
You can achieve that in the DB itself by configuring the Foreign key relationship to delete child records on deletion of the parent's key.
Note that this is transparent to Linq2SQL and it will not be aware of it, so it's best to make sure you do not keep the datacontexts around after that, since the OrderItem objects will still be present.
Set ON DELETE CASCADE for the table in question which will let the SQL Server handle this for you.