Phoenix/Ecto - restart identity in seedfile - phoenix-framework

When using the seedfile in a Phoenix app, is it possible to restart the identity for a table if the seed is run more than once?
For example, I'm currently working with the following seed:
PhoenixApp.Repo.delete_all PhoenixApp.Role
PhoenixApp.Repo.insert!(%PhoenixApp.Role{role: "admin"})
PhoenixApp.Repo.insert!(%PhoenixApp.Role{role: "non-admin"})
The first line clears the table (so that the records don't pile up if the seed is run multiple times), and the lines following create the seed records. Running this code once would create two records with the autoincremented primary keys '1' and '2', as you would normally expect. However, if I want to add another entry to the table later on, such as
PhoenixApp.Repo.insert!(%PhoenixApp.Role{role: "superuser"})
the ids of the rows will now be '3', '4', and '5', because the identity was not restarted.
Does Ecto have a command that restarts the table identity as well? I realize that I could add additional records to my table via IEx, but I'd prefer to restart the identity, if possible.

In PostgreSQL, you can execute ALTER SEQUENCE <sequence name> RESTART to reset the value of the sequence to its original value. The Sequence Name for the ID primary key will be #{table_name}_id_seq. You can run the query using Repo.query, for example:
Repo.query("ALTER SEQUENCE comments_id_seq RESTART")
iex(1)> Repo.insert!(%Comment{}).id
11
iex(2)> Repo.delete_all(Comment)
{11, nil}
iex(3)> Repo.insert!(%Comment{}).id
12
iex(4)> Repo.query("ALTER SEQUENCE comments_id_seq RESTART")
{:ok,
%Postgrex.Result{columns: nil, command: :alter_sequence, connection_id: 3360,
num_rows: 0, rows: nil}}
iex(5)> Repo.insert!(%Comment{}).id
1

Related

How to group records based on a child record in Amazon Quicksight

Is there any way to categorise an aggregated record in Amazon Quicksight based on the records being aggregated?
I'm trying to build a pivot table an Amazon Quicksight to pull out counts of various records split down by what state the record is in limited to a date range where the record first had activity.
This is all being done in SPICE since the raw data is CSV files in S3. The two CSV files are:
Student
Id: String
Current State: One of pass / fail
Department: String
Test
Id: String
Student Id: String
DateOfTest: DateTime
State: One of pass / fail
The pivot table I'm trying to build would have a row for each department and 3 columns:
fail - Count of students in the department where the current state is fail
pass-never-failed - Count of students in the department where the current state is pass and there are no failed tests for the Student
pass-failed-some - Count of students in the department where the current state is pass and there is at least one failed test
To do this I've created a dataset that contains Student joined to Test. I can then create a Pivot table with:
Row: Department
Column: Current State
Value: Count Distinct(Id[Student])
I thought I'd be able to create a calculated field to give me the three categories and use that in place of Current State using the following calculation:
ifelse(
{Current State} = 'pass',
ifelse(
countIf({Id[Test]}, {State} = 'fail') = 0,
'pass-never-failed',
'pass-failed-some'
),
{Current State}
)
But that shows invalid with the following error:
Mismatched aggregation. Custom aggregations can’t contain both aggregate "`COUNT`" and non-aggregated fields “COUNT(CASE WHEN "State" = _UTF16'fail' THEN "Id[Test]" ELSE NULL END)”, in any combination.
Is there any way to categories the Students based on an aggregation of the Tests in quicksight or do I have to pre-calculate this information in the source data?
I've been able to workaround this for now by defining three separate calculations for the three columns and adding these as values in the quicksight pivot rather than setting a column dimension at all.
Count Passed Never Failed
distinct_countIf({Id[Student]}, maxOver(
ifelse({State[Test]} = 'fail' AND {Current State[Student]} = 'pass',
1, 0)
, [{Id[Student]}], PRE_AGG) = 0)
Count Passed Failed Some
distinct_countIf({Id[Student]}, maxOver(
ifelse({State[Test]} = 'fail' AND {Current State[Student]} = 'pass',
1, 0)
, [{Id[Student]}], PRE_AGG) > 0)
Count Failed
distinct_countIf({Id[Student]}, {Current State[Student]} = 'fail')
This works, but I'd still like to know if it's possible to build a dimension that I could use for this as it would be more flexible if new states are added without the special handling of pass.

Query a table with primary key and two conditions on sort key

I'm trying to query a dynamodb table using the partition key and a sort key. The sort key is a unix date, so I want to request x partition key between these 2 dates on the sort. I am currently able to achieve this with a table scan, but I have to move this to a query for the speed benefit. I am unable to find any decent examples online of people using a partition key and sort key to query their table.
I have carefully read through this https://docs.aws.amazon.com/sdk-for-go/api/service/dynamodb/#DynamoDB.Query and understand that my params must go within the KeyConditionExpression.
I have read through https://github.com/aws/aws-sdk-go/blob/master/service/dynamodb/expression/examples_test.go and understand it on the whole. But I just can't find the syntax for KeyConditionExpression
I'd have thought it was something like this:
keyCond := expression.Key("accountId").
Equal(expression.Value(accountId)).
And(expression.Key("sortKey").
Between(expression.Value(fromDateDec), expression.Value(toDateDec)))
But this throws:
ValidationException: Invalid KeyConditionExpression: Incorrect operand type for operator or function; operator or function: BETWEEN, operand type: NULL
First you need KeyAnd to combine Hash Key and sort key condition.
// keyCondition represents the key condition where the partition key
// "TeamName" is equal to value "Wildcats" and sort key "Number" is equal
// to value 1
keyCondition := expression.KeyAnd(expression.Key("TeamName").Equal(expression.Value("Wildcats")), expression.Key("Number").Equal(expression.Value(1)))
Now instead equal condition you can replace with your between condition as follows
// keyCondition represents the boolean key condition of whether the value
// of the key "foo" is between values 5 and 10
keyCondition := expression.KeyBetween(expression.Key("foo"), expression.Value(5), expression.Value(10))

Perfomance wise for LUA table selection

I'm a bit new to LUA. So I have a game that I need to capture the Entity and insert into the table. The maximum possible Entity table that could happen at the same time is 14. So I read that an array based solution is good.
But I saw that the table size increment even if we delete some value, for example from 10 table value and delete value at index 9 its not automatically shift the size when I want to insert table number 11.
Example:
local Table = {"hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello"}
-- Current Table size = 10
-- Perform delete at index 9
Table[9] = nil
-- Have new Entity to insert
Table[#Table + 1] = "New Value"
-- The table size will grow by the time the game extend.
So for this type of situation did array based table with nil value inside that grow by the time of new table value inserted will have better perfomance or should I move into table with key?
Or I should just stick with array based table and perform full cleanup when the table isnt used?
If you set an element in a table to nil, then that just stays there as a "hole" in your array.
tab = {1, 2, 3, 4}
tab[2] = nil
-- tab == {1, nil, 3, 4}
-- #tab is actually undefined and could be both 1 or 4 (or something completely unexpected)!
What you need to do is set the field to nil, then shift all the following fields to fill that hole. Luckily, Lua has a function for that, which is table.remove(table, index).
tab = {1, 2, 3, 4}
table.remove(tab, 2)
-- tab == {1, 3, 4}
-- #tab == 3
Keep in mind that this can get very slow as there's lots of memory access involved, so don't go applying this solution when you have a few million elements some day :)
While table.remove(Table, 9) will do the job in your case (removing field from "array" table and shifting remaining fields to fill the hole), you should first consider using "set" table instead.
If you:
- often remove/add elements
- don't care about their order
- often check if table contains a certain element
then the "set" table is your choice. Use it like so
local tab = {
["John"] = true,
["Jane"] = true,
["Bob"] = true,
}
Your elements will be stored as indices in a table.
Remove an element with
tab["Jane"] = nil
Test if table contains an element with
if tab["John"] then
-- tab contains "John"
Advantages compared to array table:
- this will eliminate performance overhead when removing an element because other elements will remain intact and no shifting is required
- checking if element exists in this table (which I assume is the main puspose of this table) is also faster than using array table because it no longer requires iterating over all the elements to find a match, the hash lookup is used instead
Note however that this approach doesn't let you have duplicate values as your elements, because tables can't contain duplicate keys. In that case you can use numbers as values to store the amount of times the element is duplicated in your set, e.g.
local tab = {
["John"] = 1,
["Jane"] = 2,
["Bob"] = 35,
}
Now you have 1 John, 2 Janes and 35 Bobs
https://www.lua.org/pil/11.5.html

jdbc/insert! on sqlite3 does not manage more than two rows

I am trying to batch-write to a sqlite3 db using a pooled connection as described in clojure-cookbook.
It works up to two rows. When I insert three rows I got a java.lang.ClassCastException: clojure.lang.MapEntry cannot be cast to clojure.lang.Named exception.
Here's my code:
(def db-spec {:classname "org.sqlite.JDBC"
:subprotocol "sqlite"
:subname "sqlite.db"
:init-pool-size 1
:max-pool-size 1
:partitions 1})
(jdbc/db-do-commands
*pooled-db*
(jdbc/create-table-ddl
:play
[[:primary_id :integer "PRIMARY KEY AUTOINCREMENT"]
[:data :text]]))
(jdbc/insert! *pooled-db* :play {:data "hello"}{:data "hello"})
(jdbc/insert! *pooled-db* :play {:data "hello"}{:data "hello"}{:data "hello"})
What am I missing here?
Thanks
See the docs for this example: https://github.com/clojure/java.jdbc
(j/insert-multi! mysql-db :fruit
[{:name "Apple" :appearance "rosy" :cost 24}
{:name "Orange" :appearance "round" :cost 49}])
The API docs say this:
insert-multi!
function
Usage: (insert-multi! db table rows)
(insert-multi! db table cols-or-rows values-or-opts)
(insert-multi! db table cols values opts)
Given a database connection, a table name and either a sequence of maps (for
rows) or a sequence of column names, followed by a sequence of vectors (for
the values in each row), and possibly a map of options, insert that data into
the database.
When inserting rows as a sequence of maps, the result is a sequence of the
generated keys, if available (note: PostgreSQL returns the whole rows).
When inserting rows as a sequence of lists of column values, the result is
a sequence of the counts of rows affected (a sequence of 1's), if available.
Yes, that is singularly unhelpful. Thank you getUpdateCount and executeBatch!
The :transaction? option specifies whether to run in a transaction or not.
The default is true (use a transaction). The :entities option specifies how
to convert the table name and column names to SQL entities.

Efficient way to get difference of two streams in RethinkDB

I am running some performance benchmarks on RethinkDB (related to a specific use-case). In my simulation, there are 2 tables: contact and event. A contact has many events. The event table has 2 indices: contact_id and compound index on [campaign_id, node_id, event_type]. The contact table has about 500k contacts and about 1.75 million docs in event table.
The query I am struggling with is to find all the contacts who have sent event_type but not open event_type. Following is the query I got to work:
r.table("events").
get_all([1, 5, 'sent'], {index: 'campaign'})['contact_id'].distinct
.set_difference
(r.table("events").get_all([1, 5, 'open'], {index: 'campaign'})['contact_id'].distinct)
.count.run(conn)
But this query uses set difference, not stream difference. I have also tried using difference operator:
r.table("events").
get_all([1, 5, 'sent'], {index: 'campaign'})['contact_id'] .difference
(r.table("events").get_all([1, 5, 'open'], {index: 'campaign'})['contact_id'])
.count.run(conn)
This query never finishes and the weird thing is even after aborting the query I see (in RethinkDB dashboard) that the reads dont stop.
Whats the most efficient way of doing these kind of queries?
Follow up: find all the male contacts who have sent event_type but not open event_type. What I have now is:
r.table("contacts").get_all(r.args(
r.table("events").get_all([1, 5, 'sent'], {index: 'campaign'})['contact_id'].distinct
.set_difference
(r.table("events").get_all([1, 5, 'open'], {index: 'campaign'})['contact_id'].distinct)))
.filter({gender: 1}).count.run(conn)
One way to make this efficient is to denormalize your data. Instead of having separate contact and event tables, just have the contact table and make each contact have an array of events. Then you can write:
r.table('contacts').indexCreate('sent_but_not_open', function(row) {
return row('events').contains('sent').and(
row('events').contains('open').not());
});
That will work well if the number of events per contact is smallish. If you have thousands or millions of events per contact it will break down though.
RethinkDB doesn't offer a way to diff two streams lazily on the server. The best you could do is to change your compound index to be on [campaign_id, node_id, event_type, contact_id] instead, replace your get_all([1, 5, 'sent'], {index: 'campaign'}) with .between([1, 5, 'sent', r.minval], [1, 5, 'sent', r.maxval], {index: 'campaign'})and then put.distinct({index: 'campaign'})['contact_id']on the end. That will give you a stream of distinct contact IDs rather than an array, and these contact IDs will be sorted. You can then do the same for theopen` events, and diff the two ordered streams in the client by doing a mergesort-like thing.

Resources