PostgreSQL's JSONB-like indexable column in Tarantool? - tarantool

In PostgreSQL we can create a JSONB column that can be indexed and accessed something like this:
CREATE TABLE foo (
id BIGSERIAL PRIMARY KEY
-- createdAt, updatedAt, deletedAt, createdBy, updatedBy, restoredBy, deletedBy
data JSONB
);
CREATE INDEX ON foo((data->>'email'));
INSERT INTO foo(data) VALUES('{"name":"yay","email":"a#1.com"}');
SELECT data->>'name' FROM foo WHERE id = 1;
SELECT data->>'name' FROM foo WHERE data->>'email' = 'a#1.com';
Which is very beneficial in the prototyping phase (no need for migration at all or locking when adding column).
Can we do similar thing in Tarantool?

Sure, tarantool supports JSON path indices. The example:
-- Initialize / load the database.
tarantool> box.cfg{}
-- Create a space with two columns: id and obj.
-- The obj column supposed to contain dictionaries with nested data.
tarantool> box.schema.create_space('s',
> {format = {[1] = {'id', 'unsigned'}, [2] = {'obj', 'any'}}})
-- Create primary and secondary indices.
-- The secondary indices looks at the nested field obj.timestamp.
tarantool> box.space.s:create_index('pk',
> {parts = {[1] = {field = 1, type = 'unsigned'}}})
tarantool> box.space.s:create_index('sk',
> {parts = {[1] = {field = 2, path = 'timestamp', type = 'number'}}})
-- Insert three tuples: first, third and second.
tarantool> clock = require('clock')
tarantool> box.space.s:insert({1, {text = 'first', timestamp = clock.time()}})
tarantool> box.space.s:insert({3, {text = 'third', timestamp = clock.time()}})
tarantool> box.space.s:insert({2, {text = 'second', timestamp = clock.time()}})
-- Select tuples with timestamp of the last hour, 1000 at max.
-- Sort them by timestamp.
tarantool> box.space.s.index.sk:select(
> clock.time() - 3600, {iterator = box.index.GT, limit = 1000})
---
- - [1, {'timestamp': 1620820764.1213, 'text': 'first'}]
- [3, {'timestamp': 1620820780.4971, 'text': 'third'}]
- [2, {'timestamp': 1620820789.5737, 'text': 'second'}]
...
JSON path indices are available since tarantool 2.1.2.

Related

How to remove a field in Tarantool space?

I have field in tarantool space I no longer need.
local space = box.schema.space.create('my_space', {if_not_exists = true})
space:format({
{'field_1', 'unsigned'},
{'field_2', 'unsigned'},
{'field_3', 'string'},
})
How to remove field_2 if it's indexed and if it's not indexed?
There is no any convenient way to do it.
The first way, just declare this field as nullable and insert NULL value to this field. Yes, it will be stored physically but you could hide them from users.
It's simple and not expensive.
The second way, write in-place migration. It's not possible if you have indexed fields after field you want to drop (in your example it's field_3).
And it's dangerous if you have a huge amount of data in this space.
local space = box.schema.space.create('my_space', {if_not_exists = true})
space:create_index('id', {parts = {{field = 1, type = 'unsigned'}}})
space:format({
{'field_1', 'unsigned'},
{'field_2', 'unsigned'},
{'field_3', 'string'},
})
-- Create key_def instance to simplify primary key extraction
local key_def = require('key_def').new(space.index[0].parts)
-- drop previous format
space:format({})
-- Migrate your data
for _, tuple in space:pairs() do
space:depete(key_def:extract_key(tuple))
space:replace({tuple[1], tuple[3]})
end
-- Setup new format
space:format({
{'field_1', 'unsigned'},
{'field_3', 'string'},
})
The third way is to create new space, migrate data into it and drop previous.
Still it's quite dangerous.
local space = box.schema.space.create('new_my_space', {if_not_exists = true})
space:create_index('id', {parts = {{field = 1, type = 'unsigned'}}})
space:format({
{'field_1', 'unsigned'},
{'field_3', 'string'},
})
-- Migrate your data
for _, tuple in box.space['my_space']:pairs() do
space:replace({tuple[1], tuple[3]})
end
-- Drop the old space
box.space['my_space']:drop()
-- Rename new space
local space_id = box.space._space.index.name:get({'my_new_space'}).id
-- In newer version of Tarantool (2.6+) space.alter method available
-- But in older versions you could update name via system "_space" space
box.space._space:update({space_id}, {{'=', 'name', 'my_space'}})

LINQ Left Outer Join with Greater Than and Less Than Date Conditions

I've been struggling with this for a while and can't find the syntax for a LINQ outer join that has multiple conditions based on date. I've been looking into the GroupJoin syntax, but that only let's you compare one field value (normally IDs).
I would like to test if the parent table has a date (e.g. "UpdateDate") that falls within multiple values defined in the child table (e.g. "StartDate" and "EndDate"). If the parent date fits the condition, pull a column or two from the child table. If not, those columns from the child table should be null (classic left join stuff).
I don't think query syntax will work because it only recognizes equijoins.
Is there a way to do this in LINQ using Lambda syntax? I've been trying to use some combination of "SelectMany" and "DefaultIfEmpty" but keep getting stuck trying to define the join.
The way to do this in linq:
var q = from a in TableA
from b in TableB.where(x => a.Date > x.StartDate && a.Date < x.EndDate).DefaultIfEmpty()
select {...}
Use parameter ResultSelector of Queryable.GroupJoin to select what you want:
var result = dbContext.Parents.GroupJoin(dbContext.Children,
// outer and inner key Selectors:
parent => parent.Id, // from every parent take the primary key
child => child.ParentId, // from every child take the foreign key to parent
// ResultSelector: take the parent and all his children to make one new object
(parent, children) => new
{
// Select only the Parent properties you actually plan to use:
Id = parent.Id,
Name = parent.Name,
...
Children = children.Select(child => new
{
// select only Child properties you plan to use:
Id = child.Id,
// No need: you know the value: ParentId = child.ParentId,
...
"If the parent date fits the condition, pull a column or two from the child table, otherwise those columns from the child table should be null "
SpecialColumnA = (parent.BirthDay.Year < 2000) ?? child.BirthDay : null,
SpecialColumnB = (parent.Name == "Kennedy" ?? child.Name : null,
});
If the conditions are the same for a lot of columns, consider to check this only once:
SpecialColumns = (parent.Birthday.Year >= 2000) ? null :
// else fill the special columns:
new
{
Name = child.Name,
SomeWeirdProperty = parent.Id + child.Id,
...
},
});

How to create a temporary column + when + order by with Criteria Builder

here is the sql statement I am trying to translate in jpa :
select
id,
act_invalidation_id,
last_modification_date,
title,
case when act_invalidation_id is null then 1 else 0 end as test
from act order by test, last_modification_date desc
The actual translation
Root<Act> act = query.from(Act.class);
builder.selectCase()
.when(builder.isNull(actRoot.get("actInvalidation")), 1)
.otherwise(0).as(Integer.class);
Expression<?> actInvalidationPath = actRoot.get("actInvalidation");
Order byInvalidationOrder = builder.asc(actInvalidationPath);
Path<Date> publicationDate = actRoot.get("metadata").get("publicationDate");
Order byLastModificationDate = builder.desc(publicationDate);
query.select(act).orderBy(byInvalidationOrder, byLastModificationDate);
entityManager.createQuery(query).getResultList();
I try to create a temporary column (named test) of Integer type and orderby this column, then orderby lastmodificationdate. The content of this new column is determined by the value of actInvalidation field.
In short: How to create a temp column with integer values, then order by this temp column in jpa ?
Thank you
I didn't test this but it should work like this:
Root<Act> act = query.from(Act.class);
Expression<?> test = builder.selectCase()
.when(builder.isNull(actRoot.get("actInvalidation")), 1)
.otherwise(0).as(Integer.class);
Expression<?> actInvalidationPath = actRoot.get("actInvalidation");
Order byInvalidationOrder = builder.asc(actInvalidationPath);
Path<Date> publicationDate = actRoot.get("metadata").get("publicationDate");
Order byLastModificationDate = builder.desc(publicationDate);
Order byTest = builder.asc(test);
query.select(act).orderBy(byTest, byInvalidationOrder, byLastModificationDate);
entityManager.createQuery(query).getResultList();

How to Update previous row column based on the current row column data using LinQ

var customer= from cust in customerData
select new Customer
{
CustomerID = cust["id"],
Name = cust["Name"],
LastVisit = cust["visit"],
PurchashedAmount = cust["amount"],
Tagged = cust["tagged"]
Code = cust["code"]
}
The rows looks like this
Name LastVisit PurchasedAmount Tagged Code CustomerID
------ --------- -------------- ------ ----- -----
Joshua 07-Jan-09 Yes chiJan01 A001
Joshua 10000
The 2nd row belongs to first row just that the other columns are empty.How can i merge the PurchasedAmount into the first row using LinQ?
This is probably a more general solution than you need - it will work even if the other values are scattered across rows. The main condition is that the Name column should identify rows that belong together.
customer = from c in customer
group c by c.Name
into g
select new Customer
{
Name = g.Key,
LastVisit = g.Select(te => te.LastVisit).
Where(lv => lv.HasValue).FirstOrDefault(),
PurchaseAmount = g.Select(te => te.PurchaseAmount).
Where(pa => pa.HasValue).FirstOrDefault(),
Tagged = g.Select(te => te.Tagged).
Where(ta => ta.HasValue).FirstOrDefault(),
Code = g.Select(te => te.Code).
Where(co => !string.IsNullOrEmpty(co)).FirstOrDefault(),
CustomerID = g.Select(te => te.CustomerID).
Where(cid => !string.IsNullOrEmpty(cid)).FirstOrDefault()
};
This will return a new IEnumerable with the items grouped by Name and the non-null values selected (same effect as moving PurchasedAmount to the first row and deleting the second in your case).
Note that the code is based on the assumption that LastVisit, PurchaseAmount and Tagged are nullable types (DateTime?, int? and bool?). Thus the usage of HasValue. If, however, they are strings in your case, you have to use !string.IsNullOrEmpty() instead (as for Code and CustomerID).

Is a dynamic pivot using LINQ possible?

I have a T-SQL 2005 query which returns:
pid propertyid displayname value
----------- ----------- --------------- ---------------
14270790 74 Low Price 1.3614
14270790 75 High Price 0
14270791 74 Low Price 1.3525
14270791 75 High Price 0
14270792 74 Low Price 1.353
14270792 75 High Price 0
14270793 74 Low Price 1.3625
14270793 75 High Price 0
14270794 74 Low Price 1.3524
14270794 75 High Price 0
What I would like to do is essentially pivot on the displayname field, hopefully producing:
pid Low Price High Price
14270790 1.3614 0
14270791 1.3525 0
14270792 1.353 0
14270793 1.3625 0
14270794 1.3524 0
(Not sure how the propertyid field would be output, so I left it out (was hoping it would simply sit alongside the Low Price and High Price fields, to indicate their IDs, but I don't think that will work.)
The problem is that the content of the original displayname field is dynamic - it is produced from a join with a PropertyName' table, so the number of pivoted columns is variable. It could therefore containHigh Price,Low Price,OpenandClose`, depending on what the join with that table returns.
It is, of course, relatively easy (regardless of the trouble I'm having writing the initial query!) to produce this pivot in a fixed query or stored proc. However, is it possible to get LINQ to generate a SQL query which would name each column to be produced rather than having to write a dynamic (probably in a stored proc) query which lists out the column names?
Thanks,
Matt.
I'll give you a sample with a different data (that I needed). You can adapt that to your need. Note only two linq queries are used, most of the other fluff is to convert a list into a datatable.
var data = new[] {
new{Student=1, Subject="English", Marks=40},
new{Student=1, Subject="Maths", Marks=50},
new{Student=1, Subject="Science", Marks=60},
new{Student=1, Subject="Physics", Marks=70},
new{Student=1, Subject="Chemistry", Marks=80},
new{Student=1, Subject="Biology", Marks=90},
new{Student=2, Subject="English", Marks=4},
new{Student=2, Subject="Maths", Marks=5},
new{Student=2, Subject="Science", Marks=6},
new{Student=2, Subject="Physics", Marks=7},
new{Student=2, Subject="Chemistry", Marks=8},
new{Student=2, Subject="Biology", Marks=9}
};
/*Here the pivot column is the subject and the static column is student
group the data against the static column(s)*/
var groups = from d in data
group d by d.Student into grp
select new
{
StudentId = grp.Key,
Marks = grp.Select(d2 => new { d2.Subject, d2.Marks }).ToArray()
};
/*get all possible subjects into a separate group*/
var subjects = (from d in data
select d.Subject).Distinct();
DataTable dt = new DataTable();
/*for static cols*/
dt.Columns.Add("STUDENT_ID");
/*for dynamic cols*/
foreach (var subject in subjects)
{
dt.Columns.Add(subject.ToString());
}
/*pivot the data into a new datatable*/
foreach (var g in groups)
{
DataRow dr = dt.NewRow();
dr["STUDENT_ID"] = g.StudentId;
foreach (var mark in g.Marks)
{
dr[mark.Subject] = mark.Marks;
}
dt.Rows.Add(dr);
}
This is the closest I could get, but it's not LINQ...
create table #t
(
pointid [int],
doublevalue [float],
title [nvarchar](50)
)
insert into #t
select
distinct top 100
v.pointid, v.doublevalue, p.title
from [property] p
inner join pointvalue v on p.propertyid = v.propertyid
inner join point pt on v.pointid = pt.pointid
where v.pointid in (select top 5 p.pointid from point p where p.instanceid = 36132)
declare #fields nvarchar(250)
set #fields = (select STUFF((SELECT N',[' + title + ']' FROM [property] FOR XML PATH('')), 1, 1, N''))
--select #fields
declare #sql nvarchar(500)
set #sql = 'select * from #t
pivot
(
sum(doublevalue)
for [title] in ('+#fields+')
) as alias'
--select #sql
exec (#sql)
drop table #t
The kicker is that I'm simply asking for every entry in the Property table, meaning there's a lot of columns, in the resulting pivot, which have NULL values.
the code I think is like this:
var list = from table in Property
group table by table.pid into g
select new
{
pid = g.key,
LowPrice = g.Where(w => w.pid== g.key && w.priceType == "low").Select(s => s.value).FirstorDefault(),
HighPrice = g.Where(w => w.pid== g.key && w.priceType == "high").Select(s => s.value).FirstorDefault(),
};
Hope it can help you and have a nice day.

Resources