Create a key value table using apache ignite sqlline - caching

I am trying to make a table using apache ignite 2.9.1 sqlline option. It's value part should have collection of generic key value pairs. It is very important.
I want to create following type table
I want to know , can I do this using using apache ignite sqlline option ?
Please help me to understand this.

Tables always have specific types in Apache Ignite. Caches may be of <Object, Object> but tables (which are also backed by caches) have specific column types. All columns in PRIMARY KEY are considered key columns.
This example sums it nicely:
CREATE TABLE IF NOT EXISTS Person (
id int,
city_id int,
name varchar,
age int,
company varchar,
PRIMARY KEY (id, city_id)
) WITH "template=partitioned,backups=1,affinity_key=city_id, key_type=PersonKey, value_type=MyPerson";

Related

How different is this to creating a primary key on a column in hive?

I read that we cannot create a primary key on a column in a Hive table. But I saw the below DDL in some other place and executed it. It worked without any problem.
create table prim(id int, name char(30))
TBLPROPERTIES("PRIMARY KEY"="id");
After this I executed "describe formatted prim" and got to see that a key is created on the column ID
Table Parameters:
PRIMARY KEY id
I inserted two records with same ID number into the table.
insert into prim values(1,'ABCD');
insert into prim values(2,'EFGH');
Both the records were inserted into the table. What baffles me is that we cannot give the PRIMARY KEY in the create statement which I can understand, but when given in TBLPROPERTIES("PRIMARY KEY"="id") how different is it to the primary key in RDBMS.
PRIMARY KEY in TBLPROPERTIES is for metadata reference to preserve column significance. It does not apply any constrain on that column. This can be used as a reference from design perspective.

Update or Insert into postgres table without any primary key

I have a table which needs to be ingested from Oracle source to Greenland target using ETL tool talend. The table is huge , hence we want to load the data on daily basis incrementally. The table doesn't have any primary or unique key.
Table has date column, I am able to get both inserted/updated records from last update date but to insert that data, we need a primary key.
Any solution on how to load the data without using a primary key?
You need to define your key in talend in the schema of the component that insert into your target table, like this :
And you can use this key to update your table, in the advanced settings of the same component, activate the check box use fields optins and select your key :
This is tested and worked fine against Oracle table that does not have primary key, and it should work for you.

Cassandra timeout during read query (19 million result) at consistency ONE

I have Cassandra cluster with 2 node. And my table structure is <key, Map<list, timestamp>>. I am trying to fetch all key that contains given list. My query look like
Statement select = QueryBuilder.select().all().from(tableName).where(QueryBuilder.containsKey("list", value)); select.setFetchSize(50000);
but i am getting cassandra timeout during read query.
I can decrease setFetchSize but it taking too much time to process 19 million row.
Can any one please suggest correct way to solve this problem?
is there any alternative available for this kind of problem?
Cassandra version = Cassandra 2.2.1
Cassandra data modeling best practices recommend not to use collections (list, set, map) to store a massive amount of data. The reason is that when loading the CQL row (SELECT ... WHERE id=xxx) Cassandra server has to load the entire collection in memory.
Now to answer your questions:
Can any one please suggest correct way to solve this problem?
Using secondary index to retrieve a huge data set (19 millions) isn't the best approach for your problem.
If your requirement is: give me all list which contains an item, the following schemas may be more appropriate
Solution 1: manual denormalization
CREATE TABLE base_table(
id text,
key int,
value timestamp,
PRIMARY KEY(id, key)
);
CREATE TABLE denormalized_table_for_searching(
key int,
id text
value timestamp,
PRIMARY KEY(key, id));
// Give me all couples (id,value) where key = xxx
// Use iterator to fetch data by page and not load 19 millions row at once !!
SELECT * FROM denormalized_table_for_searching WHERE key=xxx;
Solution 2: automatic denormalization with Cassandra 3.0 materialized views
CREATE TABLE base_table(
id text,
key int,
value timestamp,
PRIMARY KEY(id, key)
);
CREATE MATERIALIZED VIEW denormalized_table_for_searching
AS SELECT * FROM base_table
WHERE id IS NOT NULL AND key IS NOT NULL
PRIMARY KEY(key, id);
// Give me all couples (id,value) where key = xxx
// Use iterator to fetch data by page and not load 19 millions row at once !!
SELECT * FROM denormalized_table_for_searching WHERE key=xxx;
is there any alternative available for this kind of problem?
See answer for point 1. above :)

Auto increment ids in cassandra database

How to create unique id in cassandra column families( like auto-increment id in mysql database ) ?
For unique IDs in Cassandra, you'll want to use UUIDs, which are probabilistically (nearly) guaranteed to be unique. There are a few built-in functions in CQL to help with UUIDs.
You asked for simple query to create table with uuid, so there you are:
create:
CREATE TABLE event (uuid uuid, name varchar);
insert:
INSERT INTO event(uuid, name) values(uuid(), 'john');
select:
SELECT * FROM event LIMIT 1;
to select on name column you must add new index:
CREATE index idx_event_name ON event(name);
and now you can select with where:
SELECT * FROM event WHERE name = 'john';
If you really need integer autoincrement IDs I've written a simple python module that does that after going through stackoverflow and not seeing anything decent that does that specific function. If you don't care about the ID being an integer you're better off using something like UUID which is probably safer and more elegant.
link: https://github.com/qdatum/globalcounter

Can we create a function based primary key in Oracle 10?

There is a requirement in our application to create the unique primary key which depend on the value of another unique column (ERROR_CODE). But our application is in a geo active active environment (have several active databases which are synchronized using another program).
Therefore even-though we have a unique constraint on this ERROR_CODE field, there is a possibility that each database has a row with a different PK for the same ERROR_CODE. During the database synchronization, this is a problem, because there are some child tables which has the PK stored in one DB and other rows contain the PK stored in other DB. Because of the unique constraint of ERROR_CODE, sync process cannot move both rows to each database (which is also not a good thing to do).
So there is a suggestion to use the hash of the ERROR_CODE field as the PK value.
I would like to know whether we can define a function based Primary key in oracle?
If PK field is "ID",
"ID" should be equal to ora_has(ERROR_CODE).
Is it possible to define the primary key like that in oracle?
In Oracle 10 you cannot do this, but in Oracle 11 you can. You have to create a virtual column, such columns can be used also as primary key:
ALTER TABLE MY_TABLE ADD (ID NUMBER GENERATED ALWAYS AS (ora_has(ERROR_CODE)) VIRTUAL);
ALTER TABLE MY_TABLE ADD CONSTRAINT t_test_pk PRIMARY KEY (ID) USING INDEX;

Resources