Attempting to follow the instructions for creating a dictionary using DDL:
-- source table
create table brands (
id UInt64,
brand String
)
ENGINE = ReplacingMergeTree(id)
partition by tuple()
order by id;
-- some data
insert into brands values (1, 'cool'), (2, 'neat'), (3, 'fun');
-- dictionary references source table
CREATE DICTIONARY IF NOT EXISTS brand_dict (
id UInt64,
brand String
)
PRIMARY KEY id
SOURCE(CLICKHOUSE(
host 'localhost'
port 9000
user 'default'
password ''
db 'default'
table 'brands'
))
LIFETIME(MIN 1 MAX 10)
LAYOUT(FLAT())
-- looks good:
show dictionaries;
-- no work
-- Code: 36. DB::Exception: Received from localhost:9000. DB::Exception: external dictionary 'brand_dict' not found.
select dictGetString('brand_dict', 'id', toUInt64(1));
Gives DB::Exception: external dictionary 'brand_dict' not found.
I haven't tried with XML config yet, so not sure if it's DDL specific, or if there's something I'm doing wrong there.
such dictionaries require database specified
dictGetString('DATABASE.brand_dict'
UPD: Starting from 21.4 Functions dictGet, dictHas use current database name if it is not specified for dictionaries created with DDL
Related
env: Clikchouse version:22.3.3.44; Database engine: atomic
I have a raw table and mv, schema like this:
CREATE TABLE IF NOT EXISTS test.Income_Raw on cluster '{cluster}' (
Id Int64,
DateNum Date,
Cnt Int64,
LoadTime DateTime
) ENGINE==MergeTree
PARTITION BY toYYYYMMDD(LoadTime)
ORDER BY (Id, DateNum);
CREATE MATERIALIZED VIEW test.Income_MV on cluster '{cluster}'
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/{shard}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(DateNum)
ORDER BY (Id, DateNum)
TTL DateNum+ INTERVAL 100 DAY
AS SELECT
DateNum,
Id,
argMaxState(Cnt, LoadTime) as Cnt ,
maxState( LoadTime) as latest_loadtime
FROM test.Income_Raw
GROUP BY Id, DataNum;
now I want to add a column named 'price' to raw table and mv,
so I run sql step by step like below:
// first I alter raw table
1. alter table test.Income_Raw on cluster '{cluster}' add column Price Int32
// below sqls, I run to alter MV
2. detach test.Income_MV on cluster '{cluster}'
3. alter test.`.inner_id.{uuid}` on cluster '{cluster}' add column Price Int32
// step 4, basically I just use 'attach' replace 'create' and add 'Price' to select query
4. attach MATERIALIZED VIEW test.Income_MV on cluster '{cluster}'
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/{shard}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(DateNum)
ORDER BY (Id, DateNum)
TTL DateNum+ INTERVAL 100 DAY
AS SELECT
DateNum,
Id,
Price,
argMaxState(Cnt, LoadTime) as Cnt ,
maxState( LoadTime) as latest_loadtime
FROM test.Income_Raw
GROUP BY Id, DataNum, Price;
but at step 4, I met error like this
Code: 80. DB::Exception: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE Income_MV;
2. CREATE TABLE Income_MV <table definition>;
3. ATTACH TABLE Income_MV FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE Income_MVUUID '<uuid>' <table definition>;. (INCORRECT_QUERY) (version 22.3.3.44 (official build))
these sqls I runned is I followed from below references.
https://kb.altinity.com/altinity-kb-schema-design/materialized-views/
Clickhouse altering materialized view's select
so my question is, how to modify mv select query, which step I was wrong?
I figure out that just need:
prepare: use explicit target table instead inner table for MV
1 alter MV target table
2 drop MV
3 re-create MV with new query
I'm trying to do CRUD operations in Hive and able to successfully run insert query however when I tried to run update and delete getting the below exception.
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
List of the queries I ran
CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2))
CLUSTERED BY (age) INTO 2 BUCKETS STORED AS ORC;
INSERT INTO TABLE students
VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32);
CREATE TABLE pageviews (userid VARCHAR(64), link STRING, came_from STRING)
PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC;
INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')
VALUES ('jsmith', 'mail.com', 'sports.com'), ('jdoe', 'mail.com', null);
INSERT INTO TABLE pageviews PARTITION (datestamp)
VALUES ('tjohnson', 'sports.com', 'finance.com', '2014-09-23'), ('tlee', 'finance.com', null, '2014-09-21');
Source : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Delete
Update and delete queries I'm trying to run
update students1 set age = 36 where name ='barney rubble';
update students1 set name = 'barney rubble1' where age =36;
delete from students1 where age=32;
Hive Version : 2.1(Latest)
Note : I'm aware that Hive is not for Update and Delete commands(on BigData set) still trying to do, to get awareness on Hive CRUD operations.
Can someone point/guide me the where I'm going wrong on update/delete queries.
make sure you are setting the properties listed here.
https://community.hortonworks.com/questions/37519/how-to-activate-acid-transactions-in-hive-within-h.html
I tested in Hive 1.1.0 CDH 5.8.3 and it is working. same exampled you provided in your comment
I have created table with a partition:
CREATE TABLE edw_src.pageviewlog_dev
(
accessurl character varying(1000),
msisdn character varying(1000),
customerid integer
)
WITH (
OIDS=FALSE
)
DISTRIBUTED BY (msisdn)
PARTITION BY RANGE(customerid)
(
PARTITION customerid START (0) END (200)
)
Now I want to change the datasize of accessurl from 1000 to 3000.I am not able to change the datasize,Whenever I am trying I am getting the error.
ERROR: "pageviewlog_dev_1_prt_customerid" is a member of a partitioning configurationHINT: Perform the operation on the master table.
I am able to change If I change the datatype from pg_attribute.If there any other way to change the datasize of existing column other than pg_attribute
I have found the Solution for the same .Sorry for the replying late .Below is the way to do ,whenever we face this kind of problem in "Post grel and greenplum"
UPDATE pg_attribute SET atttypmod = 300+4
WHERE attrelid = 'edw_src.ivs_hourly_applog_events'::regclass
AND attname = 'adtransactionid';
Greenplum isn't Postgresql so please don't confuse people by asking a Greenplum question with PostgreSQL in the title.
Don't modify catalog objects like pg_attribute. That will cause lots of problems and isn't supported.
The Admin Guide has the syntax for changing column datatypes and this is all you need to do:
ALTER TABLE edw_src.pageviewlog_dev
ALTER COLUMN accessurl TYPE character varying(3000);
Here is the working example with your table:
CREATE SCHEMA edw_src;
CREATE TABLE edw_src.pageviewlog_dev
(
accessurl character varying(1000),
msisdn character varying(1000),
customerid integer
)
WITH (
OIDS=FALSE
)
DISTRIBUTED BY (msisdn)
PARTITION BY RANGE(customerid)
(
PARTITION customerid START (0) END (200)
);
Output:
NOTICE: CREATE TABLE will create partition "pageviewlog_dev_1_prt_customerid" for table "pageviewlog_dev"
Query returned successfully with no result in 47 ms.
And now alter the table:
ALTER TABLE edw_src.pageviewlog_dev
ALTER COLUMN accessurl TYPE character varying(3000);
Output:
Query returned successfully with no result in 62 ms.
Proof in psql:
\d edw_src.pageviewlog_dev
Table "edw_src.pageviewlog_dev"
Column | Type | Modifiers
------------+-------------------------+-----------
accessurl | character varying(3000) |
msisdn | character varying(1000) |
customerid | integer |
Number of child tables: 1 (Use \d+ to list them.)
Distributed by: (msisdn)
If you are unable to alter the table it is probably because the catalog is corrupted after you updated pg_attribute directly. You can try dropping the table and recreating it or you can open a support ticket to have them attempt to correct the catalog corruption.
I've already tried out a tool named TOYS. I found it free but unfortunately it didn't work.
Then, I tried "RED-Gate Schema Compare for Oracle" but it uses the technique to drop and recreate the table mean while I need to just alter the table with the newly added/dropped columns.
Any help is highly appreciated
Thanks
Starting from Oracle 11g you could use dbms_metadata_diff package and specifically compare_alter() function to compare metadata of two schema objects:
Schema #1 HR
create table tb_test(
col number
)
Schema #2 HR2
create table tb_test(
col_1 number
)
select dbms_metadata_diff.compare_alter( 'TABLE' -- schema object type
, 'TB_TEST' -- object name
, 'TB_TEST' -- object name
, 'HR' -- by default current schema
, 'HR2'
) as res
from dual;
Result:
RES
-------------------------------------------------
ALTER TABLE "HR"."TB_TEST" ADD ("COL_1" NUMBER);
ALTER TABLE "HR"."TB_TEST" DROP ("COL");
I'm using Ruby Sequel (ORM gem) to connect to a Postgres database. I'm not using any models. My insert statements seem to have a "returning null" appended to them automatically (and thusly won't return the newly inserted row id/pk). What's the use of this? And why is this the default? And more importantly, how do I disable it (connection wide)?
Also, I noticed there's a dataset.returning method but it doesn't seem to work!
require 'sequel'
db = Sequel.connect 'postgres://user:secret#localhost/foo'
tbl = "public__bar".to_sym #dynamically generated by the app
dat = {x: 1, y: 2}
id = db[tbl].insert(dat) #generated sql -- INSERT INTO "public"."bar" ("x", "y") VALUES (1, 2) RETURNING NULL
Don't know if it matters but the table in question is inherited (using postgres table inheritance)
ruby 1.9.3p392 (2013-02-22) [i386-mingw32]
sequel (3.44.0)
--Edit 1 -- After a bit of troubleshooting--
Looks like the table inheritance COULD BE the problem here. Sequel seems to run a query automatically to determine the pk of a table (in my case the pk's defined on a table up the chain), not finding which, perhaps the "returning null" is being appended?
SELECT pg_attribute.attname AS pk FROM pg_class, pg_attribute, pg_index, pg_namespace WHERE pg_class.oid = pg_attribute.attrelid AND pg_class.relnamespace = pg_namespace.oid AND
pg_class.oid = pg_index.indrelid AND pg_index.indkey[0] = pg_attribute.attnum AND pg_index.indisprimary = 't' AND pg_class.relname = 'bar'
AND pg_namespace.nspname = 'public'
--Edit 2--
Yup, looks like that's the problem!
If you are using PostgreSQL inheritance please note that the following are not inherited:
Primary Keys
Unique Constraints
Foreign Keys
In general you must declare these on each child table. Do for example:
CREATE TABLE my_parent (
id bigserial primary key,
my_value text not null unique
);
CREATE TABLE my_child() INHERITS (my_parent);
INSERT INTO my_child(id, my_value) values (1, 'test');
INSERT INTO my_child(id, my_value) values (1, 'test'); -- works, no error thrown
What you want instead is to do this:
CREATE TABLE my_parent (
id bigserial primary key,
my_value text not null unique
);
CREATE TABLE my_child(
primary key(id),
unique(my_value)
) INHERITS (my_parent);
INSERT INTO my_child(id, my_value) values (1, 'test');
INSERT INTO my_child(id, my_value) values (1, 'test'); -- unique constraint violation thrown
This sounds to me like you have some urgent DDL issues to fix.
You could retrofit the second's constraints onto the first with:
ALTER TABLE my_child ADD PRIMARY KEY(id);
ALTER TABLE my_child ADD UNIQUE (my_value);