is there any way to find out the list of all error tables associated with each external table.
Actual Requirement: I am using External tables in Greenplum and data coming from source in form of files,data ingestion to Greenplum via external tables. and I want to report all the rejected rows to source system
Regards,
Gurupreet
http://gpdb.docs.pivotal.io/4340/admin_guide/load/topics/g-viewing-bad-rows-in-the-error-table-or-error-log.html
You basically just use the built-in function gp_read_error_log() and pass in the external table name to get the errors associated with the files. There is an example in the above link too.
The field fmterrtbl of pg_exttable contains the oid of the error table for any external table. So the query to find the error table for all external tables in the database is:
SELECT
external_namespace.nspname AS external_schema, external_class.relname AS external_table,
error_namespace.nspname AS error_schema, error_class.relname AS error_table
FROM pg_exttable AS external_tables
INNER JOIN pg_class AS external_class ON external_class.oid = external_tables.reloid
INNER JOIN pg_namespace AS external_namespace ON external_namespace.oid = external_class.relnamespace
LEFT JOIN (
pg_class AS error_class
INNER JOIN pg_namespace AS error_namespace ON error_namespace.oid = error_class.relnamespace
) ON error_class.oid = external_tables.fmterrtbl
the error_schema and error_table fields will be NULL for external tables with no error tables.
Related
Requirement is as below:
Table 1 - Order(ID, orderId, sequence, reference, valueDate, date)
Table 2 - Audit(AuditId, orderId, sequence, status, updatedBy, lastUpdatedDateTime)
For each record in ORDER table, there could be one or many rows in AUDIT table.
User is given a search form where he could enter any of the search params including status from audit table.
I need a way to join these two tables with dynamic where clause and get the top row for a given ORDER from AUDIT table.
Something like below:
select * from
ORDER t1, AUDIT t2
where t1.orderid = t2.orderId
and t1.sequence = t2.sequence
and t2.status = <userProvidedStatusIfAny>
and t2.lastUpdatedDateTime in (select max(t3.lastUpdatedDateTime) --getting latest record
AUDIT t3 where t1.orderid = t3.orderId
and t1.sequence = t3.sequence)
and t1.valudeDate = <userProvidedDataIfAny> ..... so on
Please help
-I cant do this with #Query because i have to form dynamic where clause
-#OneToMany fetches all the records in audit table for given orderId and sequence
-#Formula in ORDER table works but for only one column in select. i need multiple values from AUDIT table
Please help
I'm completely new to Hive and Stack Overflow. I'm trying to create a table with complex data type "STRUCT" and then populate it using INSERT INTO TABLE in Hive.
I'm using the following code:
CREATE TABLE struct_test
(
address STRUCT<
houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING
>
);
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('123', 'GoldStreet', London', W1a9JF') AS address
FROM dummy_table
LIMIT 1;
I get the following error:
Error while compiling statement: FAILED: semanticException [Error
10044]: Cannot insert into target because column number type are
different 'struct_test': Cannot convert column 0 from struct to
array>.
I was able to use similar code with success to create and populate a data type Array but am having difficulty with Struct. I've tried lots of code examples I've found online but none of them seem to work for me... I would really appreciate some help on this as I've been stuck on it for quite a while now! Thanks.
your sql error. you should use sql:
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy_table LIMIT 1;
You can not insert complex data type directly in Hive.For inserting structs you have function named_struct. You need to create a dummy table with data that you want to be inserted in Structs column of desired table.
Like in your case create a dummy table
CREATE TABLE DUMMY ( houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING);
Then to insert in desired table do
INSERT INTO struct_test SELECT named_struct('houseno',houseno,'streetname'
,streetname,'town',town,'postcode',postcode) from dummy;
No need to create any dummy table : just use command :
insert into struct_test
select named_struct("houseno","house_number","streetname","xxxy","town","town_name","postcode","postcode_name");
is Possible:
you must give the columns names in sentence from dummy or other table.
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy
Or
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno',tb.col1,'streetname',tb.col2, 'town',tb.col3, 'postcode',tb.col4) AS address
FROM table1 as tb
CREATE TABLE IF NOT EXISTS sunil_table(
id INT,
name STRING,
address STRUCT<state:STRING,city:STRING,pincode:INT>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '.';
INSERT INTO sunil_table 1,"name" SELECT named_struct(
"state","haryana","city","fbd","pincode",4500);???
how to insert both (normal and complex)data into table
I have a table pos.pos_inv in hdfs which is partitioned by yyyymm. Below is the query:
select DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),5),
to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),yyyymm
from pos.pos_inv inv
INNER JOIN pos.POSActvyBrdg Brdg ON Brdg.EIS_POSActvyBrdgId = Inv.EIS_POSActvyBrdgId
where to_date(from_unixtime(unix_timestamp(Inv.nrmlzdwkenddt, 'MM/dd/yyyy')))
BETWEEN DATE_SUB(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6)
and DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6)
and inv.yyyymm=201501
I have provided the partition value for the query as 201501, but still i get the error"
Error while compiling statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "inv" Table "pos_inv"
(schema)The partition, yyyymm is int type and actvydt is date stored as string type.
This happens because hive is set to strict mode.
this allow the partition table to access the respective partition /folder in hdfs .
set hive.mapred.mode=unstrict; it will work
In your query error it is said: No partition predicate found for Alias "inv" Table "pos_inv".
So you must put the where clause for the fields of the partitioned table (for pos_inv), and not for the other one (inv), as you've done.
set hive.mapred.mode=unstrict allows you access the whole data rather than the particular partitons. In some case read whole dataset is necessary, such as: rank() over
This happens when the 2 tables have the same column name (possibly same partition column). Try to deal with separate tables with where condition like below
WITH tableA as
(
-- All your where clause here
),
tableB AS
(
-- All your where clause here
)
select tableA.*, tableB.*
I'm trying to insert all data from Pacient table to Pacient_OR table (Object-Relational). Is there a simple way to do that (one script), if Pacient table has column with Pojistovna_ID (foreign key) and in Pacient_OR table there is REF to the Pojistovna_OR. Both Pojistovna and Pojistovna_OR are populated with the same data, but one is relational, second is based on object type.
I tried this (and more ofc):
INSERT INTO pacient_or
(pacient_or.id,
pacient_or.jmeno,
pacient_or.prijmeni,
pacient_or.datum_narozeni,
pacient_or.rodne_cislo,
pacient_or.telefon,
pacient_or.krevni_skupina,
pacient_or.rodinna_anamneza,
pacient_or.adresa,
pacient_or.pojistovna)
SELECT pacient.id,
pacient.jmeno,
pacient.prijmeni,
pacient.datum_narozeni,
pacient.rodne_cislo,
pacient.telefon,
pacient.krevni_skupina,
pacient.rodinna_anamneza,
Adresa_typ(pacient.ulice, pacient.mesto, pacient.psc),
(SELECT Ref(poj)
FROM pacient pac,
pojistovna_or poj
WHERE pac.pojistovna_id = poj.id)
FROM pacient;
This code throws error:
single-row subquery returns more than one row
Do not use pacient pac in the subquery. Link it to the pacient in your main from-clause. And even better do not use a subquery for this.
I am having troubles when i want to create a named calculation from two different tables.
I have the table "CallesDim" with an id(PK) and a description and the table "UbicacionesDim" with an id (PK), another id (FK to "CallesDim") and a description:
--
CallesDim
id PK
Descripcion VARCHAR
--
UbicacionesDim
id PK
CalleId FK to id from CallesDIM
Altura INT
--
I want to concatenate "Descripcion" from "CallesDim" with Altura from "UbicacionesDim".
I try doing this:
CallesDim.Descripcion + ' ' + CONVERT(VARCHAR,UbicacionesDim.Altura)
but i am having the following error:
the multi-part identifier "CallesDim.Descripcion" could not be bound
Any ideas?
Thanks!
In a named calculation you can only access columns from the table that it is defined on.
Which record of the other table should it take in case it would accept columns from other tables? How should it join? All this cannot be configured.
If you need to join two (or more) tables, you can define a named query that can contain joins and access as many tables as you like. A named query can contain everything that you can state in a single select statement.