GENERIC_INTERNAL_ERROR: Field new_test2 not found in log schema. Query cannot proceed! Derived Schema Fields: - parquet

there's a hudi table that is written as parquet file in s3, I ma trying to query it using athena, firstly it worked fine, then when I try to add a column and try to query it again I get this error:
GENERIC_INTERNAL_ERROR: Field new_test2 not found in log schema. Query cannot proceed! Derived Schema Fields:
despite the fact that when I try to query the same table using spark.sql it works fine. I don't know why this is happening as far as I understand that athena can handle schema changes, so why it's mentioning that the col doesn't exist?
and also if I tried to alter the table to add the col, we get an error of duplicate col which means it can see it.
ps: this error happens in athena engine v3, but when I set athena to automatic it works fine.

Related

An error occurred when using hive to query the es

I created an Hive external table to query the existing data of es like below
CREATE EXTERNAL TABLE ods_es_data_inc
(`agent_id` STRING,
`dt_server_time` TIMESTAMP
) COMMENT 'bb_i_app'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
'es.resource'='data*',
'es.nodes'='ip',
'es.port'='port',
'es.net.http.auth.user'='user',
'es.net.http.auth.pass'='pass'
)
when I query date field in Hive external table,I am getting below error
Error:Java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException:org.apache.hadoop.hive.serde2 .io.Timestampwritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampwritableV2 (state=,code=0)
My situation is very similar to this problem.But I have used the timestamp field when I create external table.
My component version:
Hive:3.1.0
ES-Hadoop:6.8.7
Elasticsearch:6.7.1
I switched Hive's execution engine from mr to spark.The error has not changed. After eliminating the component problem, I don't know whether it is the version mismatch or the table creation problem.

how to obtain blob type data for insertion in oracle?

hello friends I have a problem with the blob data type, I want to migrate some data from one bd to another bd however I have not been able to some tables that have blob type columns, what I have tried is to export a single record in the following way.
first I make a select of the record I want to export to my other bd
select TEMPLATE_DOCUMENT_ID,blob_file from example_table where template_document_id = 32;
then I export the result to obtain the insert
I configure as follows
when I do this I get a script with the data of the record that I want to migrate
if I run this it gives me the following error
Error report -
ORA-01465: invalid hex number
Do you have any idea how I could get the correct data to make my insert?
NOTE: MIGRATION IS DONE FROM ONE ORACLE DATABASE TO ANOTHER ORACLE DATABASE.
Obviously the source database is Oracle. You did not mention what is the target database. In case it is Oracle as well I would suggest using the Oracle Data Pump tool (expdp/impdp). Doc is here: https://docs.oracle.com/cd/B19306_01/server.102/b14215/dp_overview.htm
In case you need it, at least I use it quite often is the VIEW_AS_TABLE option of the tool as it allows me to export a subset of the data.

Hive - Hbase integration Transactional update with timestamp

I am new to hadoop and big data, just trying to figure out the possibilities to move my Data store to hbase these days, and I have come across a problem, which some of you might be able to help me with. So its like,
I have a hbase table "hbase_testTable" with Column Family : "ColFam1". I have set the version of "ColFam1" to 10, as I have to maintain history upto 10 updates to this column family. Which works fine. When I try to add new rows through hbase shell with explicit timestamp value it works fine. Basically I want to use the timestamp as my version control. So I specify the time stamp as
put 'hbase_testTable' '1001','ColFam1:q1', '1000$', 3
where '3' is my version. And everything works fine.
Now I am trying to integrate with HIVE external table, and I have all mappings well set to match that of hbase table like below :
create external table testtable (id string, q1 string, q2 string, q3 string)
STOREd BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
SERDEPROPERTIES ("hbase.columns.mapping" = ":key,colfam1:q1, colfam1:q2, colfam1:q3")
TBLPROPERTIES("hbase.table.name" = "testtable", "transactional" = "true");
And works fine with normal insertion. It updates the HBase table and vice-versa.
Even though the external table is made "Transactional", I am not able to update the data on HIVE. It gives me an error :
FAILED: SemanticException [Error 10294]: Attempt to do update or delete
using transaction manager that does not support these operations
Said that, Any updates, made to the hbase tables are reflected immediately on the hive table.
I can update the Hbase table with hive external table by trying to insert into the hive external table for the "rowid" with new data for the column.
Is it possible to I control the timestamp being written to the referenced hbase table ( like 4,5,6,7..etc) Please help.
The timestamp is one of important element in Hbase versioning. You are trying to create your own timestamp, which works fine at Hbase level.
One point, is you should be very careful, with unique and non-negative. You can look at Custom versioning in HBase-Definitve Guide book.
Now you have Hive on top of Hbase. As per documentation,
there is currently no way to access the HBase timestamp attribute, and queries always access data with the latest timestamp.
Thats for the reading part. And for putting data, you can look here.
It still says that, you have to give valid time stamp and not any other value.
The future versions are expected to expose the timestamp attribute.
I hope you got a better idea regarding how to deal with custom timestamp in Hive-Hbase integration.

Populate indexed table in Oracle using Informatica

I'm new to both Oracle and Informatica.
Currently working on a small task where I need to select all records from the source table, filter the results to get only records where field1='Y' and finally insert new rows into the target table that contains only src.field2 and src.field3 values.
These 2 fields are used for the PK and for the Index of the target table.
So i get an error in Informatica:
"ORA-26002: Table has index defined upon it"
I rather not dropping the index? is there a work around?
I've tried alter index to "unusable" but I got the same error.
Please advice.
Thanks.
Try to use Normal load mode instead of Bulk. You can set in session properties for the target.

How to see the metadata of objects in Vertica Database (desc like Oracle)

I would like to know how I can get the metadata of an object in a Vertica database; like metadata of a table. Is there any table that stores the metadata of objects (functions, views, tables) in Vertica Database.
For example in Oracle, I could type the following and get a detailed description of tables or see the code of a procedure.
oracle :> desc table_name;
or
oracle :> edit proc_name;
I know that I can see the tables from my schemas with \dt command, but is there any way I can see the DDL statements that created the objects?
Thanks, but this is not what I want. I was looking for is the export_objects() function:
select export_objects('','object_name')
This way you will get the creation script for the object.
\d table-name should get you what you need.
Extra tip: If you specify only the schema, you will get all of the objects inside that schema. Sure beats having to enter a loop where you run export_objects() for every object.

Resources