SparkException: Job aborted - cluster-computing

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 76.0 failed 4 times, most recent failure: Lost task 5.3 in stage 76.0 (TID 2334) (10.139.64.5 executor 6): com.databricks.sql.io.FileReadException: Error while reading file <File_Path> It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved. If Delta cache is stale or the underlying files have been removed, you can invalidate Delta cache manually by restarting the cluster.

Additionally to what the answer by
AbhishekKhandave-MT suggests what you can try is explicitly repairing the table:
FSCK REPAIR TABLE delta.`path/to/delta`
This also fixes scenarios where the underlying files of the table have actually been changed without it being reflected in the "_delta_log" transaction log.

There are two ways you can try for this error –
Refresh table
Invalidates the cached entries for Apache Spark cache, which include data and metadata of the given table or view. The invalidated cache is populated in lazy manner when the cached table or the query associated with it is executed again.
REFRESH [TABLE] table_name
Manually restart the cluster.

Related

AWS DMS Error when trying to replicate Oracle to PostgreSQL

I'm trying to replicate several schemas in a Oracle database to a PostgresSQL database.
When the DMS task is started with Full load, ongoing replication type the task fails after sometimes while the tables are in the Before Load status. This is the error I'm getting when the task fails
Last Error Task error notification received from subtask 0, thread 0 [reptask/replicationtask.c:2673] [1022301]
Oracle CDC stopped; Error executing source loop; Stream component failed at subtask 0,
component st_0_LBI2ND3ZI65BF6DQQYK4ITPYAY ; Stream component 'st_0_LBI2ND3ZI65BF6DQQYK4ITPYAY'
terminated [reptask/replicationtask.c:2680] [1022301] Stop Reason FATAL_ERROR Error Level FATAL
However when the same tables are added to a task with Full Load type it works without any issue. The error occurs only when trying to run the task for replicating ongoing changes.
I tried searching for this error but couldn't find a exact reason. I have configured the endpoints properly and both source and target endpoints have the required permissions for replicating changes. How can I get this resolved?
For the replication to work properly you need to enable SUPPLEMENTAL LOGGING across all the required tables in your source DB
So this can be due to multiple reasons. Although the basic cause remains the same, DMS is not able to read the logs in your oracle database and it times out.
Before proceeding forward I assume you have followed all steps mentioned in aws documentation for CDC setup here.
As mentioned in above answer the Supplemental logging should be enabled on
database level as well as for all columns and primary keys at table level ex:
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS; ALTER
TABLE schema_name.table_name ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
ALTER table PCUSER.PC_POLICY ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY)
COLUMNS;
The log retention period should be enough so that CDC ke read the logs before deleted. Here is the troubleshooting link for this issue on aws docs.
The DMS user that you are using should have read/write/alter access for all the schemas you are trying to read from. In my case it happened several times, that afer adding new tables to the schema I got this error again as the user I was using did not have the access to read newly added tables.
Also it depends on, what are you using to mine the logs. If it is LogMiner the setup is quite simple, for binary there are few extra commands you need to execute. Which are mentioned in the setup documentation.
Login to the database using the same user, you are using on DMS and check if the redo logs exists at-
SELECT * FROM V$ARCHIVED_LOG;
Also check for the DEST_ID, highlighted in the above screenshot. As far as I read the default value is 0 on DMS. You can check this for your database and add set it in the extra connection attributes-
archivedLogDestId=1;
Check if there are multiple DEST_ID's for your logs, for example if you see the DEST_ID as 1, as in above screenshot, confirm using-
SELECT * FROM V$ARCHIVED_LOG WHERE dest_id NOT IN (1)
This should return nothing, but if this return records, copy those extra
DEST_ID's and paste them in below connection attribute-
additionalArchivedLogDestId=[0,2,3, ...,n]
Finally if this doesn't work, enable detailed debug logging, here how you can . In our case the logminer and thus the DMS user did not have the access to read the redo logs.
Few extra connection attributes that I used may help you for logminer-
addSupplementalLogging=Y;useLogminerReader=Y;archivedLogDestId=1;additionalArchivedLogDestId=[0,2,3];ignoreTxnCtxValidityCheck=false;cdcTimeout=1200

IMPDP uses more disk space than expected

Background:
I've been tasked with importing a large amount of data from a production database to a test database (Oracle 12c release 2 running on RHEL) and I'm using Data Pump.
The first time I imported the tables, The tables were created and the data was imported as planned, but - due to an issue in my data pump parameter file - the constraints were not imported.
My subsequent attempts did not go as well, however. Data Pump began to freeze part way and the STATUS command showed that no bytes were being processed.
My Solution Attempts:
I tried using TABLE_EXISTS_ACTION=REPLACE and dropping the tables directly after an attempt. I also dropped the master tables of any data pump jobs I was unable to terminate from the utility.
Still, it seemed to hang earlier and earlier in the process as I continuously tried to import the tables. df -h returned 100% disk usage every time it hanged.
The dump file itself is on a separate drive so it's no longer taking up room. I've been trying to clear out space but it keeps filling up when I run a job and I can't tell where. Oracle flashbacks are disabled and I made sure to purge the oracle recycle bin.
tl;dr:
Running impdp jobs seems to use up disk space beyond the added tables and the job master tables. Where is this memory getting used up and what can I do to clear it for a succesful import?
I figured out the problem:
The database was in archivelog mode in order to set up streams and recovery manager backups. As a result, impdp was causing a flood of archived database changes.
In order to clear out the old archives I ran the following in rman for every database in noarchivelog mode on the server.
connect target /
run {
allocate channel c1 type disk;
delete force noprompt archivelog until time 'SYSDATE-30';
release channel c1;
}
This cleared up 60 gigabytes. I also added the parameter transform=disable_archive_logging:Y to my impdp parameter file. This suppresses archive creation when running data pump imports.

After restarting the services, the impala tables are not coming up

After restarting the Impala server, we are not able to see the tables(i.e. tables are not coming up).Anyone help me what order we have to follow to avoid this issue.
Thanks,
Srinivas
You should try running "invalidate metadata;" from impala-shell. This usually clears up tables not being visible as impala caches metadata.
From:
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_invalidate_metadata.html
The following example shows how you might use the INVALIDATE METADATA
statement after creating new tables (such as SequenceFile or HBase tables) through the Hive shell. Before the INVALIDATE METADATA statement was issued, Impala would give a "table not found" error if you tried to refer to those table names.

SonarQube upgrade fails from v5.1 to v5.2

I am trying to upgrade SonarQube v5.1 to v5.2 and it fails with below error:
ERROR web[o.s.s.d.m.DatabaseMigrator] Fail to execute database migration: org.sonar.db.version.v52.RemoveDuplicatedComponentKeys
java.lang.IllegalStateException: Error during processing of row:..................................................................
Caused by: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#7f872fa8 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
2015.11.05 09:08:32 INFO web[o.s.s.d.m.PlatformDatabaseMigration] DB migration failed | time=6911ms
2015.11.05 09:08:32 ERROR web[o.s.s.d.m.PlatformDatabaseMigration] DB Migration or container restart failed. Process ended with an exception
org.jruby.exceptions.RaiseException: (StandardError) An error has occurred, all later migrations canceled:
Fail to execute database migration: org.sonar.db.version.v52.RemoveDuplicatedComponentKeys
I found a workaround. I've deleted the duplicated projects from the projects table and then restarted the migration process.
To get an idea what is going on, execute the following query on your database:
select p.kee, COUNT(p.kee) FROM projects p GROUP BY p.kee HAVING COUNT(p.kee) > 1;
If you this query returns any elements you have to remove duplicated ones (e.g. the oldest ones). In my case it was easy because I had no isseues in the issue table related to projects in question.
I you have to mimic the SonarQube 5.2 migration procedure step by step (deleting duplicated projects and updating issues for them) you can find a list of queries being executed during the migration step on Github.

"Invalid column index" during a rolling release

I am dropping a table column in my Oracle database. My JRuby connection pool appears to be caching metadata and throws an exception upon select:
ERROR 12:40:00.408000 pid=1832 tid=bok56 :: Java::JavaSql::SQLException: Invalid column index:
SELECT "MY_TABLE".* FROM "MY_TABLE" WHERE "MY_TABLE"."ID" = :a1
Restarting each of my application's several JRuby instances will cause the connections in the pools to grab the correct metadata for the table but I want to do this rollout without downtime. (During a rolling restart there will be some servers with the old state and some with the new.)
Is there a way to force the connections in my pools to grab new metadata, either upon select or when signaled by the database that the metadata has changed?
Or is there another strategy for removing an unused column from an active table?
I am not averse to a multi-release (or multi-restart) solution.
I'd prefer not to use Oracle Edition-Based Redifinition™.
While I'm running JRuby on Trinidad, I'm hoping folks with Rails or Java applications will have had to solve this problem as well.

Resources