Unable to savepoint/checkpoint flink state to AWS S3 bucket - hadoop

I'm trying to checkpoint/savepoint my flink state running on EMR to an s3 bucket on AWS. Please note:
The instances (master and core nodes) have the IAM role properly set up to access the s3 bucket and all the directories/files inside it (AmazonS3FullAccess policy is attached to the role and nothing overrides it).
I can use aws s3 cp xxx s3://flink-bc/checkpoints from slave and master nodes successfully to copy files to the bucket
Using hdfs for savepoints/checkpoints work
If I set the checkpoints to use hdfs and then try to savepoint to s3, the savepoint operation error looks like
org.apache.flink.util.FlinkException: Triggering a savepoint for the job 16c162c47f225cddad974056c9494b6d failed.
at org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:723)
at org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:701)
at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985)
at org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:698)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1065)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to trigger savepoint. Decline reason: An Exception occurred while triggering the checkpoint.........
Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to trigger savepoint. Decline reason: An Exception occurred while triggering the checkpoint.
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
and the jobmanager logs:
java.io.IOException: Cannot instantiate file system for URI: s3://flink-bc/savepoints
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:187)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.initializeLocationForSavepoint(AbstractFsCheckpointStorage.java:147)
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerCheckpoint(CheckpointCoordinator.java:511)
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerSavepoint(CheckpointCoordinator.java:370)
at org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:951)

I faced a similar kind of issue while using the latest Flink version(1.10.0) with s3 to store the checkpoints in the s3 bucket.
So please find a detailed working answer which I have provided here.

Related

After HDP cluster kerberized Journal node service having issue while starting

Below is the error after cluster kerberized.
Exception in thread "main" java.io.IOException: Login failure for jn/keystone.mwbsys.com#EXAMPLE.COM from keytab /etc/security/keytabs/jn.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user
added keystone.mwbsys.com in /etc/hosts file
and then restarted the journal nodes its fixed the issue. I know this is not the permenent solution but it worked.

Hive beeline failed, while configure Azure blob storage for hdfs

We have created a hadoop kerberos cluster with azure storage blob by following the below link.
https://hadoop.apache.org/docs/stable/hadoop-azure/index.html
Facing issue while connecting beeline shell of hive thirft server but same configuration for azure blob storage is working fine with normal cluster.
please find the issue details:
ERROR org.apache.thrift.server.TThreadPoolServer: Error occurred during processing of message.
java.lang.NullPointerException
at org.apache.thrift.transport.TSaslTransport$SaslParticipant.isComplete(TSaslTransport.java:547)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:276)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:761)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:758)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1636)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:758)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Can anyone help us to configure hive server2 with Azure blob storage instead of HDFS in kerberos hadoop cluster?
Thanks,
Selva

A lot of AlreadyBeingCreatedException and LeaseExpiredException when writing parquet from spark

I have several parallel Spark jobs doing the same thing, they work on separate input/output dirs, at the end they write results to parquet from dataframe using one of the columns as a partitioner. Jobs with the biggest inputs often fail. Some of executors start to fail with below exceptions, then a stage fails and start recalculating a failed partition, if number of failed stages reaches 4(if it reaches, sometimes it doesn't and the whole job finishes successfully) the whole job is canceled.
Stages fails with these failure reasons(from spark UI):
org.apache.spark.shuffle.FetchFailedException
Connection closed by
peer
I tried to find clues on the Internet and it seems the reason maybe speculative execution, but I don't enable it in Spark, any other ideas what is the reason of that?
Spark job code:
sqlContext
.createDataFrame(finalRdd, structType)
.write()
.partitionBy(PARTITION_COLUMN_NAME)
.parquet(tmpDir);
Exceptions in executors:
16/09/14 11:04:06 ERROR datasources.DynamicPartitionWriterContainer: Aborting task.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006023_0/partition=2/part-r-06023-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_1489398656_198] for client [10.117.102.72], because this file is already being created by [DFSClient_NONMAPREDUCE_-2049022202_200] on [10.117.102.15]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141105_0001_m_006489_0/partition=2/part-r-06489-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318361396): File does not exist. Holder DFSClient_NONMAPREDUCE_-1428957718_196 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141105_0001_m_006310_0/partition=2/part-r-06310-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_-419723425_199] for client [10.117.102.44], because this file is already being created by [DFSClient_NONMAPREDUCE_596138765_198] on [10.117.102.35]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_005877_0/partition=2/part-r-05877-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359423): File does not exist. Holder DFSClient_NONMAPREDUCE_193375828_196 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_005621_0/partition=2/part-r-05621-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_498917218_197] for client [10.117.102.36], because this file is already being created by [DFSClient_NONMAPREDUCE_-578682558_197] on [10.117.102.16]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006311_0/partition=2/part-r-06311-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359109): File does not exist. Holder DFSClient_NONMAPREDUCE_-60951070_198 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3284)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006215_0/partition=2/part-r-06215-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359393): File does not exist. Holder DFSClient_NONMAPREDUCE_-331523575_197 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006311_0/partition=2/part-r-06311-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_1869576560_198] for client [10.117.102.44], because this file is already being created by [DFSClient_NONMAPREDUCE_-60951070_198] on [10.117.102.70]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
Spark UI:
We use Spark 1.6 (CDH 5.8)

SEVERE error writing to S3 backup

I'm running OpsCenter 5.1.1 with Datastax Enterprise 4.5.1. It's a 3-node cluster on AWS and I'm backing up to S3 (still...) I've started seeing a new error. I think this is a different error than any I've posted b4.
$ cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
I am seeing this error in the agent.log file
node1_agent.log: SEVERE: error after writing 15736832/16777216 bytes to https://cassandra-dev-bkup.s3.amazonaws.com/snapshots/407bb4b1-5c91-43fe-9d4f-767115668037/sstables/1430904167-reporting_test-transaction_lookup-jb-288-Index.db?partNumber=2&uploadId=.MA3X4RYssg7xL_Hr7Msgze.J4exDq9zZ_0Y7qEj9gZhJ570j73kZNr5_nbxactmPMJeKf0XyZfEC0KAplWOz9lpyRCtNeeDCvCmtEXDchH8F1J2c57aq4MrxfBcyiZr
java.io.IOException: Error writing request body to server
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3192)
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3175)
at com.google.common.io.CountingOutputStream.write(CountingOutputStream.java:53)
at com.google.common.io.ByteStreams.copy(ByteStreams.java:179)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.writePayloadToConnection(JavaUrlHttpCommandExecutorService.java:308)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:192)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:72)
at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:95)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.invoke(InvokeSyncToAsyncHttpMethod.java:128)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:94)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:55)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
at com.sun.proxy.$Proxy48.uploadPart(Unknown Source)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.prepareUploadPart(SequentialMultipartUploadStrategy.java:111)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.execute(SequentialMultipartUploadStrategy.java:93)
at org.jclouds.aws.s3.blobstore.AWSS3BlobStore.putBlob(AWSS3BlobStore.java:89)
at org.jclouds.blobstore2$put_blob.doInvoke(blobstore2.clj:246)
at clojure.lang.RestFn.invoke(RestFn.java:494)
at opsagent.backups.destinations$create_blob$fn__12007.invoke(destinations.clj:69)
at opsagent.backups.destinations$create_blob.invoke(destinations.clj:64)
at opsagent.backups.destinations$fn__12170.invoke(destinations.clj:192)
at opsagent.backups.destinations$fn__11799$G__11792__11810.invoke(destinations.clj:24)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344$fn__12375.invoke(staging.clj:61)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344.invoke(staging.clj:59)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339.invoke(staging.clj:56)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:940)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:944)
at clojure.core.async.impl.ioc_macros$take_BANG_$fn__7592.invoke(ioc_macros.clj:953)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__4097.invoke(channels.clj:102)
at clojure.lang.AFn.run(AFn.java:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
TL;DR -
Your SSTable which is 38866048 bytes, is both on your filesystem and on S3. This means the file has transferred over and you are in good shape. No need to worry about this error (though I opened an internal ticket to handle this kind of exception rather than throw a dump).
Details - A summary of what I suspect happened
1) There was a file transfer error when you reached 15736832 out of the 16777216 byte slice of the sstable.
2) At this point OpsCenter did not finish transferring the table or leave a partial version in s3
3) Another backup attempt later on moved the sstable with no error and a valid backup exists.

How to set User/Group permission with Hadoop/Kerberos Setup?

I am trying to setup Hadoop with Kerberos
I am following the CDH3 Security Guide.
Things went pretty well so far (HFDS works ok etc), but I am getting the following error when I try to submit the Job.
I run HDFS server as user HDFS and Hadoop as user called mapred. I Submit the job using user called bob, who is in mapred group.
Following are values I have for taskcontroller.cfg
mapred.local.dir=/opt/hadoop-work/local/
hadoop.log.dir=/opt/hadoop-1.0.3/logs
mapreduce.tasktracker.group=mapred
min.user.id=1000
Error I am getting is
java.io.IOException: Job initialization failed (24) with output: Reading task controller config from /etc/hadoop/taskcontroller.cfg
Can't get group information for mapred - Success.
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:192)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:185)
... 8 more
Error always comes with value given to "mapreduce.tasktracker.group=mapred" in the taskcontroller.cfg.
I have been debugging and looking in, and I think the problem is I have setup the permission among different users and groups wrong.
Any help is greatly appreciated.

Resources