Bulk insert in cassandra causing read timeout - oracle

Sadly I found a question very similar to mine, but with no real answer nohostavailableexception-while-bulk-loading-data-into-cassandra
I'm using Cassandra 2.0.8 installed on a RHEL 5 VM with 8 Cores and 8 GB of RAM.
I am running it as a single node for now.
I am trying to initialize it by migrating data off my oracle database. So I have a program that selects from the oracle table and then inserts into cassandra (in a loop) Biggest table has 500,000 records.
During the operation my program keeps dying with read-timeout errors. I tried increasing all the time out values in cassandra.yaml but it doesn't help.
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
My cassandra.yaml timeout settings are
# How long the coordinator should wait for read operations to complete read_request_timeout_in_ms: 15000
# was 5000
# How long the coordinator should wait for seq or index scans to complete range_request_timeout_in_ms: 20000
# was 10000
# How long the coordinator should wait for writes to complete write_request_timeout_in_ms: 30000
# was 20000
# How long a coordinator should continue to retry a CAS operation
# that contends with other proposals for the same row cas_contention_timeout_in_ms: 1000
# How long the coordinator should wait for truncates to complete
# (This can be much longer, because unless auto_snapshot is disabled
# we need to flush first so we can snapshot before removing the data.) truncate_request_timeout_in_ms: 300000
# was 60000
# The default timeout for other, miscellaneous operations request_timeout_in_ms: 20000
# was 10000
Anyone know how to fix this issue? or a better way to migrate data from 1 place to another (note that some tables are not that simple to migrate, I need to do some extra queries before inserting)

Related

Fetching data from Greenplum table in the order of 600 million in Apache NiFi is giving GC overhead limit exceeded

I am trying to fetch data from Greenplum table using Apache NiFi - QueryDatabaseTableRecord. I am seeing GC overhead limit exceeded error and the NiFi webpage becomes unresponsive.
I have set the 'Fetch Size' property to 10000 but it seems the property is not being used in this case.
Other settings:
Database Type : Generic
Max Rows Per Flow File : 1000000
Output Batch Size : 2
jvm min/max memory allocation is 4g/8g
Is there an alternative to avoid the GC errors for this task ?
this is a clear case of the "Fetch Size" parameter not being used, see processor info on this.
Try to test the jdbc setFetchsize on its own to see if it works.

How do I fix "File could only be replicated to 0 nodes instead of minReplication (=1)."?

I asked a similar question a while ago, and thought I solved this problem, but it turned out that it went away simply because I was working on a smaller dataset.
Numerous people have asked this question and I have gone through every single internet post that I could find and still didn't make any progress.
What I'm trying to do is this:
I have an external table browserdata in hive that refers to about 1 gigabyte of data.
I try to stick that data into a partitioned table partbrowserdata, whose definition goes like this:
CREATE EXTERNAL TABLE IF NOT EXISTS partbrowserdata (
BidID string,
Timestamp_ string,
iPinYouID string,
UserAgent string,
IP string,
RegionID int,
AdExchange int,
Domain string,
URL string,
AnonymousURL string,
AdSlotID string,
AdSlotWidth int,
AdSlotHeight int,
AdSlotVisibility string,
AdSlotFormat string,
AdSlotFloorPrice decimal,
CreativeID string,
BiddingPrice decimal,
AdvertiserID string,
UserProfileIDs array<string>
)
PARTITIONED BY (CityID int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
LOCATION '/user/maria_dev/data2';
with this query:
insert into table partbrowserdata partition(cityid)
select BidID,Timestamp_ ,iPinYouID ,UserAgent ,IP ,RegionID ,AdExchange ,Domain ,URL ,AnonymousURL ,AdSlotID ,AdSlotWidth ,AdSlotHeight ,AdSlotVisibility ,AdSlotFormat ,AdSlotFloorPrice ,CreativeID ,BiddingPrice ,AdvertiserID ,UserProfileIDs ,CityID
from browserdata;
And every time, on every platform, be it hortonworks or cloudera, I get this message:
Caused by:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/maria_dev/data2/.hive-staging_hive_2019-02-06_18-58-39_333_7627883726303986643-1/_task_tmp.-ext-10000/cityid=219/_tmp.000000_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3389)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1814)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1610)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:773)
What do I do? I can't understand why this is happening. It does seem like a memory issue though, because I am able to insert a few rows, but not all of them for some reason. Note that I have plenty of memory on HDFS, so 1 gig of extra data is pennies on a dollar, so it's probably a RAM issue?
Here's my dfs report output:
I have tried this on all execution engines: spark, tez, mr.
Please do not suggest solutions that say that I need to format the namenode, because they do not work, and they are not solutions in any way.
update:
After looking at logs for namenode I noticed this, if it helps:
Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[DISK ], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], stor agePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
These logs suggest this:
For more information, please enable DEBUG log level on
org.apache.hadoop.hdfs.ser ver.blockmanagement.BlockPlacementPolicy
and org.apache.hadoop.net.NetworkTopology
How do I do that?
I also noticed a similar unresolved post on here:
HDP 2.2#Linux/CentOS#OracleVM (Hortonworks) fails on remote submission from Eclipse#Windows
update 2:
I just tried partitioning this with spark, and it works! So, this must be a hive bug...
update 3:
Just tested this on MapR and it worked, but MapR doesn't use HDFS. This is is definitely some sort of HDFS + Hive combination bug.
Proof:
I ended up reaching out to cloudera forums and they answered my question in a matter of minutes: http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Why-can-t-I-partition-a-1-gigabyte-dataset-into-300/m-p/86554#M3981 I tried what Harsh J suggests and it worked perfectly!
Here's what he said:
If you are dealing with unordered partitioning from a data source, you
can end up creating a lot of files in parallel as the partitioning is
attempted.
In HDFS, when a file (or more specifically, its block) is open, the
DataNode performs a logical reservation of its target block size. So
if your configured block size is 128 MiB, then every concurrently open
block will deduct that value (logically) from the available remaining
space the DataNode publishes to the NameNode.
This reservation is done to help manage space and guarantees of a full
block write to a client, so that a client that's begun writing its
file never runs into an out of space exception mid-way.
Note: When the file is closed, only the actual length is persisted,
and the reservation calculation is adjusted to reflect the reality of
used and available space. However, while the file block remains open,
its always considered to be holding a full block size.
The NameNode further will only select a DataNode for a write if it can
guarantee full target block size. It will ignore any DataNodes it
deems (based on its reported values and metrics) unfit for the
requested write's parameters. Your error shows that the NameNode has
stopped considering your only live DataNode when trying to allocate a
new block request.
As an example, 70 GiB of available space will prove insufficient if
there will be more than 560 concurrent, open files (70 GiB divided
into 128 MiB block sizes). So the DataNode will 'appear full' at the
point of ~560 open files, and will no longer serve as a valid target
for further file requests.
It appears per your description of the insert that this is likely, as
each of the 300 chunks of the dataset may still carry varied IDs,
resulting in a lot of open files requested per parallel task, for
insert into several different partitions.
You could 'hack' your way around this by reducing the request block
size within the query (set dfs.blocksize to 8 MiB for ex.),
influencing the reservation calculation. However, this may not be a
good idea for larger datasets as you scale, since it will drive up the
file:block count and increase memory costs for the NameNode.
A better way to approach this would be to perform a pre-partitioned
insert (sort first by partition and then insert in a partitioned
manner). Hive for example provides this as an option:
hive.optimize.sort.dynamic.partition, and if you use plain Spark
or MapReduce then their default strategy of partitioning does exactly
this.
So, at the end of the day I did set hive.optimize.sort.dynamic.partition=true; and everything started working. But I also did another thing.
Here's one of my posts from earlier as I was investigating this issue: Why do I get "File could only be replicated to 0 nodes" when writing to a partitioned table? I was running into a problem where hive couldn't partition my dataset, because hive.exec.max.dynamic.partitions was set to 100, so, I googled this issue and somewhere on hortonworks forums I saw an answer, saying that I should just do this:
SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;
This was another problem, maybe hive tries to open as many of those concurrent connections as you set hive.exec.max.dynamic.partitions, so my insert query didn't start working until I decreased these values to 500.

logstash kafka input performance / config tuning

I use logstash to transfer data from Kafka to Elasticsearch and I'm getting the following error:
WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto offset commit failed for group kafka-es-sink: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
I tried to adjust the session timeout (to 30000) and max poll records (to 250).
The topic produces 1000 events per seconds in avro format. There are 10 partitions (2 servers) and two logstash instances with 5 consumer threads each.
I have no problems with other topics with ~100-300 events per second.
I think it should be a config issue because I also have a second connector between Kafka and Elasticsearch on the same topic which works fine (confluent's kafka-connect-elasticsearch)
The main aim is to compare kafka connect and logstash as connector. Maybe anyone has also some experience in general?

Sqoop-2 fails on large import to single node with custom query using sqoop shell

I am prototyping migration of a large record set generated by a computationally expensive custom query. This query takes approximately 1-2 hours to return a result set in SQL Developer
I am attempting to pass this query to a simple Sqoop job with links JDBC to HDFS
I have encountered the following errors in my logs:
2016-02-12 10:15:50,690 ERROR mr.SqoopOutputFormatLoadExecutor [org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:257)] Error while loading data out of MR job.
org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0005:Error occurs during loader run
at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:110)
at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:41)
at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:250)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/username/schema/recordset/.72dee005-3f75-4a95-bbc0-30c6b565d193/f5aeeecc-097e-49ab-99cc-b5032ae18a84.txt (inode 16415): File does not exist. [Lease. Holder: DFSClient_NONMAPREDUCE_-1820866823_31, pendingcreates: 1]
When I try to check the resulting .txt files in my hdfs, they are empty.
Has anyone encountered and solved this? Also, I am noticing additional bugginess with the Sqoop shell. For example, I am unable to check the job status as it always returns UNKNOWN.
I am using sqoop-1.99.6-bin-hadoop200 with Hadoop 2.7.2 (Homebrew install). I am querying a remote Oracle 11 database with the Generic JDBC Connector.
I have already conducted a smaller import job using the schema/table parameters in create job
I am tempted to migrate the entire schema table by table, then just use Hive to generate and store the record set I want. Would this be a better/easier solution?
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException
This query takes approximately 1-2 hours to return a result set in SQL
Developer
I would bet that Sqoop 1.99 creates an empty HDFS file (i.e. the NameNode gets the request, creates the file but does not materialize it for other clients yet, grants an exclusive write lease for Sqoop, and assigns responsibility for writing block#1 to a random DataNode) then waits for the JDBC ResultSet to produce some data... without doing any keep-alive in the meantime.
But alas, after 60 minutes, the NameNode just sees that the lease has expired without any sign of the Sqoop client being alive, so it closes the file -- or rather, makes as if it was never created (no flush has ever occured).
Any chance you can reduce the time lapse with a /*+ FIRST_ROWS */ hint on Oracle side?

Timeout during read exception in cassandra with datastax java driver

I'm trying to insert a single row which has few columns of size 500MB to cassandra cluster and i'm getting below error.
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/10.138.90.207:9042, /10.138.90.208:9042, /10.138.90.191:9042, /10.138.90.240:9042, /10.138.90.232:9042, /10.138.90.205:9042, /10.138.90.236:9042, /10.138.90.246:9042] - use getErrors() for details)
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at com.tcs.asml.cassandra.Crud.Insert(Crud.java:44)
at com.tcs.asml.factory.PartToolInsert.main(PartToolInsert.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/10.138.90.207:9042, /10.138.90.208:9042, /10.138.90.191:9042, /10.138.90.240:9042, /10.138.90.232:9042, /10.138.90.205:9042, /10.138.90.236:9042, /10.138.90.246:9042] - use getErrors() for details)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
When I print get errors in exception, it shows Time out during read error for all nodes in the cluster.
Get errors:
{/10.138.90.207:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.191:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.208:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.240:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.232:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.205:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.236:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read, /10.138.90.246:9042=com.datastax.driver.core.exceptions.DriverException: Timeout during read}
Cluster details:
one datacenter with 8 nodes each of 16GB RAM
Single hard disc in every node.
All nodes are connected with 10mbps bandwidth with default latency.
I tried to increase read time out using below command.
cluster.getConfiguration().getSocketOptions().setReadTimeoutMillis(60000);
Below are yaml configuration using now.
memtable total space: 4Gb
Commit log segment size: 512MB
read_request_timeout_in_ms (ms): 10000
request_timeout_in_ms (ms): 10000
concurrent reads: 32
concurrent writes: 32
I faced same issue while i'm trying to insert 250mb row and by setting read time out to 30 seconds fixed the issue.
cluster.getConfiguration().getSocketOptions().setReadTimeoutMillis(30000);
But for 500MB row size its not working.
Can anyone please give me some ideas how to tune cassandra to insert single row with huge data.
Thanks.
Question: Why do you need to store 500MB or 200MB of data in a row in cassandra? The sweet spot for partition sizes in cassandra is up to 100MB, maybe a few hundred. Cassandra's a data store for fast storage and fast querying. 500MB of data won't give you either. So why use cassandra for this?

Resources