I am trying to create a cache that could write entries to disk. It runs fine in my tests on Windows but when I deploy it on IBM z/OS USS, I end up in the following error. I have 0777 on the directory and there is enough space available on the disk. df -k gives me 405591/901440 for Available/Total. Any insights to where I should be looking to diagnose would be helpful. Following is my cache configuration-
CacheManager cacheManager = CacheManagerBuilder.newCacheManagerBuilder().with(new CacheManagerPersistenceConfiguration(new File(CACHE_DIR, "DictionaryCache")))
.withCache("dictionaryCache", CacheConfigurationBuilder.newCacheConfigurationBuilder(Integer.class, KeywordDictionary.class,
ResourcePoolsBuilder.newResourcePoolsBuilder()./*offheap(200, MemoryUnit.MB).*/disk(200, MemoryUnit.MB).heap(20, EntryUnit.ENTRIES))
.build()).build(true);
Caused by: java.lang.IllegalStateException: Cache 'dictionaryCache' creation in EhcacheManager failed.
at org.ehcache.core.EhcacheManager.createCache(EhcacheManager.java:287) ~[Classification-Engine-Scan-Job-2.0-SNAPSHOT.jar:na]
at org.ehcache.core.EhcacheManager.init(EhcacheManager.java:566) ~[Classification-Engine-Scan-Job-2.0-SNAPSHOT.jar:na]
... 19 common frames omitted
Caused by: org.ehcache.StateTransitionException: Initial table allocation failed.
Initial Table Size (slots) : 64
Allocation Will Require : 1KB
Table Page Source : org.terracotta.offheapstore.disk.paging.MappedPageSource#e88c7380
at org.ehcache.core.StatusTransitioner$Transition.succeeded(StatusTransitioner.java:209) ~[Classification-Engine-Scan-Job-2.0-SNAPSHOT.jar:na]
at org.ehcache.core.Ehcache.init(Ehcache.java:567) ~[Classification-Engine-Scan-Job-2.0-SNAPSHOT.jar:na]
at org.ehcache.core.EhcacheManager.createCache(EhcacheManager.java:260) ~[Classification-Engine-Scan-Job-2.0-SNAPSHOT.jar:na]
... 20 common frames omitted
Related
I am getting OutOfMemory error while deploying EAR (Size : 230MB) file in Websphere server.
Sometime deployment is getting success after increasing the heap size.
I have analyzed the heap dump and found leak suspects but not sure how to proceed here after.
Leak suspect : 217,295,824 bytes (87.23 %) of Java heap is used by 105
instances of java/util/WeakHashMap$Entry
Contains 3 instances of the following leak suspects:
- array of java/lang/Object holding 16,235,440 bytes at 0x6a696c8
- array of java/lang/Object holding 101,373,968 bytes at 0x1125c240
- array of java/lang/Object holding 13,602,688 bytes at 0x5290818
<\n> Total size : 217,295,824 bytes
Size : 1,040 bytes
Name : array of java/util/WeakHashMap$Entry
Number of children : 105
Number of parents : 1
Owner address : 0x2e41fd0
Owner object : java/util/WeakHashMap
Address : 0xb4c2dc0
First single ancestor : org/eclipse/jst/j2ee/internal/archive/JavaEEArchiveUtilities at 0xb4c2dc0
and getting below error in WAS logs
[main] INFO deploylib - Installing application... ADMA5016I: Installation of Kijkglas-ear-1905.01.35 started. ADMA5058I: Application and module versions are validated with versions of deployment targets. ADMA5018I: The EJBDeploy program is running on file /tmp/app6232412827642995266.ear. Starting workbench. EJB Deploy configuration directory: /var/was/profiles/AdminAgent01/ejbdeploy/configuration/ framework search path: /opt/IBM/WebSphere/8.5/deploytool/itp/plugins build:RADWEJB95-I20150829_0214 Creating the project. JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2019/06/07 10:42:59 - please wait. JVMDUMP032I JVM requested System dump using '/var/was/profiles/AdminAgent01/core.20190610.104259.30244.0001.dmp' in response to an event JVMDUMP010I System dump written to /var/was/profiles/AdminAgent01/core.20190610.104259.30244.0001.dmp JVMDUMP032I JVM requested Heap dump using '/var/was/profiles/AdminAgent01/heapdump.20190610.104259.30244.0002.phd' in response to an event JVMDUMP010I Heap dump written to /var/was/profiles/AdminAgent01/heapdump.20190610.104259.30244.0002.phd JVMDUMP032I JVM requested Java dump using '/var/was/profiles/AdminAgent01/javacore.20190610.104259.30244.0003.txt' in response to an event JVMDUMP010I Java dump written to /var/was/profiles/AdminAgent01/javacore.20190610.104259.30244.0003.txt JVMDUMP032I JVM requested Snap dump using '/var/was/profiles/AdminAgent01/Snap.20190610.104259.30244.0004.trc' in response to an event JVMDUMP010I Snap dump written to /var/was/profiles/AdminAgent01/Snap.20190610.104259.30244.0004.trc JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError". An unexpected exception was thrown. Halting execution. Shutting down workbench. Error executing deployment: java.lang.OutOfMemoryError. Error is Java heap space. java.lang.OutOfMemoryError: Java heap space at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.<init>(Throwable.java:67) at java.lang.Throwable.<init>(Throwable.java:78) at java.lang.Error.<init>(Error.java:82) at java.lang.VirtualMachineError.<init>(VirtualMachineError.java:64) at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:69) at java.lang.String.<init>(String.java:207) at java.util.jar.Attributes.read(Attributes.java:424) at java.util.jar.Manifest.read(Manifest.java:264) at java.util.jar.Manifest.<init>(Manifest.java:82) at java.util.jar.JarFile.getManifestFromReference(JarFile.java:200) at java.util.jar.JarFile.getManifest(JarFile.java:182) at sun.net.www.protocol.jar.URLJarFile.isSuperMan(URLJarFile.java:187) at sun.net.www.protocol.jar.URLJarFile.getManifest(URLJarFile.java:155) at java.util.jar.JarFile.maybeInstantiateVerifier(JarFile.java:387) at java.util.jar.JarFile.getInputStream(JarFile.java:488) at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:178) at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source) at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(Unknown Source) at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(Unknown Source) at org.apache.xerces.impl.xs.opti.SchemaDOMParser.parse(Unknown Source) at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument(Unknown Source) at org.apache.xerces.impl.xs.traversers.XSDHandler.resolveSchema(Unknown Source) at org.apache.xerces.impl.xs.traversers.XSDHandler.constructTrees(Unknown Source) at org.apache.xerces.impl.xs.traversers.XSDHandler.constructTrees(Unknown Source) at org.apache.xerces.impl.xs.traversers.XSDHandler.parseSchema(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaLoader.loadSchema(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaValidator.findSchemaGrammar(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaValidator.handleStartElement(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaValidator.startElement(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source) EJBDeploy level: #build# ADMA5008E: The EJBDeploy program failed on file /tmp/app6232412827642995266.ear. Exception: com.ibm.etools.ejbdeploy.EJBDeploymentException: Error executing EJBDeploy ADMA0063E: An error occurred during Enterprise JavaBeans (EJB) deployment. Exception: com.ibm.etools.ejbdeploy.EJBDeploymentException: Error executing EJBDeploy ADMA5011I: The cleanup of the temp directory for application Kijkglas-ear-1905.01.35 is complete. ADMA5014E: The installation of application Kijkglas-ear-1905.01.35 failed. 2019-06-10 10:43:05,625
[main] FATAL deploylib - Jython Exception in deploy.py : 2019-06-10 10:43:05,630
[main] FATAL deploylib - Traceback (most recent call last): 2019-06-10 10:43:05,630
[main] FATAL deploylib - File "/opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/cfgfiles/gMyAppWA-assembled.cfg", line 606, in ? application.installApplication() 2019-06-10 10:43:05,630
[main] FATAL deploylib - File "<string>", line 779, in installApplication 2019-06-10 10:43:05,630
[main] FATAL deploylib - com.ibm.ws.scripting.ScriptingException: com.ibm.ws.scripting.ScriptingException: WASX7132E: Application install for /opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/Kijkglas-ear-1905.01.35.ear failed: see previous messages for details. [2019-06-10 10:43:05] [/opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/deploy.ksh] [ERROR] Command /var/was/profiles/AppSrv01/bin/wsadmin.sh -javaoption -Duser.timezone=CET -f deploy.py /opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/cfgfiles/gMyAppWA-assembled.cfg /opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/cfgfiles/gMyAppWA.TST failed. [2019-06-10 10:43:05] [/opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/deploy.ksh] [INFO ] See also deploy.log and wsadmin.log in deploylib-8.1.4 directory.
See /opt/Nolio/work/WAS/log/gMyAppWA/all/stdout.log.2019-06-10_10_37_57_285 and /opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/deploy.log for more information
/gMyAppWA.TST failed. [2019-06-10 10:43:05] [/opt/Nolio/work/WAS/gMyAppWA/all/1905.01.35/deploylib/deploy.ksh] [INFO ] See also deploy.log and wsadmin.log in deploylib-8.1.4 directory.
Is there any rouge process or something blocking in background ?
You didn't write your version, topology (single, network deployment), nor way you deploy your app (console, wsadmin ohter).
As you can see in the log, there is OutOfMemoryError during EJB deploy call.
You need to increase memory for ejb deploy you can either set it file or in OS level. Check this post Getting OutofMemory condition while deploying a large application in WebSphere Application Server
1) Set it in install-root/deploytool/itp/EJBDeploy.sh file
EJBDEPLOY_JVM_HEAP="-Xms1024 -Xmx1024" at the beginning of the
ejbdeploy.sh file.
2) Set it in operating System Environment.
Set EJBDEPLOY_JVM_HEAP= '-Xms1024 -Xmx1024' in OS environment
variable.
I'd also increase memory for your admin server in AdminAgent01 profile, as it looks like you are using admin agent.
Recommend not to update the ejbdeploy.sh. When update the WebSphere to a new fixpack, the ejbdeploy.sh will be restored.
Increasing heap size through admin console
login to admin console
go to "System administration" > "Deployment manager" > "Configuration" tab > "Server Infrastructure" section on the right > "Java and Process Management" > "Process definition"
"Additional Properties" section on the right > "Environment Entries"
"New" entry by providing the Name EJBDEPLOY_JVM_HEAP and value "-Xms256m -Xmx1024m"
Save and synchronize
restart the DM server
I have several parallel Spark jobs doing the same thing, they work on separate input/output dirs, at the end they write results to parquet from dataframe using one of the columns as a partitioner. Jobs with the biggest inputs often fail. Some of executors start to fail with below exceptions, then a stage fails and start recalculating a failed partition, if number of failed stages reaches 4(if it reaches, sometimes it doesn't and the whole job finishes successfully) the whole job is canceled.
Stages fails with these failure reasons(from spark UI):
org.apache.spark.shuffle.FetchFailedException
Connection closed by
peer
I tried to find clues on the Internet and it seems the reason maybe speculative execution, but I don't enable it in Spark, any other ideas what is the reason of that?
Spark job code:
sqlContext
.createDataFrame(finalRdd, structType)
.write()
.partitionBy(PARTITION_COLUMN_NAME)
.parquet(tmpDir);
Exceptions in executors:
16/09/14 11:04:06 ERROR datasources.DynamicPartitionWriterContainer: Aborting task.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006023_0/partition=2/part-r-06023-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_1489398656_198] for client [10.117.102.72], because this file is already being created by [DFSClient_NONMAPREDUCE_-2049022202_200] on [10.117.102.15]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141105_0001_m_006489_0/partition=2/part-r-06489-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318361396): File does not exist. Holder DFSClient_NONMAPREDUCE_-1428957718_196 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141105_0001_m_006310_0/partition=2/part-r-06310-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_-419723425_199] for client [10.117.102.44], because this file is already being created by [DFSClient_NONMAPREDUCE_596138765_198] on [10.117.102.35]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_005877_0/partition=2/part-r-05877-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359423): File does not exist. Holder DFSClient_NONMAPREDUCE_193375828_196 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_005621_0/partition=2/part-r-05621-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_498917218_197] for client [10.117.102.36], because this file is already being created by [DFSClient_NONMAPREDUCE_-578682558_197] on [10.117.102.16]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006311_0/partition=2/part-r-06311-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359109): File does not exist. Holder DFSClient_NONMAPREDUCE_-60951070_198 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3284)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006215_0/partition=2/part-r-06215-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet (inode 318359393): File does not exist. Holder DFSClient_NONMAPREDUCE_-331523575_197 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3625)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3428)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/erm/data/core/internal/ekp/stg/tmp/Z_PLAN_OPER/_temporary/0/_temporary/attempt_201609141104_0001_m_006311_0/partition=2/part-r-06311-482b0b4d-1174-4c76-b203-92b2b47c78cb.parquet] for [DFSClient_NONMAPREDUCE_1869576560_198] for client [10.117.102.44], because this file is already being created by [DFSClient_NONMAPREDUCE_-60951070_198] on [10.117.102.70]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3152)
Spark UI:
We use Spark 1.6 (CDH 5.8)
I'm running OpsCenter 5.1.1 with Datastax Enterprise 4.5.1. It's a 3-node cluster on AWS and I'm backing up to S3 (still...) I've started seeing a new error. I think this is a different error than any I've posted b4.
$ cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
I am seeing this error in the agent.log file
node1_agent.log: SEVERE: error after writing 15736832/16777216 bytes to https://cassandra-dev-bkup.s3.amazonaws.com/snapshots/407bb4b1-5c91-43fe-9d4f-767115668037/sstables/1430904167-reporting_test-transaction_lookup-jb-288-Index.db?partNumber=2&uploadId=.MA3X4RYssg7xL_Hr7Msgze.J4exDq9zZ_0Y7qEj9gZhJ570j73kZNr5_nbxactmPMJeKf0XyZfEC0KAplWOz9lpyRCtNeeDCvCmtEXDchH8F1J2c57aq4MrxfBcyiZr
java.io.IOException: Error writing request body to server
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3192)
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3175)
at com.google.common.io.CountingOutputStream.write(CountingOutputStream.java:53)
at com.google.common.io.ByteStreams.copy(ByteStreams.java:179)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.writePayloadToConnection(JavaUrlHttpCommandExecutorService.java:308)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:192)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:72)
at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:95)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.invoke(InvokeSyncToAsyncHttpMethod.java:128)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:94)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:55)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
at com.sun.proxy.$Proxy48.uploadPart(Unknown Source)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.prepareUploadPart(SequentialMultipartUploadStrategy.java:111)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.execute(SequentialMultipartUploadStrategy.java:93)
at org.jclouds.aws.s3.blobstore.AWSS3BlobStore.putBlob(AWSS3BlobStore.java:89)
at org.jclouds.blobstore2$put_blob.doInvoke(blobstore2.clj:246)
at clojure.lang.RestFn.invoke(RestFn.java:494)
at opsagent.backups.destinations$create_blob$fn__12007.invoke(destinations.clj:69)
at opsagent.backups.destinations$create_blob.invoke(destinations.clj:64)
at opsagent.backups.destinations$fn__12170.invoke(destinations.clj:192)
at opsagent.backups.destinations$fn__11799$G__11792__11810.invoke(destinations.clj:24)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344$fn__12375.invoke(staging.clj:61)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344.invoke(staging.clj:59)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339.invoke(staging.clj:56)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:940)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:944)
at clojure.core.async.impl.ioc_macros$take_BANG_$fn__7592.invoke(ioc_macros.clj:953)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__4097.invoke(channels.clj:102)
at clojure.lang.AFn.run(AFn.java:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
TL;DR -
Your SSTable which is 38866048 bytes, is both on your filesystem and on S3. This means the file has transferred over and you are in good shape. No need to worry about this error (though I opened an internal ticket to handle this kind of exception rather than throw a dump).
Details - A summary of what I suspect happened
1) There was a file transfer error when you reached 15736832 out of the 16777216 byte slice of the sstable.
2) At this point OpsCenter did not finish transferring the table or leave a partial version in s3
3) Another backup attempt later on moved the sstable with no error and a valid backup exists.
./bin/accumulo shell -u root
Password: ******
2015-02-14 15:18:28,503 [impl.ServerClient] WARN : There are no tablet servers: check that zookeeper and accumulo are running.
2015-02-14 13:58:52,878 [tserver.NativeMap] ERROR: Tried and failed to load native map library from /home/hduser/hadoop/lib/native::/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
java.lang.UnsatisfiedLinkError: no accumulo in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at org.apache.accumulo.tserver.NativeMap.<clinit>(NativeMap.java:80)
at org.apache.accumulo.tserver.TabletServerResourceManager.<init>(TabletServerResourceManager.java:155)
at org.apache.accumulo.tserver.TabletServer.config(TabletServer.java:3560)
at org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:3671)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:141)
at java.lang.Thread.run(Thread.java:745)
2015-02-14 13:58:52,915 [tserver.TabletServer] ERROR: Uncaught exception in TabletServer.main, exiting
java.lang.IllegalArgumentException: Maximum tablet server map memory 83,886,080 and block cache sizes 28,311,552 is too large for this JVM configuration 48,693,248
at org.apache.accumulo.tserver.TabletServerResourceManager.<init>(TabletServerResourceManager.java:166)
at org.apache.accumulo.tserver.TabletServer.config(TabletServer.java:3560)
at org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:3671)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:141)
at java.lang.Thread.run(Thread.java:745)
The above error is shown in the tserver_localhost.log. can anyone help me with this issue.
I have hadoop running on single-node mode, zookeeper running, and i followed the instructions in the Readme file of accumulo.
I dont know how to start a tablet server.There was no explanation regarding this in the readme,could anyone help me with this.
This is the confluence of two problems.
First, your Accumulo can't find the native libraries it would use for off-heaping the in-memory-map for live edits. Knowing your version of Accumulo, how you deployed accumulo, and seeing your accumulo-env.sh would be needed to diagnose why it may have failed. (asking on the user mailing list would be best) Take a look at the README for your version under the Building section for "native map support".
For example, the passage for version 1.6.1 gives the following advice for building them yourself without a full source tree:
Alternatively, you can manually unpack the accumulo-native tarball in the
$ACCUMULO_HOME/lib directory. Change to the accumulo-native directory in
the current directory and issue make. Then, copy the resulting 'libaccumulo'
library into the $ACCUMULO_HOME/lib/native/map.
$ mkdir -p $ACCUMULO_HOME/lib/native/map
$ cp libaccumulo.* $ACCUMULO_HOME/lib/native/map
Normally, not having the native libraries available is a soft failure; Accumulo will happily issue a WARN and then rely on a pure-java implementation.
Your second problem is caused by incorrect memory configuration. Accumulo relies on a single configuration parameter to tune memory use for both the native in-memory-map and the java one. The memory for the native implementation is allocated outside of the JVM heap and can be substantial (in the 1-16GB range depending on target workload). When running with the Java implementation, that same configuration value takes away space carved from the max heap size.
Based on your log output, you have configured a total max heap for tabletservers of ~46MB. You have allocated 27MB of this for the block cache and 80MB for the in-memory-map. The error you see is because those two values would result in an OOM.
You can increase the total Java Heap in accumulo-env.sh:
# Probably looks like this
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY} -Xmx48m -Xms48m "
# change this part to give it more memory --^^^^^^
And/or you can tune how much space should be used for the native maps, block cache, and index cache in accumulo-site.xml
<!-- Amount of space to hold incoming random writes -->
<property>
<name>tserver.memory.maps.max</name>
<value>80M</value>
</property>
<!-- Amount of space for holding blocks of data read out of HDFS -->
<property>
<name>tserver.cache.data.size</name>
<value>7M</value>
</property>
<!-- Amount of space for holding indexes read out of HDFS -->
<property>
<name>tserver.cache.index.size</name>
<value>20M</value>
</property>
How you should balance these three will depend on how much memory you have and what your workload looks like. Keep in mind that more than just those two things need to go into your total Java heap (like atleast one copy of the current cell being written / read on each RPC).
I have found the solution to this.
I have removed all the config files from the config folder in accumulo and used the bootstrap_config.sh file in bin folder,..which created the config files based on the input i have given and after that i initialized accumulo again and i was able to open the shell and the error was gone.
Thanks for the help.
I'm trying to open a read-only Derby database over a network client connection (using ij / derbyclient.jar).
I have created a read-only database:
jar cMf sample.jar sample
The Derby Network Server is started.
I have tried the following connection URLs:
connect 'jdbc:derby:jar://localhost:1527/sample.jar';
connect 'jdbc:derby:jar://localhost:1527/(sample.jar)sample';
connect 'jdbc:derby://localhost:1527/jar:(sample.jar)sample';
But none of the above URLs work.
The only URL that works is:
connect 'jdbc:derby:jar:(sample.jar)sample';
It appears that read-only Derby databases can only be opened in embedded mode. Is this true ?
Solved:
After checking the "derby.log", the problem was that the read-only database needs to be able to create a temporary file.
derby.log:
java.sql.SQLException: Failed to start database 'jar:(sample.jar)sample' with class loader sun.misc.Launcher$AppClassLoader#1d450337, see the next exception for details.
...
Caused by: java.sql.SQLException: Failed to start database 'jar:(sample.jar)sample' with class loader sun.misc.Launcher$AppClassLoader#1d450337, see the next exception for details.
...
Caused by: java.sql.SQLException: Java exception: 'Unable to create temporary file: java.lang.SecurityException'.
...
Caused by: java.lang.SecurityException: Unable to create temporary file
The solution is to define a temporary directory for the database. This can be done with the "derby.storage.tempDirectory" property:
System-wide in "derby.properties":
derby.storage.tempDirectory=c:/temp
Database-wide
CallableStatement cs =
conn.prepareCall("CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(?, ?)");
cs.setString(1, "derby.storage.tempDirectory");
cs.setString(2, "c:/temp");
cs.execute();
cs.close();
The network URL is:
connect 'jdbc:derby://localhost:1527/jar:(sample.jar)sample';