Getting Regionserver throwing InvalidToken exception in logs - hadoop

I have noticed following error in my region server logs:
org.apache.hadoop.security.token.SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to /apps/hbase/data/data/default/my-table/eb512b4b9f9fa9cb2a1a3930d9c9f18b/r/df1694a4542f419992f86b219541fb6fBlock token with block_token_identifier (expiryDate=1519482398334, keyId=1283446178, userId=hbase, blockPoolId=BP-1872413417-101.33.253.88-1458393583173, blockId=1133036852, access modes=[READ]) is expired.
at org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591)
at org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490)
at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782)
at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716)
at org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)
at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1118)
at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1056)
at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1411)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374)
at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1380)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:259)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:634)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:584)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:247)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:156)
at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:363)
at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:217)
at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2003)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5338)
at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2494)
at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2480)
at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2462)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6527)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6506)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:579)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2031)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
2018-02-24 20:03:10,947 INFO [RW.default.readRpcServer.handler=151,queue=38,port=16020] hdfs.DFSClient: Access token was invalid when connecting to /101.32.65.239:50010 : org.apache.hadoop.security.token.SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to /apps/hbase/data/data/default/my-table/eb512b4b9f9fa9cb2a1a3930d9c9f18b/r/df1694a4542f419992f86b219541fb6fBlock token with block_token_identifier (expiryDate=1519482398334, keyId=1283446178, userId=hbase, blockPoolId=BP-1872413417-101.33.253.88-1458393583173, blockId=1133036852, access modes=[READ]) is expired.
I see config related to these are already tweaked by us.
dfs.client.read.shortcircuit.streams.cache.expiry.ms
300000
hdfs-default.xml
dfs.client.read.shortcircuit.streams.cache.size
4096
hdfs-site.xml
I also see a JIRA raised by Nick Dimiduk-2 already facing this, which is still open and in the comments have the config suggested by a user facing same issue, which we are also using,
Anyone faced this issue or knows about it and have any resolution/suggestion for this?

Related

Spinnaker & Okta integration failing

Scenerio:
Upgraded Spinnaker to 1.12.0. No other config changes that would impact this integration (we had to modify an s3 IAM because it quit working). Okta integration stopped working. Public key was reissued during install process for the ingress, may be relevant?
SAML-TRACE shows payload getting to okta and back
Spinnaker throws two different errors depending on browser and how I get there.
Direct link to deck url: (500) No IDP was configured, please update included metadata with at least one IDP (seen in browser and gate)
Okta "chicklet" in okta dashboard: (401) Authentication Failed: Incoming SAML message is invalid
Config details (again none of this changed):
Downloading metadata directly
JKS is being leveraged and is valid
service url is confirmed
alias for JKS is confirmed
I had this issue as well when upgrading from 1.10.13 to 1.12.2. I found lots of these error messages in Gate's logs:
2019-02-19 05:31:30.421 ERROR 1 --- [.0-8084-exec-10] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw e
xception [org.opensaml.saml2.metadata.provider.MetadataProviderException: No IDP was configured, please update included metadata with at least one IDP] with root cause
org.opensaml.saml2.metadata.provider.MetadataProviderException: No IDP was configured, please update included metadata with at least one IDP
at org.springframework.security.saml.metadata.MetadataManager.getDefaultIDP(MetadataManager.java:795) ~[spring-security-saml2-core-1.0.2.RELEASE.jar:1.0.2.RELEASE]
at org.springframework.security.saml.context.SAMLContextProviderImpl.populatePeerEntityId(SAMLContextProviderImpl.java:157) ~[spring-security-saml2-core-1.0.2.RELEASE.jar
:1.0.2.RELEASE]
at org.springframework.security.saml.context.SAMLContextProviderImpl.getLocalAndPeerEntity(SAMLContextProviderImpl.java:127) ~[spring-security-saml2-core-1.0.2.RELEASE.ja
r:1.0.2.RELEASE]
at org.springframework.security.saml.SAMLEntryPoint.commence(SAMLEntryPoint.java:146) ~[spring-security-saml2-core-1.0.2.RELEASE.jar:1.0.2.RELEASE]
at org.springframework.security.web.access.ExceptionTranslationFilter.sendStartAuthentication(ExceptionTranslationFilter.java:203) ~[spring-security-web-4.2.9.RELEASE.jar
:4.2.9.RELEASE]
...
After downgrading back to 1.10.13, I upgraded to the next version, 1.11.0, and found that's when the issue started. Eventually, I looked at Gate's logs from the launch of the Container and found:
2019-02-20 22:31:40.132 ERROR 1 --- [0.0-8084-exec-3] o.o.s.m.provider.HTTPMetadataProvider : Error retrieving metadata from https://000000000000.okta.com/app/00000000000000000/sso/saml/metadata
javax.net.ssl.SSLException: Error in hostname verification
at org.opensaml.ws.soap.client.http.TLSProtocolSocketFactory.verifyHostname(TLSProtocolSocketFactory.java:241) ~[openws-1.5.4.jar:na]
at org.opensaml.ws.soap.client.http.TLSProtocolSocketFactory.createSocket(TLSProtocolSocketFactory.java:186) ~[openws-1.5.4.jar:na]
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) ~[commons-httpclient-3.1.jar:na]
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) ~[commons-httpclient-3.1.jar:na]
...
This lead me to realize that the TLS Certificate was being rejected by Gate. Not sure why it suddenly started failing the check. Up to this point, I had it configured as:
$ hal config security authn saml edit --metadata https://000000000000.okta.com/app/00000000000000000/sso/saml/metadata
I ended up downloading the metadata file and redeploying with halyard.
$ wget https://000000000000.okta.com/app/00000000000000000/sso/saml/metadata
$ hal config security authn saml edit --metadata "${PWD}/metadata"
$ hal config version edit --version 1.12.2
$ hal deploy apply
Opened up a private browser window as suggested by the Spinnaker documentation and Gate started redirecting to Okta correctly again.
Issue filed, https://github.com/spinnaker/spinnaker/issues/4017.
So I ended up finding the answer. The tomcat config changed apparently in spinnaker in later versions for gate.
I created this snippet in ~/.hal/default/profiles/gate-local.yml
server:
tomcat:
protocolHeader: X-Forwarded-Proto
remoteIpHeader: X-Forwarded-For
internalProxies: .*
Deployed spinnaker and it was back to working.

Vertica admintools error

when I try to connect to database from admintools I am getting following error:
Error: Unable to connect to database
Hint: Username or password could be invalid
I have found in the logs following error:
Apr 20 08:08:29 [24291] [vsql.connect spawn] Exception: Error! pty.fork() failed: out of pty devices
Do you know what is the problem?
Your node might be down
Check logs at
/opt/vertica/log
or at
/opt/vertica/config/admintools.conf
check restart policy section is right
[Database:mydb] host = 11.11.11.11
restartpolicy = ksafe

Hive Browser Throwing Error

I am trying to put some basic query in hive editor in hue browser , but it is returning the following error whereas my Hivecli works fine and able to execute queries. Could someone help me?
Fetching results ran into the following error(s):
Bad status for request TFetchResultsReq(fetchType=1,
operationHandle=TOperationHandle(hasResultSet=True,
modifiedRowCount=None, operationType=0,
operationId=THandleIdentifier(secret='r\t\x80\xac\x1a\xa0K\xf8\xa4\xa0\x85?\x03!\x88\xa9',
guid='\x852\x0c\x87b\x7fJ\xe2\x9f\xee\x00\xc9\xeeo\x06\xbc')),
orientation=4, maxRows=-1):
TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]",
sqlState=None,
infoMessages=["*org.apache.hive.service.cli.HiveSQLException:Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]:24:23",
'org.apache.hive.service.cli.operation.OperationManager:getOperationLogRowSet:OperationManager.java:229',
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:687',
'sun.reflect.GeneratedMethodAccessor14:invoke::-1',
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
'java.lang.reflect.Method:invoke:Method.java:606',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
'java.security.AccessController:doPrivileged:AccessController.java:-2',
'javax.security.auth.Subject:doAs:Subject.java:415',
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1657',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
'com.sun.proxy.$Proxy19:fetchResults::-1',
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
'java.lang.Thread:run:Thread.java:745'], statusCode=3), results=None,
hasMoreRows=None)
This error could be either due to HiveServer2 not running or Hue does not have access to hive_conf_dir.
Check whether the HiveServer2 has been started and is running. It uses the port 10000 by default.
netstat -ntpl | grep 10000
If it is not running, start the HiveServer2
$HIVE_HOME/bin/hiveserver2
Also check the Hue configuration file hue.ini. The hive_conf_dir property must be set under [beeswax] section. If not set, add this property under [beeswax]
hive_conf_dir=$HIVE_HOME/conf
Restart supervisor after making these changes.

Windows kernel error log

There are many same events all my Hyper V servers environment. I haven't found any solution for this event. If you have the solution for this error can you help me?
>Log Name: Microsoft-Windows-Kernel-EventTracing/Admin
>Source: Microsoft-Windows-Kernel-EventTracing
>Date: 4/28/2016 1:34:27 PM
>Event ID: 2
>Task Category: Session
>Level: Error
>Keywords: Session
>User: NETWORK SERVICE
>Computer: HYPERV01.prod.local
>Description:
>**Session "" failed to start with the following error: 0xC0000022**
You're getting an access denied error upon attempt to start network service. Your permissions are probably restricted.

Too few bytes (-1) received from OPMN response error

When I entered Oracle Application Server Control panel, an error message was appeared "Too few bytes (-1) received from OPMN response". It doesn't allow to do any deployment. How can I fix this error?
Trace Information:
oracle.ias.opmn.optic.OpticBadConnectException: Too few bytes (-1) received from OPMN response at oracle.ias.opmn.optic.OpmnPhone.rcvResponse(OpmnPhone.java:529) at oracle.ias.opmn.optic.OpmnPhone.makePhoneCall(OpmnPhone.java:193) at oracle.ias.opmn.optic.OpmnPhone.request(OpmnPhone.java:130) at oracle.ias.opmn.optic.OpmnQuery.getBuf(OpmnQuery.java:347) at oracle.ias.opmn.optic.OpmnQuery.getDom(OpmnQuery.java:467) at oracle.ias.opmn.optic.OpmnQuery.getIasCluster(OpmnQuery.java:941) at oracle.sysman.ias.studio.cluster.OpticTopologyAdminBean.initializeAppServers(OpticTopologyAdminBean.java:1117) at oracle.sysman.ias.studio.cluster.TopologyHelper.prepareData(TopologyHelper.java:1278) at oracle.sysman.ias.studio.sdk.AbstractController.prepareData(AbstractController.java:875) at oracle.sysman.emSDK.svlt.PageHandler.handleRequest(PageHandler.java:391) at oracle.sysman.emSDK.svlt.EMServlet.myDoGet(EMServlet.java:765) at oracle.sysman.emSDK.svlt.EMServlet.doGet(EMServlet.java:283) at oracle.sysman.ias.studio.app.StudioConsole.doGet(StudioConsole.java:297) at javax.servlet.http.HttpServlet.service(HttpServlet.java:743) at javax.servlet.http.HttpServlet.service(HttpServlet.java:856) at com.evermind.server.http.ResourceFilterChain.doFilter(ResourceFilterChain.java:64) at oracle.sysman.ias.studio.app.BrowserVersionFilter.doFilter(BrowserVersionFilter.java:75) at com.evermind.server.http.EvermindFilterChain.doFilter(EvermindFilterChain.java:15) at oracle.sysman.ias.studio.app.MultipleJVMFilter.doFilter(MultipleJVMFilter.java:85) at com.evermind.server.http.EvermindFilterChain.doFilter(EvermindFilterChain.java:17) at oracle.sysman.ias.studio.app.PostLogonFilter.doFilter(PostLogonFilter.java:80) at com.evermind.server.http.EvermindFilterChain.doFilter(EvermindFilterChain.java:17) at oracle.sysman.ias.studio.app.ShortHostnameRedirectFilter.doFilter(ShortHostnameRedirectFilter.java:68) at com.evermind.server.http.ServletRequestDispatcher.invoke(ServletRequestDispatcher.java:619) at com.evermind.server.http.ServletRequestDispatcher.forwardInternal(ServletRequestDispatcher.java:368) at com.evermind.server.http.HttpRequestHandler.doProcessRequest(HttpRequestHandler.java:866) at com.evermind.server.http.HttpRequestHandler.processRequest(HttpRequestHandler.java:448) at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:302) at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:190) at oracle.oc4j.network.ServerSocketReadHandler$SafeRunnable.run(ServerSocketReadHandler.java:260) at com.evermind.util.ReleasableResourcePooledExecutor$MyWorker.run(ReleasableResourcePooledExecutor.java:303) at java.lang.Thread.run(Thread.java:595)
Error logs in opmn.log file :
.
.
.
16/02/03 10:14:29 [ons-connect] Local connection 127.0.0.1,6100 invalid form factor
16/02/03 10:16:29 [ons-connect] Local connection 127.0.0.1,6100 invalid form factor
16/02/03 10:18:42 [ons-connect] Local connection 127.0.0.1,6100 invalid form factor
16/02/03 10:19:29 [ons-connect] Local connection 127.0.0.1,6100 invalid form factor
.
.
After a long search process, I find that my error is related with owner permissions on oracle server. oracle user manages the instances so files under oracle/ folder must have oracle owner permissions. If you enter with root or other user and change some file permissions by running or rewriting them especially config files under j2ee/home/persistence, opmn/conf, opmn/bin, oracle user can not execute that files. I set files owner permissions with oracle user and restart the server, error was gone.

Resources