Couldn’t acquire exclusive lock on DB at ‘/eventstore/db’ - ubuntu-20.04

I’m trying to install eventstore on ubuntu 20.04 but everytime I run evenstored --what-if (as root or as simple user, or as sudo) I get the following error message : Couldn't acquire exclusive lock on DB at '/eventstore/db'..
I tried many things :
I tried ensuring that eventstore user and group were owner of the folder.
reinstalling eventstore
rebooting server
stop process with systemctl stop eventstore and starting it back again
I also tried launching service first (as root / sudo or simple user) before using eventstored --what-if.
I can’t figure out why I keep getting this message as if many instance of eventstore where launched at the same time.
EDIT :
Here is my config file (/etc/eventstore/eventstore.conf)
# Paths
Db: /eventstore/db
Index: /eventstore/index
Log: /eventstore/logs
# Certificates configuration
CertificateFile: /etc/eventstore/certs/cert.crt
CertificatePrivateKeyFile: /etc/eventstore/certs/privkey.key
TrustedRootCertificatesPath: /etc/ssl/certs
CertificateReservedNodeCommonName: "*.mathob-jehanno.com"
# Network configuration
IntIp: 37.187.2.103
ExtIp: 37.187.2.103
IntHostAdvertiseAs: mathob-jehanno.com
ExtHostAdvertiseAs: mathob-jehanno.com
HttpPort: 2113
IntTcpPort: 1112
EnableExternalTcp: false
EnableAtomPubOverHTTP: false
# Projections configuration
RunProjections: None

It happened to me previously. I was running v20 without supplying the necessary settings like the certificates were missing. The server crashed because of this, but the last message you see is this Couldn't acquire exclusive lock on DB at '/eventstore/db'. You might look close and see if it's a warning, and the real reason for the crash is mentioned earlier in the stack trace of the original error.

Ok so
First of all, comments helped a lot :
this error message is following another one which give more detail about what the problem is.
One thing to know is that eventstored --what-if is supposed to be run while service is not running so user need to stop the service before (systemctl stop eventstore).
I then changed the path to db, index and logs file to match the default value (it prevented me some permissions error).

Related

Windows Process Activation Service (WAS) will not start

IIS 10 will not restart on my PC. When I navigate to localhost, I get a 'localhost refused to connect' message. After looking through the event viewer, it turns out that the issue is that the Windows Process Activation Service (WAS) will not start.
The error message given is:
'The Windows Process Activation Service (WAS) encountered an error while handling key generation. This will prevent WAS from starting corrently. The data field contains the error number.'
When I try to start the service manually from the services app, I get the following:
Has anyone experienced this issue before? Any help would be greatly appreciated, I've been trawling the internet for several days trying to find a solution to no avail.
We've run into this issue several times after recent Windows Updates. In all cases, the following worked (got from a Microsoft support rep):
Run the following from an admin Powershell prompt:
reg delete HKLM\SYSTEM\CurrentControlSet\Services\WAS\Parameters /v GenerateKeys /f
net start w3svc
The keys will be regenerated, then the IIS AppPools can be started
According to your error message, WAS can not access the machine key when start up. Usually, machine keys are used to encrypt sensitive information in config file, WAS will not be able to start if there is no machine key to use.
The easiest and most common method is to try to uninstall and reinstall WAS.
If it still can not start, try to delete the registry entry NanoSet with cmd.
If the above two methods are useless, you can refer to this to delete machine keys, let WAS create new one while starting.
If somebody is still fighting with this issue, please check Event Viewer under System filter and check for any logs related to WAS. In my case I found the following entry:
The Windows Process Activation Service (WAS) encountered an error while handling key generation. This will prevent WAS from starting currently. The data field contains the error number.
So I just started again CNG Key Isolation service and everything is working now.

Solaris SMF for Oracle DB is ok but not for the Listener. How can an SMF method work under svcadm but not to restart the service when it has failed?

I have 2 questions about Solaris SMF. (I am an SMF newbie.)
I set up the Oracle RDBMS service in SMF as per https://docs.oracle.com/cd/E37838_01/html/E61677/odbstartstop.html
The database part works entirely as expected, so I added a listener as another service instance seeing as the method script has an option of 'listener' as an argument instead of 'db' and will run a lsnrctl start ${LISTENER} instead of using sqlplus to access and then start or stop a database instance.
The svcadm enable and svcadm disable of the service start and stop the listener as expected. The issue is that the framework senses if lsnrctl is running but does nothing to restart it, if it has stopped. See below:
svc:/site/oracle/db/oracle12lsnr:LISTENER4 (?)
State: maintenance since May 21, 2020 03:25:39 PM BST
Reason: Method failed.
See: http://support.oracle.com/msg/SMF-8000-8Q
See: /var/svc/log/site-oracle-db-oracle12lsnr:LISTENER4.log
Impact: This service is not running.
The - Reason: Method failed. - is not congruent with the fact that invoking the method via svcadm enable (or disable) shows that the method works just fine.
Further investigation - I killed the lsnrctl process from root and got this from svcs -Lv
[ May 22 14:13:30 Executing stop method ("/lib/svc/method/svc-oracle12-database lsnr stop LISTENER4"). ]
LSNRCTL for Solaris: Version 12.1.0.2.0 - Production on 22-MAY-2020 14:13:30
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=orahost.some.domain)(PORT=1521)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Solaris Error: 146: Connection refused
[ May 22 14:13:30 Method "stop" exited with status 95. ]
So the first question has changed and is now: Why would it run the stop method? The db version of this service runs the start method when the database service goes down.
Answer to Q1: the service framework runs the stop method followed by the start method. Once this was established a fresh look at the method script revealed a flaw. It error exited in the stop method if it couldn't contact the tnslsnr process. (Logic fail. If the tnslsnr process was killed you can't test a connection to it!)
To be honest I am struggling with the sheer volume of information to get through. I am currently reading through the pdf version of the URL above. I had a quick look here at Moellenkamp's blog http://blog.moellenkamp.org/archives/18-Auditing-a-single-SMF-service-revisted.html but I've not implemented that auditing service yet - assuming it would help anyway. If anyone has any thoughts as to why this isn't working I'd be really grateful.
The second question is this:
In the example the manifest is stored in /lib/svc/manifest/site/oracle/db and first time around I changed this to /lib/svc/manifest/site/oracle12db since 2 subdirectories (after .../site) seemed a little over the top and this resulted in the service just failing to work in any way (always in maintenance). I had adjusted the manifest xml file to match the changed directory structure. I was baffled and after fiddling around I simply changed the xml files and directory structure to match the example and it all worked. Why would that be? Is there some formula to the layers in the service_name or service_bundle?
I haven't yet read anything that says the directory structure has to be extended as per the example. I had not typo'd the xml file as far as I can tell - especially as revoking the changes to match the original example was simply to alter the service_name and service_bundle lines to match the extended directory structure.
To diagnose the reason for a service failure, always start with the service log, path for which is in the svcs output. Or just use "svcs -Lv " to display it directly.
Another 'easy when you know how'.
Upon failure the framework runs the stop method followed by the start method.
I can now scan through the pdf and look to confirm this and also such things as restart and refresh.
I'll vote for user13596356's response because the quick turnaround on checking the log and a bit of input from an SMF question by user40330 from 7 yrs ago got me looking at the service method script which was flawed.

Mesos framework stays inactive due to "Authentication failed: EOF"

I'm currently trying to deploy Eremetic (version 0.28.0) on top of Marathon using the configuration provided as an example. I actually have been able to deploy it once, but suddenly, after trying to redeploy it, the framework stays inactive.
By inspecting the logs I noticed a constant attempt to connect to some service that apparently never succeeds because of some authentication problem.
2017/08/14 12:30:45 Connected to [REDACTED_MESOS_MASTER_ADDRESS]
2017/08/14 12:30:45 Authentication failed: EOF
It looks like the service returning an error is ZooKeeper and more precisely it looks like the error can be traced back to this line in the Go ZooKeeper library. ZooKeeper however seems to work: I've tried to query it directly with zkCli and to run a small Spark job (where the Mesos master is given with zk:// URL) and everything seems to work.
Unfortunately I'm not able to diagnose the problem further, what could it be?
It turned out to be a configuration problem. The master URL was simply wrong and this is how the error was reported.

Redis throughs (ERR operation not permitted) error even after properly running for 1 to 2 hrs

I have used Redis in my project for Caching purpose, I used Spring for that set up, You can go through the below mentioned link to understand what I did in my project.
http://caseyscarborough.com/blog/2014/12/18/caching-data-in-spring-using-redis/
This code was running fine in production environment (Rhel 7- EC2 instance) from last 6 to 8 months. Now suddenly it started giving "ERR operation not permitted" error
org.springframework.dao.InvalidDataAccessApiUsageException: ERR operation not permitted; nested exception is redis.clients.jedis.exceptions.JedisDataException: ERR operation not permitted
at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:44)
Due to this we are unable to fetch the data from Redis server. Hence our application doesn't work properly.
I did search on this issue, I have gone through the links like
redis (error) ERR operation not permitted
This says to check "requirepass" in redis.conf file whether its commented or not, But when I saw redis.conf file in production environment its commented out.
Even through its commented I ran below mentioned command on redis-cli
"AUTH foobared"
After runing the above mentioned command, It didn't work.
Note : But when we kill the running instance of Redis and Restart it, It will start working properly then it doesn't give "ERR operation not permitted" error.
After Restart of Redis the system start working properly for another one to two hours, then again same issue arises and it will again goes off after I restart the Redis server.
Note : I tried upgrading Redis server from 2.6 to 3 even though it didn't work
Is your Redis exposed to the internet?
It's possibly a CONFIG SET requirepass attack.
See this SO question and #antirez comments here

quickfix session config issues

I've compiled and trolled around the quickfix ( http://www.quickfixengine.org ) source and the examples. I figured a good starting point would be to compile (C++) and run the 'executor' example, then use the 'tradeclient' example to connect to 'executor', and send it order requests.
I created two seperate session files one for the 'executor' as an acceptor, and one for the 'tradeclient' as the initiator. They're both running on the same Win7 pc.
'executor' runs, but tradeclient can't connect to it, and I can't figure out why. I downloaded Mini-fix and was able to send messages to executor, so I know that executor is working. I figure that the problem is with the tradeclient session settings. I've included both of them below, I was hoping someone could point out what's causing them to not communicate. They're both running on the same computer using port 56156.
--accceptor session.txt----
[DEFAULT]
ConnectionType=acceptor
ReconnectInterval=5
SenderCompID=EXEC
DefaultApplVerID=FIX.5.0
[SESSION]
BeginString=FIXT.1.1
TargetCompID=SENDER
HeartBtInt=5
#SocketConnectPort=
SocketAcceptPort=56156
SocketConnectHost=127.0.0.1
TransportDataDictionary=pathToXml/spec/FIX50.xml
StartTime=07:00:00
EndTime=23:00:00
FileStorePath=store
---- initiator session.txt ---
[DEFAULT]
ConnectionType=initiator
ReconnectInterval=5
SenderCompID=SENDER
DefaultApplVerID=FIX.5.0
[SESSION]
BeginString=FIXT.1.1
TargetCompID=EXEC
HeartBtInt=5
SocketConnectPort=56156
#SocketAcceptPort=56156
SocketConnectHost=127.0.0.1
TransportDataDictionary=pathToXml/spec/FIX50.xml
StartTime=07:00:00
EndTime=23:00:00
FileLogPath=log
FileStorePath=store
--------end------
Update: Thanks for the resonses... Turns out that my logfile directories didn't exist. Once I created them, they both started communicating. Must have been some logging error that didn't throw an exception, but disabled proper behavior.
Is there an error condition that I should be checking? I was relying on exceptions, but that's obviously not enough.
It doesn't seem to be config, check that your message sequence numbers are in synch, especially since you've been connecting to a different server using the same settings.
Try setting the TargetCompID and SenderCompID on the acceptor to *

Resources