I have multiple RabbitMQ nodes running on different machines. After installing each node I failed to specify a common cookie for each of them to use so I had to go back and manually change the file .erlang.cookie . My issue is that after doing this I get conflicting error messages. If i do rabbitmqctl status
I get the following error:
DIAGNOSTICS
attempted to contact: ['rabbit#nc-mso-test01']
rabbit#nc-mso-test01: * connected to epmd (port 4369) on
nc-mso-test01 * epmd reports node 'rabbit' running on port 25672 *
TCP connection succeeded but Erlang distribution failed
Authentication failed (rejected by the remote node), please check
the Erlang cookie
current node details:
- node name: 'rabbitmq-cli-45#nc-mso-test01'
- home dir: C:\Users\jol
- cookie hash: 9/Hx6l+wLQv3NkmSDFqBog==
Whatever script I call, I get the same error. I tried restarting the service, removing and installing it through rabbitmq-service. The error persists. From what I can gather from other posts, the reason might be that the node and the erlang broker are running on separate users and each of them have a different version of the cookie, one is stuck with the old one.
How can I make the server and node restart, so that both of them use the new cookie file?
I solved my issue. I missed the fact that the RabbitMQ setup has two cookie files, one in c:\Windows for the erlang component, and one in C:\Users\%USER%. From what I understand, if the erlan VM is started on it's own application user and the RabbitMQ node is started on a different user, which would have been my case, then the two cookie files were different and I had to sync those up before syncing the cluster cookies.
Documentation says:
The cookie file used by the Windows service account and the user running CLI tools must be synchronised. RabbitMQ-Clustering Guide
On Erlang versions starting with 20.2, the cookie file locations are:
For user running CLI tools - usually C:\Users\%USERNAME%\.erlang.cookie for user %USERNAME%
For the RabbitMQ Windows service - %USERPROFILE%\.erlang.cookie
(usually C:\WINDOWS\system32\config\systemprofile)
On Erlang versions prior to 20.2 (e.g. 19.3 or 20.1), the cookie file locations are:
For user running CLI tools - usually C:\Users\%USERNAME%\.erlang.cookie for user %USERNAME%
For the RabbitMQ Windows service - %WINDIR%\.erlang.cookie (usually C:\Windows\.erlang.cookie)
Related
I want to start using Vault to rotate credentials for mssql databases, and I need to be able to use a gMSA in my mssql connection string. My organization currently only uses Windows servers and will only provide gMSAs for service accounts.
Specifying the gMSA as the user id in the connection string returns the 400 error error creating database object: error verifying connection: InitialBytes InitializeSecurityContext failed 8009030c.
I also tried transitioning my vault services to use the gMSA as their log on user, but this made nodes unable to become a leader node even though they were able to join the cluster and forward requests.
My setup:
I have a Vault cluster running across a few Windows servers. I use nssm to run them as a Windows service since there is no native Windows service support.
nssm is configured to run vault server -config="C:\vault\config.hcl" and uses the Local System account to run under.
When I change the user, the node is able to start up and join the raft cluster as a follower, but can not obtain leader status, which causes my cluster to become unresponsive once the Local System user nodes are off.
The servers are running on Windows Server 2022 and Vault is at v1.10.3, using integrated raft storage. I have 5 vault nodes in my cluster.
I tried running the following command to configure my database secret engine:
vault write database/config/testdb \
connection_url='server=myserver\testdb;user id=domain\gmsaUser;database=mydb;app name=vault;' \
allowed_roles="my-role"
which caused the error message I mentioned above.
I then tried to change the log on user for the service. I followed these steps to rotate the user:
Updated the directory permissions for everywhere vault is touching (configs, certificates, storage) to include my gMSA user. I gave it read permissions for the config and certificate files and read/write for storage.
Stopped the service
Removed the node as a peer from the cluster using vault operator raft remove-peer instanceName.
Deleted the old storage files
Changed the service user by running sc.exe --% config "vault" obj="domain\gmsaUser" type= own.
Started the service back up and waited for replication
When I completed the last step, I could see the node reappear as a voter in the Vault UI. I was able to directly hit the node using the cli and ui and get a response. This is not an enterprise cluster, so this should have just forwarded the request to the leader, confirming that the clustering portion was working.
Before I got to the last node, I tried running vault operator step-down and was never able to get the leader to rotate. Turning off the last node made the cluster unresponsive.
I did not expect changing the log on user to cause any issue with node's ability to operate. I reviewed the logs but there was nothing out of the ordinary, even by setting the log level to trace. They do show successful unseal, standby mode, and joining the raft cluster.
Most of the documentation I have found for the mssql secret engine includes creating a user/pass at the sql server for Vault to use, which is not an option for me. Is there any way I can use the gMSA in my mssql config?
When you put user id into the SQL connection string it will try to do SQL authentication and no longer try windows authentication (while gMSA is a windows authentication based).
When setting up the gMSA account did you specify the correct parameter for who is allowed to retrieve the password (correct: PrincipalsAllowedToRetrieveManagedPassword, incorrect but first suggestion when using tab completion PrincipalsAllowedToDelegateToAccount)
maybe you need to Install-ADServiceAccount ... on the machine you're running vault on
All the trick seems to be that I'm under AWS.
I found many solutions to solve that problem as connection can be made directly on the server. The issue I have is that I'm running the server on AWS, therefore, only RDP connection is available.
As the server is part of the domain , I added the server as part of the server manager ones on my Domain controller.I tried several times to install/uninstall the RD diagnoser AND the RD License manager. When I try to run each one, I get " Server Manager can not open the tool
Checked with Powershel,, the component install is properly done ( or at list seems to for the system ).
I tried to reset the grace period for RDS server ( info here: https://mangolassi.it/topic/19353/reset-120-day-grace-period-for-windows-rds-server-with-powershell but no effects.
I would need first to get back access to a session, which could be a good start
Thanks for your insights
I am on composer 0.16.0 and Fabric 1.0.4
While experimenting with Historian queries via composer-client consistently run into a situation when the network becomes non-responsive and the only way to reanimate it seems to be restarting the Fabric and redeploying the network.
The error follows:
>
Error: Error trying to ping. Error: Error trying to query business network. Error: chaincode error (status: 500, message: Error: The current identity has not been registered: admin)
>
So, the questions are:
1. Is this a known issue and is there a workaround? Happy to do more diagnostics and file it properly if that helps.
2. Any way to reboot the network without restarting the Fabric?
Thank you!
so the error "The current identity has not been registered: admin" is fundamentally caused by the fact you are restarting your CA server each time - ie a new CA server, a new authority issuing new credentials effectively for 'admin' (and hence your present admin credentials from 'previous' in your card store are not recognised by the new CA server).
Suggest to
1) clear out old admin cards from your card store eg. composer card delete --name admin#tutorial-network
2) re-import your 'admin' card through playground or CLI - and do a composer network ping to retrieve credentials to the card store.
3) Reduce your Historian queries result sets by adding selection criteria
Note: To restart your existing Dev Fabric - just use docker stop to stop your containers - and docker start you can restart them from the same state (or use docker-compose stop and docker-compose start if you're familiar with that command). Else, use docker persistence to persist your data.
https://hyperledger.github.io/composer/tutorials/developer-tutorial.html
Probably good to
I have a asp.net mvc application that interacts with RabbitMq. Everything works great locally.
However, on our deployment server it cannot connect
DEBUG|MassTransit.RabbitMqTransport.Integration.RabbitMqConnectionCache|Connecting: muyuser#localhost:5672/|
ERROR|MassTransit.RabbitMqTransport.RabbitMqReceiveTransport|RabbitMQ connection failed: Connect failed: muyuser#localhost:5672/|
What I'm able to gather is this
In order to connect to RabbitMq you need a valid .erlang.cookie in (on windows) your User root
As best I can tell, this cookie is created when you install rabbitmq
In development we're using localdb which runs as the developer's user (which has this cookie)
In production the application runs off of IIS which uses the application pool and the built-in ApplicationPoolIdentity account. Which doesn't have a User folder for the .erlang.cookie file to live in.
So the question becomes...what now? How is this intended to work?
Obviously we could create a dedicated user for the web application but our system administrator is understandably very reluctant to do this.
Another clue, is that when I tried to RDP, log in as myself and connect to rabbit I found that I could not. After troubleshooting I discovered that my cookie didn't match up with that of others who could! I replaced it with the one from c:\windows\.erlang.cookie and could then connect from cli. It seems possible like there is a cookie installed somewhere for the applicationpoolidentity but it is an incorrect cookie. What is the location where it would go?
Erlang cookies are used for internode communication, whether it is for clustering RabbitMQ or for contacting RabbitMQ via the command line using rabbitmqctl.
If you have problems with an AMQP connection, then the erlang cookie has nothing to do here.
Take a look at access control https://www.rabbitmq.com/access-control.html to see if your user is properly configured.
At the same time check the server logs to see why the connection is refused.
Problem
Enterprise Manager starts and then hangs.
Environment
RAC installation on Windows, comprised of two nodes, node1 and node2. Enterprise Manager is installed on node1. We are able to get dbconsole to run briefly and and then it fails.
emagent.trc from node1 shows what appear to be two sets of relevant errors.
The first set of errors regard an inability to connect to the EM repository (which is on the same node).
The second error is associated with the "Instance Health Check initialization failed".
emagent.trc (node1)
Thread-5548 ERROR fetchlets.healthCheck: GIM-00105: Shared memory region is corrupted.
Thread-5548 ERROR engine: [oracle_database,clustername_node1name,health_check] : nmeegd_GetMetricData failed : Instance Health Check initialization failed due to one of the following causes: the owner of the EM agent process is not same as the owner of the Oracle instance processes; the owner of the EM agent process is not part of the dba group; or the database version is not 10g (10.1.0.2) and above.
Thread-5668 WARN http: snmehl_connect: connect failed to (node1:1158): No connection could be made because the target machine actively refused it.
Thread-5668 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://node1:1158/em/upload/: retStatus=-1
Thread-5708 ERROR upload: FxferSend: Cannot connect to: https://node1:1158/em/upload/. retStatus=-1
Thread-5708 ERROR upload: Failed to upload file B0000109.xml, ret = -2
I would like to get advice about how to proceed to troubleshoot these two errors in hopes of getting EM to start and stay up.
Regarding the first error, how would one troubleshoot an inability to connect to a web page running on the same node? This would appear to rule out firewall issues, etc. as a cause.
Regarding the second error, dbconsole and agent were started manually from the command line using a domain account, whereas the Oracle service runs under Local System (dbconsole was configured to use Local System on startup but failed, and can only be restarted via emctl start dbconsole.)
This part of the errors is the most promising.
the owner of the EM agent process is not same as the owner of the Oracle instance processes; the owner of the EM agent process is not part of the dba group;
You should check if all accounts running oracle are part of the ora_dba group.
See: http://download.oracle.com/docs/html/B13831_01/ap_unix.htm#i634430