Vault fail to auto unseal after consul snapshot restore - consul

So basically, I have set up an architecture with Consul/Vault in a kubernetes cluster within AWS. My vault auto unseals with AWS KMS when the pods start.
Recently I’ve done some testing around backing up vault using consul snapshot.
The scenario I tested is:
First taking snapshot of vault consul snapshot save vault.prod.snap
Then removing vault doing consul kv delete -recurse vault/
Removing vault statefulsets and pods
consul snapshot restore vault.prod.snap
Finally re-create vault statefulsets
Result:
I got an error 500 on the third key during the auto unseal that says:
body {“errors”:[“failed to decrypt encrypted stored keys: cipher: message authentication failed”]}
I tried that another test where I don’t clean the vault with command kv delete -recurse vault/
I basically just remove a couple of policies in the UI and the restore. That scenario seems to work correctly, it’s only when I restore from “scratch”, that my vault cannot unseal anymore.
could somebody give me some hint please ?

I've finally fixed the problem after 3 days
After fixing the followings my vault was able to be unsealed again:
Duplicate Node IDs :https://github.com/hashicorp/consul/issues/4741
Consul server Has to be given ACL token with sufficient rights
For the first issue, I've reused a ruby script and added it to the consul docker image to generate a permanent node id for each consul.
Regarding the second issue, my ACL policy for the agent was not giving required permission for the agent to make necessary updates.

Related

Use gMSA for Hashicorp Vault mssql credential rotation

I want to start using Vault to rotate credentials for mssql databases, and I need to be able to use a gMSA in my mssql connection string. My organization currently only uses Windows servers and will only provide gMSAs for service accounts.
Specifying the gMSA as the user id in the connection string returns the 400 error error creating database object: error verifying connection: InitialBytes InitializeSecurityContext failed 8009030c.
I also tried transitioning my vault services to use the gMSA as their log on user, but this made nodes unable to become a leader node even though they were able to join the cluster and forward requests.
My setup:
I have a Vault cluster running across a few Windows servers. I use nssm to run them as a Windows service since there is no native Windows service support.
nssm is configured to run vault server -config="C:\vault\config.hcl" and uses the Local System account to run under.
When I change the user, the node is able to start up and join the raft cluster as a follower, but can not obtain leader status, which causes my cluster to become unresponsive once the Local System user nodes are off.
The servers are running on Windows Server 2022 and Vault is at v1.10.3, using integrated raft storage. I have 5 vault nodes in my cluster.
I tried running the following command to configure my database secret engine:
vault write database/config/testdb \
connection_url='server=myserver\testdb;user id=domain\gmsaUser;database=mydb;app name=vault;' \
allowed_roles="my-role"
which caused the error message I mentioned above.
I then tried to change the log on user for the service. I followed these steps to rotate the user:
Updated the directory permissions for everywhere vault is touching (configs, certificates, storage) to include my gMSA user. I gave it read permissions for the config and certificate files and read/write for storage.
Stopped the service
Removed the node as a peer from the cluster using vault operator raft remove-peer instanceName.
Deleted the old storage files
Changed the service user by running sc.exe --% config "vault" obj="domain\gmsaUser" type= own.
Started the service back up and waited for replication
When I completed the last step, I could see the node reappear as a voter in the Vault UI. I was able to directly hit the node using the cli and ui and get a response. This is not an enterprise cluster, so this should have just forwarded the request to the leader, confirming that the clustering portion was working.
Before I got to the last node, I tried running vault operator step-down and was never able to get the leader to rotate. Turning off the last node made the cluster unresponsive.
I did not expect changing the log on user to cause any issue with node's ability to operate. I reviewed the logs but there was nothing out of the ordinary, even by setting the log level to trace. They do show successful unseal, standby mode, and joining the raft cluster.
Most of the documentation I have found for the mssql secret engine includes creating a user/pass at the sql server for Vault to use, which is not an option for me. Is there any way I can use the gMSA in my mssql config?
When you put user id into the SQL connection string it will try to do SQL authentication and no longer try windows authentication (while gMSA is a windows authentication based).
When setting up the gMSA account did you specify the correct parameter for who is allowed to retrieve the password (correct: PrincipalsAllowedToRetrieveManagedPassword, incorrect but first suggestion when using tab completion PrincipalsAllowedToDelegateToAccount)
maybe you need to Install-ADServiceAccount ... on the machine you're running vault on

Saving Vault secrets in Consul

Tell me please, how can I do next?:
I am creating a secret in Vault.
This secret is automatically added to Consul.
I go to Consul and see the keys from Vault there.
Please give me documentation or article.

How to start using Minio without accessKey and secretKey?

I have a minio server running on debian using SystemD and proxied with NGINX and secured with Let's Encrypt. In the docs it suggests the service is comparable to Amazon S3 but I can't figure out how to actually use the service.
Version: 2019-03-27T22:35:21Z
Release-Tag: RELEASE.2019-03-27T22-35-21Z
Commit-ID: 6df05e489dc789cf26e82810cf5cfeefb1d90761
It looks like in order to create a bucket or use the minio cli mc there needs to be a registered TARGET along with accessKey and secretKey. I can't find anywhere on the server where that information is available and it's not clear to me how to create a new target.
Here is the /etc/default/minio file:
MINIO_VOLUMES="/usr/local/share/minio"
MINIO_OPTS="-C /etc/minio --address :9000"
There are no files in /etc/minio.
It's running and set up, but how can I start actually using the minio server?
Edit: Config JSON
I tried creating a new config file and entering new accessKey and secretKey in the credential field. I was not able to sign in to the Minio Browser app using the same keys.
Edit: Key Files
I tried entering a new access key and secret key into the files /etc/minio/access_key and /etc/minio/secret_key and adding the following lines to the /etc/default/minio environment file:
MINIO_ACCESS_KEY_FILE="/etc/minio/access_key"
MINIO_SECRET_KEY_FILE="/etc/minio/secret_key"
I restarted the service systemctl restart minio but I still can't log into the Minio Browser app.
It only worked to provide MINIO_ACCESS_KEY and MINIO_SECRET_KEY into /etc/default/minio environment file. Every other method failed.
I used the following to generate a secret key that resemble AWS access keys in the example. In the CLI help text it looks like access key and secret key would work however.
SecureRandom.urlsafe_base64(30)

How to refer AWS access key(secret key) in cloud-init without hard coding

I want to write cloud-init script which initializes REX-Ray docker plugin(A service which uses AWS credentials on its configuration).
I have considered the following methods. However, these methods have some disadvantages.
Hard code access key/secret key in cloud-init script.
Problem: This is not secure.
Create IAM role, then refer access key, secret key from instance meta data.
Problem: Access key will expires in a certain period.
So I need to restart REX-Ray daemon process, which causes service temporary unavailable.
Please tell me which is better way to refer access key/secret key, or another way if it exists.
Thanks in advance.
The docker plugin should get the credentials automatically. You don't have to do anything. Do not set any environment variables for AWS credentials.
AWS CLI / AWS SDK will get the credentials automatically from the meta data server.
You can use the following method of authentication
Environment variables
Export both access and secret keys in environment environment as follow:
$ export AWS_ACCESS_KEY_ID="anaccesskey"
$ export AWS_SECRET_ACCESS_KEY="asecretkey"
Shared Credential file
You can use an AWS credentials file to specify your credentials. The default location is $HOME/.aws/credentials on Linux and OS X, or "%USERPROFILE%.aws\credentials" for Windows users. If terraform fail to detect credentials inline, or in the environment, Terraform will check this location
You can optionally specify a different location in the configuration by providing the shared_credentials_file attribute as follow
provider "aws" {
region = "us-west-2"
shared_credentials_file = "/Users/tf_user/.aws/creds"
profile = "customprofile"
}
https://www.terraform.io/docs/providers/aws/

Forge configuration What to Put in Server Providers=>Amazon=>Key/Secret

I created an Ubuntu server on Amazon AWS.
Then I registered for Forge, and now trying to configure it.
I selected source control to be Bitbucket.
I selected Amazon in Server Provider Section,but now I am not sure what to put in key and secret
I found the answer to this question,
We need to create a IAM user and opt for api access key and secret.
also remember to give access to at least FullEC2Admin Access to this user before initiating the process to create and provision the server via forge.

Resources