Kerberos to work on my ansible setup Minor code may provide more information', 851968) ('Server not found in Kerberos database', -1765328377)) - ansible

Been having some issues with setting up kerberos within my lab setup.
Ansible server: Ubuntu
AD server: Win 2016 server
Target server: Win 2016 server
Please note that I can get ansible working with my target server when using local authentication.
What have I done ?
read Using Ansible on windows with domain user
https://osric.com/chris/accidental-developer/2017/01/error-cannot-contact-any-kdc-for-realm-while-getting-initial-credentials/
Here is my inventory server
[sqlservers]
myserver.mylab.local ansible_host=192.x.x.x
[sqlservers:vars]
ansible_user = ansible-user#MYLAB.LOCAL
ansible_password = xxxxxx
ansible_connection = winrm
ansible_winrm_transport = kerberos
ansible_port = 5986
ansible_winrm_server_cert_validation = ignore
#ansible_winrm_kinit_cmd = "/opt/VA/uxauth/bin/uxconsole -krb -init"
ansible_winrm_kerberos_delegation = true
Contents of the krb5.conf file
[libdefaults]
default_realm = MYLAB.LOCAL
[realms]
MYLAB.LOCAL = {
kdc = adservver.mylab.local
admin_server = adserver.mylab.local
default_domain = mylab.local
}
[domain_realm]
.mylab.local = MYLAB.LOCAL
mylab.local = MYLAB.LOCAL
I get the error message below.
fatal: [myserver.mylab.local]: UNREACHABLE! => {"changed": false, "msg": "kerberos: authGSSClientStep() failed: (('Unspecified GSS failure. Minor code may provide more information', 851968), ('Server not found in Kerberos database', -1765328377))", "unreachable": true}
To test that I can get a kerberos token, I am able to run the commands below.
kinit -C ansible-user#MYLAB.LOCAL
klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: ansible-user#MYLAB.LOCAL
Valid starting Expires Service principal
05/21/21 10:50:42 05/21/21 20:50:42 krbtgt/MYLAB.LOCAL#MYLAB.LOCAL
renew until 05/28/21 10:50:39

Thanks all for your answers, in the end I checked everyting, and it looks like a reboot of the AD server resolved the issue.
Also when specifying the credentials for the playbook to use, the domain name needs to be in capitals. If the domain name is not in uppercase, you get a KDC error.
ansible-user#MYDOMAIN.COM

Related

Winrm basic: the specified credentials were rejected by the server error

Trying to connect to a windows host from a Linux Zorin control Host by using Ansible.
Installed winrm in the windows machine and set all the required authentication methods to True.
Configuration of winrm in the Window Host
PS C:\WINDOWS\system32> winrm get winrm/config
Config
MaxEnvelopeSizekb = 500
MaxTimeoutms = 60000
MaxBatchItems = 32000
MaxProviderRequests = 4294967295
Client
NetworkDelayms = 5000
URLPrefix = wsman
AllowUnencrypted = true
Auth
Basic = true
Digest = true
Kerberos = true
Negotiate = true
Certificate = true
CredSSP = false
DefaultPorts
HTTP = 5985
HTTPS = 5986
TrustedHosts
Service
RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GXGR;;;S-1-5-21-2039588290-1060779563-2652726705-1011)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
EnumerationTimeoutms = 240000
MaxConnections = 300
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = false
Auth
Basic = true
Kerberos = true
Negotiate = true
Certificate = false
CredSSP = false
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985
HTTPS = 5986
IPv4Filter = *
IPv6Filter = *
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
AllowRemoteAccess = true
Winrs
AllowRemoteShellAccess = true
IdleTimeout = 7200000
MaxConcurrentUsers = 2147483647
MaxShellRunTime = 2147483647
MaxProcessesPerShell = 2147483647
MaxMemoryPerShellMB = 2147483647
MaxShellsPerUser = 2147483647
Even after setting the Basic = true, getting the specified creds were rejected error. Tried making AllowUnencrypted = true, but it is showing following error message:
WSManFault
Message
ProviderFault
WSManFault
Message = WinRM firewall exception will not work since one of the network connection types on this machine is set to Public. Change the network connection type to either Domain or Private and try again.
Tried changing the network connection type to private. And tried making AllowUnencrypted = true, getting the same error again as above(WinRM firewall exception will not work since one of the network connection types on this machine is set to Public. Change the network connection type to either Domain or Private and try again.)
Tried adding a firewall exception rule to the port 5985 too on the windows host.
Tried giving the permissions of Read and Execute to the user by winrm configsddl default also. Even though not working.
Giving the right credentials. The hosts file of ansible is as follows:
[win]
<IP>
[win:vars]
ansible_user=<username>
ansible_password=<password>
ansible_connection=winrm
ansible_winrm_scheme=http
ansible_winrm_transport=basic
ansible_winrm_port=5985
ansible_winrm_server_cert_validation=ignore
Trying the following ansible command:
ansible win -i hosts -m win_ping
I tried everything i found in the internet, but not able to establish the connection through winrm.
I will be thankful to anyone who provides the solution. My eyes are bleeding red from watching the error on the screen from 4 days.
I changed the ansible_winrm_transport from basic to ntlm. It resolved my issue.

Certificate Auth. : the specified credentials were rejected by the server

It's been multiple days since I started trying enabling all my Windows hosts to be reachable with Ansible via the certificate authentication method. I use a script to configure WinRM and to create a self-signed certificate. On multiple hosts, it works fine and after the script is finished I can connect to them via certificate authentication but on some other (like 15-20% of them) it's impossible.
I get this error message:
fatal: [SERVERNAME]: UNREACHABLE! => {
"changed": false,
"msg": "certificate: the specified credentials were rejected by the server",
"unreachable": true
}
What is strange is that I don't see the login event in the Windows event viewer. Here is my WinRM configuration:
Service
RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
EnumerationTimeoutms = 240000
MaxConnections = 300
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = false
Auth
Basic = true
Kerberos = true
Negotiate = true
Certificate = true
CredSSP = true
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985
HTTPS = 5986
IPv4Filter = *
IPv6Filter = *
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
AllowRemoteAccess = true
Winrs
AllowRemoteShellAccess = true
IdleTimeout = 7200000
MaxConcurrentUsers = 10
MaxShellRunTime = 2147483647
MaxProcessesPerShell = 25
MaxMemoryPerShellMB = 1024
MaxShellsPerUser = 30
Both the listener and the certificate mapping are configured on the windows machine:
Listener
Address = *
Transport = HTTPS
Port = 5986
Hostname
Enabled = true
URLPrefix = wsman
CertificateThumbprint = 927...C26E
ListeningOn = 127.0.0.1, 172.20.x.x
CertMapping
URI = *
Subject = ansibleuser#localhost
Issuer = 579f3eb1c3756339a246843f70e1a89b14fdc244
UserName = ansibleuser
Enabled = true
Password
What I've tried up until now:
Check the presence of the LocalAccountTokenFilterPolicy registry key
Configure the access to WinRM through winrm configSDDL default
Check GPOs
Change the password (check and uncheck password never expires,
etc...)
Create another local admin user
Enable basic and unencrypted authentication
Change the connection type to private (could not since the servers
are domain joined)
Run the script provided by ansible to configure WinRM
I don't understand what is going on and it's driving me nuts. Did someone encounter this problem before ?
I'm open to all suggestions, thanks in advance.
FINALLY found a solution to this problem:
Based on this thread Client Certificate-based authentication stopped working for PS Remoting, I found out that a registry key named "ClientAuthTrustMode" should be set to the value "2" and, with that, the error message magically disappears.
Here is a Microsoft article detailing the implication of the key : Overview of TLS - SSL (Schannel SSP)
Here is a simple powershell command to flip the switch :
Set-ItemProperty -Path registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL -Name ClientAuthTrustMode -Type DWord -Value 2
Hopefully that will help someone out there.
Thanks for Your solution! It solves my problem also.
We had lot of servers using ansible and WINRM with certificate based authentication, but only one of our servers had the same issue as yours was...
The only one interesting difference, what I've found on your shared settings is this:
ListeningOn = 127.0.0.1, 172.20.x.x
This is also same as mine... localhost is in the first place...
ListeningOn = 127.0.0.1, 19x.1xx.19.98, ::1
Other servers in our setup is add the network interface on the first place like this:
ListeningOn = 19x.16.61.38, 127.0.0.1, ::1
I really don't know, it matters or not, but this is the only difference, what I've found.
Thanks a lot.

Ansible Windows Kerberos Authentification Bad HTTP response returned from server Code 500

After configuring winRM on a windows server and filling all needed information to connect :
---
### winrm / win connection ###
ansible_winrm_realm: *My AD Domain*
ansible_connection: winrm
ansible_winrm_kerberos_delegation: yes
ansible_port: 5985
ansible_winrm_transport: kerberos
I got an
fatal: [MyServer]: UNREACHABLE! => {"changed": false, "msg": "kerberos: ('http', 'Bad HTTP response returned from server. Code 500')", "unreachable": true}
I have tried a lot of things including changing my configuration and checking if the WinRm is reachable and it was all good :
C:\Users\ME>winrs -r :http://myserver:5985/wsman -u:My_User -p:Password ipconfig
My WinRM Config :
PS C:\Users\XXXX> winrm get winrm/config/Service
Service
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
EnumerationTimeoutms = 240000
MaxConnections = 300
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = false
Auth
Basic = false
Kerberos = true
Negotiate = true
Certificate = false
CredSSP = false
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985
HTTPS = 5986
IPv4Filter = *
IPv6Filter = *
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
AllowRemoteAccess = true
PS C:\Users\XXXX> winrm get winrm/config/Winrs
Winrs
AllowRemoteShellAccess = true
IdleTimeout = 7200000
MaxConcurrentUsers = 2147483647
MaxShellRunTime = 2147483647
MaxProcessesPerShell = 2147483647
MaxMemoryPerShellMB = 2147483647
MaxShellsPerUser = 2147483647
Since i'm trying to use HTTP instead of HTTPS, the solution is to change the WinRm service config to allow encrypted connections by running the following command :
Set-Item -Path WSMan:\localhost\Service\AllowUnencrypted -Value true
I ran into this exception and the solution for me was to install the python-kerberos wrapper.
pip3 install pywinrm[kerberos]
Finally solved by upgrading pykerberos to 1.2.1 version
pip3 install pykerberos --upgrade
As workaround you can use python2 to run this playbook:
/usr/bin/python2 /usr/bin/ansible-playbook WindowsTest.yml
Below command on host node resolved the issue. We need to accept unencrypted traffic.
Set-Item -Path WSMan:\localhost\Service\AllowUnencrypted -Value true

Traefik not getting SSL certificates for new domains

I've got Traefik/Docker Swarm/Let's Encrypt/Consul set up, and it's been working fine. It managed to successfully get certificates for the domains admin.domain.tld, registry.domain.tld and staging.domain.tld, but now that I've tried adding containers that are serving domain.tld and matomo.domain.tld those aren't getting any certificates (browser warns of self signed certificate because it's the default Traefik certificate).
My Traefik configuration (that's being uploaded to Consul):
debug = false
logLevel = "DEBUG"
insecureSkipVerify = true
defaultEntryPoints = ["https", "http"]
[entryPoints]
[entryPoints.ping]
address = ":8082"
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[traefikLog]
filePath = '/var/log/traefik/traefik.log'
format = 'json'
[accessLog]
filePath = '/var/log/traefik/access.log'
format = 'json'
[accessLog.fields]
defaultMode = 'keep'
[accessLog.fields.headers]
defaultMode = 'keep'
[accessLog.fields.headers.names]
"Authorization" = "drop"
[retry]
[api]
entryPoint = "traefik"
dashboard = true
debug = false
[ping]
entryPoint = "ping"
[metrics]
[metrics.influxdb]
address = "http://influxdb:8086"
protocol = "http"
pushinterval = "10s"
database = "metrics"
[docker]
endpoint = "unix:///var/run/docker.sock"
domain = "domain.tld"
watch = true
exposedByDefault = false
network = "net_web"
swarmMode = true
[acme]
email = "my#mail.tld"
storage = "traefik/acme/account"
entryPoint = "https"
onHostRule = true
[acme.httpChallenge]
entryPoint = "http"
Possibly related, in traefik.log I repeatedly (as in almost once per second) get the following (but only for the registry subdomain). Sounds like an issue to persist the data to consul, but there are no errors indicating such an issue.
{"level":"debug","msg":"Looking for an existing ACME challenge for registry.domain.tld...","time":"2019-07-07T11:37:23Z"}
{"level":"debug","msg":"Looking for provided certificate to validate registry.domain.tld...","time":"2019-07-07T11:37:23Z"}
{"level":"debug","msg":"No provided certificate found for domains registry.domain.tld, get ACME certificate.","time":"2019-07-07T11:37:23Z"}
{"level":"debug","msg":"ACME got domain cert registry.domain.tld","time":"2019-07-07T11:37:23Z"}
Update: I managed to find this line in the log:
{"level":"error","msg":"Error getting ACME certificates [matomo.domain.tld] : cannot obtain certificates: acme: Error -\u003e One or more domains had a problem:\n[matomo.domain.tld] acme: error: 400 :: urn:ietf:paramsacme:error:connection :: Fetching http://matomo.domain.tld/.well-known/acme-challenge/WJZOZ9UC1aJl9ishmL2ACKFbKoGOe_xQoSbD34v8mSk: Timeout after connect (your server may be slow or overloaded), url: \n","time":"2019-07-09T16:27:43Z"}
So it seems the issue is the challenge failing because of a timeout. Why the timeout though?
Update 2: More log entries:
{"level":"debug","msg":"Looking for an existing ACME challenge for staging.domain.tld...","time":"2019-07-10T19:38:34Z"}
{"level":"debug","msg":"Looking for provided certificate to validate staging.domain.tld...","time":"2019-07-10T19:38:34Z"}
{"level":"debug","msg":"No provided certificate found for domains staging.domain.tld, get ACME certificate.","time":"2019-07-10T19:38:34Z"}
{"level":"debug","msg":"No certificate found or generated for staging.domain.tld","time":"2019-07-10T19:38:34Z"}
{"level":"debug","msg":"http: TLS handshake error from 10.255.0.2:51981: remote error: tls: unknown certificate","time":"2019-07-10T19:38:34Z"}
But then, after a couple minutes to an hour, it works (for two domains so far).
not sure if its a feature or a bug, but removing the following http to https redirect solved it for me:
[entryPoints.http.redirect]
entryPoint = "https"

WinRM connectivity issue?

I am trying to have a Enter-PSSession with a Company server with in the company network. I can RDC to the server, ping the server and also get the Windows Services status using Get-Service -ComputerName DBServer. However, WinRM session does not allow me to get into the server.
My PC:
Windows 10
Powershell 5.0
IP: 128.2.60.102
Server:
Windows Server 2012
PowerShell 4.0
IP: 10.1.130.1
On DBServer:
PS C:\Windows\system32> winrm e winrm/config/listerner
WSManFault
Message
ProviderFault
WSManFault
Message = The WS-Management service cannot process the
request. The resource URI does not support the
Enumerate operation.
Error number: -2144108495 0x80338031
The WS-Management service cannot process the request because the WS-
Addressing Action URI in the request is not compatible with the resource.
PS C:\Windows\system32> winrm quickconfig
WinRM service is already running on this machine.
WinRM is already set up for remote management on this computer.
PS C:\Windows\system32> winrm get winrm/config
Config
MaxEnvelopeSizekb = 500
MaxTimeoutms = 60000
MaxBatchItems = 32000
MaxProviderRequests = 4294967295
Client
NetworkDelayms = 5000
URLPrefix = wsman
AllowUnencrypted = false
Auth
Basic = true
Digest = true
Kerberos = true
Negotiate = true
Certificate = true
CredSSP = false
DefaultPorts
HTTP = 5985
HTTPS = 5986
TrustedHosts
Service
RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
EnumerationTimeoutms = 240000
MaxConnections = 300
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = false
Auth
Basic = false
Kerberos = true
Negotiate = true
Certificate = false
CredSSP = false
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985
HTTPS = 5986
IPv4Filter = *
IPv6Filter = *
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
AllowRemoteAccess = true
Winrs
AllowRemoteShellAccess = true
IdleTimeout = 7200000
MaxConcurrentUsers = 10
MaxShellRunTime = 2147483647
MaxProcessesPerShell = 25
MaxMemoryPerShellMB = 1024
MaxShellsPerUser = 30
On Client(My machine):
PS C:\windows\system32> Test-WSMan -ComputerName "DBServer"
Test-WSMan : <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/
1/wsmanfault" Code="2150859046" Machine="MyMachine"><f:Message>WinRM cannot
complete the operation. Verify that the specified computer name is valid, that
the computer is accessible over the network, and that a firewall exception for
the WinRM service is enabled and allows access from this computer. By default,
the WinRM firewall exception for public profiles limits access to remote
computers within the same local subnet. </f:Message></f:WSManFault>
At line:1 char:1
+ Test-WSMan -ComputerName "DBServer"
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (DBServer:String) [Test-WSMan], InvalidOperationException
+ FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.TestWSManCommand
PS C:\windows\system32> winrm e winrm/config/listener
Listener
Address = *
Transport = HTTP
Port = 5985
Hostname
Enabled = true
URLPrefix = wsman
CertificateThumbprint
ListeningOn = 127.0.0.1, 128.1.60.202, ::1
PS C:\windows\system32> winrm quickconfig
WinRM service is already running on this machine.
WinRM is already set up for remote management on this computer.
Firewall ports for WinRM are open for both HTTP and HTTPS.
Can anyone help with this issue?
Note: The following is likely not the root cause of your problem, but it explains the error message you saw on the server, which is cryptic and deserves an explanation (a simple typo confounded me too and led me here).
You have a typo in the resource URI you're passing to winrm e on your server:
winrm e winrm/config/listerner # note the extra "r"
should be:
winrm e winrm/config/listener
Unfortunately, **a reference to a nonexistent resource URI results in the following cryptic error message:
WSManFault
Message
ProviderFault
WSManFault
Message = The WS-Management service cannot process the request. The resource URI does not support the Enumerate operation.
Error number: -2144108495 0x80338031
The WS-Management service cannot process the request because the WS-Addressing Action URI in the request is not compatible with the resource.

Resources