How to read TimeoutStartSec value in systemD configuration from Application via Dbus interfaces - systemd

In my service configuration TimeoutStartSec == 100s.
According to man page.. my Application need to notify to systemD sd_notify(READY=1) during <100s. If not service is put into failed state.
https://www.freedesktop.org/software/systemd/man/systemd.service.html
But in case of i want to do something ( eg just print out some log said : startup is not done in time ) . before my service is actually set to failed state .
Is there any change to do that...
My idea is create a timer which have same value with TimeoutStartSec == xx s
then i can manage to do something before timer expired.
But the question is TimeoutStartSec == xx is dynamicaly configured by user - in my project..
So i would expect some Dbus interface which will offer to read TimeoutStartSec from my application...
I checked
https://www.freedesktop.org/wiki/Software/systemd/dbus/
but did not found a corresponding property.
I am using systemD on Linux which freely use systemD Dbus interfaces.

I found solution .
SystemD actually provide that info
dbus-send --system --dest=org.freedesktop.systemd1 --print-reply /org/freedesktop/systemd1/unit/ServiceName_2eservice \
org.freedesktop.DBus.Properties.Get string:org.freedesktop.systemd1.Service string:TimeoutStartUSec
Note: your name of service need to modify to get exactly object path ServiceName.service adapt to ServiceName_2eservice

Related

Ambari User & Group Management for Custom Ambari Services

I have been working with Custom Ambari Services for quite some time. I have been able to install several different custom components. I have created several management packs and consider myself very experienced in making third party services work in Ambari.
Whenever I install a custom service I get a user KeyError, for example Elasticsearch:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module>
BeforeAnyHook().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
method(env)
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook
setup_users()
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 50, in setup_users
groups = params.user_to_groups_dict[user],
KeyError: u'elasticsearch'
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-15.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-15.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
A known work around is to execute a python command to turn off user/group management:
python /var/lib/ambari-server/resources/scripts/configs.py -u admin -p admin -n [CLUSTER_NAME] -l [CLUSTER_FQDN] -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v true
However, this leaves the cluster in an undesirable state if you want to install native services again. If I execute the python command to turn user/group management back on, the next native service install will again fail on the third party user key object error.
Is there a database table that contains the list or key value object of users and groups that ambari manages? Satisfying the original error seems like the only turnkey solution.
I have tried to locate the key value object myself, I have also tried creating the users groups, I have even tried modifying the agent/server code executing the install. Next I will try more but I thought maybe this would be a good first post for SO.
Stuck with the same error for a few hours, here is the result of the investigation.
First of all, we need to know that the Ambari has one main group for all services in stack.
Secondly, the creation of the user is quite hidden with one look you will never guess when and where the creation will be and from where the parameters will come.
And for the last there is quiestion, how we setup the params.user_to_groups_dict[user]?
The 'main group' is set in <stack_name>/<stack_version>/configuration/cluster-env.xml, for me it was HDP/3.0/configuration/cluster-env.xml:
<property>
<name>user_group</name>
<display-name>Hadoop Group</display-name>
<value>hadoop</value>
<property-type>GROUP</property-type>
<description>Hadoop user group.</description>
<value-attributes>
<type>user</type>
<overridable>false</overridable>
</value-attributes>
<on-ambari-upgrade add="true"/>
</property>
That parameter will be used everywhere in services to claim the group, for example zookeeper has the env.xml such as:
<property>
<name>zk_user</name>
<display-name>ZooKeeper User</display-name>
<value>sdp-zookeeper</value>
<property-type>USER</property-type>
<description>ZooKeeper User.</description>
<value-attributes>
<type>user</type>
<overridable>false</overridable>
<user-groups>
<property>
<type>cluster-env</type>
<name>user_group</name>
</property>
</user-groups>
</value-attributes>
<on-ambari-upgrade add="true"/>
</property>
And there is a magic in value-attributes: user-groups with one property that links to the cluster-env to user_group parameter. This is the connection that we are looking for.
The answer is setup the user parameter like zookeeper user.
The Wizard searches the stack and find the right users/groups to manage of the services you have chosen with wizard.
The map that contains the params.user_to_groups_dict will be created in runtime the cluster wizard and avalialbe at /var/lib/ambari-agent/data/command-xy.json:
"clusterLevelParams": {
"stack_version": "3.0",
"not_managed_hdfs_path_list": "[\"/tmp\"]",
"hooks_folder": "stack-hooks",
"stack_name": "HDP",
"group_list": "[\"sdp-hadoop\",\"users\"]",
"user_groups": "{\"httpfs\":[\"hadoop\"],\"ambari-qa\":[\"hadoop\",\"users\"],\"hdfs\":[\"hadoop\"],\"zookeeper\":[\"hadoop\"]}",
"cluster_name": "test",
"dfs_type": "HDFS",
"user_list": "[\"httpfs\",\"hdfs\",\"ambari-qa\",\"sdp-zookeeper\"]"
},

Terraform azurerm_virtual_machine_extension error "extension operations are disallowed"

I have written a Terraform template that creates an Azure Windows VM. I need to configure the VM to Enable PowerShell Remoting for the release pipeline to be able to execute Powershell scripts. After the VM is created I can RDP to the VM and do everything I need to do to enable Powershell remoting, however, it would be ideal if I could script all of that so it could be executed in a Release pipeline. There are two things that prevent that.
The first, and the topic of this question is, that I have to run "WinRM quickconfig". I have the template working such that when I do RDP to the VM, after creation, that when I run "WinRM quickconfig" I receive the following responses:
WinRM service is already running on this machine.
WinRM is not set up to allow remote access to this machine for management.
The following changes must be made:
Configure LocalAccountTokenFilterPolicy to grant administrative rights remotely to local users.
Make these changes [y/n]?
I want to configure the VM in Terraform so LocalAccountTokenFilterPolicy is set and it becomes unnecessary to RDP to the VM to run "WinRM quickconfig". After some research it appeared I might be able to do that using the resource azure_virtual_machine_extension. I add this to my template:
resource "azurerm_virtual_machine_extension" "vmx" {
name = "hostname"
location = "${var.location}"
resource_group_name = "${var.vm-resource-group-name}"
virtual_machine_name = "${azurerm_virtual_machine.vm.name}"
publisher = "Microsoft.Azure.Extensions"
type = "CustomScript"
type_handler_version = "2.0"
settings = <<SETTINGS
{
# "commandToExecute": "powershell Set-ItemProperty -Path 'HKLM:\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Policies\\System' -Name 'LocalAccountTokenFilterPolicy' -Value 1 -Force"
}
SETTINGS
}
When I apply this, I get the error:
Error: compute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="This operation cannot be performed when extension operations are disallowed. To allow, please ensure VM Agent is installed on the VM and the osProfile.allowExtensionOperations property is true."
I couldn't find any Terraform documentation that addresses how to set the allowExtensionOperations property to true. On a whim, I tried adding the property "allow_extension_operations" to the os_profile block in the azurerm_virtual_machine resource but it is rejected as an invalid property. I also tried adding it to the os_profile_windows_config block and isn't valid there either.
I found a statement on Microsoft's documentation regarding the osProfile.allowExtensionOperations property that says:
"This may only be set to False when no extensions are present on the virtual machine."
https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.management.compute.models.osprofile.allowextensionoperations?view=azure-dotnet
This implies to me that the property is True by default but it doesn't actually say that and it certainly isn't acting like that. Is there a way in Terraform to set osProfile.alowExtensionOperations to true?
Running into the same issue adding extensions using Terraform, i created a Windows 2016 custom image,
provider "azurerm" version ="2.0.0"
Terraform 0.12.24
Terraform apply error:
compute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=0
-- Original Error: autorest/azure: Service returned an error.
Status=<nil>
Code="OperationNotAllowed"
Message="This operation cannot be performed when extension operations are disallowed. To allow, please ensure VM Agent is installed on the VM and the osProfile.allowExtensionOperations property is true."
I ran into same error, possible solution depends on 2 things here.
You have to pass provider "azurerm" version ="2.5.0 and you have to pass os_profile_windows_config (see below) parameter in virtual machine resource as well. So, that terraform will consider the extensions that your are passing. This fixed my errors.
os_profile_windows_config {
provision_vm_agent = true
}

libnetwork: Error: unknown command "/var/run/docker/netns/582bd184e561" for "some_app"

I am trying to setup a network in the container (using Docker's libnetwork and libcontainer), but I keep running into this issue. As far as I can tell it's looking into some_app to get some sandbox information?
INFO[3808] No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]
INFO[3808] IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
Error: unknown command "/var/run/docker/netns/582bd184e561" for "some_app"
Run 'some_app --help' for usage.
ERRO[3808] Resolver Setup/Start failed for container 6b81802576bd4f16aa117061f81b5c3e, "setup not done yet"
ERRO[3808] failed to add interface vethef0a693 to sandbox: failed in prefunc: failed to set namespace on link "vethef0a693": invalid argument
ERRO[3808] failed to add interface vethef0a693 to sandbox: failed in prefunc: failed to set namespace on link "vethef0a693": invalid argument
I was wondering if anyone could help me make sense of this and perhaps prevent it. Are these two separate errors?
Thank you
Here is the library I am trying to use
It took me a while to figure this out, but here goes:
Just like in Docker, libnetwork creates a veth interface pair. It then moves one end of the veth pair into the container namespace. During this process libnetwork tries to execute commands registered at runtime on the current instance of the binary (some_app in this case).
These commands do not exist on the external interface of some_app however. They are injected later using a library called reexec. For this to work, reexec needs to be initialized like this:
if reexec.Init() {
return
}
Also note that according to this thread libnetwork is currently not supported for applications outside of Docker.
NB: I discovered this by reading the source code, so I might be wrong but my issue went away after this.

Informatica error 1417 :: Task not yet registered with this service process

I am getting following error while running a workflow in informatica.
Session task instance [worklet.session] : [TM_6775 The master DTM process was unable to connect to the master service process to update the session status with the following message: error message [ERROR: The session run for [Session task instance [worklet.session]] and [ folder id = 206, workflow id = 16042, workflow run id = 65095209, worklet run id = 65095337, task instance id = 13272 ] is not yet registered with this service process.] and error code [1417].]
This error comes randomly for many other sessions, when they are ran through workflow as a whole. However if I "start task" that failed task next time, it runs successfully.
Any help is much appreciated.
Just an idea to try if you use versioning. Check that everthing is checked in correctly. If the mapping, worflow or worklet is checked out then you and informatica will run different versions wich may cause the behaivour to differ when you start it manually.
Infromatica will allways use the checked in version and you will allways use the checked out version.

wsadmin + jython restart WAS appserver

Is it possible to stop/start WAS appserver using wsadmin (jacl/jython). I want to detele all caches on profile and then restart WAS appserver. I'm using wsadmin as standalone.
From wsadmin you may issue a command (using Jython):
AdminControl.invoke(AdminControl.queryNames('WebSphere:*,type=Server,node=%s,process=%s' % ('YourNodeName', 'YourServerName')), 'restart')
works with WAS Base & ND.
With ND you have another option:
AdminControl.invoke(AdminControl.queryNames('WebSphere:*,type=Server,node=%s,process=%s' % ('YourNodeName', 'YourServerName')), 'stop')
# now your server is stopped, you can do any cleanup
# and then start the server with NodeAgent
AdminControl.invoke(AdminControl.queryNames('WebSphere:*,type=NodeAgent,node=%s' % 'YourNodeName'), 'launchProcess', ['YourServerName'], ['java.lang.String'])
Check out the wsadminlib script. It has over 500 methods written for you to perform specific wsadmin tasks. Also check out related wsadminlib blog - you'll definitely want to view the powerpoint on this site to get an overview of usage.
You don't specify which caches you would like to clear. If you want to clear dynacache, wsadminlib offers clearDynaCache, clearAllProxyCaches, and others as well as server restart methods.
Example usage:
import sys
execfile('/opt/software/portalsoftware/wsadminlib/wsadminlib.py')
clearAllProxyCaches()
for (nodename,servername) in listAllAppServers():
clearDynaCache( nodename, servername, dynacachename )
save()
maxwaitseconds=300
restartServer( nodename, servername, maxwaitseconds)

Resources