What is causing the systemd error: "Timer unit lacks value setting" (no other error messages?) - systemd

I have the following timer unit:
[Unit]
Description=Timer for Hive management command: process_payments
[Timer]
Unit=hive-manage#process_payments.service
OnCalendar=*:0/20
[Install]
WantedBy=hive.target
When I check the timer status using systemctl status hive-manage#process-payments.timer, I see the following error in the logs:
● hive-manage#process-payments.timer - Timer for Hive management command: process-payments
Loaded: error (Reason: Invalid argument)
Active: inactive (dead)
Mar 02 21:28:39 boldidea systemd[1]: hive-manage#process-payments.timer: Timer unit lacks value setting. Refusing.
Mar 02 21:39:06 boldidea systemd[1]: hive-manage#process-payments.timer: Timer unit lacks value setting. Refusing.
Mar 02 21:39:27 boldidea systemd[1]: hive-manage#process-payments.timer: Timer unit lacks value setting. Refusing.
After some searching, most people get an accompanying message that gives more detail on the error, however I am not getting any context other than "Timer unit lacks value setting".
This error is not very helpful -- I'm unaware of any setting named "value".

It turns out I had an older unit called process-payments, and it was later renamed to process_payments (underscore instead of hyphen). I was referencing the old name in my systemctl status command.

Related

Kubernetes Pod terminates with Exit Code 143

I am using a containerized Spring boot application in Kubernetes. But the application automatically exits and restarts with exit code 143 and error message "Error".
I am not sure how to identify the reason for this error.
My first idea was that Kubernetes stopped the container due to too high resource usage, as described here, but I can't see the corresponding kubelet logs.
Is there any way to identify the cause/origin of the SIGTERM? Maybe from spring-boot itself, or from the JVM?
Exit Code 143
It denotes that the process was terminated by an external signal.
The number 143 is a sum of two numbers: 128+x, # where x is the signal number sent to the process that caused it to terminate.
In the example, x equals 15, which is the number of the SIGTERM signal, meaning the process was killed forcibly.
Hope this helps better.
I've just run into this exact same problem. I was able to track down the origin of the Exit Code 143 by looking at the logs on the Kubernetes nodes (note, the logs on the node not the pod). (I use Lens as an easy way to get a node shell but there are other ways)
Then if you look in /var/log/messages for terminated you'll see something like this:
Feb 2 11:52:27 np-26992252-3 kubelet[23125]: I0202 11:52:27.541751 23125 kubelet.go:2214] "SyncLoop (probe)" probe="liveness" status="unhealthy" pod="default/app-compute-deployment-56ccffd87f-8s78v"
Feb 2 11:52:27 np-26992252-3 kubelet[23125]: I0202 11:52:27.541920 23125 kubelet.go:2214] "SyncLoop (probe)" probe="readiness" status="" pod="default/app-compute-deployment-56ccffd87f-8s78v"
Feb 2 11:52:27 np-26992252-3 kubelet[23125]: I0202 11:52:27.543274 23125 kuberuntime_manager.go:707] "Message for Container of pod" containerName="app" containerStatusID={Type:containerd ID:c3426d6b07fe3bd60bcbe675bab73b6b4b3619ef4639e1c23bca82692633765e} pod="default/app-comp
ute-deployment-56ccffd87f-8s78v" containerMessage="Container app failed liveness probe, will be restarted"
Feb 2 11:52:27 np-26992252-3 kubelet[23125]: I0202 11:52:27.543374 23125 kuberuntime_container.go:723] "Killing container with a grace period" pod="default/app-compute-deployment-56ccffd87f-8s78v" podUID=89fdc1a2-3a3b-4d57-8a4d-ab115e52dc85 containerName="app" containerID="con
tainerd://c3426d6b07fe3bd60bcbe675bab73b6b4b3619ef4639e1c23bca82692633765e" gracePeriod=30
Feb 2 11:52:27 np-26992252-3 containerd[22741]: time="2023-02-02T11:52:27.543834687Z" level=info msg="StopContainer for \"c3426d6b07fe3bd60bcbe675bab73b6b4b3619ef4639e1c23bca82692633765e\" with timeout 30 (s)"
Feb 2 11:52:27 np-26992252-3 containerd[22741]: time="2023-02-02T11:52:27.544593294Z" level=info msg="Stop container \"c3426d6b07fe3bd60bcbe675bab73b6b4b3619ef4639e1c23bca82692633765e\" with signal terminated"
The bit to look out for is containerMessage="Container app failed liveness probe, will be restarted"

GCP / exporting disk image to Storage bucket fails

I'm trying to export a disk image I've build in GCP and export it as a vmdk to a storage bucket.
The export through an error message complaining about service account not found. I can't remember having deleted such a user account. For me it should exist since the creation of the project.
How can I re-create a default service account without taking the risk to loose all my compute engine resources? Which roles should I give to this service account?
[image-export-ext.export-disk.setup-disks]: 2021-10-06T18:52:00Z CreateDisks: Creating disk "disk-export-disk-os-image-export-ext-export-disk-j8vpl".
[image-export-ext.export-disk.setup-disks]: 2021-10-06T18:52:00Z CreateDisks: Creating disk "disk-export-disk-buffer-j8vpl".
[image-export-ext.export-disk]: 2021-10-06T18:52:01Z Step "setup-disks" (CreateDisks) successfully finished.
[image-export-ext.export-disk]: 2021-10-06T18:52:01Z Running step "run-export-disk" (CreateInstances)
[image-export-ext.export-disk.run-export-disk]: 2021-10-06T18:52:01Z CreateInstances: Creating instance "inst-export-disk-image-export-ext-export-disk-j8vpl".
[image-export-ext]: 2021-10-06T18:52:07Z Error running workflow: step "export-disk" run error: step "run-export-disk" run error: operation failed &{ClientOperationId: CreationTimestamp: Description: EndTime:2021-10-06T11:52:07.153-07:00 Error:0xc000712230 HttpErrorMessage:BAD REQUEST HttpErrorStatusCode:400 Id:5314937137696624317 InsertTime:2021-10-06T11:52:02.707-07:00 Kind:compute#operation Name:operation-1633546321707-5cdb3a43ac385-839c7747-2ca655ee OperationGroupId: OperationType:insert Progress:100 Region: SelfLink:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b/operations/operation-1633546321707-5cdb3a43ac385-839c7747-2ca655ee StartTime:2021-10-06T11:52:02.708-07:00 Status:DONE StatusMessage: TargetId:840687976797195965 TargetLink:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b/instances/inst-export-disk-image-export-ext-export-disk-j8vpl User:494995903825#cloudbuild.gserviceaccount.com Warnings:[] Zone:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b ServerResponse:{HTTPStatusCode:200 Header:map[Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Wed, 06 Oct 2021 18:52:07 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} ForceSendFields:[] NullFields:[]}:
Code: EXTERNAL_RESOURCE_NOT_FOUND
Message: The resource '494995903825-compute#developer.gserviceaccount.com' of type 'serviceAccount' was not found.
[image-export-ext]: 2021-10-06T18:52:07Z Workflow "image-export-ext" cleaning up (this may take up to 2 minutes).
[image-export-ext]: 2021-10-06T18:52:08Z Workflow "image-export-ext" finished cleanup.
[image-export] 2021/10/06 18:52:08 step "export-disk" run error: step "run-export-disk" run error: operation failed &{ClientOperationId: CreationTimestamp: Description: EndTime:2021-10-06T11:52:07.153-07:00 Error:0xc000712230 HttpErrorMessage:BAD REQUEST HttpErrorStatusCode:400 Id:5314937137696624317 InsertTime:2021-10-06T11:52:02.707-07:00 Kind:compute#operation Name:operation-1633546321707-5cdb3a43ac385-839c7747-2ca655ee OperationGroupId: OperationType:insert Progress:100 Region: SelfLink:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b/operations/operation-1633546321707-5cdb3a43ac385-839c7747-2ca655ee StartTime:2021-10-06T11:52:02.708-07:00 Status:DONE StatusMessage: TargetId:840687976797195965 TargetLink:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b/instances/inst-export-disk-image-export-ext-export-disk-j8vpl **User:494995903825#cloudbuild.gserviceaccount.com** Warnings:[] Zone:https://www.googleapis.com/compute/v1/projects/savvy-bonito-207708/zones/us-east1-b ServerResponse:{HTTPStatusCode:200 Header:map[Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Wed, 06 Oct 2021 18:52:07 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} ForceSendFields:[] NullFields:[]}: Code: EXTERNAL_RESOURCE_NOT_FOUND; Message: The resource **'494995903825-compute#developer.gserviceaccount.com' of type 'serviceAccount' was not found.**
ERROR
ERROR: build step 0 "gcr.io/compute-image-tools/gce_vm_image_export:release" failed: step exited with non-zero status: 1
Go to IAM & Admin > IAM and check whether your default SA is there.
If deleted you can recover within 30 days.
How to check if it is deleted?
To recover. One cannot recover a default compute service account after 30 days.
If all the above fails, then you might need to go the custom SA route, or share an image with a project that has a default service account.

Issue installing openwhisk with incubator-openwhisk-devtools

I have a blocking issue installing openwhisk with docker
I typed make quick-start right after a git pull of the project incubator-openwhisk-devtools. My OS is Fedora 29, docker version is 18.09.0, docker-compose version is 1.22.0. JDk 8 Oracle.
I get the following error:
[...]
adding the function to whisk ...
ok: created action hello
invoking the function ...
error: Unable to invoke action 'hello': The server is currently unavailable (because it is overloaded or down for maintenance). (code ciOZDS8VySDyVuETF14n8QqB9wifUboT)
[...]
[ERROR] [#tid_sid_unknown] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for health-0: 30069 ms has passed since batch creation plus linger time
[ERROR] [#tid_sid_unknown] [KafkaProducerConnector] sending message on topic 'health' failed: Expiring 1 record(s) for health-0: 30009 ms has passed since batch creation plus linger time
Please note that controller-local-logs.log is never created.
If I issue a touch controller-local-logs.log in the right directory the log file is always empty after I try to issue make quick-start again.
http://localhost:8888/ping gives me the right answer: pong.
http://localhost:9222 is not reacheable.
Where am I wrong?
Thank you in advance

Postgres Backup Restoration Issue

 my  oobjective is simple,  just a backup and retsore it  on other machine , which have no raltion with running cluter .
My steps .
1.  Remotly pg_basebackup on new machine .
2.  rm -fr ../../main/
3.  mv bacnkup/main/ ../../main/
4.  start postgres service
** During backup no error occur. **
But getting error:
2018-12-13 10:05:12.437 IST [834] LOG: database  system was shut down in recovery at 2018-12-12 23:01:58 IST
2018-12-13 10:05:12.437 IST [834] LOG:  invalid primary  checkpoint record
2018-12-13 10:05:12.437 IST [834] LOG: invalid secondary checkpoint record
2018-12-13 10:05:12.437 IST [834] PANIC: could not locate a valid checkpoint record
 2018-12-13 10:05:12.556 IST [833] LOG: startup process (PID 834) was terminated by signal 6: Aborted
 2018-12-13 10:05:12.556 IST [833] LOG: aborting  startup due to startup process failure
 2018-12-13 10:05:12.557 IST [833] LOG: database  system is shut down
Based on the answer to a very similar question (How to mount a pg_basebackup on a stand alone server to retrieve accidently deleted data and on the fact that that answer helped me get this working glitch-free, the steps are:
do the basebackup, or copy/untar previously made one, to the right location /var/lib/postgresql/9.5/main
remove the file backup_label
run /usr/lib/postgresql/9.5/bin/pg_resetxlog -f /var/lib/postgresql/9.5/main
start postgres service
(replying to this old question because it is the first one I found when looking to find the solution to the same problem).

TFS lab - Aborting run: The start date cannot occur after the end date

Recently we are working on integrate telerik test case into TFS 2012 BDT workflow.
But when there is failed test in test plan, all test case followed failed test case will be aborted with following error:
"ID" "Date and time" "Message"
23 "09/23/2014 15:50:34" "Error saving the test results: The start date 2014/9/23 15:50:05 cannot occur after the end date 2014/9/23 7:50:34.
Will retry 3 time(s)."
24 "09/23/2014 15:51:05" "Error saving the test results: The start date 2014/9/23 15:50:05 cannot occur after the end date 2014/9/23 7:50:34.
Will retry 2 time(s)."
25 "09/23/2014 15:51:36" "Error saving the test results: The start date 2014/9/23 15:50:05 cannot occur after the end date 2014/9/23 7:50:34.
Will retry 1 time(s)."
26 "09/23/2014 15:52:06" "Error saving the test results: The start date 2014/9/23 15:50:05 cannot occur after the end date 2014/9/23 7:50:34.
Will retry 0 time(s)."
27 "09/23/2014 15:52:06" "Unexpected error occurred. Aborting run: The start date 2014/9/23 15:50:05 cannot occur after the end date 2014/9/23 7:50:34."
If all test cases pass, the issue will not occur.
If we run MSTest UI test even test case failed, no such issue will occur.
TFS version: 2012 RTM.
All TFS Lab are hyper-v VM.
Test controller, build controller are hyper-v VM.
All Build controller, test controller, test agent VM have China Beijing TimeZone setting.
We have tried to change timezone settings to UTC on Test controller, build controller, test agent VM, but the issue still exist.
Also we have checked Host server have china beijing timezone. TFS infrastructure are in domain of company. And time is correctly synced.
Some similar error in post, but we don't have any clue to fix it. Because all time are correctly synced.
Thanks M.Radwan.
We find out that our test agent is 2012 Update 4. But test controller is 2012.
After upgrade test controller to 2012 Update 4, the error is gone.

Resources