running lambda function from container image which runs bash script after invokes. script working perfctly but still it throw Runtime.ExitError in end - bash

I am running a lamba function from container image. basically this image runs a bash script while someone invokes it. So the script is working perfctly but still it throw Runtime.ExitError error at the end. How can I fix this?
START RequestId: 13b31468-aa8d-45cc-9c9a-b4fcaaca9b79 Version: $LATEST
main.sh
2022-04-05 16:29:09 arghyadockercli01
2022-03-31 17:38:17 aws-cloudtrail-logs-201043775914-116d21af
2022-04-05 18:45:45 dkabdjkdjse
2022-04-04 14:31:41 dsfdhfjdhdhhjkjfhjdhfj
2022-04-04 05:13:06 ghfjgfhjg
2022-04-01 14:10:49 s3-trail-log-1
2022-03-31 15:46:30 s3trail-bucket
file copied 01
main.sh
2022-04-05 16:29:09 arghyadockercli01
2022-03-31 17:38:17 aws-cloudtrail-logs-201043775914-116d21af
2022-04-05 18:45:45 dkabdjkdjse
2022-04-04 14:31:41 dsfdhfjdhdhhjkjfhjdhfj
2022-04-04 05:13:06 ghfjgfhjg
2022-04-01 14:10:49 s3-trail-log-1
2022-03-31 15:46:30 s3trail-bucket
file copied 01
END RequestId: 13b31468-aa8d-45cc-9c9a-b4fcaaca9b79
REPORT RequestId: 13b31468-aa8d-45cc-9c9a-b4fcaaca9b79 Duration: 11095.96 ms Billed Duration: 11096 ms Memory Size: 128 MB Max Memory Used: 49 MB
RequestId: 13b31468-aa8d-45cc-9c9a-b4fcaaca9b79 Error: Runtime exited without providing a reason
Runtime.ExitError

Related

Compute Engine start node server on instance startup

I am trying to run a discord bot in a node application in a free Compute Engine instance. I am struggling to make a script that actually starts the node app.
I created this script and added it as startup-script metadata from file:
cd code/movo-tron-2000 && npm start &
I checked that the script runs with sudo google_metadata_script_runner --script-type startup --debug, but when I restart the instance, the app doesn't start. Running sudo journalctl -u google-startup-scripts.service prints the following logs:
Apr 20 12:19:08 bot-vm systemd[1]: Starting Google Compute Engine Startup Scripts...
Apr 20 12:19:09 bot-vm startup-script[691]: INFO Starting startup scripts.
Apr 20 12:19:09 bot-vm startup-script[691]: INFO Found startup-script in metadata.
Apr 20 12:19:09 bot-vm startup-script[691]: INFO startup-script: /startup-od52epug/tmpjy_z4vue: line 1: cd: code/mo
Apr 20 12:19:09 bot-vm startup-script[691]: INFO startup-script: Return code 0.
Apr 20 12:19:09 bot-vm startup-script[691]: INFO Finished running startup scripts.
Apr 20 12:19:09 bot-vm systemd[1]: Started Google Compute Engine Startup Scripts.
I see that the script gets executed, but also gets terminated. The app listened for requests, so it shouldn't get terminated in order to run. I assume that the startup script gets run on the same thread as the google compute engine startup script so it gets terminated in order to continue the vm boot. What should I change in my startup script in order to start my app properly and not have it terminated by the instance?
Edit: I set up the following systemd service and script at their corresponding locations
Service:
[Unit]
Description=Start bot
[Service]
ExecStart=/home/me_adi_hf/code/movo-tron-2000/start.sh
[Install]
WantedBy=default.target
Script:
#!/bin/sh
date > /root/bot_report.txt
du -sh /home/ >> /root/bot_report.txt
But when running sudo systemd start bot.service and then checking it's status with sudo systemd status bot.service am getting this output, indicating Exec format error:
bot-start.service - Start bot
Loaded: loaded (/etc/systemd/system/bot-start.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2020-04-21 09:43:01 UTC; 9s ago
Process: 19303 ExecStart=/home/me_adi_hf/code/movo-tron-2000/start.sh (code=exited, status=203/EXEC)
Main PID: 19303 (code=exited, status=203/EXEC)
Apr 21 09:43:01 bot-vm systemd[1]: Started Start bot.
Apr 21 09:43:01 bot-vm systemd[19303]: bot-start.service: Failed at step EXEC spawning /home/me_adi_hf/code/movo-tron-2000/start.sh: Exec format error
Apr 21 09:43:01 bot-vm systemd[1]: bot-start.service: Main process exited, code=exited, status=203/EXEC
Apr 21 09:43:01 bot-vm systemd[1]: bot-start.service: Unit entered failed state.
Apr 21 09:43:01 bot-vm systemd[1]: bot-start.service: Failed with result 'exit-code'.
I am not sure what causes the error, since the service file syntax looks correct

docker installation of openproject: Phusion passenger fails to start after installation

I am trying to install openproject using docker on centos7.6 but Phusion passenger fails to start after installation. Error is suggesting it failed to parse response.
The preloader process sent an unparseable response:. I don't know how to fix this issue.
stdout:
-----> Database setup finished.
On first installation, the default admin credentials are login: admin, password: admin
-----> Launching supervisord...
2019-05-08 08:14:46,313 CRIT Supervisor running as root (no user in config file)
2019-05-08 08:14:46,318 INFO supervisord started with pid 1
2019-05-08 08:14:47,321 INFO spawned: 'postgres' with pid 155
2019-05-08 08:14:47,325 INFO spawned: 'apache2' with pid 156
2019-05-08 08:14:47,328 INFO spawned: 'web' with pid 157
2019-05-08 08:14:47,331 INFO spawned: 'worker' with pid 158
2019-05-08 08:14:47,351 INFO spawned: 'postfix' with pid 159
2019-05-08 08:14:47,360 INFO spawned: 'memcached' with pid 160
2019-05-08 08:14:47.634 UTC [172] LOG: database system was shut down at 2019-05-08 08:14:44 UTC
2019-05-08 08:14:47,634 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2019-05-08 08:14:47.649 UTC [172] LOG: MultiXact member wraparound protections are now enabled
2019-05-08 08:14:47.653 UTC [155] LOG: database system is ready to accept connections
2019-05-08 08:14:47.663 UTC [177] LOG: autovacuum launcher started
2019-05-08 08:14:48,670 INFO success: postgres entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: web entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: worker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: memcached entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
2019-05-08 08:14:50,198 INFO exited: postfix (exit status 0; expected)
--> Downloading a Phusion Passenger agent binary for your platform
--> Installing Nginx 1.15.8 engine
--------------------------
[passenger_native_support.so] trying to compile for the current user (app) and Ruby interpreter...
(set PASSENGER_COMPILE_NATIVE_SUPPORT_BINARY=0 to disable)
Compilation successful. The logs are here:
/tmp/passenger_native_support-15tsfhk.log
[passenger_native_support.so] successfully loaded.
=============== Phusion Passenger Standalone web server started ===============
PID file: /app/tmp/pids/passenger.8080.pid
Log file: /app/log/passenger.8080.log
Environment: production
Accessible via: http://0.0.0.0:8080/
You can stop Phusion Passenger Standalone by pressing Ctrl-C.
Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/
===============================================================================
[ N 2019-05-08 08:15:01.7338 404/Tb age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
Forcefully loading the application. Use :environment to avoid eager loading.
[auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
hook registered
App 439 output: [auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
App 439 output: hook registered
Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
App 439 output: Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
[Worker(host:d0b3748f627a pid:158)] Starting job worker
2019-05-08T08:15:45+0000: [Worker(host:d0b3748f627a pid:158)] Starting job worker
App 439 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `handle_spawn_command'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:78:in `accept_and_process_next_client'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:167:in `run_main_loop'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:207:in `<module:App>'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:30:in `<module:PhusionPassenger>'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:29:in `<main>'
[ E 2019-05-08 08:15:46.6971 404/Tc age/Cor/App/Implementation.cpp:221 ]: Could not spawn process for application /app: The preloader process sent an unparseable response:
Error ID: d7825364
Error details saved to: /tmp/passenger-error-wjSTKF.html
[ E 2019-05-08 08:15:46.7028 404/T8 age/Cor/Con/CheckoutSession.cpp:276 ]: [Client 1-1] Cannot checkout session because a spawning error occurred. The identifier of the error is d7825364. Please see earlier logs for details about the error.
[ W 2019-05-08 08:34:24.7967 404/Tk age/Cor/Spa/SmartSpawner.h:572 ]: An error occurred while spawning an application process: Cannot connect to Unix socket '/tmp/passenger.PKROzbY/apps.s/preloader.hyl9g8': No such file or directory (errno=2)
[ W 2019-05-08 08:34:24.7968 404/Tk age/Cor/Spa/SmartSpawner.h:574 ]: The application preloader seems to have crashed, restarting it and trying again...
App 543 output: [auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
App 543 output: hook registered
App 543 output: Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
App 543 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `handle_spawn_command'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:78:in `accept_and_process_next_client'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:167:in `run_main_loop'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:207:in `<module:App>'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:30:in `<module:PhusionPassenger>'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:29:in `<main>'
[ E 2019-05-08 08:34:52.2521 404/Tk age/Cor/App/Implementation.cpp:221 ]: Could not spawn process for application /app: The preloader process sent an unparseable response:
Error ID: c2ce0823
Error details saved to: /tmp/passenger-error-bpsfAC.html
[ E 2019-05-08 08:34:52.2570 404/T8 age/Cor/Con/CheckoutSession.cpp:276 ]: [Client 1-2] Cannot checkout session because a spawning error occurred. The identifier of the error is c2ce0823. Please see earlier logs for details about the error.
Thanks.
The import line in the log is this one:
App 439 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
This means your container is unable to allocate necessary memory. It could be that your system is in a OOM state and things are being killed or due to some other restriction on the daemon that prevents it from allocating additional memory
For reference:
https://success.docker.com/article/docker-daemon-error-cannot-allocate-memory

Hadoop Container failed even 100 percent completed

I have setup a small cluster Hadoop 2.7, Hbase 0.98 and Nutch 2.3.1. I have wrote a custom job that simple first combine docs of same domain, after that each URL of domain (from cache i.e., a list) is first obtained from from cache and then corresponding key is used to fetched the object via datastore.get(url_key) and then after updating score, it is written via context.write.
The job should complete after all docs are processed but what I have observed that each attempt if failed due to timeout and progress is 100 percent complete show. Here is the LOG
attempt_1549963404554_0110_r_000001_1 100.00 FAILED reduce > reduce node2:8042 logs Thu Feb 21 20:50:43 +0500 2019 Fri Feb 22 02:11:44 +0500 2019 5hrs, 21mins, 0sec AttemptID:attempt_1549963404554_0110_r_000001_1 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000001_3 100.00 FAILED reduce > reduce node1:8042 logs Fri Feb 22 04:39:08 +0500 2019 Fri Feb 22 07:25:44 +0500 2019 2hrs, 46mins, 35sec AttemptID:attempt_1549963404554_0110_r_000001_3 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000002_0 100.00 FAILED reduce > reduce node3:8042 logs Thu Feb 21 12:38:45 +0500 2019 Thu Feb 21 22:50:13 +0500 2019 10hrs, 11mins, 28sec AttemptID:attempt_1549963404554_0110_r_000002_0 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
What it is so i.e., when an attempt is 100.00 percent complete then it should be marked as successfull. Unfortunately, there is any error information other than timeout for my case. How to debug this problem ?
My reducer is somewhat posted to another question
Apache Nutch 2.3.1 map-reduce timeout occurred while updating the score
I have observed that, in the mentioned 3 logs the time required for execution is varied with big difference. Please look upto the job which you are executing once.

Spark streaming from Kafka returns result on local but Not working on Yarn

I am using Cloudera's VM CDH 5.12, spark v1.6, kafka(installed by yum) v0.10 and python 2.66 and scala 2.10
Below is a simple spark application that I am running. It takes events from kafka and prints it after map reduce.
from __future__ import print_function
import sys
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: kafka_wordcount.py <zk> <topic>", file=sys.stderr)
exit(-1)
sc = SparkContext(appName="PythonStreamingKafkaWordCount")
ssc = StreamingContext(sc, 1)
zkQuorum, topic = sys.argv[1:]
kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1})
lines = kvs.map(lambda x: x[1])
counts = lines.flatMap(lambda line: line.split(" ")) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a+b)
counts.pprint()
ssc.start()
ssc.awaitTermination()
When I submit above code using following command(local) it runs fine
spark-submit --master local[2] --jars /usr/lib/spark/lib/spark-examples.jar testfile.py <ZKhostname>:2181 <kafka-topic>
But when I submit same above code using following command(YARN) it doesn't work
spark-submit --master yarn --deploy-mode client --jars /usr/lib/spark/lib/spark-examples.jar testfile.py <ZKhostname>:2181 <kafka-topic>
Here is the log generated when ran on YARN(cutting them short, logs may differ from above mentioned spark settings):
INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.134.143
ApplicationMaster RPC port: 0
queue: root.cloudera
start time: 1515766709025
final status: UNDEFINED
tracking URL: http://quickstart.cloudera:8088/proxy/application_1515761416282_0010/
user: cloudera
40 INFO YarnClientSchedulerBackend: Application application_1515761416282_0010 has started running.
40 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 53694.
40 INFO NettyBlockTransferService: Server created on 53694
53 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
54 INFO BlockManagerMasterEndpoint: Registering block manager quickstart.cloudera:56220 with 534.5 MB RAM, BlockManagerId(1, quickstart.cloudera, 56220)
07 INFO ReceiverTracker: Starting 1 receivers
07 INFO ReceiverTracker: ReceiverTracker started
07 INFO PythonTransformedDStream: metadataCleanupDelay = -1
07 INFO KafkaInputDStream: metadataCleanupDelay = -1
07 INFO KafkaInputDStream: Slide time = 10000 ms
07 INFO KafkaInputDStream: Storage level = StorageLevel(false, false, false, false, 1)
07 INFO KafkaInputDStream: Checkpoint interval = null
07 INFO KafkaInputDStream: Remember duration = 10000 ms
07 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream#7137ea0e
07 INFO PythonTransformedDStream: Slide time = 10000 ms
07 INFO PythonTransformedDStream: Storage level = StorageLevel(false, false, false, false, 1)
07 INFO PythonTransformedDStream: Checkpoint interval = null
07 INFO PythonTransformedDStream: Remember duration = 10000 ms
07 INFO PythonTransformedDStream: Initialized and validated org.apache.spark.streaming.api.python.PythonTransformedDStream#de77734
10 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 5.8 KB, free 534.5 MB)
10 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.5 KB, free 534.5 MB)
20 INFO JobScheduler: Added jobs for time 1515766760000 ms
30 INFO JobScheduler: Added jobs for time 1515766770000 ms
40 INFO JobScheduler: Added jobs for time 1515766780000 ms
After this, the job just starts repeating following lines(after some delay set by stream context) and doesnt printout kafka's stream, whereas job on master local with the exact same code does.
Interestingly it prints following line every-time a kafka event occurs(picture is of increased spark memory settings)
Note that:
Data is in kafka and I can see that in consumer console
I have also tried increasing executor's momory(3g) and network timeout time(800s) but no success
Can you see application stdout logs through Yarn Resource Manager UI?
Follow your Yarn Resource Manager link.(http://localhost:8088).
Find your application in running applications list and follow application's link. (http://localhost:8088/application_1396885203337_0003/)
Open "stdout : Total file length is xxxx bytes" link to see log file on browser.
Hope this helps.
When in local mode the application runs in a single machine and you get to see all the prints given in the codes.When run on a cluster everything is in distributed mode and runs on different machines/cores an will not be able to see the print given
Try to get the logs generated by spark using command yarn logs -applicationId
It's possible, that your is an alias and it's not defined on yarn nodes, or is not resolved on the yarn nodes for other reasons.

Graphs generated but shows waiting for samples in JMeter

IHi I have the same question as JMeter: jp#gc Graphs Generator: I got .png just with text "Waiting for sample...". The jtl file has been created without empty line, have edited the user.properties file.
I followed the steps mentioned in this link for the graph generator.
sh jmeter -t /home/Annie/JMeter/grp.jmx -n -l /home/Annie/JMeter/g.jtl -JTEST_RESULTS_FILE=/home/Annie/JMeter/g.jtl
Creating summariser <summary>
Created the tree successfully using /home/Annie/JMeter/grp.jmx
Starting the test # Mon Oct 16 11:27:30 IST 2017 (1508133450438)
Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445
summary + 1 in 00:00:03 = 0.3/s Avg: 3133 Min: 3133 Max: 3133 Err: 0 (0.00%) Active: 2 Started: 2 Finished: 0
summary + 14 in 00:00:14 = 1.0/s Avg: 2731 Min: 2098 Max: 4216 Err: 0 (0.00%) Active: 0 Started: 5 Finished: 5
summary = 15 in 00:00:18 = 0.9/s Avg: 2757 Min: 2098 Max: 4216 Err: 0 (0.00%)
Tidying up ... # Mon Oct 16 11:27:48 IST 2017 (1508133468522)
... end of run
In log its showing :
WARN o.a.j.v.ViewResultsFullVisualizer:Error loading result renderer: org.apache.jmeter.visualizers.RenderInBrowser
java.lang.NoClassDefFoundError: javafx/embed/swing/JFXPanel
Caused by: java.lang.ClassNotFoundException: javafx.embed.swing.JFXPanel
What should be done to get the graph?
My expectation is that you are using OpenJDK on Linux which doesn't have JavaFX
Use your Linux distribution package manager to get Oracle Java 8 and make sure JMeter is configured to use Oracle Java instead of OpenJDK.
If you are trying to use PerfMon Metrics Collector Listener in GUI mode to test it - make sure JMeter test is running at this time as first of all it is a Listener therefore it needs to process sample events in order to display anything, it might be even a Dummy Sampler firing each N seconds. See How to Monitor Your Server Health & Performance During a JMeter Load Test guide for more details.

Resources