High memory consumption in typescript - ts-node

I run a very simple test program and it consumes 130 MB of memory.
What caused this problem?
ts-node --files test.ts
// module: tset.ts
setTimeout(() => {
console.log('ok!')
}, 10000);

Ts-node need much memory for transfile (ts -> js) so it has big size than normal node execution env.

Related

Sometimes when I run `npx hardhat compile` I get this error: FATAL ERROR: NewNativeModule Allocation failed - process out of memor

Sometimes when I run this command npx hardhat compile on my windows cli I get the error below:
Compiling 72 files with 0.7.0
contracts/libraries/ERC20.sol: Warning: SPDX license identifier not provided in source file. Before publishing, consider adding a comment containing "SPDX-License-Identifier: <SPDX-License>" to each source file. Use "SPDX-License-Identifier: UNLICENSED" for non-open-source code.
Please see https://spdx.org for more information.
contracts/libraries/ERC1155/EnumerableSet.sol:158:5: Warning: Variable is shadowed in inline assembly by an instruction of the same name
function add(Bytes32Set storage set, bytes32 value) internal returns (bool) {
^ (Relevant source part starts here and spans across multiple lines).
contracts/libraries/ERC1155/EnumerableSet.sol:224:5: Warning: Variable is shadowed in inline assembly by an instruction of the same name
function add(AddressSet storage set, address value) internal returns (bool) {
^ (Relevant source part starts here and spans across multiple lines).
Compiling 1 file with 0.8.0
<--- Last few GCs --->
[8432:042B0460] 263058 ms: Mark-sweep (reduce) 349.8 (356.3) -> 248.2 (262.4) MB, 434.4 / 0.2 ms (+ 70.9 ms in 3 steps since start of marking, biggest step 69.6 ms, walltime since start of marking 800 ms) (average mu = 0.989, current mu = 0.990) memory[8432:042B0460] 263627
ms: Mark-sweep (reduce) 248.2 (259.4) -> 248.2 (252.1) MB, 555.5 / 0.0 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 556 ms) (average mu = 0.969, current
mu = 0.023) memory p
<--- JS stacktrace --->
FATAL ERROR: NewNativeModule Allocation failed - process out of memory`
After some time the error just kind of goes away.
It goes away probably after I've restarted my system or created a new Hardhat project and imported the code there.
But this is happening too often, what could be the cause?
I've done quite some research and some answers suggested it might be a problem with Node and the application's memory allocation, but I don't know how I would apply the solutions to a Hardhat project.
Here is a link to one possible solution: https://medium.com/#vuongtran/how-to-solve-process-out-of-memory-in-node-js-5f0de8f8464c
OS: WINDOWS 10
CLI: WINDOWS CMD

AWS Lambda Chalice Layers Segmentation Fault

I am deploying a Python 3.7 Lambda function via Chalice. Because the code with its environment requirements, is larger than 50 MB limit, I am using the "automatic_layer" feature of Chalice to generate the layer with the requirements, which is awswrangler.
Because the generated layer is > 50 MB, I am uploading the generated managed-layer-...-python3.7.zip manually to s3 and create a Lambda layer. Then I re-deploy with chalice, removing the automatic_layer option and setting the layers to the generated ARN of the layer I manually created.
The function deployed this way worked OK for a couple of times, then started failing occasionally with "Segmentation Fault". The error rate increased shortly and now it is failing 100%.
Traceback:
> OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
> START RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Version: $LATEST
> OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
> END RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc
> REPORT RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Duration: 7165.04 ms Billed Duration: 7166 ms Memory Size: 128 MB Max Memory Used: 41 MB
> RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Error: Runtime exited with error: signal: segmentation fault (core dumped)
> Runtime.ExitError
As awswrangler itself requires boto3 & botocore, and they are already in the Lambda environment, I suspected that there might be a conflict of different versions of boto. I tried the same flow by explicitly including boto3 and botocore in the requirements but I am still receiving the same segmentation fault error.
Any help is much appreciated.
You could use AWS X-Ray to get more information on the problem : https://docs.aws.amazon.com/lambda/latest/dg/python-tracing.html
Moreover you might analyze the core dump generated executing your lambda function on a bash shell:
ulimit -c unlimited
cd /tmp
ececute your python ...
You should find a file named /tmp/core..... that you should analyze with gdb after download. The command "man core" could help you.

"go test -cpuprofile" does not generate a full trace

Issue
I have a go package, with a test suite.
When I run the test suite for this package, the total runtime is ~ 7 seconds :
$ go test ./mydbpackage/ -count 1
ok mymodule/mydbpackage 7.253s
However, when I add a -cpuprofile=cpu.out option, the sampling does not cover the whole run :
$ go test ./mydbpackage/ -count 1 -cpuprofile=cpu.out
ok mymodule/mydbpackage 7.029s
$ go tool pprof -text -cum cpu.out
File: mydbpackage.test
Type: cpu
Time: Aug 6, 2020 at 9:42am (CEST)
Duration: 5.22s, Total samples = 780ms (14.95%) # <--- depending on the runs, I get 400ms to 1s
Showing nodes accounting for 780ms, 100% of 780ms total
flat flat% sum% cum cum%
0 0% 0% 440ms 56.41% testing.tRunner
10ms 1.28% 1.28% 220ms 28.21% database/sql.withLock
10ms 1.28% 2.56% 180ms 23.08% runtime.findrunnable
0 0% 2.56% 180ms 23.08% runtime.mcall
...
Looking at the collected samples :
# sample from another run :
$ go tool pprof -traces cpu.out | grep "ms " # get the first line of each sample
10ms runtime.nanotime
10ms fmt.(*readRune).ReadRune
30ms syscall.Syscall
10ms runtime.scanobject
10ms runtime.gentraceback
...
# 98 samples collected, for a total sum of 1.12s
The issue I see is : for some reason, the sampling profiler stops collecting samples, or is blocked/slowed down at some point.
Context
go version is 1.14.6, platform is linux/amd64
$ go version
go version go1.14.6 linux/amd64
This package contains code that interact with a database, and the tests are run against a live postgresql server.
One thing I tried : t.Skip() internally calls runtime.Goexit(), so I replaced calls to t.Skip and variants with a simple return ; but it didn't change the outcome.
Question
Why aren't more samples collected ? I there some known pattern that blocks/slows down the sampler, or terminates the sampler earlier than it should ?
#Volker guided me to the answer in his comments :
-cpuprofile creates a profile in which only goroutines actively using the CPU are sampled.
In my use case : my go code spends a lot of time waiting for the answers of the postgresql server.
Generating a trace using go test -trace=trace.out, and then extracting a network blocking profile using go tool trace -pprof=net trace.out > network.out yielded much more relevant information.
For reference, on top of opening the complete trace using go tool trace trace.out, here are the values you can pass to -pprof= :
from go tool trace docs :
net: network blocking profile
sync: synchronization blocking profile
syscall: syscall blocking profile
sched: scheduler latency profile

Spark - Container is running beyond physical memory limits

I have a cluster of two worker nodes.
Worker_Node_1 - 64GB RAM
Worker_Node_2 - 32GB RAM
Background Summery :
I am trying to execute spark-submit on yarn-cluster to run Pregel on a Graph to calculate the shortest path distances from one source vertex to all other vertices and print the values on console.
Experment :
For Small graph with 15 vertices execution completes application final status : SUCCEEDED
My code works perfectly and prints shortest distance for 241 vertices graph for single vertex as source vertex but there is a problem.
Problem :
When I dig into the Log file the task gets complete successfully in 4 mins and 26 Secs but still on the terminal it keeps on showing application status as Running and after approx 12 more minutes task execution terminates saying -
Application application_1447669815913_0002 failed 2 times due to AM Container for appattempt_1447669815913_0002_000002 exited with exitCode: -104 For more detailed output, check application tracking page:http://myserver.com:8088/proxy/application_1447669815913_0002/
Then, click on links to logs of each attempt.
Diagnostics: Container [pid=47384,containerID=container_1447669815913_0002_02_000001] is running beyond physical memory limits. Current usage: 17.9 GB of 17.5 GB physical memory used; 18.7 GB of 36.8 GB virtual memory used. Killing container.
Dump of the process-tree for container_1447669815913_0002_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 47387 47384 47384 47384 (java) 100525 13746 20105633792 4682973 /usr/lib/jvm/java-7-oracle-cloudera/bin/java -server -Xmx16384m -Djava.io.tmpdir=/yarn/nm/usercache/cloudera/appcache/application_1447669815913_0002/container_1447669815913_0002_02_000001/tmp -Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://myserver.com:8020/user/spark/applicationHistory -Dspark.executor.memory=14g -Dspark.shuffle.service.enabled=false -Dspark.yarn.executor.memoryOverhead=2048 -Dspark.yarn.historyServer.address=http://myserver.com:18088 -Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native -Dspark.shuffle.service.port=7337 -Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/spark/lib/spark-assembly.jar -Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.authenticate=false -Dspark.app.name=com.path.PathFinder -Dspark.master=yarn-cluster -Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native -Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1447669815913_0002/container_1447669815913_0002_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class com.path.PathFinder --jar file:/home/cloudera/Documents/Longest_Path_Data_1/Jars/ShortestPath_Loop-1.0.jar --arg /home/cloudera/workspace/Spark-Integration/LongestWorstPath/configFile --executor-memory 14336m --executor-cores 32 --num-executors 2
|- 47384 47382 47384 47384 (bash) 2 0 17379328 853 /bin/bash -c LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native::/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native /usr/lib/jvm/java-7-oracle-cloudera/bin/java -server -Xmx16384m -Djava.io.tmpdir=/yarn/nm/usercache/cloudera/appcache/application_1447669815913_0002/container_1447669815913_0002_02_000001/tmp '-Dspark.eventLog.enabled=true' '-Dspark.eventLog.dir=hdfs://myserver.com:8020/user/spark/applicationHistory' '-Dspark.executor.memory=14g' '-Dspark.shuffle.service.enabled=false' '-Dspark.yarn.executor.memoryOverhead=2048' '-Dspark.yarn.historyServer.address=http://myserver.com:18088' '-Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native' '-Dspark.shuffle.service.port=7337' '-Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/spark/lib/spark-assembly.jar' '-Dspark.serializer=org.apache.spark.serializer.KryoSerializer' '-Dspark.authenticate=false' '-Dspark.app.name=com.path.PathFinder' '-Dspark.master=yarn-cluster' '-Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native' '-Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/lib/native' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1447669815913_0002/container_1447669815913_0002_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.path.PathFinder' --jar file:/home/cloudera/Documents/Longest_Path_Data_1/Jars/ShortestPath_Loop-1.0.jar --arg '/home/cloudera/workspace/Spark-Integration/LongestWorstPath/configFile' --executor-memory 14336m --executor-cores 32 --num-executors 2 1> /var/log/hadoop-yarn/container/application_1447669815913_0002/container_1447669815913_0002_02_000001/stdout 2> /var/log/hadoop-yarn/container/application_1447669815913_0002/container_1447669815913_0002_02_000001/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
Things I tried :
yarn.schedular.maximum-allocation-mb – 32GB
mapreduce.map.memory.mb = 2048 (Previously it was 1024)
Tried varying --driver-memory upto 24g
Could you please put more color on to how I can configure the Resource Manager so that Large Size Graphs ( > 300K vertices) can also be processed? Thanks.
Just increase default conf of spark.driver.memory from 512m to 2g solve this error in my case.
You may set the memory to higher if it keeps hitting the same error. Then, you can keep reducing it until it hits the same error so that you know the optimum driver memory to use for your job.
The more data you are processing, the more memory is needed by each Spark task. And if your executor is running too many tasks then it can run out of memory. When I had problems processing large amounts of data, it usually was a result of not properly balancing the number of cores per executor. Try to either reduce the number of cores or increase the executor memory.
One easy way to tell that you are having memory issues is to check the Executor tab on the Spark UI. If you see a lot of red bars indicating high garbage collection time, you are probably running out of memory in your executors.
I slove the error in my case to increase conf of spark.yarn.executor.memoryOverhead Which stand for off-heap memory
When you increase the amount of driver-memory and executor-memory, do not forget this config item
I have similar problem :
Key error info:
exitCode: -104
'PHYSICAL' memory limit
Application application_1577148289818_10686 failed 2 times due to AM Container for appattempt_1577148289818_10686_000002 exited with **exitCode: -104**
Failing this attempt.Diagnostics: [2019-12-26 09:13:54.392]Container [pid=18968,containerID=container_e96_1577148289818_10686_02_000001] is running 132722688B beyond the **'PHYSICAL' memory limit**. Current usage: 1.6 GB of 1.5 GB physical memory used; 4.6 GB of 3.1 GB virtual memory used. Killing container.
Increase both spark.executor.memory and spark.executor.memoryOverhead didn't take effect .
Then I increase spark.driver.memory solved it.
Spark jobs ask for resources from resource manager in a different way from MapReduce jobs. Try to tune the number of executors and mem/vcore allocated to each executor. Follow http://spark.apache.org/docs/latest/submitting-applications.html

Container is running beyond physical memory. Hadoop Streaming python MR

I am running a Python Script which needs a file (genome.fa) as a dependency(reference) to execute. When I run this command :
hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/had oop-streaming-2.5.1.jar -file ./methratio.py -file '../Test_BSMAP/genome.fa' - mapper './methratio.py -r -g ' -input /TextLab/sravisha_test/SamFiles/test_sam -output ./outfile
I am getting this Error:
15/01/30 10:48:38 INFO mapreduce.Job: map 0% reduce 0%
15/01/30 10:52:01 INFO mapreduce.Job: Task Idattempt_1422600586708_0001_m_000 009_0, Status : FAILED
Container [pid=22533,containerID=container_1422600586708_0001_01_000017] is running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
I am using Cloudera Manager (Free Edition) .These are my config :
yarn.app.mapreduce.am.resource.cpu-vcores = 1
ApplicationMaster Java Maximum Heap Size = 825955249 B
mapreduce.map.memory.mb = 1GB
mapreduce.reduce.memory.mb = 1 GB
mapreduce.map.java.opts = -Djava.net.preferIPv4Stack=true
mapreduce.map.java.opts.max.heap = 825955249 B
yarn.app.mapreduce.am.resource.mb = 1GB
Java Heap Size of JobHistory Server in Bytes = 397 MB
Can Someone tell me why I am getting this error ??
I think your python script is consuming a lot of memory during the reading of your large input file (clue: genome.fa).
Here is my reason (Ref: http://courses.coreservlets.com/Course-Materials/pdf/hadoop/04-MapRed-6-JobExecutionOnYarn.pdf, Container is running beyond memory limits, http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/)
Container’s Memory Usage = JVM Heap Size + JVM Perm Gen + Native Libraries + Memory used by spawned processes
The last variable 'Memory used by spawned processes' (the Python code) might be the culprit.
Try increasing the mem size of these 2 parameters: mapreduce.map.java.opts
and mapreduce.reduce.java.opts.
Try increasing the maps spawning at the time of execution ... you can increase no. of mappers by decreasing the split size... mapred.max.split.size ...
It will have overheads but will mitigate the problem ....

Resources