Recently I create my first serverless service. Nothing fancy, just a script that uses FFMPEG to encode some CCTV footage and reduce quality.
After a couple of days, I realize that many of my footage where not there. After stare for some time at the aws lambda metrics panel, I assume that the cause of the problem was too little time and to little memory, so I cranked up the timeout (8 minutes) and max memory (380MB). And then I left it work for a couple of days to see if it would get better.
Fast forward to today, I log in on aws and notice that I was never using more than 95MB of memory (as seen on the image).
Last logs
Is this right?
Also, looking at the graph, I notice that I still get some errors. Increase the timeout is the solution?
aws graphs
Sorry for the poor quality of the question, I really tried to lookup.
Available RAM is not the only thing that's influenced by the Memory setting, which is a bit counterintuitive at first. The memory setting in a Lambda function influences:
The amount of compute (vCPUs) that's available to the function
The amount of network throughput that's available
The amount of system memory / RAM that's available
All of these scale based on the memory. You can think of it like this: If you provision 128MB RAM, you also get about 1/8 of a vCPU, if you go to 256 MB, you get a quarter of a vCPU. These are not exact numbers, but it's a useful mental model. Somewhere between 1024 and 1280 MB you get a full vCPU and afterwards a second vCPU is added.
Your workload seems CPU intensive (since there's not GPU to offload it to), so I'd try to increase the Memory to give Lambda more compute power to see how it behaves.
Related
We have a spring boot application that runs on 20 servers and we have a balancer that redirects the requests to our servers.
Since last week we are having huge problems with CPU usage (100% in all VM's) almost simultaneously without having any noticeable increase in the incoming requests.
Before that, we had 8 VM's without any issue.
In peak hours we have 3-4 thousand users with 15-20k requests per 15 minutes.
I am sure it has something to do with the heap usage since all the CPU usage comes from the GC that tries to free up some memory.
At the moment, we isolated some requests that we thought might cause the problem to specific VM's and proved that those VM's are stable even though there is traffic. (In the picture below you can see the OLD_gen memory is stable in those VM's)
The Heap memory looked something like this
The memory continues to increase and there are two scenarios, it will either reach a point and after 10 minutes it will drop on its own at 500MB or it will stay there cause 100% CPU usage and stay there forever.
From the heap dumps that we have taken, it seems that most of the memory has been allocated in char[] instances.
We have used a variety of tools (VisualVM, JProfiler etc) in order to try to debug the issue without any luck.
I don't know if I am missing something obvious, or something else.
I also tried, to change GC algorithm to G1 from the default and disable hibernate query cache plan since a lot of our queries are using the in parameter for filtering.
UPDATE
We managed to reduce the number of requests in our most heavily used API Call and the OLD gen looks like that now. Is that normal?
Java cannot use terabytes of RAM because the GC pause is way too long (minutes). With the recent update to the Go GC, I'm wondering if its GC pauses are short enough for use with huge amounts of RAM, such as a couple of terabytes.
Are there any benchmarks of this yet? Can we use a garbage-collected language with this much RAM now?
tl;dr:
You can't use TBs of RAM with a single Go process right now. Max is 512 GB on Linux, and most that I've seen tested is 240 GB.
With the current background GC, GC workload tends to be more important than GC pauses.
You can understand GC workload as pointers * allocation rate / spare RAM. Of apps using tons of RAM, only those with few pointers or little allocation will have a low GC workload.
I agree with inf's comment that huge heaps are worth asking other folks about (or testing). JimB notes that Go heaps have a hard limit of 512 GB right now, and 18 240 GB is the most I've seen tested.
Some things we know about huge heaps, from the design document and the GopherCon 2015 slides:
The 1.5 collector doesn't aim to cut GC work, just cut pauses by working in the background.
Your code is paused while the GC scans pointers on the stack and in globals.
The 1.5 GC has a short pause on a GC benchmark with a roughly 18GB heap, as shown by the rightmost yellow dot along the bottom of this graph from the GopherCon talk:
Folks running a couple production apps that initially had about 300ms pauses reported drops to ~4ms and ~20ms. Another app reported their 95th percentile GC time went from 279ms to ~10ms.
Go 1.6 added polish and pushed some of the remaining work to the background. As a result, tests with heaps up to a bit over 200GB still saw a max pause time of 20ms, as shown in a slide in an early 2016 State of Go talk:
The same application that had 20ms pause times under 1.5 had 3-4ms pauses under 1.6, with about an 8GB heap and 150M allocations/minute.
Twitch, who use Go for their chat service, reported that by Go 1.7 pause times had been reduced to 1ms with lots of running goroutines.
1.8 took stack scanning out of the stop-the-world phase, bringing most pauses well under 1ms, even on large heaps. Early numbers look good. Occasionally applications still have code patterns that make a goroutine hard to pause, effectively lengthening the pause for all other threads, but generally it's fair to say the GC's background work is now usually much more important than GC pauses.
Some general observations on garbage collection, not specific to Go:
The frequency of collections depends on how quickly you use up the RAM you're willing to give to the process.
The amount of work each collection does depends in part on how many pointers are in use.
(That includes the pointers within slices, interface values, strings, etc.)
Rephrased, an application accessing lots of memory might still not have a GC problem if it only has a few pointers (e.g., it handles relatively few large []byte buffers), and collections happen less often if the allocation rate is low (e.g., because you applied sync.Pool to reuse memory wherever you were chewing through RAM most quickly).
So if you're looking at something involving heaps of hundreds of GB that's not naturally GC-friendly, I'd suggest you consider any of
writing in C or such
moving the bulky data out of the object graph. For example, you could manage data in an embedded DB like bolt, put it in an outside DB service, or use something like groupcache or memcache if you want more of a cache than a DB
running a set of smaller-heap'd processes instead of one big one
just carefully prototyping, testing, and optimizing to avoid memory issues.
The new Java ZGC garbage collector can now use 16 Terrabytes of memory and garbage collect in under 10ms.
When using the desktop PC's in my university (Which have 4Gb of ram), calculations in Matlab are fairly speedy, but on my laptop (Which also has 4Gb of ram), the exact same calculations take ages. My laptop is much more modern so I assume it also has a similar clock speed to the desktops.
For example, I have written a program that calculates the solid angle subtended by 50 disks at 500 points. On the desktop PC's this calculation takes about 15 seconds, on my laptop it takes about 5 minutes.
Is there a way to reduce the time taken to perform these calculations? e.g, can I allocate more ram to MATLAB, or can I boot up my PC in a way that optimises it for using MATLAB? I'm thinking that if the processor on my laptop is also doing calculations to run other programs this will slow down the MATLAB calculations. I've closed all other applications, but I know theres probably a lot of stuff going on I can't see. Can I boot my laptop up in a way that will have less of these things going on in the background?
I can't modify the code to make it more efficient.
Thanks!
You might run some of my benchmarks which, along with example results, can be found via:
http://www.roylongbottom.org.uk/
The CPU core used at a particular point in time, is the same on Pentiums, Celerons, Core 2s, Xeons and others. Only differences are L2/L3 cache sizes and external memory bus speeds. So you can compare most results with similar vintage 2 GHz CPUs. Things to try, besides simple number crunching tests.
1 - Try memory test, such as my BusSpeed, to show that caches are being used and RAM not dead slow.
2 - Assuming Windows, check that the offending program is the one using most CPU time in Task Manager, also that with the program not running, that CPU utilisation is around zero.
3 - Check that CPU temperature is not too high, like with SpeedFan (free D/L).
4 - If disk light is flashing, too much RAM might be being used, with some being swapped in and out. Task Manager Performance would show this. Increasing RAM demands can be checked my some of my reliability tests.
There are many things that go into computing power besides RAM. You mention processor speed, but there is also number of cores, GPU capability and more. Programs like MATLAB are designed to take advantage of features like parallelism.
Summary: You can't compare only RAM between two machines and expect to know how they will perform with respect to one another.
Side note: 4 GB is not very much RAM for a modern laptop.
Firstly you should perform a CPU performance benchmark on both computers.
Modern operating systems usually apply the most aggressive power management schemes when it is run on laptop. This usually means turning off one or more cores, or setting them to a very low frequency. For example, a Quad-core CPU that normally runs at 2.0 GHz could be throttled down to 700 MHz on one CPU while the other three are basically put to sleep, while it is on battery. (Remark. Numbers are not taken from a real example.)
The OS manages the CPU frequency in a dynamic way, tweaking it on the order of seconds. You will need a software monitoring tool that actually asks for the CPU frequency every second (without doing busy work itself) in order to know if this is the case.
Plugging in the laptop will make the OS use a less aggressive power management scheme.
(If this is found to be unrelated to MATLAB, please "flag" this post and ask moderator to move this question to the SuperUser site.)
I did a proof of concept for a complex transformation in SSIS. I have performance metrics now for this POC that I created in a virtual machine, with 1 gig memory, 1 core assigned. The SSIS transformations are all file based (source and target).
Now I want to use this metric for choosing the right amount of cores and memory in production environment.
What would be the right strategy to determine the right amount of cores and memory for production if I know the amount of files per day and the total amount of file size per day to be transformed ?
(edit) Think about total transfer sizes of 100 gigabyte and 5000 files per day!
You'd want to do two other benchmarks: 2 GB mem, 1 core and 1 GB mem, dual core. Taking a snapshot of a fairly tiny environment is difficult to extrapolate without a couple more datapoints.
Also, with only 1GB RAM you'll also want to make sure the server isn't also running out of memory and paging to disk (which will skew your figures somewhat as everything becomes reliant on disk access - and given you're already reading from disk anyway...). So make sure you know what's happening there as well.
SSIS tries to buffer as much as it can in memory for speed, so more memory is always good :-) The bigger question is what benefit extra cores will give you.
There are a number of areas for performance. One is the number of cores. The more cores you have the more parallel work that can be done. This of course is also dependent upon how you build your package. Certain objects are synchronous others are asynchronous. Memory is also a factor, but it is limited to 100MB/dataflow component.
Applications like Microsoft Outlook and the Eclipse IDE consume RAM, as much as 200MB. Is it OK for a modern application to consume that much memory, given that few years back we had only 256MB of RAM? Also, why this is happening? Are we taking the resources for granted?
Is it acceptable when most people have 1 or 2 gigabytes of RAM on their PCS?
Think of this - although your 200mb is small and nothing to worry about given a 2Gb limit, everyone else also has apps that take masses of RAM. Add them together and you find that the 2Gb I have very quickly gets all used up. End result - your app appears slow, resource hungry and takes a long time to startup.
I think people will start to rebel against resource-hungry applications unless they get 'value for ram'. you can see this starting to happen on servers, as virtualised systems gain popularity - people are complaining about resource requirements and corresponding server costs.
As a real-world example, I used to code with VC6 on my old 512Mb 1.7GHz machine, and things were fine - I could open 4 or 5 copies along with Outlook, Word and a web browser and my machine was responsive.
Today I have a dual-processor 2.8Ghz server box with 3Gb RAM, but I cannot realistically run more than 2 copies of Visual Studio 2008, they both take ages to start up (as all that RAM still has to be copied in and set up, along with all the other startup costs we now have), and even Word take ages to load a document.
So if you can reduce memory usage you should. Don't think that you can just use whatever bloated framework/library/practice you want with impunity.
http://en.wikipedia.org/wiki/Moore%27s_law
also:
http://en.wikipedia.org/wiki/Wirth%27s_law
There's a couple of things you need to think about.
1/ Do you have 256M now? I wouldn't think so - my smallest memory machine is 2G so a 200M application is not much of a problem.
2a/ That 200M you talk about might not be "real" memory. It may just be address space in which case it might not all be in physical memory at once. Some bits may only be pulled in to physical memory when you choose to do esoteric things.
2b/ It may also be shared between other processes (such as a DLL). This means it could be only held in physical memory as one copy but be present in the address space of many processes. That way, the usage is amortized over those many processes. Both 2a and 2b depend on where your figure of 200M actually came from (which I don't know and, running Linux, I'm unlikel to find out without you telling me :-).
3/ Even if it is physical memory, modern operating systems aren't like the old DOS or Windows 3.1 - they have virtual memory where bits of applications can be paged out (data) or thrown away completely (code, since it can always reload from the executable). Virtual memory gives you the ability to use far more memory than your actual physical memory.
Many modern apps will take advantage of the existance of more memory to cache more. Some like firefox and SQL server have explicit settings for how much memory they will use. In my opinion, it's foolish to not use available memory - what's the point of having 2GB of RAM if your apps all sit around at 10MB leaving 90% of your physical memory unused. Of course, if your app does use caching like this, it better be good at releasing that memory if page file thrashing starts, or allow the user to limit the cache size manually.
You can see the advantage of this by running a decent-sized query against SQL server. The first time you run the query, it may take 10 seconds. But when you run that exact query again, it takes less than a second - why? The query plan was only compiled the first time and cached for use later. The database pages that needed to be read were only loaded from disk the first time - the second time, they were still cached in RAM. If done right, the more memory you use for caching (until you run into paging) the faster you can re-access data. You'll see the same thing in large documents (e.g. in Word and Acrobat) - when you scroll to new areas of a document, things are slow, but once it's been rendered and cached, things speed up. If you don't have enough memory, that cache starts to get overwritten and going to the old parts of the document gets slow again.
If you can make good use of the RAM, it is your responsability to use it.
Yes, it is perfectly normal. Also something big was changed since 256MB were normal... and do not forget that before that 640Kb were supposed to be enough for everybody!
Now most software solutions are build with a garbage collector: C#, Java, Ruby, Python... everybody love them because certainly development can be faster, however there is one glitch.
The same program can be memory leak free with either manual or automatic memory deallocation. However in the second case it is likely for the memory consumption to grow. Why? In the first case memory is deallocated and kept clean immediately after something becomes useless (garbage). However it takes time and computing power to detect that automatically, hence most collectors (except for reference counting) wait for garbage to accumulate in order to make worth the cost of the exploration. The more you wait the more garbage you can sweep with the cost of one blow, but more memory is needed to accumulate that garbage. If you try to force the collector constantly, your program would spend more time exploring memory than working on your problems.
You can be completely sure than as long as programmers get more resources, they will sacrifice them using heavier tools in exchange for more freedom, abstraction and faster development.
A few years ago 256 MB was the norm for a PC, then Outlook consumed about 30 - 35 MB or so of memory, that's around 10% of the available memory, Now PC's have 2 GB or more as a norm, and outlook consumes 200 MB of memory, that's about 10% also.
The 1st conclusion: as more memory is available applications use more of it.
The 2nd conclusion: no matter what time frame you pick there are applications that are true memory hogs (like Outlook) and applications that are very efficient memory wise.
The 3rd conclusion: memory consumption of a app can't go down with time, else 640K would have been enough even today.
It completely depends on the application.