How to see hadoop's heap use? - hadoop

I am doing a school work to analyze the use of heap in hadoop. It involves running two versions of a mapreduce program to calculate the median of the length of forum comments: the first one is 'memory-unconscious' and the reduce program handles in memory a list with the length of every comment; the second one is 'memory-conscious' and the reducer uses a very memory-efficient data structure to handle the data.
The purpose is to use both programs to process data of different sizes and watch how the memory usage goes up faster in the first one (until it eventually runs out of memory).
My question is: how can I obtain the heap usage of hadoop or the reduce tasks?
I thouth the counter "Total committed heap usage (bytes)" would cointain this data, but I have found both versions of the program return almost the same values.
Regarding the correctness of the programs, the 'memory-unconscious' one runs out of memory with a large input and fails, while the other one does not and is able to finish.
Thanks in advance

I don't know what memory-conscious data structure you are using(If you give which one then might help), But most of in-memory data structure utilizes virtual memory means is data structure size increases to some extent based on policy extra data element/s will be dump into virtual memory. Hence we does not result in Out-of-memory error. but in case memory-unconscious doesn't do that. In both the cases data structure size will remain same that's why you are getting same size. To get real memory usage by Reducer you can get it by:
New Feature added java 1.5 is Instrumentation interface by which you can get objects memory usage(getObjectSize). Nice article about it: LINK
/* Returns the amount of free memory in the Java Virtual Machine. Calling the gc method may result in increasing the value returned by freeMemory.*/
long freeMemory = Runtime.getRuntime().freeMemory()
/* Returns the maximum amount of memory that the Java virtual machine will attempt to use. If there is no inherent limit then the value Long.MAX_VALUE will be returned. */
long maximumMemory = Runtime.getRuntime().maxMemory();
/* Returns the total amount of memory in the Java virtual machine. The value returned by this method may vary over time, depending on the host environment.
Note that the amount of memory required to hold an object of any given type may be implementation-dependent. */
long totalMemory = Runtime.getRuntime().totalMemory()

Related

Do job objects work with memory mapped files?

I am using memory mapped files for a set of very large datasets (each ~150GB). The scenario here is that the memory that is total amount of memory that is consumed during processing the data is around 50TB, because there are many stages of algorithms where each algirithm process another 150GB dataset.
Memory mapped files seem to balance quite well if there is little memory load and the memory manager has enough time balance the memory.
High Memory Load
The problem is though, that when there is a high memory load and a lot of data is written into the memory mapped files, the current process working set grows to the limit of the RAM. What you can see then is that the pages are written to harddisk but when I reach the max RAM, Windows stalls completely never returns until everything is written.
Another problem is that the memory manager is writing the pages only when it reaches the maximum RAM limit. Before nothing is written to disk.
I can agree that my process that is writing data and the memory page writer are two independent processes that don't know each other, so while I'm producing data, the page writer tries its best to write the pages to disk but is not able to follow since writing toi disk takes more time than allocating memory.
Obviously it is never a problem if the memory load of Windows is below the maximum RAM. If I'm coming near the maximum RAM things going to fall apart
First Attempt: SetWorkingSetSize
My first attempt was to use the SetProcessWorkingSet, by setting maximum allowed working set size to a strict limit (e.g. 5GB). What happens now is that the memory manager is starting to write pages to disk when the process has reaches the 5GB working set size and the process never reaches more than the example 5GB. Nevertheless my process still writes data, but what I can see from the Task Manager is that the amount of memory is still growing, whereas the process itself is still on 5GB. Nevertheless the same situation as before exists, the RAM grows until the max RAM is reached and Windows stalls again.
I was hoping that Windows throttles itself when I'm allocating memory, so that there is a mechnism in Windows that defers memory allocation, meaning that my application automatically slows down but that doesn't seem to be case.
Second Attempt: Job Objects
So I thought Job Objects can be a solution as it seemed to me that using SetProcessWorkingSetSize does not work as expected and is only a suggestion to the OS. So I used JOBOBJECT_EXTENDED_LIMIT_INFORMATION
typedef struct _JOBOBJECT_EXTENDED_LIMIT_INFORMATION {
JOBOBJECT_BASIC_LIMIT_INFORMATION BasicLimitInformation;
IO_COUNTERS IoInfo;
SIZE_T ProcessMemoryLimit;
SIZE_T JobMemoryLimit;
SIZE_T PeakProcessMemoryUsed;
SIZE_T PeakJobMemoryUsed;
} JOBOBJECT_EXTENDED_LIMIT_INFORMATION, *PJOBOBJECT_EXTENDED_LIMIT_INFORMATION;
with SetInformationJobObject but it seems that it doesn't do anything in terms of limit the process' working set, where SetProcessWorkingSetSize limits it.
So my questions are
should Job Objects limit the working set, and should the work as SetProcessWorkingSetSize (I understand that Job Objects are for more than this, but I'm just asking for this special condition)
Should Windows throttle memory allocation itself or do I have to do it myself.
If Windows does not throttle what would be the best approach to throttle my application (also using Job Objects using notifications?)

relationship between container_memory_working_set_bytes and process_resident_memory_bytes and total_rss

I'm looking to understanding the relationship of
container_memory_working_set_bytes vs process_resident_memory_bytes vs total_rss (container_memory_rss) + file_mapped so as to better equipped system for alerting on OOM possibility.
It seems against my understanding (which is puzzling me right now) given if a container/pod is running a single process executing a compiled program written in Go.
Why is the difference between container_memory_working_set_bytes is so big(nearly 10 times more) with respect to process_resident_memory_bytes
Also the relationship between container_memory_working_set_bytes and container_memory_rss + file_mapped is weird here, something I did not expect, after reading here
The total amount of anonymous and swap cache memory (it includes transparent hugepages), and it equals to the value of total_rss from memory.status file. This should not be confused with the true resident set size or the amount of physical memory used by the cgroup. rss + file_mapped will give you the resident set size of cgroup. It does not include memory that is swapped out. It does include memory from shared libraries as long as the pages from those libraries are actually in memory. It does include all stack and heap memory.
So cgroup total resident set size is rss + file_mapped how does this value is less than container_working_set_bytes for a container that is running in the given cgroup
Which make me feels something with this stats that I'm not correct.
Following are the PROMQL used to build the above graph
process_resident_memory_bytes{container="sftp-downloader"}
container_memory_working_set_bytes{container="sftp-downloader"}
go_memstats_heap_alloc_bytes{container="sftp-downloader"}
container_memory_mapped_file{container="sftp-downloader"} + container_memory_rss{container="sftp-downloader"}
So the relationship seems is like this
container_working_set_in_bytes = container_memory_usage_bytes - total_inactive_file
container_memory_usage_bytes as its name implies means the total memory used by the container (but since it also includes file cache i.e inactive_file which OS can release under memory pressure) substracting the inactive_file gives container_working_set_in_bytes
Relationship between container_memory_rss and container_working_sets can be summed up using following expression
container_memory_usage_bytes = container_memory_cache + container_memory_rss
cache reflects data stored on a disk that is currently cached in memory. it contains active + inactive file (mentioned above)
This explains why the container_working_set was higher.
Ref #1
Ref #2
Not really an answer, but still two assorted points.
Does this help to make sense of the chart?
Here at my $dayjob, we had faced various different issues with how different tools external to the Go runtime count and display memory usage of a process executing a program written in Go.
Coupled with the fact Go's GC on Linux does not actually release freed memory pages to the kernel but merely madvise(2)s it that such pages are MADV_FREE, a GC cycle which had freed quite a hefty amount of memory does not result in any noticeable change of the readings of the "process' RSS" taken by the external tooling (usually cgroups stats).
Hence we're exporting our own metrics obtained by periodically calling runtime.ReadMemStats (and runtime/debug.ReadGCStats) in any major serivice written in Go — with the help of a simple package written specifically for that. These readings reflect the true idea of the Go runtime about the memory under its control.
By the way, the NextGC field of the memory stats is super useful to watch if you have memory limits set for your containers because once that reading reaches or surpasses your memory limit, the process in the container is surely doomed to be eventually shot down by the oom_killer.

Ignite uses more memory than expected

I am using Ignite to build a framework for data calculation. One big problem is the memory usage is a little more than expected. The data using 1G memory outside Ignite will use more than 1.5G in Ignite cache.
I turned off backup and copyOnRead already. I don't use query feature so no extra index space. I also counted in the extra space used for each cache and cache entry. The total memory usages still doesn't add up.
The data value for each cache entry is a big map contains list of primitive arrays. Each entry is about 120MB.
What can be the problem? The data structure or the configuration?
Ignite does introduce some overhead to your data and half of a GB doesn't sound too bad too me. I would recommend you to refer to this guide for more details: https://apacheignite.readme.io/docs/capacity-planning
Difference between expected and real memory usage arises from 2 main points:
Each entry takes constant overhead consists of objects providing support for processing entries in distributed computing environment.
E.g. you can declare integer local variable, it takes 4 bytes in the stack, but it's hard to make the variable long live and accessible from other places of program. So you have to create new Integer object, which consumes at least 16 bytes (300% overhead isn't it?). Going further, if you want to make this object mutable and safely acsessible by multiple threads, you have to create new AtomicReference and store your object inside. Total memory consumption will be at least 32 bytes... and so on. Every time we're extending object functionality, we get additional overhead, there is no other way.
Each entry stored inside a cache in a special serialized format. So the actual memory footprint of an entry depends on the format is used. By default Ignite uses BinaryMarshaller to convert an object to the byte array, and this array is stored inside a BinaryObject.
The reason is simple, distributed computing systems continiously exchange entries between nodes, and every entry in cache should be ready to be transferred as a byte array.
Please, read the article, it was recently updated. You could estimate entry overhead for small entries by hand, but for big entries you should inspect actual entry stored in the cache as a byte array. Look at the withKeepBinary method.

How to measure the performance of the Erlang Garbage Collector?

I have started programming in Erlang recently and there are a few things I want to understand regarding garbage collection (GC). As far as I understand, there is a generational GC for the private heap of each process and a reference counting GC for the global shared heap.
What I would like to know is if there is anyway to get:
How many number of collection cycles?
How many bytes are allocated and deallocated, on a global level or process level?
What are the private heaps, and shared heap sizes? And can we define this as a GC parameter?
How long does it take to collect garbage? The % of time needed?
Is there a way to run a program without GC?
Is there a way to get this kind of information, either with code or using some commands when I run an Erlang program?
Thanks.
To get information for a single process, you can call erlang:process_info(Pid). This will yield (as of Erlang 18.0) the following fields:
> erlang:process_info(self()).
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.27.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.26.0>},
{total_heap_size,4184},
{heap_size,2586},
{stack_size,24},
{reductions,3707},
{garbage_collection,[{min_bin_vheap_size,46422},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,7}]},
{suspending,[]}]
The number of collection cycles for the process is available in the field minor_gcs under the section garbage_collection.
Per Process
The current heap size for the process is available in the field heap_size from the results above (in words, 4 bytes on a 32-bit VM and 8 bytes on a 64-bit VM). The total memory consumption of the process can be obtained by calling erlang:process_info(Pid, memory) which returns for example {memory,34312} for the above process. This includes call stack, heap and internal structures.
Deallocations (and allocations) can be traced using erlang:trace/3. If the trace flag is garbage_collection you will received messages on the form {trace, Pid, gc_start, Info} and {trace, Pid, gc_end, Info}. The Info field of the gc_start message contains such things as heap_size and old_heap_size.
Per System
Top level statistics of the system can be obtained by erlang:memory/0:
> erlang:memory().
[{total,15023008},
{processes,4215272},
{processes_used,4215048},
{system,10807736},
{atom,202481},
{atom_used,187597},
{binary,325816},
{code,4575293},
{ets,234816}]
Garbage collection statistics can be obtained via erlang:statistics(garbage_collection) which yields:
> statistics(garbage_collection).
{85,23961,0}
Where (as of Erlang 18.0) the first field is the total number of garbage collections performed by the VM and the second field is the total number of words reclaimed.
The heap sizes for a process are available under the fields total_heap_size (all heap fragments and stack) and heap_size (the size of the youngest heap generation) from the process info above.
They can be controlled via spawn options, specifically min_heap_size which sets the initial heap size for a process.
To set it for all process, erlang:system_flag(min_heap_size, MinHeapSize) can be called.
You can also control global VM memory allocation via the +M... options to the Erlang VM. The flags are described here. However, this requires extensive knowledge about the internals of the Erlang VM and its allocators and using them should not be taken lightly.
This can be obtained via the tracing described in answer 2. If you use the option timestamp when tracing, you will receive a timestamp with each trace message that can be used to calculate the total GC time.
Short answer: no.
Long answer: Maybe. You can control the initial heap size (via min_heap_size) which will affect when garbage collection will occur the first time. You can also control when a full sweep will be performed with the fullsweep_after option.
More information can be found in the Academic and Historical Questions and Processes section of the Efficiency Guide.
The most practical way of introspecting Erlang memory usage at runtime is via the Recon library, as Steve Vinoski mentioned.

What is Peak Working Set in windows task manager

I'm confused about the windows task manager memory overview.
in the general memory overview it shows "in use" 7.9gb (in my sample)
.
I've used process explorer to sum up the used memory and it shows me the following:
Since this is the nearest number to the 7.9gb of the task manager, i guess this value is shown there.
Now my question:
What is the Peak working set?
If i hoover over the column in task manager, it says:
and the microsoft help says Maximum amount of working set memory used by the process.
Is it now the effective used memory of all processes, or is it the maximum of memory which was used by all process?
The number you refer to is "Memory used by processes, drivers and the operating system" [source].
This is an easy but somewhat vague description. A somewhat similar description would be the total amount of memory that is not free, or part of the buffer cache, or part of the standby list.
It is not the maximum memory used at some time ("peak"), it's a coincidence that you have roughly the same number there. It is the presently used amount (used by "everyone", that is all programs and the OS).
The peak working set is a different thing. The working set is the amount of memory in a process (or, if you consider several processes, in all these processes) that is currently in physical memory. The peak working set is, consequently, the maximum value so far seen.
A process may allocate more memory than it actually ever commits ("uses"), and most processes will commit more memory than they have in their working set at one time. This is perfectly normal. Pages are moved in and out of working sets (and into the standby list) to assure that the computer, which has only a finite amount of memory, always has enough reserves to satisfy any memory needs.
The memory figures in question aren't actually a reliable indicator of how much memory a process is using.
A brief explanation of each of the memory relationships:
Private Bytes are what the process is allocated, also with pagefile usage.
Working Set is the non-paged Private Bytes plus memory-mapped files.
Virtual Bytes are the Working Set plus paged Private Bytes and
standby list.
In answer to your question the peak working set is the maximum amount of physical RAM that was assigned to the process in question.
~ Update ~
Available memory is defined as the sum of the standby list plus free memory. There is far more to total memory usage than the sum all process working sets. Because of this and due to memory sharing this value is not generally very useful.
The virtual size of a process is the portion of a process virtual address space that has been allocated for use. There is no relationship between this and physical memory usage.
Private bytes is the portion of a processes virtual address space that has been allocated for private use. It does not include shared memory or that used for code. There is no relationship between this value and physical memory usage either.
Working set is the amount of physical memory in use by a process. Due to memory sharing there will be some double counting in this value.
The terms mentioned above aren't really going to mean very much until you understand the basic concepts in Windows memory management. Have a look HERE for some further reading.

Resources