What does VIRTUAL_MEMORY_BYTES task counter mean in Hadoop? - hadoop

The following excerpt from The Definitive Guide provides high level details as shown below but
what exactly is virtual memory is referring to in this task counter?
How to interpret it? How is it related to PHYSICAL_MEMORY_BYTES?
Following is an example extract from one of the jobs. Physical is 214 GB approx. and virtual is 611 GB approx.

1.What exactly is virtual memory is referring to in this task counter?
Virtual Memory here is used to prevent Out of Memory errors of a task,if data size doesn't fits in RAM(physical mem).
in RAM.So a portion of memory of size what didn't fit in RAM will be used as Virtual Memory.
So,while setting up hadoop cluster one is advised to have the value of vm.swappiness =1 to achieve better performance. On linux systems, vm.swappiness is set to 60 by default.
Higher the value more aggresive swapping of memory pages.
https://community.hortonworks.com/articles/33522/swappiness-setting-recommendation.html
2. How to interpret it? How is it related to PHYSICAL_MEMORY_BYTES?
swapping of memory pages from physical memory to virtual memory on disk when not enough phy mem
This is the relation between PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES.

Related

setting up heap memory in jmeter for more than one concurrent script execution

Below is my scenario.
I have 2 test scripts :- one might use 5GB to 15GB of heap memory and other script might use from 5GB to 12GB.
If i have a machine of 32 GB memory,
While executing for the first script can i assign XMS 1GB XMX 22GB(though my script needs 15GB) and for the second script can i assign XMS 1GB and XMX 12GB
As sum of maximum goes beyong 32GB(total memory)
In the second case i assign like this--->
for script 1:XMS 22GB XMX 22GB
for script 2:XMS 12GB and XMX 12GB
Sum of Max 34GB.
Does it by any chance work like below----- >
If 12GB is assigned for first script,is this memory blocked for that process/script ? and can i not use the unused memory for other processes ?
or
If 12GB is assigned for the first script ,it uses only as much as requuired by it and any other process can use the rest memory ? IF it works in this way-i don't have to specifically assign heap for two scripts separately.
If you set the minimum heap memory via Xms parameter the JVM will reserve this memory and it will not be available for other processes.
If you're able to allocate more JVM Heap than you have total physical RAM it means that your OS will go for swapping - using hard drive to dump memory pages which extends your computer memory at cost of speed because memory operations are fast and disk operations are very slow.
For example look at my laptop which has 8 GB of total physical RAM:
It has 8 GB of physical memory of which 1.2 GB are free. It means that I can safely allocate 1 GB of RAM to Java
However when I give 8 GB to Java as:
java -Xms8G
it still works
15 GB - still works
and when I try to allocate 20 GB it fails because it doesn't fit into physical + virtual memory.
You must avoid swapping because it means that JMeter will not be able to send requests fast enough even if the system under tests supports it so make sure to measure how much available physical RAM you have and your test must not exceed this. If you cannot achieve it on one machine - you will have to go for distributed testing
Also "concurrently" running 2 different scripts is not something you should be doing because it's not only about the memory, a single CPU core can execute only one command at a time, other commands are waiting in the queue and being served by context switching which is kind of expensive and slow operation.
And last but not the least, allocating the maximum HEAP is not the best idea because this way garbage collections will be less frequent but will last much longer resulting in throughput dropdowns, keep heap usage between 30% and 80% like in Optimal Heap Size article

statement_mem seems to limit the node memory instead of the segment memory

According to the GreenPlum documentation, GUCs such as statement_mem, gp_vmem_protect_limit should work at segment level. Same thing should happen with a resource queue memory allowance.
On our system we have 8 primary segments per node. So if I set the statement_mem of a query to 2GB I would expect the query to consume (if needed) up to 2GB x 8 = 16GBs of RAM. But it seems that it would only use 2GBs total per node before starting to write into disk (that's it 2GB/8 per segment). I tried with different statement_values and same thing.
max_statement_mem or gp_vmem_protect_limit limits are never reached. RAM usage on nodes have been monitored using various tools (from GP command center to top, free, all the way across Pivotal suggested session_level_memory_consumption view).
EDITED FROM HERE
ADDED two documentation sources where statement_mem is defined per segment and not per host. (#Jon Roberts)
On the GP best practices guide, beginning of page 32, it clearly says that if the statement_mem is 125MB and we have 8 segments on the server, each query will get 1GB allocated per server.
https://www.google.es/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwi6sOTx8O3KAhVBKg4KHTwICX0QFggmMAE&url=http%3A%2F%2Fgpdb.docs.pivotal.io%2F4300%2Fpdf%2FGPDB43BestPractices.pdf&usg=AFQjCNGkTqa6143fvJUztYISWAiVyj62dA&sig2=D2ZcJwLDqN0qBzU73NjXNg&bvm=bv.113943164,d.ZWU&cad=rja
On the https://support.pivotal.io/hc/en-us/articles/201947018-Pivotal-Greenplum-GPDB-Memory-Configuration it seems to use statement_mem as segment memory and not host memory. It keeps interrelating statement_mem with the memory limit of the resource queues as well as with the gp_vmem_protect_limit (both parameters defined per segment basis).
This is why I'm getting confused about how to properly manage the memory resources.
Thanks
I incorrectly stated that statement_mem is on a per host and that is not the case. This link is talking about the memory on a segment level:
http://gpdb.docs.pivotal.io/4370/guc_config-statement_mem.html#statement_mem
With the default of "eager_free" gp_resqueue_memory_policy, memory gets re-used so the aggregate amount of memory used may look low for a particular query execution. If you change it to "auto" where the memory isn't re-used, the memory usage is more noticeable.
Run an "explain analyze" of your query and see the slices that are used. With eager_free, the memory gets re-used so you may only have a single slice wanting more memory than available such as this one:
(slice18) * Executor memory: 10399K bytes avg x 2 workers, 10399K bytes max (seg0). Work_mem: 8192K bytes max, 13088K bytes wanted.
And for your question on how to manage the resources, most people don't change the default values. A query that spills to disk is usually an indication that the query needs to be revised or the data model needs some work.

How much virtual memory does a 30.5Gb heap(256 Gb memory in total) for Elasticsearch support?

Assume I have a machine with 256gb memory and 12TB SSD. Indexed document size is 100TB. I assign 30.5 GB to Elasticsearch heap. The remaining is for Lucene and OS.
My question is, how much virtual memory does Elasticsearch support? To put it in another way, how many indexed documents can I put into the virtual memory for each machine?
Thanks
The amount of virtual memory ES can use is defined by the value of the vm.max_map_count setting in /etc/sysctl.conf. By default it is set at 262144, but you can change this value using:
sysctl -w vm.max_map_count=262144
From the linux documentation:
This file contains the maximum number of memory map areas a process
may have. Memory map areas are used as a side-effect of calling
malloc, directly by mmap and mprotect, and also when loading shared
libraries.
While most applications need less than a thousand maps, certain
programs, particularly malloc debuggers, may consume lots of them,
e.g., up to one or two maps per allocation.
The default value is 65536.
So this setting doesn't impose a specific size available to ES/Lucene, but a number of discrete memory areas that a given process can use. How much memory is used exactly will depend on the size of the memory chunks being allocated by ES/Lucene. By default, Lucene uses
1<<30 = 1,073,741,824 ~= 1GB chunks on a 64 bit JRE and
1<<28 = 268,435,456 ~= 256MB chunks on 32 bit JRE
So if you do the math, the default value of vm.max_map_count is probably good enough for your case, if not you can tune it and monitor your virtual memory usage.

zone_NORMAL and ZONE_HIGHMEM on 32 and 64 bit kernels

I trying to to make the linux memory management a little bit more clear for tuning and performances purposes.
By reading this very interesting redbook "Linux Performance and Tuning Guidelines" found on the IBM website I came across something I don't fully understand.
On 32-bit architectures such as the IA-32, the Linux kernel can directly address only the first gigabyte of physical memory (896 MB when considering the reserved range). Memory above the so-called ZONE_NORMAL must be mapped into the lower 1 GB. This mapping is completely transparent to applications, but allocating a memory page in ZONE_HIGHMEM causes a small performance degradation.
why the memory above 896 MB has to be mapped into the lower 1GB ?
Why there is an impact on performances by allocating a memory page in ZONE_HIGHMEM ?
what is the ZONE_HIGHMEM used for then ?
why a kernel that is able to recognize up to 4gb ( CONFIG_HIGHMEM=y ) can just use the first gigabyte ?
Thanks in advance
When a user process traps in to the kernel, the page tables are not changed. This means that one linear address space must be able to cover both the memory addresses available to the user process, and the memory addresses available to the kernel.
On IA-32, which allows a 4GB linear address space, usually the first 3GB of the linear address space are allocated to the user process, and the last 1GB of the linear address space is allocated to the kernel.
The kernel must use its 1GB range of addresses to be able to address any part of physical memory it needs to. Memory above 896MB is not "mapped into the low 1GB" - what happens is that physical memory below 896MB is assigned a permanent linear address in the kernel's part of the linear address space, whereas as memory above that limit must be assigned a temporary mapping in the remaining part of the linear address space.
There is no impact on performance when mapping a ZONE_HIGHMEM page into a userspace process - for a userspace process, all physical memory pages are equal. The impact on performance exists when the kernel needs to access a non-user page in ZONE_HIGHMEM - to do so, it must map it into the linear address space if it is not already mapped.

How does the linux kernel manage less than 1GB physical memory?

I'm learning the linux kernel internals and while reading "Understanding Linux Kernel", quite a few memory related questions struck me. One of them is, how the Linux kernel handles the memory mapping if the physical memory of say only 512 MB is installed on my system.
As I read, kernel maps 0(or 16) MB-896MB physical RAM into 0xC0000000 linear address and can directly address it. So, in the above described case where I only have 512 MB:
How can the kernel map 896 MB from only 512 MB ? In the scheme described, the kernel set things up so that every process's page tables mapped virtual addresses from 0xC0000000 to 0xFFFFFFFF (1GB) directly to physical addresses from 0x00000000 to 0x3FFFFFFF (1GB). But when I have only 512 MB physical RAM, how can I map, virtual addresses from 0xC0000000-0xFFFFFFFF to physical 0x00000000-0x3FFFFFFF ? Point is I have a physical range of only 0x00000000-0x20000000.
What about user mode processes in this situation?
Every article explains only the situation, when you've installed 4 GB of memory and the kernel maps the 1 GB into kernel space and user processes uses the remaining amount of RAM.
I would appreciate any help in improving my understanding.
Thanks..!
Not all virtual (linear) addresses must be mapped to anything. If the code accesses unmapped page, the page fault is risen.
The physical page can be mapped to several virtual addresses simultaneously.
In the 4 GB virtual memory there are 2 sections: 0x0... 0xbfffffff - is process virtual memory and 0xc0000000 .. 0xffffffff is a kernel virtual memory.
How can the kernel map 896 MB from only 512 MB ?
It maps up to 896 MB. So, if you have only 512, there will be only 512 MB mapped.
If your physical memory is in 0x00000000 to 0x20000000, it will be mapped for direct kernel access to virtual addresses 0xC0000000 to 0xE0000000 (linear mapping).
What about user mode processes in this situation?
Phys memory for user processes will be mapped (not sequentially but rather random page-to-page mapping) to virtual addresses 0x0 .... 0xc0000000. This mapping will be the second mapping for pages from 0..896MB. The pages will be taken from free page lists.
Where are user mode processes in phys RAM?
Anywhere.
Every article explains only the situation, when you've installed 4 GB of memory and the
No. Every article explains how 4 Gb of virtual address space is mapped. The size of virtual memory is always 4 GB (for 32-bit machine without memory extensions like PAE/PSE/etc for x86)
As stated in 8.1.3. Memory Zones of the book Linux Kernel Development by Robert Love (I use third edition), there are several zones of physical memory:
ZONE_DMA - Contains page frames of memory below 16 MB
ZONE_NORMAL - Contains page frames of memory at and above 16 MB and below 896 MB
ZONE_HIGHMEM - Contains page frames of memory at and above 896 MB
So, if you have 512 MB, your ZONE_HIGHMEM will be empty, and ZONE_NORMAL will have 496 MB of physical memory mapped.
Also, take a look to 2.5.5.2. Final kernel Page Table when RAM size is less than 896 MB section of the book. It is about case, when you have less memory than 896 MB.
Also, for ARM there is some description of virtual memory layout: http://www.mjmwired.net/kernel/Documentation/arm/memory.txt
The line 63 PAGE_OFFSET high_memory-1 is the direct mapped part of memory
The hardware provides a Memory Management Unit. It is a piece of circuitry which is able to intercept and alter any memory access. Whenever the processor accesses the RAM, e.g. to read the next instruction to execute, or as a data access triggered by an instruction, it does so at some address which is, roughly speaking, a 32-bit value. A 32-bit word can have a bit more than 4 billions distinct values, so there is an address space of 4 GB: that's the number of bytes which could have a unique address.
So the processor sends out the request to its memory subsystem, as "fetch the byte at address x and give it back to me". The request goes through the MMU, which decides what to do with the request. The MMU virtually splits the 4 GB space into pages; page size depends on the hardware you use, but typical sizes are 4 and 8 kB. The MMU uses tables which tell it what to do with accesses for each page: either the access is granted with a rewritten address (the page entry says: "yes, the page containing address x exists, it is in physical RAM at address y") or rejected, at which point the kernel is invoked to handle things further. The kernel may decide to kill the offending process, or to do some work and alter the MMU tables so that the access may be tried again, this time successfully.
This is the basis for virtual memory: from the point of view, the process has some RAM, but the kernel has moved it to the hard disk, in "swap space". The corresponding table is marked as "absent" in the MMU tables. When the process accesses his data, the MMU invokes the kernel, which fetches the data from the swap, puts it back at some free space in physical RAM, and alters the MMU tables to point at that space. The kernel then jumps back to the process code, right at the instruction which triggered the whole thing. The process code sees nothing of the whole business, except that the memory access took quite some time.
The MMU also handles access rights, which prevents a process from reading or writing data which belongs to other processes, or to the kernel. Each process has its own set of MMU tables, and the kernel manage those tables. Thus, each process has its own address space, as if it was alone on a machine with 4 GB of RAM -- except that the process had better not access memory that it did not allocate rightfully from the kernel, because the corresponding pages are marked as absent or forbidden.
When the kernel is invoked through a system call from some process, the kernel code must run within the address space of the process; so the kernel code must be somewhere in the address space of each process (but protected: the MMU tables prevent access to the kernel memory from unprivileged user code). Since code can contain hardcoded addresses, the kernel had better be at the same address for all processes; conventionally, in Linux, that address is 0xC0000000. The MMU tables for each process map that part of the address space to whatever physical RAM blocks the kernel was actually loaded upon boot. Note that the kernel memory is never swapped out (if the code which can read back data from swap space was itself swapped out, things would turn sour quite fast).
On a PC, things can be a bit more complicated, because there are 32-bit and 64-bit modes, and segment registers, and PAE (which acts as a kind of second-level MMU with huge pages). The basic concept remains the same: each process gets its own view of a virtual 4 GB address space, and the kernel uses the MMU to map each virtual page to an appropriate physical position in RAM, or nowhere at all.
osgx has an excellent answer, but I see a comment where someone still doesn't understand.
Every article explains only the situation, when you've installed 4 GB
of memory and the kernel maps the 1 GB into kernel space and user
processes uses the remaining amount of RAM.
Here is much of the confusion. There is virtual memory and there is physical memory. Every 32bit CPU has 4GB of virtual memory. The Linux kernel's traditional split was 3G/1G for user memory and kernel memory, but newer options allow different partitioning.
Why distinguish between the kernel and user space? - my own question
When a task swaps, the MMU must be updated. The kernel MMU space should remain the same for all processes. The kernel must handle interrupts and fault requests at any time.
How does virtual to physical mapping work? - my own question.
There are many permutations of virtual memory.
a single private mapping to a physical RAM page.
a duplicate virtual mapping to a single physical page.
a mapping that throws a SIGBUS or other error.
a mapping backed by disk/swap.
From the above list, it is easy to see why you may have more virtual address space than physical memory. In fact, the fault handler will typically inspect process memory information to see if a page is mapped (I mean allocated for the process), but not in memory. In this case the fault handler will call the I/O sub-system to read in the page. When the page has been read and the MMU tables updated to point the virtual address to a new physical address, the process that caused the fault resumes.
If you understand the above, it becomes clear why you would like to have a larger virtual mapping than physical memory. It is how memory swapping is supported.
There are other uses. For instance two processes may use the same code library. It is possible that they are at different virtual addresses in the process space due to linking. You may map the different virtual addresses to the same physical page in this case in order to save physical memory. This is quite common for new allocations; they all point to a physical 'zero page'. When you touch/write the memory the zero page is copied and a new physical page allocated (COW or copy on write).
It is also sometimes useful to have the virtual pages aliased with one as cached and another as non-cached. The two pages can be examined to see what data is cached and what is not.
Mainly virtual and physical are not the same! Easily stated, but often confusing when looking at the Linux VMM code.
-
Hi, actually, I don't work on x86 hardware platform, so there may exist some technical errors in my post.
To my knowledge, the range between 0(or 16)MB - 896MB is listed specially while you have more RAM than that number, say, you have 1GB physical RAM on your board, which is called "low-memory". If you have more physical RAM than 896MB on your board, then, rest of the physical RAM is called highmem.
Speaking of your question, there are 512MiBytes physical RAM on your board, so actually, there is no 896, no highmem.
The total RAM kernel can see and also can map is 512MB.
'Cause there is 1-to-1 mapping between physical memory and kernel virtual address, so there is 512MiBytes virtual address space for kernel. I'm really not sure whether or not the prior sentence is right, but it's what in my mind.
What I mean is if there is 512MBytes, then the amount of physical RAM the kernel can manage is also 512MiBytes, further, the kernel cannot create such big address space like beyond 512MBytes.
Refer to user space, there is one different point, pages of user's application can be swapped out to harddisk, but pages of the kernel cannot.
So, for user space, with the help of page tables and other related modules, it seems there is still 4GBytes address space.
Of course, this is virtual address space, not physical RAM space.
This is what I understand.
Thanks.
If the physical memory is less than 896 MB then the linux kernel maps upto that physical address lineraly.
For details see this.. http://learnlinuxconcepts.blogspot.in/2014/02/linux-addressing.html

Resources