Confused between Kbps and KBps - performance

I've recently studied in my syllabus that Kb refers to Kilo bits where as KB refers to Kilo Bytes. Also I've studied that Kb refers to speed and KB refers to speed. So according to what I've studied I must be able to download 1 MB of file in 8 Seconds at a speed of 1 Mbps as 1 MB equals 8 Mb. But I can download that file in just 1 Second at a speed of 1 Mbps. How is that possible?

You are correct until the last statement.
You can download a 1 MB file in 8 sec at 1 Mb/s or 1 sec at 1 MB/s.
8 Mb = 1 MB.

Related

Children vs parent output in iozone

I'm failing to understand what iozone benchmark outputs.
Here I'm launching a basic read with 16 processes, each of them reading a 2048 KiB files all at once.
I've aggressively disabled caching with echo 3 > /proc/sys/vm/drop_caches.
Results are the following:
Run began: Thu Apr 21 22:12:42 2022
File size set to 2048 kB
Record Size 2048 kB
Include close in write timing
Include fsync in write timing
Command line used: iozone -t 16 -s 2048 -r 2048 -ce -i 1
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 processes
Each process writes a 2048 kByte file in 2048 kByte records
Children see throughput for 16 readers = 1057899.00 kB/sec
Parent sees throughput for 16 readers = 559102.01 kB/sec
Min throughput per process = 0.00 kB/sec
Max throughput per process = 1057899.00 kB/sec
Avg throughput per process = 66118.69 kB/sec
Min xfer = 0.00 kB
Children see throughput for 16 re-readers = 948555.56 kB/sec
Parent sees throughput for 16 re-readers = 584476.30 kB/sec
Min throughput per process = 0.00 kB/sec
Max throughput per process = 948555.56 kB/sec
Avg throughput per process = 59284.72 kB/sec
Min xfer = 0.00 kB
I don't get why 'children' bandwidth differs so much from 'parent' bandwidth nor why it seems that only one process have been used (Min throughput per process is 0.0 kB/sec and Avg throughput per process is Children see throughput for 16 readers / 16).
This SO question is roughly the same but the only answer is a bit vague.

Spark Scratch Space

I have a cluster of 13 machines with 4 physical CPUs and 24 G of RAM.
I started a spark cluster with one driver and 12 slaves.
I set the number of cores by slaves to 12 cores, meaning I have a cluster as foloowing :
Alive Workers: 12
Cores in use: 144 Total, 110 Used
Memory in use: 263.9 GB Total, 187.0 GB Used
I started an application with the folowing configuration :
[('spark.driver.cores', '4'),
('spark.executor.memory', '15G'),
('spark.executor.id', 'driver'),
('spark.driver.memory', '5G'),
('spark.python.worker.memory', '1042M'),
('spark.cores.max', '96'),
('spark.rdd.compress', 'True'),
('spark.serializer.objectStreamReset', '100'),
('spark.executor.cores', '8'),
('spark.default.parallelism', '48')]
I understand there are 15G of RAM by executor with 8 task slot and a parallelism of 48 (48 = 6 task slot * 12 slaves).
then I have two big files on HDFS : 6 G each, (from a directory of 12 files of 5 blocks of 128 Mb each) , with a 3x replication factor.
I union these two files => I get one dataframe of 12 GB I think but I see a 37 G reading input through the IHM :
That could be the first question : Why 37 Gb ?
Then as the execution time is too long for me, I try to cache the data so that I can go faster. But the caching method never finishes, here you can see it is already 45 minutes before the end (Vs 6 min not cached !):
So I try to understand why, and I see the usage of Memory/Disk on the storage section of the ihm :
So there are some part of the RDD that are staying on disk.
Furthemore I see the executors may still have free memory :
And I notice on the same "storage" page that the size of the RDD has jumped :
Storage Level: Disk Serialized 1x Replicated
Cached Partitions: 72
Total Partitions: 72
Memory Size: 42.7 GB
Disk Size: 73.3 GB
=> I understand : Memory Size: 42.7 GB + Disk Size: 73.3 GB = 110 G !
=> So my 6 G file has transformed on 37 G and then on 110 G ???
But i try to understand why is there still some memory left on my executor, and I go to the "err" dump of one, and I see :
18/02/08 11:04:08 INFO MemoryStore: Will not store rdd_50_46
18/02/08 11:04:09 WARN MemoryStore: Not enough space to cache rdd_50_46 in memory! (computed 1134.1 MB so far)
18/02/08 11:04:09 INFO MemoryStore: Memory use = 1641.6 KB (blocks) + 7.7 GB (scratch space shared across 6 tasks(s)) = 7.7 GB. Storage limit = 7.8 GB.
18/02/08 11:04:09 WARN BlockManager: Persisting block rdd_50_46 to disk instead.
And Here I see that the executor want to cache a 1641.6 KB block (only 1Mo !) and I can't because there is a ["scratch space"] of 7.7 Gb "shared across 6 tasks".
=> What is a "scratch space" ? ?
=> The 6 tasks => comes from the parallelism of 48 / 12 = 6
And then I come back to the app information, and I see that the count that lasted 48 min read only 37 Gb of data ! (The 48 min are clearly used to cache the data too)
When I do a count on the cached dataframe I have a 116G input read :
And at the end of the day, the time saved by the cached count is not that impressive, here are 3 duration :
4.8 ' : count on cached df
48' : count while caching
5.8' : count on not cached df (read directly from hdfs)
So why is it so ?
Because the cached df is not that much cached :
Meaning more or less 40 Gb in memory and 60 Gb on disk.
I am surprised because at 15G / executor * 12 slaves => 180 Gb of memory, and I can cache only 40 Gb ... But in fact I remember that the memory is splitted :
30% for spark
54% for storage
16% for shuffle
So I understand that I do have 54% * 15G for storage, ie 8.1 G, meaning that on my 180 Gb, I only have 97 Gb for storage. Why do I have 90 - 40 = 50 G not used then ?
Oups... This is a long post !
Plenty of questions... Sorry...

Impala - out of memory exception. Slow queries

Can someone help me. I'm running a cluster of 5 Impala-Nodes for my Api. Now I get a lot of 'out of memory' Exceptions when I run queries.
Failed to get minimum memory reservation of 3.94 MB on daemon r5c3s4.colo.vm:22000 for query 924d155863398f6b:c4a3470300000000 because it would exceed an applicable memory limit. Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error. Memory usage:
, node=[r4c3s2]
Process: Limit=55.00 GB Total=49.79 GB Peak=49.92 GB, node=[r4c3s2]
Buffer Pool: Free Buffers: Total=0, node=[r4c3s2]
Buffer Pool: Clean Pages: Total=4.21 GB, node=[r4c3s2]
Buffer Pool: Unused Reservation: Total=-4.21 GB, node=[r4c3s2]
Free Disk IO Buffers: Total=1.19 GB Peak=1.52 GB, node=[r4c3s2]
However, it says there are just used 23.83 GB of 150.00 GB. Also the queries became really slow. This problem occurred out of nowhere. Does anyone have an explanation for that?
Here are all memory infromation I got from the "/memz?detailed=true" page of one node:
Memory Usage
Memory consumption / limit: 23.83 GB / 150.00 GB
Breakdown`enter code here`
Process: Limit=150.00 GB Total=23.83 GB Peak=58.75 GB
Buffer Pool: Free Buffers: Total=72.69 MB
Buffer Pool: Clean Pages: Total=0
Buffer Pool: Unused Reservation: Total=-71.94 MB
Free Disk IO Buffers: Total=1.61 GB Peak=1.67 GB
RequestPool=root.default: Total=20.77 GB Peak=59.92 GB
Query(2647a4f63d37fdaa:690ad3b500000000): Reservation=20.67 GB ReservationLimit=120.00 GB OtherMemory=101.21 MB Total=20.77 GB Peak=20.77 GB
Unclaimed reservations: Reservation=71.94 MB OtherMemory=0 Total=71.94 MB Peak=139.94 MB
Fragment 2647a4f63d37fdaa:690ad3b50000001c: Reservation=0 OtherMemory=114.48 KB Total=114.48 KB Peak=855.48 KB
AGGREGATION_NODE (id=9): Total=102.12 KB Peak=102.12 KB
Exprs: Total=102.12 KB Peak=102.12 KB
EXCHANGE_NODE (id=8): Total=0 Peak=0
DataStreamRecvr: Total=0 Peak=0
DataStreamSender (dst_id=10): Total=872.00 B Peak=872.00 B
CodeGen: Total=3.50 KB Peak=744.50 KB
Fragment 2647a4f63d37fdaa:690ad3b500000014: Reservation=0 OtherMemory=243.31 KB Total=243.31 KB Peak=1.57 MB
AGGREGATION_NODE (id=3): Total=102.12 KB Peak=102.12 KB
Exprs: Total=102.12 KB Peak=102.12 KB
AGGREGATION_NODE (id=7): Total=119.12 KB Peak=119.12 KB
Exprs: Total=119.12 KB Peak=119.12 KB
EXCHANGE_NODE (id=6): Total=0 Peak=0
DataStreamRecvr: Total=0 Peak=0
DataStreamSender (dst_id=8): Total=6.81 KB Peak=6.81 KB
CodeGen: Total=7.25 KB Peak=1.34 MB
Fragment 2647a4f63d37fdaa:690ad3b50000000c: Reservation=2.32 GB OtherMemory=349.48 KB Total=2.32 GB Peak=2.32 GB
AGGREGATION_NODE (id=2): Total=119.12 KB Peak=119.12 KB
Exprs: Total=119.12 KB Peak=119.12 KB
AGGREGATION_NODE (id=5): Reservation=2.32 GB OtherMemory=199.74 KB Total=2.32 GB Peak=2.32 GB
Exprs: Total=120.12 KB Peak=120.12 KB
EXCHANGE_NODE (id=4): Total=0 Peak=0
DataStreamRecvr: Total=336.00 B Peak=549.14 KB
DataStreamSender (dst_id=6): Total=6.44 KB Peak=6.44 KB
CodeGen: Total=15.85 KB Peak=3.10 MB
Fragment 2647a4f63d37fdaa:690ad3b500000004: Reservation=18.29 GB OtherMemory=100.52 MB Total=18.38 GB Peak=18.38 GB
AGGREGATION_NODE (id=1): Reservation=18.29 GB OtherMemory=334.12 KB Total=18.29 GB Peak=18.29 GB
Exprs: Total=148.12 KB Peak=148.12 KB
HDFS_SCAN_NODE (id=0): Total=100.17 MB Peak=178.15 MB
Exprs: Total=4.00 KB Peak=4.00 KB
DataStreamSender (dst_id=4): Total=6.75 KB Peak=6.75 KB
CodeGen: Total=9.72 KB Peak=2.92 MB
RequestPool=fe-eval-exprs: Total=0 Peak=12.00 KB
Untracked Memory: Total=1.44 GB
tcmalloc
------------------------------------------------
MALLOC: 24646559936 (23504.8 MiB) Bytes in use by application
MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 725840992 ( 692.2 MiB) Bytes in central cache freelist
MALLOC: + 4726720 ( 4.5 MiB) Bytes in transfer cache freelist
MALLOC: + 208077600 ( 198.4 MiB) Bytes in thread cache freelists
MALLOC: + 105918656 ( 101.0 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 25691123904 (24501.0 MiB) Actual memory used (physical + swap)
MALLOC: + 53904392192 (51407.2 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 79595516096 (75908.2 MiB) Virtual address space used
MALLOC:
MALLOC: 133041 Spans in use
MALLOC: 842 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
System
Physical Memory: 252.41 GB
Transparent Huge Pages Config:
enabled: always [madvise] never
defrag: [always] madvise never
khugepaged defrag: 1
Process and system memory metrics
Name Value Description
memory.anon-huge-page-bytes 19.01 GB Total bytes of anonymous (a.k.a. transparent) huge pages used by this process.
memory.mapped-bytes 113.09 GB Total bytes of memory mappings in this process (the virtual memory size).
memory.num-maps 18092 Total number of memory mappings in this process.
memory.rss 24.51 GB Resident set size (RSS) of this process, including TCMalloc, buffer pool and Jvm.
memory.thp.defrag [always] madvise never The system-wide 'defrag' setting for Transparent Huge Pages.
memory.thp.enabled always [madvise] never The system-wide 'enabled' setting for Transparent Huge Pages.
memory.thp.khugepaged-defrag 1 The system-wide 'defrag' setting for khugepaged.
memory.total-used 23.83 GB Total memory currently used by TCMalloc and buffer pool.
Buffer pool memory metrics
Name Value Description
buffer-pool.clean-page-bytes 0 Total bytes of clean page memory cached in the buffer pool.
buffer-pool.clean-pages 0 Total number of clean pages cached in the buffer pool.
buffer-pool.clean-pages-limit 12.00 GB Limit on number of clean pages cached in the buffer pool.
buffer-pool.free-buffer-bytes 72.69 MB Total bytes of free buffer memory cached in the buffer pool.
buffer-pool.free-buffers 177 Total number of free buffers cached in the buffer pool.
buffer-pool.limit 120.00 GB Maximum allowed bytes allocated by the buffer pool.
buffer-pool.reserved 20.67 GB Total bytes of buffers reserved by Impala subsystems
buffer-pool.system-allocated 20.67 GB Total buffer memory currently allocated by the buffer pool.
buffer-pool.unused-reservation-bytes 71.94 MB Total bytes of buffer reservations by Impala subsystems that are currently unused
JVM aggregate memory metrics
Name Value Description
jvm.total.committed-usage-bytes 1.45 GB Jvm total Committed Usage Bytes
jvm.total.current-usage-bytes 903.10 MB Jvm total Current Usage Bytes
jvm.total.init-usage-bytes 1.92 GB Jvm total Init Usage Bytes
jvm.total.max-usage-bytes 31.23 GB Jvm total Max Usage Bytes
jvm.total.peak-committed-usage-bytes 2.09 GB Jvm total Peak Committed Usage Bytes
jvm.total.peak-current-usage-bytes 1.48 GB Jvm total Peak Current Usage Bytes
jvm.total.peak-init-usage-bytes 1.92 GB Jvm total Peak Init Usage Bytes
jvm.total.peak-max-usage-bytes 31.41 GB Jvm total Peak Max Usage Bytes
JVM heap memory metrics
Name Value Description
jvm.heap.committed-usage-bytes 1.37 GB Jvm heap Committed Usage Bytes
jvm.heap.current-usage-bytes 827.25 MB Jvm heap Current Usage Bytes
jvm.heap.init-usage-bytes 2.00 GB Jvm heap Init Usage Bytes
jvm.heap.max-usage-bytes 26.67 GB Jvm heap Max Usage Bytes
jvm.heap.peak-committed-usage-bytes 0 Jvm heap Peak Committed Usage Bytes
jvm.heap.peak-current-usage-bytes 0 Jvm heap Peak Current Usage Bytes
jvm.heap.peak-init-usage-bytes 0 Jvm heap Peak Init Usage Bytes
jvm.heap.peak-max-usage-bytes 0 Jvm heap Peak Max Usage Bytes
JVM non-heap memory metrics
Name Value Description
jvm.non-heap.committed-usage-bytes 76.90 MB Jvm non-heap Committed Usage Bytes
jvm.non-heap.current-usage-bytes 75.68 MB Jvm non-heap Current Usage Bytes
jvm.non-heap.init-usage-bytes 2.44 MB Jvm non-heap Init Usage Bytes
jvm.non-heap.max-usage-bytes -1.00 B Jvm non-heap Max Usage Bytes
jvm.non-heap.peak-committed-usage-bytes 0 Jvm non-heap Peak Committed Usage Bytes
jvm.non-heap.peak-current-usage-bytes 0 Jvm non-heap Peak Current Usage Bytes
jvm.non-heap.peak-init-usage-bytes 0 Jvm non-heap Peak Init Usage Bytes
jvm.non-heap.peak-max-usage-bytes 0 Jvm non-heap Peak Max Usage Bytes
Process: Limit=150.00 GB Total=23.83 GB Peak=58.75 GB
this caused by memory limit Memorylimit exceeded
change those setting memory.soft_limit_in_bytes ,memory.limit_in_bytes mem_limit ,default_pool_mem_limit value to 0 or -1
1 or 0 represents unlimited .

What do you means KB/sec?

we have got TOTAL
Label: 10
Average: 1288
Median: 1278
90%: 1525
95%: 1525
99%: 1546
Min: 887
Max: 1546
Throughput: 6.406149903907751
KB/sec: 39.21264413837284
What do means of means KB/sec? please help me understand ot it
According to the Glossary
KB/s(Aggregate Report)
Throughput is measured in bytes and represents the amount of data that the Virtual users received from the server.Throughput KPI is measured in kilobytes(KB) per seconds.
So basically it is average amount of data received by JMeter from the application under test per second.
KB/sec is the speed of a connection.
KB meaning Kilobyte and sec meaning per second
You get faster speeds of MB/sec which is Megabyte and even faster speeds of GB/sec which is Gigabytes
1000 KB = 1 MB
1000 MB = 1 GB
Hope this helps :)

Calculating CPU Performance in MIPS

i was taking an exam earlier and i memorized the questions that i didnt know how to answer but somehow got it correct(since the online exam using electronic classrom(eclass) was done through the use of multiple choice.. The exam was coded so each of us was given random questions at random numbers and random answers on random choices, so yea)
anyways, back to my questions..
1.)
There is a CPU with a clock frequency of 1 GHz. When the instructions consist of two
types as shown in the table below, what is the performance in MIPS of the CPU?
-Execution time(clocks)- Frequency of Appearance(%)
Instruction 1 10 60
Instruction 2 15 40
Answer: 125
2.)
There is a hard disk drive with specifications shown below. When a record of 15
Kbytes is processed, which of the following is the average access time in milliseconds?
Here, the record is stored in one track.
[Specifications]
Capacity: 25 Kbytes/track
Rotation speed: 2,400 revolutions/minute
Average seek time: 10 milliseconds
Answer: 37.5
3.)
Assume a magnetic disk has a rotational speed of 5,000 rpm, and an average seek time of 20 ms. The recording capacity of one track on this disk is 15,000 bytes. What is the average access time (in milliseconds) required in order to transfer one 4,000-byte block of data?
Answer: 29.2
4.)
When a color image is stored in video memory at a tonal resolution of 24 bits per pixel,
approximately how many megabytes (MB) are required to display the image on the
screen with a resolution of 1024 x768 pixels? Here, 1 MB is 106 bytes.
Answer:18.9
5.)
When a microprocessor works at a clock speed of 200 MHz and the average CPI
(“cycles per instruction” or “clocks per instruction”) is 4, how long does it take to
execute one instruction on average?
Answer: 20 nanoseconds
I dont expect someone to answer everything, although they are indeed already answered but i am just wondering and wanting to know how it arrived at those answers. Its not enough for me knowing the answer, ive tried solving it myself trial and error style to arrive at those numbers but it seems taking mins to hours so i need some professional help....
1.)
n = 1/f = 1 / 1 GHz = 1 ns.
n*10 * 0.6 + n*15 * 0.4 = 12 ns (=average instruction time) = 83.3 MIPS.
2.)3.)
I don't get these, honestly.
4.)
Here, 1 MB is 10^6 bytes.
3 Bytes * 1024 * 768 = 2359296 Bytes = 2.36 MB
But often these 24 bits are packed into 32 bits b/c of the memory layout (word width), so often it will be 4 Bytes*1024*768 = 3145728 Bytes = 3.15 MB.
5)
CPI / f = 4 / 200 MHz = 20 ns.

Resources