Getting a number of context switches for a process / thread - parallel-processing

Out of curiosity I want to know how many times my program was context switched by the OS. Like all the registers were saved and the control was passed to another process or thread, and then after some time everything was restored and we continue as it never happened.
Does the system maintain such a number somewhere or is there a sort of hack or whatever?
I am on Linux in particular but I am interested about other systems as well.

Well, let's examine the case. Linux type O/S keeps these details systematically and one may use a comfort of Python, for both inspecting the state and also for easy design of a monitoring system, that can report any excessive circumstances ( the former quite matching a just out of curiosity cases, the latter quite handy for any re-work / re-use for systematic work ) :
A "Monitor" example for both { voluntary | involuntary }-Ctx Switching :
Python here serves for both the educational role and for the ease and comfort of further extending the scope of functionalities:
Having assigned signal.signal( signal.SIGALRM, SIG_ALRM_handler_A ) and the timing, the system gets ready to report both voluntary and involuntary ( enforced ) Context-Switches, for which a "FAT"-blocking piece of computing was used, that resorts, due to historical reasons to non-GIL Numpy/C/FORTRAN code and thus gets disturbed by just involuntary-CtxSwitched cases, as was shown below:
len(str([np.math.factorial(2**f) for f in range(20)][-1]))
but by using a principally any other PID-number, this trivial monitoring mechanics can serve for whatever other purposes:
########################################################################
### SIGALRM_handler_
###
import psutil, resource, os, time
SIG_ALRM_last_ctx_switch_VOLUNTARY = -1
SIG_ALRM_last_ctx_switch_FORCED = -1
def SIG_ALRM_handler_A( aSigNUM, aFrame ): # SIG_ALRM fired evenly even during [ np.math.factorial( 2**f ) for f in range( 20 ) ] C-based processing =======================================
# onEntry_ROTATE_SigHandlers() -- MAY set another sub-sampled SIG_ALRM_handler_B() ... { last: 0, 0: handler_A, 1: handler_B, 2: handler_C }
#
# onEntry_SEQ of calls of regular, hierarchically timed MONITORS ( just the SNAPSHOT-DATA ACQUISITION Code-SPRINTs, handle later due to possible TimeDOMAIN overlaps )
#
#
# print( time.ctime() )
# print( formatExtMemoryUsed( getExtMemoryUsed() ) )
# print( 60 * "=", psutil.Process( os.getpid() ).num_ctx_switches(), "~~~", aProcess.cpu_percent( interval = 0 ) )
# ??? # WHY CPU 0.0%
aProcess = psutil.Process( os.getpid() )
aProcessCpuPCT = aProcess.cpu_percent( interval = 0 ) # EVENLY-TIME-STEPPED
aCtxSwitchNUMs = aProcess.num_ctx_switches() # THIS PROCESS ( may inspect other per-incident later ... on anomaly )
aVolCtxSwitchCNT = aCtxSwitchNUMs.voluntary
aForcedSwitchCNT = aCtxSwitchNUMs.involuntary
global SIG_ALRM_last_ctx_switch_VOLUNTARY
global SIG_ALRM_last_ctx_switch_FORCED
if ( SIG_ALRM_last_ctx_switch_VOLUNTARY != -1 ): # .INIT VALUE STILL UNCHANGED
#----------
# .ON_TICK: must process delta(s)
if ( SIG_ALRM_last_ctx_switch_VOLUNTARY == aVolCtxSwitchCNT ):
#
# AN INDIRECT INDICATION OF A LONG-RUNNING WORKLOAD OUTSIDE GIL-STEPPING ( regex / C-lib / FORTRAN / numpy-block et al )
# ||||| vvv
# SIG_: Wed Oct 19 12:24:32 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=315) ~~~ 0.0
# SIG_: Wed Oct 19 12:24:37 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=323) ~~~ 0.0
# SIG_: Wed Oct 19 12:24:42 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=331) ~~~ 0.0
# SIG_: Wed Oct 19 12:24:47 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=338) ~~~ 0.0
# SIG_: Wed Oct 19 12:24:52 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=346) ~~~ 0.0
# SIG_: Wed Oct 19 12:24:57 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=353) ~~~ 0.0
# ... ||||| ^^^
# 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
# >>> ||||| |||
# vvvvv |||
# SIG_: Wed Oct 19 12:26:17 2016 ------------------------------ pctxsw(voluntary=49983, involuntary=502) ~~~ 0.0
# SIG_: Wed Oct 19 12:26:22 2016 ------------------------------ pctxsw(voluntary=49984, involuntary=502) ~~~ 0.0
# SIG_: Wed Oct 19 12:26:27 2016 ------------------------------ pctxsw(voluntary=49985, involuntary=502) ~~~ 0.0
# SIG_: Wed Oct 19 12:26:32 2016 ------------------------------ pctxsw(voluntary=49986, involuntary=502) ~~~ 0.0
# SIG_: Wed Oct 19 12:26:37 2016 ------------------------------ pctxsw(voluntary=49987, involuntary=502) ~~~ 0.0
# SIG_: Wed Oct 19 12:26:42 2016 ------------------------------ pctxsw(voluntary=49988, involuntary=502) ~~~ 0.0
#rint( "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), 10 * "-", aProcess.num_ctx_switches(), "{0: > 8.2f} CPU_CORE_LOAD [%]".format( aProcessCpuPCT ), " INSPECT processes ... ev. add a Stateful-self-Introspection" )
print( "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), 10 * "-", aProcess.num_ctx_switches(), "{0:_>60s}".format( str( aProcess.threads() ) ), " INSPECT processes ... ev. add a Stateful-self-Introspection" )
#rint( "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", str( resource.getrusage( resource.RUSAGE_SELF ) )[22:] )
else:
#----------
# .ON_INIT: may report .INIT()
#rint( "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), ...
print( "SIG_ALRM_handler_A(): activated ", time.ctime(), 30 * "-", aProcess.num_ctx_switches() )
##########
# FINALLY:
SIG_ALRM_last_ctx_switch_VOLUNTARY = aVolCtxSwitchCNT # .STO ACTUALs
SIG_ALRM_last_ctx_switch_FORCED = aForcedSwitchCNT # .STO ACTUALs
#rint( "SIG_: ", time.ctime(), 30 * "-", aProcess.num_ctx_switches(), " ~~~ ", aProcess.cpu_percent( interval = 0 ), " % -?- ", aProcess.threads() )
#____________________________________________________________________
# SIG_ALRM_handler_A( aSigNUM, aFrame ): DEFINED
#####################################################################
##########
# FINALLY:
#
# > signal.signal( signal.SIGALRM, SIG_ALRM_handler_A ) # .ASSOC { SIGALRM: thisHandler }
# > signal.setitimer( signal.ITIMER_REAL, 10, 5 ) # .SET #5 [sec] interval, after first run, starting after 10[sec] initial-delay
# > signal.setitimer( signal.ITIMER_REAL, 0, 5 ) # .UNSET
# > SIG_ALRM_last_ctx_switch_VOLUNTARY = -1 # .RESET .INIT() the global { signalling | state }-variable
# > len(str([np.math.factorial(2**f) for f in range(20)][-1])) # .RUN A "FAT"-BLOCKING CHUNK OF A regex/numpy/C/FORTRAN-calculus
Also the Thread-level CtxSwitch details
While this was not elaborated to a similar depth, the same as above applies to:
>>> psutil.Process( 18263 ).cpu_percent() 0.0
>>> psutil.Process( 18263 ).ppid() 18054
>>> psutil.Process( 18054 ).cpu_percent() 0.0
=== ( 18054 ).threads(): [ 17679, 17680, 17681, 18054, 18265, 18266, 18267, ]
==4 -------------vvv-------------------=4--------------vvvv-------------------=4--------------vvv
>>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=267), pctxsw(voluntary=4, involuntary=1909), pctxsw(voluntary=4, involuntary=444)]
>>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=273), pctxsw(voluntary=4, involuntary=1915), pctxsw(voluntary=4, involuntary=445)]
>>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=275), pctxsw(voluntary=4, involuntary=1917), pctxsw(voluntary=4, involuntary=445)]

Related

console output of the current calendar month in Ruby

I need to output to the console the calendar of the current month in Ruby. The result should be similar to ncal on UNIX-like systems. I found a solution for C ++ but can't adapt for Ruby. So far, I only realized that I need to use nested loops to output the height and width. Tell me in which direction to move?
require 'date'
days = %w[Mun Tue Wed Thu Fri Sat Sun]
puts " #{Date::MONTHNAMES[Date.today.month]} #{Date.today.year}"
i = 0
start_month = (Date.today - Date.today.mday + 1).strftime("%a")
while i < days.size
print days[i]
j = 1
while j <= 31
if days[i] == start_month
print " #{j}"
end
j += 7
end
i += 1
puts
end
I'll take your solution so far, and try to give some specific pointers for how to progress with it - but of course, there are many different ways to approach this problem in general, so this is by no means the only approach!
The first critical issue (as you're aware!) is that you're only printing things for the row starting on the 1st of the month, due to this line:
if days[i] == start_month
Sticking with the current overall design, we know we'll need to print something for every line, so clearly a conditional like this isn't going to work. Let's try removing it.
Firstly, it will be more convenient to know which day of the week the month started on as a number, not a string, so we can easily calculate offsets against another day. Let's do that with:
# e.g. for 1st July 2021 this was a Thursday, so we get `4`.
start_of_month_weekday = (Date.today - Date.today.mday + 1).cwday
Next (and this is the crucial step!), we can use the above information to find out "which day of the month is it, on this day of the week?"
Here a first version of that calculation, incorporated into your solution so far:
require 'date'
days = %w[Mon Tue Wed Thu Fri Sat Sun]
puts " #{Date::MONTHNAMES[Date.today.month]} #{Date.today.year}"
i = 0
# e.g. for 1st July 2021 this was a Thursday, so we get `4`.
start_of_month_weekday = (Date.today - Date.today.mday + 1).cwday
while i < days.size
print days[i]
day_of_month = i - start_of_month_weekday + 2 # !!!
while day_of_month <= 31
print " #{day_of_month}"
day_of_month += 7
end
i += 1
puts
end
This outputs:
July 2021
Mon -2 5 12 19 26
Tue -1 6 13 20 27
Wed 0 7 14 21 28
Thu 1 8 15 22 29
Fri 2 9 16 23 30
Sat 3 10 17 24 31
Sun 4 11 18 25
Not bad! Now we're getting somewhere!
I'll leave you to figure out the rest 😉 .... But here are some clues, for what I'd tackle next:
This code, print " #{day_of_month}", needs to print a "blank space" if the day number is less than 1. This could be done with a simple if statement.
Similarly, since you want this calendar to line up neatly in a grid, you need this code to always print a something two characters wide. sprintf is your friend here! Check out the "Examples of width", about halfway down the page.
You've hardcoded 31 for the number of days in the month. This should be fixed, of course. (Use the Date library!)
It's funny how you used strftime("%a") in one place, yet constructed the calendar title awkwardly in the line above! 😄 Take a look at the documentation for formatting dates; it's extremely flexible. I think you can use: Date.today.strftime("%B %Y").
If you'd like to add some colour (or background colour?) to the current day of the month, consider doing something like this, or use a library to assist.
Using while loops works OK, but is quite un-rubyish. In 99% of cases, ruby has even better tools for the job; it's a very expressive language - iterators are king! (I'm guessing you first learned another language, before ruby? Seeing while loops, and/or for loops, is a dead giveaway that you're more familiar with a different language.) Instead of the outer while loop (while i < days.size), you could use days.each_with_index. And instead of the inner while loop (while j < 31), you could use day_of_month.step(31, 7) (how cool is that!!).
This is one way:
Construct a one-dimensional array, beginning with the daynames (Mon Tue ...).
Figure out a way to determine with how many "blanks" the month starts (these are days from the previous month. wday might help). Attach that amount of empty strings to the array.
Determine how many days the month has (hint Date.new(2021,7,-1), and attach all these daynumbers to the array.
Attach empty strings to the array until the size of the array is divisible by 7 (or better, calculate). Skip this if you're skipping the last bullet.
Convert all elements of this array to right-adjusted strings of size 3 or some-such.
Use each_slice(7) to slice the array into weeks.
If desired, transpose this array of week-slices to mimic the ncal output.
Thank you for your help, literally 10 hours and I figured it out thanks to you. I apologize once again for the initially incorrectly posed question.
With the help of hints, I assembled such a solution.
require 'date'
days = %w[Mon Tue Wed Thu Fri Sat Sun]
p days
blanks = Date.new(2021,7,1).wday - 1
blanks.times do
days.push(' ')
end
days_in_month = Date.new(2021, 7, -1).day
days_in_month
day = 1
while day <= days_in_month
days.push(day)
day += 1
end
unless (days.size % 7) == 0
days.push(' ')
end
days.join(', ')
new_arr = days.each_slice(7).to_a
puts"Массив дней: #{new_arr}"
for i in 0...7
for j in 0...new_arr.size
print " #{new_arr[j][i]}"
end
puts
end
require 'date'
# init
DAYS_ORDER = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
today = Date.today
month = today.month
year = today.year
first_day = Date.new(year, month, 1)
last_day = Date.new(year, month, -1)
hash_days = {}
# get all current months days and add to hash_days
first_day.upto(last_day) { |day| hash_days[day.day] = day.strftime('%a') }
# group by wday
grouped_hash = hash_days.group_by { |day| day.pop }.transform_values { |days| days.flatten }
# sort by wday from DAYS_ORDER
sorted_arr = grouped_hash.sort_by { |k, v| DAYS_ORDER.index(k) }
# rendering current month's calendar with mark current day
## title
print "\x1b[4m#{today.strftime("%B %Y")}\x1b[0m\n"
## calendar
indent = true
sorted_arr.each do |wday, days|
print wday
if days[0] != 1 && indent == true
print " "
else
indent = false
end
days.each do |value|
spaces = " " * (value > 9 ? 1 : 2)
str_day = spaces + value.to_s
current_day = "\x1b[1;31m#{str_day}\x1b[0m"
print value == today.day ? current_day : str_day
end
puts
end
view

Redis HSET keys expire after few minutes

I am trying to connect to a remote redis server and set keys using HSET command like below
hset ABCD:1105 balance 1000
I can able to see the Key using KEYS *
But after approx 1 minute the KEYS * returns empty (empty list or set) . Whereas TTL on the key returns -1.
This is the memory configuration in the redis server
1) "masterauth"
2) ""
3) "maxmemory"
4) "0"
5) "maxmemory-samples"
6) "5"
7) "maxclients"
8) "10000"
9) "min-slaves-to-write"
10) "0"
11) "min-replicas-to-write"
12) "0"
13) "min-slaves-max-lag"
14) "10"
15) "min-replicas-max-lag"
16) "10"
17) "maxmemory-policy"
18) "noeviction"
Here the max memory policy is also noeviction. Then why the Keys are getting expired.
Updating the logs of redis server pod
> 1:C 09 Jan 2021 17:02:04.495 # oO0OoO0OoO0Oo Redis is starting
> oO0OoO0OoO0Oo 1:C 09 Jan 2021 17:02:04.495 # Redis version=5.0.7,
> bits=64, commit=00000000, modified=0, pid=1, just started 1:C 09 Jan
> 2021 17:02:04.495 # Configuration loaded 1:M 09 Jan 2021 17:02:04.496
> * Running mode=standalone, port=6379. 1:M 09 Jan 2021 17:02:04.496 # WARNING: The TCP backlog setting of 511 cannot be enforced because
> /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:M 09
> Jan 2021 17:02:04.496 # Server initialized 1:M 09 Jan 2021
> 17:02:04.496 # WARNING you have Transparent Huge Pages (THP) support
> enabled in your kernel. This will create latency and memory usage
> issues with Redis. To fix this issue run the command 'echo never >
> /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to
> your /etc/rc.local in order to retain the setting after a reboot.
> Redis must be restarted after THP is disabled. 1:M 09 Jan 2021
> 17:02:04.897 * DB loaded from append only file: 0.400 seconds 1:M 09
> Jan 2021 17:02:04.897 * Ready to accept connections
Update2 : Memory Info
used_memory:999576 used_memory_human:976.15K used_memory_rss:5341184
used_memory_rss_human:5.09M used_memory_peak:1679456
used_memory_peak_human:1.60M used_memory_peak_perc:59.52%
used_memory_overhead:958562 used_memory_startup:790256
used_memory_dataset:41014 used_memory_dataset_perc:19.59%
allocator_allocated:1104272 allocator_active:1642496
allocator_resident:5189632 total_system_memory:29508444160
total_system_memory_human:27.48G used_memory_lua:37888
used_memory_lua_human:37.00K used_memory_scripts:0
used_memory_scripts_human:0B number_of_cached_scripts:0 maxmemory:0
maxmemory_human:0B maxmemory_policy:noeviction
allocator_frag_ratio:1.49 allocator_frag_bytes:538224
allocator_rss_ratio:3.16 allocator_rss_bytes:3547136
rss_overhead_ratio:1.03

Garbage collector in Ruby 2.2 provokes unexpected CoW

How do I prevent the GC from provoking copy-on-write, when I fork my process ? I have recently been analyzing the garbage collector's behavior in Ruby, due to some memory issues that I encountered in my program (I run out of memory on my 60core 0.5Tb machine even for fairly small tasks). For me this really limits the usefulness of ruby for running programs on multicore servers. I would like to present my experiments and results here.
The issue arises when the garbage collector runs during forking. I have investigated three cases that illustrate the issue.
Case 1: We allocate a lot of objects (strings no longer than 20 bytes) in the memory using an array. The strings are created using a random number and string formatting. When the process forks and we force the GC to run in the child, all the shared memory goes private, causing a duplication of the initial memory.
Case 2: We allocate a lot of objects (strings) in the memory using an array, but the string is created using the rand.to_s function, hence we remove the formatting of the data compared to the previous case. We end up with a smaller amount of memory being used, presumably due to less garbage. When the process forks and we force the GC to run in the child, only part of the memory goes private. We have a duplication of the initial memory, but to a smaller extent.
Case 3: We allocate fewer objects compared to before, but the objects are bigger, such that the amount of memory allocated stays the same as in the previous cases. When the process forks and we force the GC to run in the child all the memory stays shared, i.e. no memory duplication.
Here I paste the Ruby code that has been used for these experiments. To switch between cases you only need to change the “option” value in the memory_object function. The code was tested using Ruby 2.2.2, 2.2.1, 2.1.3, 2.1.5 and 1.9.3 on an Ubuntu 14.04 machine.
Sample output for case 1:
ruby version 2.2.2
proces pid log priv_dirty shared_dirty
Parent 3897 post alloc 38 0
Parent 3897 4 fork 0 37
Child 3937 4 initial 0 37
Child 3937 8 empty GC 35 5
The exact same code has been written in Python and in all cases the CoW works perfectly fine.
Sample output for case 1:
python version 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2]
proces pid log priv_dirty shared_dirty
Parent 4308 post alloc 35 0
Parent 4308 4 fork 0 35
Child 4309 4 initial 0 35
Child 4309 10 empty GC 1 34
Ruby code
$start_time=Time.new
# Monitor use of Resident and Virtual memory.
class Memory
shared_dirty = '.+?Shared_Dirty:\s+(\d+)'
priv_dirty = '.+?Private_Dirty:\s+(\d+)'
MEM_REGEXP = /#{shared_dirty}#{priv_dirty}/m
# get memory usage
def self.get_memory_map( pids)
memory_map = {}
memory_map[ :pids_found] = {}
memory_map[ :shared_dirty] = 0
memory_map[ :priv_dirty] = 0
pids.each do |pid|
begin
lines = nil
lines = File.read( "/proc/#{pid}/smaps")
rescue
lines = nil
end
if lines
lines.scan(MEM_REGEXP) do |shared_dirty, priv_dirty|
memory_map[ :pids_found][pid] = true
memory_map[ :shared_dirty] += shared_dirty.to_i
memory_map[ :priv_dirty] += priv_dirty.to_i
end
end
end
memory_map[ :pids_found] = memory_map[ :pids_found].keys
return memory_map
end
# get the processes and get the value of the memory usage
def self.memory_usage( )
pids = [ $$]
result = self.get_memory_map( pids)
result[ :pids] = pids
return result
end
# print the values of the private and shared memories
def self.log( process_name='', log_tag="")
if process_name == "header"
puts " %-6s %5s %-12s %10s %10s\n" % ["proces", "pid", "log", "priv_dirty", "shared_dirty"]
else
time = Time.new - $start_time
mem = Memory.memory_usage( )
puts " %-6s %5d %-12s %10d %10d\n" % [process_name, $$, log_tag, mem[:priv_dirty]/1000, mem[:shared_dirty]/1000]
end
end
end
# function to delay the processes a bit
def time_step( n)
while Time.new - $start_time < n
sleep( 0.01)
end
end
# create an object of specified size. The option argument can be changed from 0 to 2 to visualize the behavior of the GC in various cases
#
# case 0 (default) : we make a huge array of small objects by formatting a string
# case 1 : we make a huge array of small objects without formatting a string (we use the to_s function)
# case 2 : we make a smaller array of big objects
def memory_object( size, option=1)
result = []
count = size/20
if option > 3 or option < 1
count.times do
result << "%20.18f" % rand
end
elsif option == 1
count.times do
result << rand.to_s
end
elsif option == 2
count = count/10
count.times do
result << ("%20.18f" % rand)*30
end
end
return result
end
##### main #####
puts "ruby version #{RUBY_VERSION}"
GC.disable
# print the column headers and first line
Memory.log( "header")
# Allocation of memory
big_memory = memory_object( 1000 * 1000 * 10)
Memory.log( "Parent", "post alloc")
lab_time = Time.new - $start_time
if lab_time < 3.9
lab_time = 0
end
# start the forking
pid = fork do
time = 4
time_step( time + lab_time)
Memory.log( "Child", "#{time} initial")
# force GC when nothing happened
GC.enable; GC.start; GC.disable
time = 8
time_step( time + lab_time)
Memory.log( "Child", "#{time} empty GC")
sleep( 1)
STDOUT.flush
exit!
end
time = 4
time_step( time + lab_time)
Memory.log( "Parent", "#{time} fork")
# wait for the child to finish
Process.wait( pid)
Python code
import re
import time
import os
import random
import sys
import gc
start_time=time.time()
# Monitor use of Resident and Virtual memory.
class Memory:
def __init__(self):
self.shared_dirty = '.+?Shared_Dirty:\s+(\d+)'
self.priv_dirty = '.+?Private_Dirty:\s+(\d+)'
self.MEM_REGEXP = re.compile("{shared_dirty}{priv_dirty}".format(shared_dirty=self.shared_dirty, priv_dirty=self.priv_dirty), re.DOTALL)
# get memory usage
def get_memory_map(self, pids):
memory_map = {}
memory_map[ "pids_found" ] = {}
memory_map[ "shared_dirty" ] = 0
memory_map[ "priv_dirty" ] = 0
for pid in pids:
try:
lines = None
with open( "/proc/{pid}/smaps".format(pid=pid), "r" ) as infile:
lines = infile.read()
except:
lines = None
if lines:
for shared_dirty, priv_dirty in re.findall( self.MEM_REGEXP, lines ):
memory_map[ "pids_found" ][pid] = True
memory_map[ "shared_dirty" ] += int( shared_dirty )
memory_map[ "priv_dirty" ] += int( priv_dirty )
memory_map[ "pids_found" ] = memory_map[ "pids_found" ].keys()
return memory_map
# get the processes and get the value of the memory usage
def memory_usage( self):
pids = [ os.getpid() ]
result = self.get_memory_map( pids)
result[ "pids" ] = pids
return result
# print the values of the private and shared memories
def log( self, process_name='', log_tag=""):
if process_name == "header":
print " %-6s %5s %-12s %10s %10s" % ("proces", "pid", "log", "priv_dirty", "shared_dirty")
else:
global start_time
Time = time.time() - start_time
mem = self.memory_usage( )
print " %-6s %5d %-12s %10d %10d" % (process_name, os.getpid(), log_tag, mem["priv_dirty"]/1000, mem["shared_dirty"]/1000)
# function to delay the processes a bit
def time_step( n):
global start_time
while (time.time() - start_time) < n:
time.sleep( 0.01)
# create an object of specified size. The option argument can be changed from 0 to 2 to visualize the behavior of the GC in various cases
#
# case 0 (default) : we make a huge array of small objects by formatting a string
# case 1 : we make a huge array of small objects without formatting a string (we use the to_s function)
# case 2 : we make a smaller array of big objects
def memory_object( size, option=2):
count = size/20
if option > 3 or option < 1:
result = [ "%20.18f"% random.random() for i in xrange(count) ]
elif option == 1:
result = [ str( random.random() ) for i in xrange(count) ]
elif option == 2:
count = count/10
result = [ ("%20.18f"% random.random())*30 for i in xrange(count) ]
return result
##### main #####
print "python version {version}".format(version=sys.version)
memory = Memory()
gc.disable()
# print the column headers and first line
memory.log( "header") # Print the headers of the columns
# Allocation of memory
big_memory = memory_object( 1000 * 1000 * 10) # Allocate memory
memory.log( "Parent", "post alloc")
lab_time = time.time() - start_time
if lab_time < 3.9:
lab_time = 0
# start the forking
pid = os.fork() # fork the process
if pid == 0:
Time = 4
time_step( Time + lab_time)
memory.log( "Child", "{time} initial".format(time=Time))
# force GC when nothing happened
gc.enable(); gc.collect(); gc.disable();
Time = 10
time_step( Time + lab_time)
memory.log( "Child", "{time} empty GC".format(time=Time))
time.sleep( 1)
sys.exit(0)
Time = 4
time_step( Time + lab_time)
memory.log( "Parent", "{time} fork".format(time=Time))
# Wait for child process to finish
os.waitpid( pid, 0)
EDIT
Indeed, calling the GC several times before forking the process solves the issue and I am quite surprised. I have also run the code using Ruby 2.0.0 and the issue doesn't even appear, so it must be related to this generational GC just like you mentioned.
However, if I call the memory_object function without assigning the output to any variables (I am only creating garbage), then the memory is duplicated. The amount of memory that is copied depends on the amount of garbage that I create - the more garbage, the more memory becomes private.
Any ideas how I can prevent this ?
Here are some results
Running the GC in 2.0.0
ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3664 post alloc 67 0
Parent 3664 4 fork 1 69
Child 3700 4 initial 1 69
Child 3700 8 empty GC 6 65
Calling memory_object( 1000*1000) in the child
ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3703 post alloc 67 0
Parent 3703 4 fork 1 70
Child 3739 4 initial 1 70
Child 3739 8 empty GC 15 56
Calling memory_object( 1000*1000*10)
ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3743 post alloc 67 0
Parent 3743 4 fork 1 69
Child 3779 4 initial 1 69
Child 3779 8 empty GC 89 5
UPD2
Suddenly figured out why all the memory is going private if you format the string -- you generate garbage during formatting, having GC disabled, then enable GC, and you've got holes of released objects in your generated data. Then you fork, and new garbage starts to occupy these holes, the more garbage - more private pages.
So i added a cleanup function to run GC each 2000 cycles (just enabling lazy GC didn't help):
count.times do |i|
cleanup(i)
result << "%20.18f" % rand
end
#......snip........#
def cleanup(i)
if ((i%2000).zero?)
GC.enable; GC.start; GC.disable
end
end
##### main #####
Which resulted in(with generating memory_object( 1000 * 1000 * 10) after fork):
RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 0
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2501 post alloc 35 0
Parent 2501 4 fork 0 35
Child 2503 4 initial 0 35
Child 2503 8 empty GC 28 22
Yes, it affects performance, but only before forking, i.e. increase load time in your case.
UPD1
Just found criteria by which ruby 2.2 sets old object bits, it's 3 GC's, so if you add following before forking:
GC.enable; 3.times {GC.start}; GC.disable
# start the forking
you will get(the option is 1 in command line):
$ RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 1
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2368 post alloc 31 0
Parent 2368 4 fork 1 34
Child 2370 4 initial 1 34
Child 2370 8 empty GC 2 32
But this needs to be further tested concerning the behavior of such objects on future GC's, at least after 100 GC's :old_objects remains constant, so i suppose it should be OK
Log with GC.stat is here
By the way there's also option RGENGC_OLD_NEWOBJ_CHECK to create old objects from the beginning, but i doubt it's a good idea, but may be useful for a particular case.
First answer
My proposition in the comment above was wrong, actually bitmap tables are the savior.
(option = 1)
ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 14807 post alloc 27 0
Parent 14807 4 fork 0 27
Child 14809 4 initial 0 27
Child 14809 8 empty GC 6 25 # << almost everything stays shared <<
Also had by hand and tested Ruby Enterprise Edition it's only half better than worst cases.
ruby version 1.8.7
proces pid log priv_dirty shared_dirty
Parent 15064 post alloc 86 0
Parent 15064 4 fork 2 84
Child 15065 4 initial 2 84
Child 15065 8 empty GC 40 46
(I made the script run strictly 1 GC, by increasing RUBY_GC_HEAP_INIT_SLOTS to 600k)

Rails 3.2.8 - How do I get the week number from Rails?

I would like to know how to get the current week number from Rails and how do I manipulate it:
Translate the week number into date.
Make an interval based on week number.
Thanks.
Use strftime:
%U - Week number of the year. The week starts with Sunday. (00..53)
%W - Week number of the year. The week starts with Monday. (00..53)
Time.now.strftime("%U").to_i # 43
# Or...
Date.today.strftime("%U").to_i # 43
If you want to add 43 weeks (or days,years,minutes, etc...) to a date, you can use 43.weeks, provided by ActiveSupport:
irb(main):001:0> 43.weeks
=> 301 days
irb(main):002:0> Date.today + 43.weeks
=> Thu, 22 Aug 2013
irb(main):003:0> Date.today + 10.days
=> Sun, 04 Nov 2012
irb(main):004:0> Date.today + 1.years # or 1.year
=> Fri, 25 Oct 2013
irb(main):005:0> Date.today + 5.months
=> Mon, 25 Mar 2013
You are going to want to stay away from strftime("%U") and "%W".
Instead, use Date.cweek.
The problem is, if you ever want to take a week number and convert it to a date, strftime won't give you a value that you can pass back to Date.commercial.
Date.commercial expects a range of values that are 1 based.
Date.strftime("%U|%W") returns a value that is 0 based. You would think you could just +1 it and it would be fine. The problem will hit you at the end of a year when there are 53 weeks. (Like what just happened...)
For example, let's look at the end of Dec 2015 and the results from your two options for getting a week number:
Date.parse("2015-12-31").strftime("%W") = 52
Date.parse("2015-12-31").cweek = 53
Now, let's look at converting that week number to a date...
Date.commercial(2015, 52, 1) = Mon, 21 Dec 2015
Date.commercial(2015, 53, 1) = Mon, 28 Dec 2015
If you blindly just +1 the value you pass to Date.commercial, you'll end up with an invalid date in other situations:
For example, December 2014:
Date.commercial(2014, 53, 1) = ArgumentError: invalid date
If you ever have to convert that week number back to a date, the only surefire way is to use Date.cweek.
date.commercial([cwyear=-4712[, cweek=1[, cwday=1[, start=Date::ITALY]]]]) → date
Creates a date object denoting the given week date.
The week and the day of week should be a negative
or a positive number (as a relative week/day from the end of year/week when negative).
They should not be zero.
For the interval
require 'date'
def week_dates( week_num )
year = Time.now.year
week_start = Date.commercial( year, week_num, 1 )
week_end = Date.commercial( year, week_num, 7 )
week_start.strftime( "%m/%d/%y" ) + ' - ' + week_end.strftime("%m/%d/%y" )
end
puts week_dates(22)
EG: Input (Week Number): 22
Output: 06/12/08 - 06/19/08
credit: Siep Korteling http://www.ruby-forum.com/topic/125140
Date#cweek seems to get the ISO-8601 week number (a Monday-based week) like %V in strftime (mentioned by #Robban in a comment).
For example, the Monday and the Sunday of the week I'm writing this:
[ Date.new(2015, 7, 13), Date.new(2015, 7, 19) ].map { |date|
date.strftime("U: %U - W: %W - V: %V - cweek: #{date.cweek}")
}
# => ["U: 28 - W: 28 - V: 29 - cweek: 29", "U: 29 - W: 28 - V: 29 - cweek: 29"]

Lighttpd slow downloads

I have a dedicated server with 1GB/s dedicated, 4GB ram and 4cpus. I have static files for download (from 300mb to 900mb). I was testing over Apache, Nginx and Lighttpd.
Apache makes too many threats and after 200 connections it goes very high so apache it's a NO GO...
Nginx after 100 connections it goes very high so it's a NO GO either.
Lighttpd so far is very good as is a single-threaded server. With 500 concurrent connections the load stays at 0.90 - 1.10 (very good) but I'm facing a download speed problem, it goes slower even when I have 1GBps dedicated port, I see the iptraf and with 500 concurrent connections it goes no more than 250000 KB/s. With apache and nginx sometimes it went to 700000 KB/s the upstream in the server. I switched between sendfile and writev in the config and it has the same result.
I'm not using any php or fast-cgi, just straight download directly to the file, for example: http://www.myserver.com/file.zip and it downloads the file.
I will attach some info here for you to help me figure it out.
Kernel 2.6
lighttpd.conf
# lighttpd configuration file
#
# use it as a base for lighttpd 1.0.0 and above
#
# $Id: lighttpd.conf,v 1.7 2004/11/03 22:26:05 weigon Exp $
############ Options you really have to take care of ####################
## modules to load
# at least mod_access and mod_accesslog should be loaded
# all other module should only be loaded if really neccesary
# - saves some time
# - saves memory
server.modules = (
# "mod_rewrite",
# "mod_redirect",
# "mod_alias",
"mod_access",
# "mod_cml",
# "mod_trigger_b4_dl",
# "mod_auth",
# "mod_status",
# "mod_setenv",
# "mod_proxy_core",
# "mod_proxy_backend_http",
# "mod_proxy_backend_fastcgi",
# "mod_proxy_backend_scgi",
# "mod_proxy_backend_ajp13",
# "mod_simple_vhost",
# "mod_evhost",
# "mod_userdir",
# "mod_cgi",
# "mod_compress",
# "mod_ssi",
# "mod_usertrack",
# "mod_expire",
# "mod_secdownload",
# "mod_rrdtool",
"mod_accesslog" )
## a static document-root, for virtual-hosting take look at the
## server.virtual-* options
server.document-root = "/usr/share/nginx/html/"
## where to send error-messages to
server.errorlog = "/www/logs/lighttpd.error.log"
# files to check for if .../ is requested
index-file.names = ( "index.php", "index.html",
"index.htm", "default.htm" )
## set the event-handler (read the performance section in the manual)
# server.event-handler = "freebsd-kqueue" # needed on OS X
server.event-handler = "linux-sysepoll"
#server.network-backend = "linux-sendfile"
server.network-backend = "writev"
# mimetype mapping
mimetype.assign = (
".pdf" => "application/pdf",
".sig" => "application/pgp-signature",
".spl" => "application/futuresplash",
".class" => "application/octet-stream",
".ps" => "application/postscript",
".torrent" => "application/x-bittorrent",
".dvi" => "application/x-dvi",
".gz" => "application/x-gzip",
".pac" => "application/x-ns-proxy-autoconfig",
".swf" => "application/x-shockwave-flash",
".tar.gz" => "application/x-tgz",
".tgz" => "application/x-tgz",
".tar" => "application/x-tar",
".zip" => "application/zip",
".mp3" => "audio/mpeg",
".m3u" => "audio/x-mpegurl",
".wma" => "audio/x-ms-wma",
".wax" => "audio/x-ms-wax",
".ogg" => "application/ogg",
".wav" => "audio/x-wav",
".gif" => "image/gif",
".jpg" => "image/jpeg",
".jpeg" => "image/jpeg",
".png" => "image/png",
".xbm" => "image/x-xbitmap",
".xpm" => "image/x-xpixmap",
".xwd" => "image/x-xwindowdump",
".css" => "text/css",
".html" => "text/html",
".htm" => "text/html",
".js" => "text/javascript",
".asc" => "text/plain",
".c" => "text/plain",
".cpp" => "text/plain",
".log" => "text/plain",
".conf" => "text/plain",
".text" => "text/plain",
".txt" => "text/plain",
".dtd" => "text/xml",
".xml" => "text/xml",
".mpeg" => "video/mpeg",
".mpg" => "video/mpeg",
".mov" => "video/quicktime",
".qt" => "video/quicktime",
".avi" => "video/x-msvideo",
".asf" => "video/x-ms-asf",
".asx" => "video/x-ms-asf",
".wmv" => "video/x-ms-wmv",
".bz2" => "application/x-bzip",
".tbz" => "application/x-bzip-compressed-tar",
".tar.bz2" => "application/x-bzip-compressed-tar"
)
# Use the "Content-Type" extended attribute to obtain mime type if possible
#mimetype.use-xattr = "enable"
## send a different Server: header
## be nice and keep it at lighttpd
# server.tag = "lighttpd"
#### accesslog module
accesslog.filename = "/www/logs/access.log"
## deny access the file-extensions
#
# ~ is for backupfiles from vi, emacs, joe, ...
# .inc is often used for code includes which should in general not be part
# of the document-root
url.access-deny = ( "~", ".inc" )
$HTTP["url"] =~ "\.pdf$" {
server.range-requests = "disable"
}
##
# which extensions should not be handle via static-file transfer
#
# .php, .pl, .fcgi are most often handled by mod_fastcgi or mod_cgi
static-file.exclude-extensions = ( ".php", ".pl", ".fcgi" )
######### Options that are good to be but not neccesary to be changed #######
## bind to port (default: 80)
#server.port = 81
## bind to localhost (default: all interfaces)
#server.bind = "grisu.home.kneschke.de"
## error-handler for status 404
#server.error-handler-404 = "/error-handler.html"
#server.error-handler-404 = "/error-handler.php"
## to help the rc.scripts
#server.pid-file = "/var/run/lighttpd.pid"
###### virtual hosts
##
## If you want name-based virtual hosting add the next three settings and load
## mod_simple_vhost
##
## document-root =
## virtual-server-root + virtual-server-default-host + virtual-server-docroot
## or
## virtual-server-root + http-host + virtual-server-docroot
##
#simple-vhost.server-root = "/home/weigon/wwwroot/servers/"
#simple-vhost.default-host = "grisu.home.kneschke.de"
#simple-vhost.document-root = "/pages/"
##
## Format: <errorfile-prefix><status-code>.html
## -> ..../status-404.html for 'File not found'
#server.errorfile-prefix = "/home/weigon/projects/lighttpd/doc/status-"
## virtual directory listings
#dir-listing.activate = "enable"
## enable debugging
#debug.log-request-header = "enable"
#debug.log-response-header = "enable"
#debug.log-request-handling = "enable"
#debug.log-file-not-found = "enable"
#debug.log-condition-handling = "enable"
### only root can use these options
#
# chroot() to directory (default: no chroot() )
#server.chroot = "/"
## change uid to <uid> (default: don't care)
#server.username = "wwwrun"
## change uid to <uid> (default: don't care)
#server.groupname = "wwwrun"
#### compress module
#compress.cache-dir = "/tmp/lighttpd/cache/compress/"
#compress.filetype = ("text/plain", "text/html")
#### proxy module
## read proxy.txt for more info
#$HTTP["url"] =~ "\.php$" {
# proxy-core.balancer = "round-robin"
# proxy-core.allow-x-sendfile = "enable"
# proxy-core.protocol = "http"
# proxy-core.backends = ( "192.168.0.101:80" )
# proxy-core.max-pool-size = 16
#}
#### fastcgi module
## read fastcgi.txt for more info
## for PHP don't forget to set cgi.fix_pathinfo = 1 in the php.ini
#$HTTP["url"] =~ "\.php$" {
# proxy-core.balancer = "round-robin"
# proxy-core.allow-x-sendfile = "enable"
# proxy-core.check-local = "enable"
# proxy-core.protocol = "fastcgi"
# proxy-core.backends = ( "unix:/tmp/php-fastcgi.sock" )
# proxy-core.max-pool-size = 16
#}
#### CGI module
#cgi.assign = ( ".pl" => "/usr/bin/perl",
# ".cgi" => "/usr/bin/perl" )
#
#### SSL engine
#ssl.engine = "enable"
#ssl.pemfile = "server.pem"
#### status module
#status.status-url = "/server-status"
#status.config-url = "/server-config"
#### auth module
## read authentication.txt for more info
#auth.backend = "plain"
#auth.backend.plain.userfile = "lighttpd.user"
#auth.backend.plain.groupfile = "lighttpd.group"
#auth.backend.ldap.hostname = "localhost"
#auth.backend.ldap.base-dn = "dc=my-domain,dc=com"
#auth.backend.ldap.filter = "(uid=$)"
#auth.require = ( "/server-status" =>
# (
# "method" => "digest",
# "realm" => "download archiv",
# "require" => "user=jan"
# ),
# "/server-config" =>
# (
# "method" => "digest",
# "realm" => "download archiv",
# "require" => "valid-user"
# )
# )
#### url handling modules (rewrite, redirect, access)
#url.rewrite = ( "^/$" => "/server-status" )
#url.redirect = ( "^/wishlist/(.+)" => "http://www.123.org/$1" )
#### both rewrite/redirect support back reference to regex conditional using %n
#$HTTP["host"] =~ "^www\.(.*)" {
# url.redirect = ( "^/(.*)" => "http://%1/$1" )
#}
#
# define a pattern for the host url finding
# %% => % sign
# %0 => domain name + tld
# %1 => tld
# %2 => domain name without tld
# %3 => subdomain 1 name
# %4 => subdomain 2 name
#
#evhost.path-pattern = "/home/storage/dev/www/%3/htdocs/"
#### expire module
#expire.url = ( "/buggy/" => "access 2 hours", "/asdhas/" => "access plus 1 seconds 2 minutes")
#### ssi
#ssi.extension = ( ".shtml" )
#### rrdtool
#rrdtool.binary = "/usr/bin/rrdtool"
#rrdtool.db-name = "/var/www/lighttpd.rrd"
#### setenv
#setenv.add-request-header = ( "TRAV_ENV" => "mysql://user#host/db" )
#setenv.add-response-header = ( "X-Secret-Message" => "42" )
## for mod_trigger_b4_dl
# trigger-before-download.gdbm-filename = "/home/weigon/testbase/trigger.db"
# trigger-before-download.memcache-hosts = ( "127.0.0.1:11211" )
# trigger-before-download.trigger-url = "^/trigger/"
# trigger-before-download.download-url = "^/download/"
# trigger-before-download.deny-url = "http://127.0.0.1/index.html"
# trigger-before-download.trigger-timeout = 10
## for mod_cml
## don't forget to add index.cml to server.indexfiles
# cml.extension = ".cml"
# cml.memcache-hosts = ( "127.0.0.1:11211" )
#### variable usage:
## variable name without "." is auto prefixed by "var." and becomes "var.bar"
#bar = 1
#var.mystring = "foo"
## integer add
#bar += 1
## string concat, with integer cast as string, result: "www.foo1.com"
#server.name = "www." + mystring + var.bar + ".com"
## array merge
#index-file.names = (foo + ".php") + index-file.names
#index-file.names += (foo + ".php")
#### include
#include /etc/lighttpd/lighttpd-inc.conf
## same as above if you run: "lighttpd -f /etc/lighttpd/lighttpd.conf"
#include "lighttpd-inc.conf"
#### include_shell
#include_shell "echo var.a=1"
## the above is same as:
#var.a=1
sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536
# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
# These ensure that TIME_WAIT ports either get reused or closed fast.
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_tw_recycle = 1
# TCP memory
net.core.rmem_max = 16777216
net.core.rmem_default = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 262144
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
# For Large File Hosting Servers
net.core.wmem_max = 1048576
#net.ipv4.tcp_wmem = 4096 87380 524288
net.ipv4.tcp_wmem = 4096 524288 16777216
Actual top command
top - 16:15:57 up 6 days, 19:30, 2 users, load average: 1.05, 0.85, 0.83
Tasks: 143 total, 1 running, 142 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.6%us, 2.8%sy, 0.0%ni, 64.7%id, 30.8%wa, 0.0%hi, 1.1%si, 0.0%st
Mem: 3914664k total, 3729404k used, 185260k free, 1676k buffers
Swap: 8388600k total, 9984k used, 8378616k free, 3340832k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28590 root 20 0 518m 75m 71m D 13.1 2.0 1:12.24 lighttpd
28660 root 20 0 15016 1104 812 R 1.9 0.0 0:00.02 top
1 root 20 0 19328 620 396 S 0.0 0.0 0:03.74 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.14 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:00.12 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
7 root RT 0 0 0 0 S 0.0 0.0 0:00.32 migration/1
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
9 root 20 0 0 0 0 S 0.0 0.0 0:01.96 ksoftirqd/1
10 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/1
11 root RT 0 0 0 0 S 0.0 0.0 0:01.00 migration/2
12 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
13 root 20 0 0 0 0 S 0.0 0.0 5:04.44 ksoftirqd/2
14 root RT 0 0 0 0 S 0.0 0.0 0:00.23 watchdog/2
15 root RT 0 0 0 0 S 0.0 0.0 0:00.50 migration/3
16 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/3
17 root 20 0 0 0 0 S 0.0 0.0 0:01.84 ksoftirqd/3
18 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/3
iostat
Linux 2.6.32-220.7.1.el6.x86_64 (zlin) 05/01/2012 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.57 0.00 3.95 30.76 0.00 64.72
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 109.58 38551.74 149.33 22695425748 87908220
netstat -an |grep :80 |wc -l
259
iptraf
247270.0 kbits/sec
What should I change to make the clients download faster, they said sometimes it downloads slower than 10 KB/s
So it appears disk I/O is your problem from looking at top and iostat. You've used up all the disk cache and the system is waiting for data to be read in from disk before it can send it out the NIC.
The first thing I would try is to change to:
server.network-backend = "linux-sendfile"
as that will improve buffer usage (a bit, not as much as it should).
You can do a back of the envelope calculation of how much memory you would need to cache your typical work load (simplistically just add together the sizes of your most 100 popular files). I'm guessing that it's going to be a lot more than the 4GB of memory that you have so the next thing to do would be to either get faster disk drives or more memory.
This is why people use Content Delivery Networks (CDNs) to deliver large amounts of bandwidth (which often comes in the form of large files).
The problem here isn't your web server, it's that HTTP is really not designed as a file download protocol. It's the Hyper Text Transport Protocol, and many of the decisions around HTTP focus on the Hyperlinked Text aspect- file sizes are expected to be small, under a few dozen Kb and certainly under a Mb. The web infrastructure takes advantage of this fact in a lot of their approaches to data caching, etc. Instead of using HTTP for something it really isn't designed for, I would recommend looking at using a different transport mechanism.
FTP: File Transfer Protocol. FTP was designed specifically to transfer files of arbitrary size, and doesn't make the same assumptions as HTTP software. If all you are doing is static downloads, your web page HTML can link to the static files with an ftp:// link, and configuring an FTP server to allow anonymous download is usually straightforward. Check your FTP server's docs for details. Browsers since IE6/FF2 have supported basic FTP natively- the average user will have no different workflow than usual. This is probably not the best approach, as FTP was designed long before HTTP, and as Perry mentioned, long before we had half a gig files.
CDN: Using a content delivery network like Amazon's S3 doesn't technically get around using HTTP, but it lets you not have to worry about your users overloading your server like you're seeing.
BitTorrent: If your users are a bit more tech savy, consider setting your server up to seed the static file indefinitely, then publish magnet links on your site. In the worst case, a single user will experience a direct download from your server, using a protocol that actually knows how to handle large files. In the best case, your hundreds of users will both leech and seed eachother, drastically reducing your server's load. Yes, this required your users to know how to run and configure bittorrent, which is probably not the case, but it's an interesting paradigm for file downloads none the less.

Resources