How can I test a CPU scheduling algorithm (example:RR)?
As you know, an operating system includes its own processes which run on the CPU. However, I want to do it in a pure environment without any other processes and just with the P1, P2, and P3 processes that I have made.
Is there any simulation environment for testing CPU scheduling algorithms?
Edited:PART 1 : For example a company like Microsoft or in Universites how test the CPU scheduling algorithms and see it's result? I want to see that result.
PART 2 : Is there any simulation environment for doing this?
when we have OS (Windows,Linux) so there are some processes which blongs the OS.but I want to do it in a pure environment.
I don't know my idea is right or No,please tell if I'm making mistake for testing the CPU scheduling algorithm.
How can I implement it?
because I had just do it in a paper.
The CPU scheduler a.k.a task/process scheduler is inside the kernel for Linux systems. So, one way to compare between two different tasks scheduler is to build the same kernel with two different scheduler and compare with the same workload or application. The default scheduler in Linux is CFS (Complete Fair Scheduler). There are several other schedulers for example, Real-Time, BFS, and others. The RR (Round Robin) is just the method in choosing the next task to be schedule after one task is preempted.
Here is more info about Tuning the Task Scheduler
Related
I am interested in performing weak scaling tests on an HPC cluster. In order to achieve this, I run several small tests on 1,2,4,8,16,32,64 nodes with each simulation taking less than a minute to maximum 1 hour. However, the jobs stay in queue (1 hour queue) for several days before the test results are available.
I have two questions:
Is there a way to prioritize the jobs in the job scheduler given that most tests are less than a minute for which I have to wait several days?
Can and to what extent such a job scheduling policy invite abuse of HPC resources. Consider a hypothetical example of an HPC simulation on 32 nodes, which is divided into several small 1 hour simulations that get prioritized because of the solution provided in point 1. above?
Note: the job scheduling and management system used at the HPC center is MOAB. Each cluster node is equipped with 2 Xeon 6140 CPUs#2.3 GHz (Skylake), 18 cores each.
Moab's fairshare scheduler may do what you want, or if it doesn't out of the box, may allow tweaking to prioritize jobs within the range you're interested in: http://docs.adaptivecomputing.com/mwm/7-1-3/help.htm#topics/fairness/6.3fairshare.html.
I don't quite understand spark.task.cpus parameter. It seems to me that a “task” corresponds to a “thread” or a "process", if you will, within the executor. Suppose that I set "spark.task.cpus" to 2.
How can a thread utilize two CPUs simultaneously? Couldn't it require locks and cause synchronization problems?
I'm looking at launchTask() function in deploy/executor/Executor.scala, and I don't see any notion of "number of cpus per task" here. So where/how does Spark eventually allocate more than one cpu to a task in the standalone mode?
To the best of my knowledge spark.task.cpus controls the parallelism of tasks in you cluster in the case where some particular tasks are known to have their own internal (custom) parallelism.
In more detail:
We know that spark.cores.max defines how many threads (aka cores) your application needs. If you leave spark.task.cpus = 1 then you will have #spark.cores.max number of concurrent Spark tasks running at the same time.
You will only want to change spark.task.cpus if you know that your tasks are themselves parallelized (maybe each of your task spawns two threads, interacts with external tools, etc.) By setting spark.task.cpus accordingly, you become a good "citizen". Now if you have spark.cores.max=10 and spark.task.cpus=2 Spark will only create 10/2=5 concurrent tasks. Given that your tasks need (say) 2 threads internally the total number of executing threads will never be more than 10. This means that you never go above your initial contract (defined by spark.cores.max).
I am trying to create a background task scheduler for my process, which needs to schedule the tasks(compute intensive) parallelly while maintaining the responsiveness of the UI.
Currently, I am using CPU usage(percentage) to against a threshold (~50%) for the scheduler to start a new task, and it sort of works fine.
This program can run on a variety hardware configurations( e.g processor speed, number of cores), so 50% limit can be too harsh or soft for certain configurations.
Is there any good way to include different parameters of CPU configuration e.g cores, speed; which can dynamically come up with a threshold number based on the hardware configuration?
My suggestions:
Run as many threads as CPUs in the system.
Set the priority of each thread to an idle (lowest)
In the thread main loop do a smallest sleep possible, i.e. usleep(1)
I have studied about the topic of Job Schedulers and there are different types like Long term, medium and short term schedulers and finally got confused with the things.
So my question is, "Among these three schedulers, which scheduler type will make use of the scheduling algorithms(like FCFS, SJF etc.)"
My understanding so far is, "The scheduling algorithm will take the job from the ready queue (which contains the list of jobs to be executed which is in ready more) and keeps the CPU busy as much as possible".
And the Long Term Scheduler is the one which decides what are all the jobs to be allowed in the ready queue.
So, the long term scheduler is the one which is going to make use of those scheduling algols..?.
And also, I have seen the link, https://en.wikipedia.org/wiki/Scheduling_(computing)
where I have seen that,
Note: The following lines is excerpted from Wiki...
"Thus the short-term scheduler makes scheduling decisions much more frequently than the long-term or mid-term schedulers...."
So, whether all these 3 schedulers will make use of the scheduling algol.??
Finally, I got tucked at this point and got confused with the difference between these types of schedulers ..
Could some one kindly do briefly explain this one?
So I can able to understand this one.
Thanks in advance.
So, whether all these 3 schedulers will make use of the scheduling
algo??
Basically, the scheduling algorithms are chosen by all three of them depending on whichever is functional at that point. All of them require some kind of scheduling decisions at any point as all of them are schedulers. So, it all depends on which is executing at what instant (short-term scheduler executes more frequently as compared to others).
Wikipedia is right in mentioning that. I hope you got your answer in short.
Description :
As mentioned in Process Scheduling page on tutorialspoint :-
Schedulers are special system softwares which handles process scheduling in various ways. Their main task is to select the jobs to be submitted into the system and to decide which process to run.
Long Term Scheduler ------> It selects processes from pool and loads them into memory for execution
Medium Term Scheduler -----> It selects those processes which are ready to execute.
Short Term Scheduler ------> It can re-introduce the process into memory and execution can be continued.
The below list (click here for source) shows the function of each of the three types of schedulers (long-term, short-term, and medium-term) for each of three types of operating systems (batch, interactive, and real-time).
batch
longterm -----> job admission based on characteristics and resource
needs
mediumterm -----> usually none—jobs remain in storage until done
shortterm -----> processes scheduled by priority; continue until wait
voluntarily, request service, or are terminated
interactive
longterm -----> sessions and processes normally accepted unless
capacity reached
mediumterm -----> processes swapped when necessary
shortterm -----> processes scheduled on rotating basis; continue until
service requested, time quantum expires, or pre-empted
real-time
longterm -----> processes either permanent or accepted at once
mediumterm -----> processes never swapped
shortterm -----> scheduling based on strict priority with immediate
preemption; may time-share processes with equal priorities
I understand that Gang scheduling is a scheduling algorithm for parallel systems that schedules related threads or processes to run simultaneously on different processors.
Gang scheduling is used so that if two or more threads or processes communicate with each other, they will all be ready to communicate at the same time. However, how does a gang scheduling algorithm determine that the particular set of processes will be communicating among themselves and hence schedule related threads or processes to run simultaneously on different processors?
Gang scheduling is usually applied to a job, either by operating system default or because the job has been marked for gang scheduling. All tasks in the job are scheduled together without attempting to measure whether they all actively communicate.
The following paper has an introduction and some citations that may help you get background on gang scheduling:
Papazachos, Z.C.; Karatza, H.D.; , "Gang scheduling in a two-cluster system with critical sporadic jobs and migrations," Performance Evaluation of Computer & Telecommunication Systems, 2009. SPECTS 2009. International Symposium on , vol.41, no., pp.41-48, 13-16 July 2009
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5224147&isnumber=5224098