Does fault-tolerance entail high availability? - high-availability

I still confuse the principle of high availability with that of fault tolerance. I have researched about the definition of each and ended up with the following answers:
Fault tolerance refers to systems that have no point of failure.
High availability refers to systems with minimized downtime (up to 99.999% uptime)
Additionaly, I found from the SNIA dictionary that high availability is most often achieved through failure tolerance. So, does this mean that a fault tolerance system can definitely provide high availability?

Related

What is the performance impact of virtual memory relative to direct mapped memory?

Virtual memory is a convenient way to isolate memory among processes and give each process its own address space. It works by translating virtual addresses to physical addresses.
I'm already very familiar with how virtual memory works and is implemented. What I don't know about is the performance impact of virtual memory relative to direct mapped memory, which requires no overhead for translation.
Please don't say that there is no overhead. This is obviously false since traversing page tables requires several memory accesses. It is possible that TLB misses are infrequent enough that the performance impacts are negligible, however, if this is the case there should be evidence for it.
I also realize the importance of virtual memory for many of the functions a modern OS provides, so this question isn't about whether virtual memory is good or bad (it is clearly a good thing for most use cases), I'm asking purely about the performance effects of virtual memory.
The answer I'm looking for is ideally something like: virtual memory imposes an x% overhead over direct mapping and here is a paper showing that. I tried to look for papers with such results, but was unable to find any.
This question is difficult to answer definitively because virtual memory is an integral part of modern systems are designed to support virtual memory and most software is written and optimized using systems with virtual memory.
However, in the early 2000s Microsoft Research developed a research OS called Signularity that, among other things, did not rely on virtual memory for process isolation. As part of this project they published a paper where they analyzed the overhead of hardware support for process isolation. The paper is entitled Deconstructing Process Isolation (non-paywall link here). In the paper the researchers write:
Most operating systems use a CPU’s memory management hardware to
provide process isolation, using two mechanisms. First, processes are
only allowed access to certain pages of physical memory. Second,
privilege levels prevent untrusted code from manipulating the system
resources that implement processes, for example, the memory management
unit (MMU) or interrupt controllers. These mechanisms’ non-trivial
performance costs are largely hidden, since there is no widely used
alternative approach to compare them to. Mapping from virtual to
physical addresses can incur overheads up to 10–30% due to exception
handling, inline TLB lookup, TLB reloads, and maintenance of kernel
data structures such as page tables [29]. In addition, virtual memory
and privilege levels increase the cost of inter-process communication.
Later in the paper they write:
Virtual memory systems (with the exception of software-only systems
such as SPUR [46]) rely on a hardware cache of address translations to
avoid accessing page tables at every processor cache miss. Managing
TLB entries has a cost, which Jacob and Mudge estimated at 5–10% on a
simulated MIPS-like processor [29]. The virtual memory system also
brings its data, and in some systems, code as well, into a processor’s
caches, which evicts user code and data. Jacob and Mudge estimate
that, with small caches, these induced misses can increase the
overhead to 10–20%. Furthermore, they found that virtual memory
induced interrupts can increase the overhead to 10–30%. Other studies
found similar or even higher overheads, though the actual costs are
very dependent on system details and benchmarks [3, 6, 10, 26, 36, 40,
41]. In addition, TLB access is on the critical path of many processor
designs [2, 30] and so might affect processor clock speed.
Overall I would take these results with a grain of salt since the research is promoting an alternative system. But clearly there is some overhead associated with implementing virtual memory, and this paper gives one attempt to quantify some of these overheads (within the context of evaluating a possible alternative). I recommend reading the paper for more detail.

Is there a technique to predict performance impact of application

A customer is running a clustered web application server under considerable load. He wants to know if the upcoming application, which is not implemented yet, will still be manageable by his current setup.
Is there a established method to predict the performance impact of application in concept state, based on an existing requirement specification (or maybe a functional design specification).
First priority would be to predict the impact on CPU resource.
Is it possible to get fairly exact results at all?
I'd say the canonical answer is no. You always have to benchmark the actual application being deployed on its target architecture.
Why? Software and software development are not predictable. And systems are even more unpredictable.
Even if you know the requirements now and have done deep analysis what happens if:
The program has a performance bug (or two...) - which might even be a bug in a third-party library
New requirements are added or requirements change
The analysis and design don't spot all the hidden inter-relationships between components
There are non-linear effects of adding load and the new load might take the hardware over a critical threshold (a threshold that is not obvious now).
These concerns are not theoretical. If they were, SW development would be trivial and projects would always be delivered on time and to budget.
However there are some heuristics I personally used that you can apply. First you need a really good understanding of the current system:
Break the existing system's functions down into small, medium and large and benchmark those on your hardware
Perform a load test of these individual functions and capture thoughput in transactions/sec, CPU cost, network traffic and disk I/O figures for as many of these transactions as possible, making sure you have representation of small, medium and large. This load test should take the system up to the point where additional load will decrease transactions/sec
Get the figures for the max transactions/sec of the current system
Understand the rate of growth of this application and plan accordingly
Perform the analysis to get an 'average' small, medium and large 'cost' in terms of CPU, RAM, disk and network. This would be of the form:
Small transaction
CPU utilization: 10ms
RAM overhead 5MB (cache)
RAM working: 100kb (eg 10 concurrent threads = 1MB, 100 threads = 10MB)
Disk I/O: 5kb (database)
Network app<->DB: 10kb
Network app<->browser: 40kb
From this analysis you should understand how much headroom you have - CPU certainly, but check that there is sufficient RAM, network and disk capacity. Eg, the CPU required for small transactions is number of small transactions per second multiplied by the CPU cost of a small transaction. Add in the CPU cost of medium transactions and large ones, and you have your CPU budget.
Make sure the DBAs are involved. They need to do the same on the DB.
Now you need to analyse your upcoming application:
Assign each features into the same small, medium and large buckets, ensuring a like-for-like matching as far as possible
Ask deep, probing questions about how many transactions/sec each feature will experience at peak
Talk about the expected rate of growth of the application
Don't forget that the system may slow as the size of the database increases
On a personal note, you are being asked to predict the unpredictable - putting your name and reputation on the line. If you say it can fit, you are owning the risk for a large software development project. If you are being pressured to say yes, you need to ensure that there are many other people's names involved along with yours - and those names should all be visible on the go/no-go decision. Not only is this more likely to ensure that all factors are considered, and that the analysis is sound, but it will also ensure that the project has many involved individuals personally aligned to its success.

Safety issue when using OneCare Safety Scanner - no description of "Issue" that is severe

Used Windows Onecare Safety Scanner several times and kept getting warning:
1 severe issue found and 1 high issue found.
No other info.... so what does it mean, and what do I do now?
There are four possible severity ratings for NIS writeups:
Critical - refers to a vulnerability whose exploitation could allow the propagation of an Internet worm without user action
Important - refers to a vulnerability whose exploitation could result in compromise of the confidentiality, integrity, or availability of user’s data, or of the integrity or availability of processing resources
Moderate - refers to a vulnerability whose exploitability is mitigated to a significant degree by factors such as default configuration, auditing, or difficulty of exploitation
Low - refers to a vulnerability whose exploitation is extremely difficult, or whose impact is minimal

Methodology/Template for calculating Application reliability five Nines/Six Nines?

any concrete suggestions for computing application/System reliability ?
For predicted reliability, you multiply the predicted reliabilities of all critical components together (under the assumption that the reliabilities are independent, which is not generally safe). With non-critical components, you've got to work out whether they group together to form a critical system or whether there is some characteristic time which they can be down for before coming critical or … Well, in summary, you've just got to analyze very carefully.
But predicted reliability is not the same as measured reliability! If you're at all serious about this (and generally 99.9999% reliability is very serious stuff) then you're going to have to measure, and you need to work out very carefully what to measure too and from what perspective. There's no point in measuring website availability from within the same cluster if the characteristic problem of the deployment is off-site networking bandwidth.

Software performance (MCPS and Power consumed) in a Embedded system

Assume an embedded environment which has either a DSP core(any other processor core).
If i have a code for some application/functionality which is optimized to be one of the best from point of view of Cycles consumed(MCPS) , will it also be a code, best from the point of view of Power consumed by that code in a real hardware system?
Can a code optimized for least MCPS be guaranteed to have least power consumption as well?
I know there are many aspects to be considered here like the architecture of the underlying processor and the hardware system(memory, bus, etc..).
Very difficult to tell without putting a sensitive ammeter between your board and power supply and logging the current drawn. My approach is to test assumptions for various real world scenarios rather than go with the supporting documentation.
No, lowest cycle count will not guarantee lowest power consumption.
It's a good indication, but you didn't take into account that memory bus activity consumes quite a lot of power as well.
Your code may for example have a higher cycle count but lower power consumption if you move often needed data into internal memory (on chip ram). That won't increase the cycle-count of your algorithms but moving the data in- and out the internal memory increases cycle-count.
If your system has a cache as well as internal memory, optimize for best cache utilization as well.
This isn't a direct answer, but I thought this paper (from this answer) was interesting: Real-Time Task Scheduling for Energy-Aware Embedded Systems.
As I understand it, it trying to run each task under the processor's low power state, unless it can't meet the deadline without high power. So in a scheme like that, more time efficient code (less cycles) should allow the processor to spend more time throttled back.

Resources