How to calculate variance with C++? - variance

This is what I have so far. I am not able to get the right variance.
variance += ((numbers[i] - (double)mean) * (numbers[i] - (double)mean));
variance /= (double)size;
cout << "The variance is: " << variance << endl;

There are multiple problems with your code, all of which contribute to the variance being wrong.
You are calculating the variance in the loop, while updating the mean every iteration (before you have the correct mean). While this will result in the correct mean being stored in the mean variable at the end of the loop, the variance will have been computed with an incorrect mean in all but the last iteration.
You need the mean of all the numbers before you start subtracting it from the array to compute the variance. The solution would be another loop to calculate the variance after the original loop, at which point you already have the correct mean.
Array indices start at 0, but you start at 1, which means the fist number in your array will always be 0, and you are calculating mean and variance of 9 numbers instead of 10, but dividing by 10.
Mean is an integer, but is meant to store a floating point (or double precision) number. Additionally, total / size will round down since they are both integers, so you should cast one of them to double.

Put
variance += ((numbers[i] - (double)mean) * (numbers[i] - (double)mean));
outside the first for loop in another for loop.
And the code will become like this-
#include <iostream>
using namespace std;
int main()
{
int arraySize = 10;
int numbers[10];
int input, total = 0, size = 0, mean = 0;
double variance = 0;
for (int i = 1; i < arraySize; i++) {
cout << "Enter number " << i << " (-1 to end): ";
cin >> input;
if (input == -1) {
break;
}
else {
numbers[i] = input;
}
total += numbers[i];
size++;
mean = total / size;
}
for(int i=1; i<=size;i++)
variance += ((numbers[i] - (double)mean) * (numbers[i] - (double)mean));
variance /= (double)size;
cout << "The mean is: " << mean << endl;
cout << "The variance is: " << variance << endl;
}
While calculating variance you need the mean of all the numbers in array not the mean after each input.

Related

Google Kick Start 2020 Round A Allocation Problem - WA error

Problem
There are N houses for sale. The i-th house costs Ai dollars to buy. You have a budget of B dollars to spend.
What is the maximum number of houses you can buy?
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case begins with a single line containing the two integers N and B. The second line contains N integers. The i-th integer is Ai, the cost of the i-th house.
Output
For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is the maximum number of houses you can buy.
int main(){
int tests, size, b;
cin >> tests;
for(size_t j = 0; j < tests; ++j){
cin >> size >> b;
int cur;
vector<int> prices(size);
for(size_t i = 0; i < prices.size(); ++i){
cin >> cur;
prices[i] = cur;
}
int cur_total = 0;
int count = 0;
sort(prices.begin(), prices.end());
int i = 0;
while(cur_total + prices[i] <= b){
++count;
cur_total += prices[i];
++i;
}
cout << "Case #" << j+1 << ": " << count << endl;
}
}
The above is my solution and when I run it locally with a test case, it is returning a correct output. Can anyone see a problem with my code??

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets.
Can it be done in polynomial time? What would be the solution?
Checking all subsets is 2^n.
Backtracking is a possibility for this problem.
It allows examining all the possibilities recursively, without the need of a large amount of memory.
It stops as soon as an optimal solution is found: sum = X, up to a given tolerance (for example 10^-10 in the programme below)
It allows to implement a simple procedure of premature abandon:
at a given time, if sum + the sum of all remaining elements is higher than X, then we can give up examining the current path, without examining the remaining elements. This procedure is optimized by sorting the input data in decreasing order
Here is a code, in C++. The code being quite basic, it should be easy to migrate it to another language.
This programme tests the algorithm with random (uniform) elements, and display the number of iterations.
The complexity (i.e. the number of iterations) is really varying with the random elements (of course), but also greatly depends of the tolerance that we accept. With a tolerance of 10^-10 and a size of n=100, the complexity generally stays quite acceptable. It is no longer the case with a smaller tolerance.
With n = 100 and five runs, I obtained for the number of iterations: 6102, 3672, 8479, 2235, 12926. However, it is clear that there is no warranty to have good performances in all cases. For n = 100, the number of candidates (subsets) is huge.
// Find min sum greater than a given number X
#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <tuple>
#include <cstdlib>
#include <cmath>
#include <ctime>
std::tuple<double, std::vector<double>> min_sum_greater(std::vector<double> &a, double X) {
int n = a.size();
std::vector<bool> parti (n, false); // current partition studies
std::vector<bool> parti_opt (n, false); // optimal partition
std::vector<double> sum_back (n, 0); // sum of remaining elements
//std::cout << "n = " << n << " \tX = " << X << "\n";
std::sort(a.begin(), a.end(), std::greater<double>());
sum_back[n-1] = a[n-1];
for (int i = n-2; i >= 0; --i) {
sum_back[i] = sum_back[i+1] + a[i];
}
double sum = 0.0; // current sum
int i = 0; // index of the element being examined
double best_sum = sum_back[0] + 1.0;
bool up_down = true;
double eps = 1.0e-10; // error tolerance
long long cpt = 0; // to check the number of iterations
while (true) { // UP
//std::cout << "Start of while loop: i = " << i << "\n";
cpt++;
if (up_down) {
bool abandon = (sum + sum_back[i] < X - eps) || (sum > best_sum);
if (abandon) { //premature abandon
parti[i] = false;
up_down = false;
i--;
continue;
}
parti[i] = true;
sum += a[i];
//std::cout << "UP, i = " << i << " \tsum = " << sum << "\n";
if (fabs(sum - X) < eps) {
best_sum = sum;
parti_opt = parti;
break;
}
if (sum >= X) {
if (sum < best_sum) {
best_sum = sum;
parti_opt = parti;
//std::cout << "i = " << i << " \tbest sum = " << best_sum << "\n";
}
parti[i] = false;
sum -= a[i];
}
if (i == (n-1)) { // leaf
up_down = false;
i--;
continue;
}
i++;
} else { // DOWN
if (i < 0) break;
if (parti[i]) {
sum -= a[i];
parti[i] = false;
i++;
up_down = true;
} else {
i--;
up_down = false;
}
}
}
std::vector<double> answer;
for (int i = 0; i < n; ++i) {
if (parti_opt[i]) answer.push_back (a[i]);
}
std::cout << "number of iterations = " << cpt << " for n = " << n << "\n";
return std::make_tuple (best_sum, answer);
}
int main () {
//std::vector<double> a = {5, 6, 2, 10, 2, 3, 4, 13, 17, 38, 42};
double X = 33.5;
srand (time(NULL));
int n = 100;
double vmax = 100;
X = vmax * n / 4;
std::vector<double> a (n);
for (int i = 0; i < n; ++i) {
a[i] = vmax * double(rand())/RAND_MAX;
}
double sum;
std::vector<double> y;
std::tie (sum, y) = min_sum_greater (a, X);
std::cout << std::setprecision(15) << "sum = " << sum << "\n";
if (n < 20) {
std::cout << "set: ";
for (auto val: y) {
std::cout << val << " ";
}
std::cout << "\n";
}
}

Comparison values in FOR loop are not associating to correct vector elements when loop completes. - Code Review

I am having two issues I believe. They have to do with i) initializing the vectors and ii) getting the smallest and greatest values to correspond to the correct values in the vector.
I've tried a couple of things:
When I initialize smallest and greatest I always run into the problem that both values begin with 0 as their initial value. I attempted to manually enter a value to initialize both variables (smallest = distance[i], but that didn't really work because my vectors are empty initially.
#include "pch.h"
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cmath>
using std::cout;
using std::cin;
using std::vector;
using std::string;
int main()
{
// Read a sequence of double values into a vector
vector <double> distance = {}; // declaring the vector named "distance"
double sum = 0;
double smallest;
double greatest;
for (double x = 0; cin >> x;) { // read into distance, to terminate putting values in vector use anything that is not of variable type of vector
distance.push_back(x); // put distance into vector
cout << '\n';
for (int i = 0; i < distance.size(); i = i + 1) { // keeping track of elements in vector by displaying them
cout << distance[i] << '\n';
}
}
for (int i = 0; i < distance.size(); i = i + 1) { // adding up all values of vector by iterating through all elements
sum = sum + distance[i];
}
cout << "The total sum of all the elements in the vecotr is: " << sum <<
'\n';
for (int i = 0; i < distance.size(); i = i + 1) { // determining the smallest value in the vector
if (smallest > distance[i]) {
smallest = distance[i];
}
}
cout << "The smallest value in the vector is: " << smallest << '\n';
for (int i = 0; i < distance.size(); i = i + 1) { // determining the greatest value in the vector
if (greatest < distance[i]) {
greatest = distance[i];
}
}
cout << "The greatest value in the vector is: " << greatest << '\n';
cout << "The mean distance between two neigbouring cities is: " << sum / distance.size() << '\n';
}
What suggestions/ hints could somebody provide me to work through this?
You can simply assign proper value after filling the vector.
vector <double> distance = {}; // declaring the vector named "distance"
double sum = 0;
double smallest;
double greatest;
for (double x = 0; cin >> x;) { // read into distance, to terminate putting values in vector use anything that is not of variable type of vector
distance.push_back(x); // put distance into vector
cout << '\n';
for (int i = 0; i < distance.size(); i = i + 1) { // keeping track of elements in vector by displaying them
cout << distance[i] << '\n';
}
}
if(!distance.empty()){ // if distance vector is not empty
smallest = distance[0];
greatest = distance[0];
}
Other common technique is to initialize smallest and greatest with values representing respectively +infinity and -infinity.
This way after first iteraton of loop the value of each variable will be in fact distance[0].
#include <limits>
(...)
double smallest = std::numeric_limits<double>::max();
double greatest = std::numeric_limits<double>::min();

Having trouble with a for loop for school

I'm currently taking a C/C++ programming class at my school. I am tasked with writing a piece of code that will ask the user how many numbers they would like averaged, then averages them. The program has to contain a for loop. The problem that I am having is that after the user has entered the "n" variable, if they type a character such as "a", the program will immediately spit out an answer as my average. I would like to find a way to prevent the user from entering characters so that my for loop can finish running and average the numbers properly. Here is my code:
{
int n, i = 1, x = 1;
double sum = 0, average, value;
cout << "\nHow many numbers do you want to average?: ";
cin >> n;
while (n < 1)
{
cout << "\nYou have entered an invalid number.\n";
cout << "\nHow many numbers do you want to average?: ";
cin.clear();
while (cin.get() != '\n');
cin >> n;
}
for (n; i <= n; i++)
{
cout << "\nEnter value: ";
cin >> value;
sum = sum + value;
}
average = sum / n;
cout << "\nThe average is: " << average << endl;
system("pause");
return 0;
}

knapsack program, slow running time

i modified some code from internet to meet my requirements, but unfortunately this program seems to run a bit slow. im not sure whether its my computer or the program itself is slow.
int max(int a, int b)
{
return (a > b) ? a : b;
}
int knapSack(int W, int wt[], int val[], int n)
{
if (n == 0 || W == 0)
return 0;
if (wt[n - 1] > W)
return knapSack(W, wt, val, n - 1);
else
return max(val[n - 1] + knapSack(W - wt[n - 1], wt, val, n - 1),
knapSack(W, wt, val, n - 1));
}
int main()
{
char exeAgain='n';
do
{
cout << "Enter the number of items in a Knapsack : ";
int n, W;
cin >> n;
int val[n], wt[n];
for (int i = 0; i < n; i++)
{
val[i]=(rand()%100)+1;
wt[i]=(rand()%100)+1;
cout << "Item Number "<< i+1 << " value "<<val[i]<<" weight " << wt[i] << endl;
}
cout << "Enter the capacity of knapsack : ";
cin >> W;
cout << "The total profit is " << knapSack(W, wt, val, n)<< endl;
cout<<"Would you like to execute this program again? [Y/N] : ";
cin>>exeAgain;
}
while(exeAgain == 'y' || exeAgain == 'Y');
return 0;
}
is it normal for this program to run slow? (the input is 200 & 1500)
smaller input is fast, but when i use big number, the program slows down.
The Knapsack Problem is famously NP-Complete, which means in simple terms that as the size of your input increases, the time to solve it increases astronomically. So yes, it is completely normal for your program to run slowly at larger inputs.
If you research around, you'll find a lot of work done on the problem, including various strategies to run faster than a first attempt, or heuristics for a good-but-not-perfect solution, which will run faster than something looking for a perfect solution.

Resources