Detect current CPU Clock Speed Programmatically on OS X? - macos

I just bought a nifty MBA 13" Core i7. I'm told the CPU speed varies automatically, and pretty wildly, too. I'd really like to be able to monitor this with a simple app.
Are there any Cocoa or C calls to find the current clock speed, without actually affecting it?
Edit: I'm OK with answers using Terminal calls, as well as programmatic.
Thanks!

Try this tool called "Intel Power Gadget". It displays IA frequency and IA power in real time.
http://software.intel.com/sites/default/files/article/184535/intel-power-gadget-2.zip

You can query the CPU speed easily via sysctl, either by command line:
sysctl hw.cpufrequency
Or via C:
#include <stdio.h>
#include <sys/types.h>
#include <sys/sysctl.h>
int main() {
int mib[2];
unsigned int freq;
size_t len;
mib[0] = CTL_HW;
mib[1] = HW_CPU_FREQ;
len = sizeof(freq);
sysctl(mib, 2, &freq, &len, NULL, 0);
printf("%u\n", freq);
return 0;
}

Since it's an Intel processor, you could always use RDTSC. That's an assembler instruction that returns the current cycle counter — a 64bit counter that increments every cycle. It'd be a little approximate but e.g.
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
uint64_t rdtsc(void)
{
uint32_t ret0[2];
__asm__ __volatile__("rdtsc" : "=a"(ret0[0]), "=d"(ret0[1]));
return ((uint64_t)ret0[1] << 32) | ret0[0];
}
int main(int argc, const char * argv[])
{
uint64_t startCount = rdtsc();
sleep(1);
uint64_t endCount = rdtsc();
printf("Clocks per second: %llu", endCount - startCount);
return 0;
}
Output 'Clocks per second: 2002120630' on my 2Ghz MacBook Pro.

There is a kernel extension written by "flAked" which logs the cpu p-state to the kernel log.
http://www.insanelymac.com/forum/index.php?showtopic=258612
maybe you could contact him for the code.

This seems to work correctly on OSX.
However, it doesn't work on Linux, where sysctl is deprecated and KERN_CLOCKRATE is undefined.
#include <sys/sysctl.h>
#include <sys/time.h>
int mib[2];
size_t len;
mib[0] = CTL_KERN;
mib[1] = KERN_CLOCKRATE;
struct clockinfo clockinfo;
len = sizeof(clockinfo);
int result = sysctl(mib, 2, &clockinfo, &len, NULL, 0);
assert(result != -1);
log_trace("clockinfo.hz: %d\n", clockinfo.hz);
log_trace("clockinfo.tick: %d\n", clockinfo.tick);

Related

__seg_fs on GCC. Is it possible to emulate it just in a program?

I've just read about support for %fs and %gs segment prefixes on the Intel platforms in GCC.
It was mentioned that "The way you obtain %gs-based pointers, or control the
value of %gs itself, is out of the scope of gcc;"
I'm looking for a way when I manually can set the value of %fs (I'm on IA32, RH Linux) and work with it. When I just set %fs=%ds the test below works fine and this is expected. But I cannot change the test in order to have another value of %fs and do not get a segmentation fault. I start thinking that changing the value of %fs is not the only thing to do. So I'm looking for an advice how to make a part of memory addressed by %fs that is not equal to DS.
#include <stddef.h>
typedef char __seg_fs fs_ptr;
fs_ptr p[] = {'h','e','l','l','o','\0'};
void fs_puts(fs_ptr *s)
{
char buf[100];
buf[0] = s[0];
buf[1] = s[1];
buf[2] = s[2];
buf[3] = '\0';
puts(buf);
}
void __attribute__((constructor)) set_fs()
{
__asm__("mov %ds, %bx\n\t"
"add $0, %bx\n\t" //<---- if fs=ds then the program executes as expected. If not $0 here, then segmentation fault happens.
"mov %bx, %fs\n\t");
}
int main()
{
fs_puts(p);
return 0;
}
I've talked with Armin who implemented __seg_gs/__seg_fs in GCC (Thanks Armin!).
So basically I cannot use these keywords for globals. The aim of introducing __seg_gs/fs was to have a possibility to dynamically allocate regions of memory that are thread-local.
We cannot use __thread for a pointer and to allocate a memory for it using malloc. But __seg_gs/fs introduce such possibility.
The test below somehow illustrates that.
Note that arch_prctl() was used. It exists as 64-bit version only.
Also note that %fs is used for __thread on 64-bit and %gs is free.
#include <stddef.h>
#include <string.h>
#include <stdio.h>
#include <asm/ldt.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <asm/prctl.h>
#include <sys/syscall.h>
#include <unistd.h>
typedef __seg_gs char gs_str;
void gs_puts(gs_str *ptr)
{
int i;
char buf[100];
for(i = 0; i < 100; i++)
buf[i] = ptr[i];
puts(buf);
}
int main()
{
int i;
void *buffer = malloc(100 * sizeof(char));
arch_prctl(ARCH_SET_GS, buffer);
gs_str *gsobj = (gs_str *)0;
for (i = 0; i < 100; i++)
gsobj[i] = 'a'; /* in the %gs space */
gs_puts(gsobj);
return 0;
}

How to use copy_to_user

I'm trying to add a custom system call into the linux kernel. Here is a simple code:
#include <linux/mysyscall.h>
#include <linux/kernel.h>
#include <asm/uaccess.h>
#include <asm/system.h>
asmlinkage int sys_mysyscall(int *data){
int a = 3;
cli();
copy_to_user(data, &a, 1);
sti();
printk(KERN_EMERG "Called with %d\n", a);
return a;
}
I can compile a kernel with mysyscall added and when I try to access it with a user program like:
#include <linux/mysyscall.h>
int main(void){
int *data;
int r;
int a = 0;
data = &a;
r = mysyscall(data);
printf("r is %d and data is %d", r, *data);
}
*data does not equal to 3 it equals to 0.
How should I use copy_to_user to fix it?
The copy to user line of code copies only one byte from 'a'. In case of little endian systems it is going to be 0. Copy all the 4 bytes to get the correct result.

Implement a random-number generator using only getpid() and gettimeofday()?

I am using gcc compiler to Implement a random-number generator using only getpid() and gettimeofday(). Here is my code
#include <stdio.h>
#include <sys/time.h>
#include <sys/time.h>
#include <time.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
struct timeval tv;
int count;
int i;
int INPUT_MAX =10;
int NO_OF_SAMPLES =10;
gettimeofday(&tv, NULL);
printf("Enter Max: \n");
scanf("%d", &INPUT_MAX);
printf("Enter No. of samples needed: \n");
scanf("%d", &NO_OF_SAMPLES);
/*printf("%ld\n",tv.tv_usec);
printf("PID :%d\n", getpid());*/
for (count = 0; count< NO_OF_SAMPLES; count++) {
printf("%ld\n", (getpid() * tv.tv_usec) % INPUT_MAX + 1);
for (i = 0; i < 1000000; ++i)
{
/* code */
}
}
return 0;
}
I gave a inner for loop for delay purpose but the result what i am getting is always same no. like this
./a.out
Enter Max:
10
Enter No. of samples needed:
10
1
1
1
1
1
1
1
1
1
1
Plz correct me what am i doing wrong?
getpid() is constant during the programs execution, so you get constant values, too.
But even if you use gettimeofday() inside the loop, this likely won't help:
gcc will likely optimize away your delay loop.
even it it's not optimized away, the delays will be very similar and your values won't be very random.
I'd suggest you look up "linear congruential generator", for a simple way to generate more random numbers.
Put gettimeofday in the loop. Look if getpid() is divisible by INPUT_MAX + 1 you will get the same answer always. Instead you can add getpid() (not make any sense though()) to tv.tv_usec.

how to increase memory limit in Visual Studio C++

Need Help.I'm stuck at a problem when running a C++ code on Windows- Visual Studio.
When I run that code in Linux environment, there is no restriction on the memory I am able to allocate dynamically(till the size available in RAM).
But on VS Compiler, it does not let me create an array beyond a limited size.
I've tried /F option and 20-25 of google links to increase memory size but they dont seem to help much.
I am currently able to assign only around 100mb out of 3gb available.
If there is a solution for this in Windows and not in Visual Studio's compiler, I will be glad to hear that as I have a CUDA TeslaC2070 card which is proving to be pretty useless on Windows as I wanted to run my CUDA/C++ code on Windows environment.
Here's my code. it fails when LENGTH>128(no of images 640x480pngs. less than 0.5mb each. I've also calculated the approximate memory size it takes by counting data structures and types used in OpenCV and by me but still it is very less than 2gb). stackoverflow exception. Same with dynamic allocation. I've already maximized the heap and stack sizes.
#include "stdafx.h"
#include <cv.h>
#include <cxcore.h>
#include <highgui.h>
#include <cuda.h>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#define LENGTH 100
#define SIZE1 640
#define SIZE2 480
#include <iostream>
using namespace std;
__global__ void square_array(double *img1_d, long N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
img1_d[idx]= 255.0-img1_d[idx];
}
int _tmain(int argc, _TCHAR* argv[])
{
IplImage *img1[LENGTH];
// Open the file.
for(int i=0;i<LENGTH;i++)
{ img1[i] = cvLoadImage("abstract3.jpg");}
CvMat *mat1[LENGTH];
for(int i=0;i<LENGTH;i++)
{
mat1[i] = cvCreateMat(img1[i]->height,img1[i]->width,CV_32FC3 );
cvConvert( img1[i], mat1[i] );
}
double a[LENGTH][2*SIZE1][SIZE2][3];
for(int m=0;m<LENGTH;m++)
{
for(int i=0;i<SIZE1;i++)
{
for(int j=0;j<SIZE2;j++)
{
CvScalar scal = cvGet2D( mat1[m],j,i);
a[m][i][j][0] = scal.val[0];
a[m][i][j][1] = scal.val[1];
a[m][i][j][2] = scal.val[2];
a[m][i+SIZE1][j][0] = scal.val[0];
a[m][i+SIZE1][j][1] = scal.val[1];
a[m][i+SIZE1][j][2] = scal.val[2];
}
} }
//cuda
double *a_d;
int N=LENGTH*2*SIZE1*SIZE2*3;
cudaMalloc((void **) &a_d, N*sizeof(double));
cudaMemcpy(a_d, a, N*sizeof(double), cudaMemcpyHostToDevice);
int block_size = 370;
int n_blocks = N/block_size + (N%block_size == 0 ? 0:1);
cout<<n_blocks<<block_size;
square_array <<< n_blocks, block_size >>> (a_d, N);
cudaMemcpy(a, a_d, N*sizeof(double), cudaMemcpyDeviceToHost);
//cuda end
char name[]= "Image: 00000";
name[12]='\0';
int x=0,y=0;
for(int m=0;m<LENGTH;m++)
{
for (int i = 0; i < img1[m]->width*img1[m]->height*3; i+=3)
{
img1[m]->imageData[i]= a[m][x][y][0];
img1[m]->imageData[i+1]= a[m][x][y][1];
img1[m]->imageData[i+2]= a[m][x][y][2];
if(x==SIZE1)
{
x=0;
y++;
}
x++;
}
switch(name[11])
{
case '9': switch(name[10])
{
case '9':
switch(name[9])
{
case '9': name[11]='0';name[10]='0';name[9]='0';name[8]++;
break;
default : name[11]='0';
name[10]='0';
name[9]++;
}break;
default : name[11]='0'; name[10]++;break;
}
break;
default : name[11]++;break;
}
// Display the image.
cvNamedWindow(name, CV_WINDOW_AUTOSIZE);
cvShowImage(name,img1);
//cvSaveImage(name ,img1);
}
// Wait for the user to press a key in the GUI window.
cvWaitKey(0);
// Free the resources.
//cvDestroyWindow(x);
//cvReleaseImage(&img1);
//cvDestroyWindow("Image:");
//cvReleaseImage(&img2);
return 0;
}
The problem is that you are allocating a huge multidimensional array on the stack in your main function (double a[..][..][..]). Do not allocate this much memory on the stack. Use malloc/new to allocate on the heap.

Page faults on OS X when reading with MMAP

I am trying to benchmark file system I/O on Mac OS X using mmap.
#include <unistd.h>
#include <fcntl.h>
#include <dirent.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <stdio.h>
#include <math.h>
char c;
int main(int argc, char ** argv)
{
if (argc != 2)
{
printf("no files\n");
exit(1);
}
int fd = open(argv[1], O_RDONLY);
fcntl(fd, F_NOCACHE, 1);
int offset=0;
int size=0x100000;
int pagesize = getpagesize();
struct stat stats;
fstat(fd, &stats);
int filesize = stats.st_size;
printf("%d byte pages\n", pagesize);
printf("file %s # %d bytes\n", argv[1], filesize);
while(offset < filesize)
{
if(offset + size > filesize)
{
int pages = ceil((filesize-offset)/(double)pagesize);
size = pages*pagesize;
}
printf("mapping offset %x with size %x\n", offset, size);
void * mem = mmap(0, size, PROT_READ, 0, fd, offset);
if(mem == -1)
return 0;
offset+=size;
int i=0;
for(; i<size; i+=pagesize)
{
c = *((char *)mem+i);
}
munmap(mem, size);
}
return 0;
}
The idea is that I'll map a file or portion of it and then cause a page fault by dereferencing it. I am slowly losing my sanity since this doesn't at all work and I've done similar things on Linux before.
Change this line
void * mem = mmap(0, size, PROT_READ, 0, fd, offset);
to
void * mem = mmap(0, size, PROT_READ, MAP_PRIVATE, fd, offset);
And, don't compare mem with -1. Use this instead:
if(mem == MAP_FAILED) { ... }
It's both more readable and more portable.
General advice: if you're on a different UNIX platform from what you're used to, it's a good idea to open the man page. For mmap on OS X, it can be found here. It says
Conforming applications must specify either MAP_PRIVATE or MAP_SHARED.
So, specifying 0 on the fourth
argument is not OK in OS X. I believe
this is true for BSD in general.

Resources