Xcode Server Bots not running tests - xcode

I'm trying, but failing to setup a reliable continuous integration environment using Xcode server.
I have a git repository on a headless mac mini server running the Xcode server service, the server has a separate development user account with administrator privileges that is used by Xcode.
I have setup my schemes, with testing included and shared them to the repository.
The bots run, check out code, build, analyze and archive, but only seems to run tests when it feels like it, which is almost never. I've checked the schemes and they have not changed since Xcode ran the tests and when it didn't.
On first setting them up, tests wouldn't run at all, until I added administrator privileges to the development account, then the tests ran a couple of times, before Xcode server decided to stop running them again.
I don't seem to get any reason why the tests aren't run, sometimes the bots fail to run because of some crash during the setup, and an error is reported, but mostly the bot seems to run, they just don't execute the tests, and no error is reported.
I've logged in remotely to the server, and the simulator is running, but never seems to do anything.
Here's a screenshot of an example bot, you can see the tests used to run, it sees I've reduced my warnings and got rid of an analysis issue. You can also see where no tests run, and no kind of warning or error is given as to why.
I've tried restarting the server, nope.
I've tried restarting the client, nope.
It's really frustrating and can't find any recent issues that offer a proper solution to this. The server is in constant use running backups and other tasks, so I'd rather not have a solution that involves me logging in to the server and restarting something every time there's a problem, which is always, it makes the whole point of bots useless if I'm spending more time logging in to my server trying to get them to work than they are at actually running.
Anyone have similar issues and a solution?
Edit: Noticed that my memory usage was very high on the server, memory pressure was practically always amber, so went out and got some memory today, increased the mac mini's memory from 4GB to 16GB, and now the tests have started running again. Also, the whole process is much faster (less than surprising i guess).
Could it just be low memory causing problems with the simulator? I've only just installed the memory and restarted, so I'll give it a few test runs before I confirm this solution, it's stopped working before...

Seems that this may be a memory issue, I upgraded the servers memory from 4GB to 16GB as my Activity monitor was showing significant memory pressure.
Since doing this, the bots started running tests again, and the total running time for the bot is a quarter that it was.
As per my edit, I've been running the bots for a day now, including bots that run on multiple simulators, and everything seems to be fine.
It's not very good that no obvious indication is given in xcode as to why the tests didn't run.
For reference and to see if this might fix your problems, original server specs were :
Mac Mini Server edition (late 2012)
2.3 GHz Intel Core i7
4GB memory
2x1TB drives
Replaced the 2x2GB memory sticks with 2x8GB sticks (The maximum allowed for the model)
EDIT : After a month of running with no problems, increasing the memory has solved the problem permanently.

Related

MSBuild performance problems after server migration

We are running TeamCity with 20-ish .Net build configurations. All of them uses MSBuild 4.0. We recently moved one of our build agents from a machine running Windows Server 2008 to a new physical server running Windows Server 2012, and after this migration all the builds are taking almost twice as long time to finish, compared to on the old server! The new server is more powerful than the old server, both in terms of CPU, RAM and Disk. We have been running benchmarks on both servers, all of them confirming that the new server should be more capable than the old one.
According to the build logs it seems that all the stages of the build is slower, so it's not just one of the build steps that are slower.
First thing we did was to check the CPU utilization during builds, and it is suspiciously low! Only 3-6% CPU usage on some of the cores. RAM usage was also very low. Could it be some config for the build agent that is slowing down the builds, and that we have overlooked?
Note: The new server is running as a virtual machine. At first we thought that was the reason, but then that should have been reflected on the benchmarks? This is the only virtual machine running on this physical server, and it has almost all HW resources dedicated. Would be interesting to hear if any of you have had similar bad experience with running the build server on a virtual machine. We have also tried booting "natively" from the VHD image, without any difference in the build times.
I know this could be a very tricky one to debug for "external" people, but I was hoping that someone could maybe give some good suggestion on where to look for the issue, as we are kind of stuck right now.
Edit: Tried activating the Performance Monitor tool in TeamCity, and it shows that both CPU, RAM and Disk usage is comfortably running at <10% (Disk access peaked at 35% a couple of times during the build)

Is there a way to cap the file size of slony log shipping files?

I am working with a SuSE machine (cat /etc/issue: SUSE Linux Enterprise Server 11 SP1 (i586)) running Postgresql 8.1.3 and the Slony-I replication system (slon version 1.1.5). We have a working replication setup going between two databases on this server, which is generating log shipping files to be sent to the remote machines we are tasked to maintain. As of this morning, we ran into a problem with this.
For a while now, we've had strange memory problems on this machine - the oom-killer seems to be striking even when there is plenty of free memory left. That has set the stage for our current issue to occur - we ran a massive update on our system last night, while replication was turned off. Now, as things currently stand, we cannot replicate the changes out - slony is attempting to compile all the changes into a single massive log file, and after about half an hour or so of running, it trips over the oom-killer issue, which appears to restart the replication package. Since it is constantly trying to rebuild that same package, it never gets anywhere.
My first question is this: Is there a way to cap the size of Slony log shipping files, so that it writes out no more than 'X' bytes (or K, or Meg, etc.) and after going over that size, closes the current log shipping file and starts a new one? We've been able to hit about four megs in size before the oom-killer hits with fair regularity, so if I could cap it there, I could at least start generating the smaller files and hopefully eventually get through this.
My second question, I guess, is this: Does anyone have a better solution for this issue than the one I'm asking about? It's quite possible I'm getting tunnel vision looking at the problem, and all I really need is -a- solution, not necessarily -my- solution.

Autoconf on Windows 7 dreadfully slow

I am working on a project using Google's cmockery unit testing framework. For a while, I was able to build the cmockery project with no problems. e.g. "./configure", "make && make install" etc. and it took a reasonable amount of time (1-2 minutes or so.) After working on other miscellaneous tasks on the computer and going back to re-build it, it becomes horrendously slow. (e.g. after fifteen minutes it is still checking system variables.)
I did a system restore to earlier in the day and it goes back to working properly for a time. I have been very careful about monitoring any changes I make to the system, and have not been able to find any direct correlation between something I am changing and the problem. However, the problem inevitably recurs (usually as soon as I assume I must have accidentally avoided the problem and move on). The only way I am able to fix it is to do a system restore to a time when it was working. (Sometimes restarting the machine works as well, sometimes it does not.)
I imagine that the problem is between the environment and autoconf itself rather than something specific in cmockery's configuration. Any ideas?
I am using MinGW and under Windows 7 Professional
Make sure that antivirus software is not interfering. Often, antivirus programs monitor every file access; autoconf accesses many files during its operation and is likely to be slowed down drastically.

Significant Performance Decrease when moving from Windows Server 2003 to 2008 (IIS 6 to IIS 7)

Our ASP.Net 2.0 web app was running happily along on Windows Server 2003. We were starting to see some of the limits of the environment approaching, such as memory and CPU usage spikes, and as we're getting ready to scale we decided it was time for a larger server with higher availability.
We decided to move to Windows Server 2008 to take advantage of IIS 7's shared configuration. In our development and integration environments, we reproduced the OS and app in 2008/IIS 7 and everything seemed fine. But truth be told, don't have a good way of simulating production-like loads as of yet, nor can we reproduce our prod environment accurately (we're small with limited resources). So once we rolled out to production, we were surprised to find performance significantly worse on 2008 than it was on 2003.
We've also moved from a 32-bit environment to 64-bit in the process, and we've also incorporated ASP.Net 3.5 dll's into the project.
Memory usage is through the roof, but I'm not as worried about that. We believe in part this is because of the overhead with Server 2008's memory, so throwing more RAM at it may solve that issue. The troubling thing is we're seeing processor spikes to 99% CPU Utilization, which we've never seen before in the 2003/IIS 6 environment.
Has anyone encountered these issues before and are there any suggestions for a solution/places to look? Right now we're doing the following:
1) Buying time by adding memory.
2) Buying time by setting app pool limits: shut down w3wp.exe when CPU hits 99% load. Since you don't have the option to recycle the app pools, I have a scheduled task running that recycles any stopped app pools.
3) Profiling the app pools under Classic and Integrated modes to see which may perform better.
Any other ideas are completely welcome.
Our experiance is that code runs much faster on a 64bit windows 2008 than on a 32bit windows 2003 server.
I am wondering if something else is also running on the machine. For example is SQL Server installed with a maintainence plan that could cause the CPU spike.
I would check the following:
Which process is using the CPU?
Is there a change in the code? Try installing the new code on the old machine
Is it something to do with the compile options? Is the CPU usage a recompile?
Are there any errors in the event log?
In our cases, since we have 4 processors, we then increased the "number of worker process to 4" currently working well so far as compare before.
here a snapshot:
http://pic.gd/c3661a
You can use the application pool "Recycle" option in IIS7+ to configure physical and virtual memory limits for application pools. Once these are reached the process will recycle and the resources will be released. Unfortunately the option to recycle based on CUP usage has been removed from IIS7+ (some one correct me if I'm wrong). If you have other apps on the server and want to avoid them competing for resources when this condition happens you can implement Windows System Resources Manager and it's IIS policy (here is a good tutorial http://learn.iis.net/page.aspx/449/using-wsrm-to-manage-iis-70-apppool-cpu-utilization/)
Note SRWM is only available on Enterprise and Data Center editions.

Terminating intermittently

Has anyone had and solved a problem where programs would terminate without any indication of why? I encounter this problem about every 6 months and I can get it to stop by having me (the administrator) log-in then out of the machine. After this things are back to normal for the next 6 months. I've seen this on Windows XP and Windows 2000 machines.
I've looked in the Event Viewer and monitored API calls and I cannot see anything out of the ordinary.
UPDATE: On the Windows 2000 machine, Visual Basic 6 would terminate when loading a project. On the Windows XP machine, IIS stopped working until I logged in then out.
UPDATE: Restarting the machine doesn't work.
Perhaps it's not solved by you logging in, but by the user logging out. It could be a memory leak and logging out closes the process, causing windows to reclaim the memory. I assume programs indicated multiple applications, so it could be a shared dll that's causing the problem. Is there any kind of similarities in the programs? .Net, VB6, Office, and so on, or is it everything on the computer? You may be able to narrow it down to shared libraries.
During the 6 month "no error" time frame, is the system always on and logged in? If that's the case, you may suggest the user periodically reboot, perhaps once a week, in order to reclaim leaked memory, or memory claimed by hanging programs that didn't close properly.
You need to take this issue to the software developer.
The more details you provide the more likely it will be that you will get an answer: explain what exact program was 'terminating'. A termination is usually caused by an internal unhandled error, and not all programs check for them, and log them before quitting. However I think you can install Dr Watson, and it will give you at least a stack trace when a crash happens.

Resources