Propel and Persistent Connections - propel

I'm having issues with a large number of conncurrent connections to an Amazon RDS database using propel as the ORM with PHP. The application runs fine during load testing with 20 to 50 connections open at a time, then seems to hit a wall, mushrooms up to maximum connections almost immediately, and everything dies.
I believe Propel is using mysql_pconnect, but I can't find where it designates that, or a simple way to turn it off. I may be chasing a red herring here, but I'm stumped, and there are enough comments on the net regarding pconnect causing problems with too many connections that I thought it would be worth a shot to remove it.
Anyone know how to do this? I have been searching using various phrases, can't seem to find anything.

As it turns out, the error was being caused by the RDS redo log. There is only one size for all RDS instance sizes. On the larger instances sizes, it's possible to fill the redo log and come back to the beginning before the data is written out to the database. At this point it does the 'furiously flushing' thing to get caught up, does not process any new requests, and they pile up like crazy. This eventually caused our app to crash. More, smaller RDS servers fixed the issue, though not very happy with Amazon over this. They need to be able to change the size of the redo logs.

Related

Why are my ADODB queries not being persisted to the SQL server?

My VB6 program uses ADODB to do a lot of SQL (2000) CRUD.
Sometimes the network connection between the remote clients and the data center somehow "drops" resulting in the impossibility to establish new connections (so users launching the program can't use it).
The issue is the following:
Anyone who is using the program at the moment of the "drop" can continue using it with no issues whatoever, perform every operation, update data, read data, and everything seems like is working normally.
User then proceeds to fire up a "sum-up" report which lists everything that was done (before or after the "drop").
If we then check the database, all data regarding whatever was done after the network drop is not there. User goes back into the program and everything is as it was before the network drop.
It seems like all queries where somehow performed in-memory ? I'm at a loss about how to even approach the issue (I'm familiar enough with VB6 to work with the source code but I don't know a lot about ADODB).
I haven't yet tried to replicate the behavior due to limited customer's availability (development environment is housed in their offices), I'll try starting up the program from the IDE then rip out the network cable.
Provided I can replicate the issue, how do I fix this ? Is there some setting I'm not aware of ?
On a side note: the issue is sporadic (it happened a handful of times during the last year, and the software is being used heavily and on a daily basis by mutiple concurrent users).
After reading up on Disconnected Recordsets, it seems that's what's behind this odd behavior I'm experiencing.
This is not something that can be simply "turned off".

Amazon ec2 instance slow

The instance started to fail on mysql... the mysql stops from nowhere. Then i was re-starting it and it worked again.
But now, it got worse. The transfer rate when i download a file from another instante starts in 900kbps and keep going down till 20kbps. Also for external downloads.
I tested also a zip job, zipping a big file.... it starts quickly then it slows down and keeps a rate of 10 files zipped per second wich is too slow ( another instances gets 1000 files per sec).
I can't access trough http the websites hosted also because its too slow.
I have already reboot, stop->start. Also i made an image e rebuild the image in a new instance and the problem continues. I also changed the Volume used by the instance and the volume with the problem keeps slow.
What should i do?
I'll take a stab (based on the lack of details), that your instance is just to small - sounds like you are running a web server and mysql on it at least, I'll also bet that you are trying to use a micro instance which have notoriously variable performance statistics and really aren't suitable for running a web server and database server with consistent performance (maybe ok for development, but not production imo).
Try just spinning up a larger instance and see if your problems magically go away; you can test this for a few dollars and if it does solve the problem you can decide if you can permanently upgrade the image size.

Node.js suddenly getting extremely slow

We have this architecture with 2 node processes.
One polls a private API and pushes the changes to the second node if any.
The second node process the data and calls a bunch other API's and eventually emits a change event to the client, a HTML5 website, with socket.io
This second node will always process the data and will always emit changes even if no clients are connected. So in my opinion the CPU or mem usage is not that greatly affected by the number of connected clients. Also note that this architecture is still running on a private staging environment.
Everything runs fine and we're ready to go live until we noticed after couple of days, maybe a week, the second node suddenly gets extremely slow while the first node is still fine.
It gets so bad that even the connection between the two nodes gets timed out and they are on the same network over localhost. It also takes more then 10 seconds to browse to the socket.io/socket.io.js file.
I know its very hard to understand the problem without seeing any code but I'm kinda pulling my hair out because we have to go live in couple of days and my logs are not revealing anything and google isn't helping either.
Whats a good practice towards building Have you ever experienced anything like this? What was the problem and how did you fix it?
Whats a good monitor and profiler for node.js? (preferably free)
What are good practices towards building a node.js app with makes a lot of outgoing API calls?
Anything or anyone that could help me in the right direction of solving or even discovering the actual problem will be greatly appreciated!
Thank you!
Never experienced anything like this but may be the second node is blocking the event loop by doing CPU intensive work or waiting for some resource synchronously.
Add some logging in your code to see how much time second node is taking for processing each change pushed by first node. May be some type of change consumes CPU for 10 seconds or so to complete.
You should also start monitoring memory, CPU and network connections. When things slow down your monitoring will provide some clue as to where is the bottle neck.
For monitoring you can try following 3 tools
nodetime
hummingbird
node-monitor
Also read http://nodetime.com/blog/monitoring-nodejs-application-performance
It sounds like you have a memory leak somewhere in the second node, maybe from calling too many anonymous functions etc... do you notice your RAM usage slightly creeping up as it runs?

Causes of high network latency

I have a site that is moving incredibly slowly right now. Both Safari's inspector and Firebug are reporting that most of the load time is due to latency. The actual download is happening in less than a second. There's a lot of database activity in play (though the metrics on that indicate that it's pretty healthy), but what else can cause really high latency? Is it a purely network thing or are there changes I can make to the app to improve the latency numbers?
I'm using YSlow to help identify performance improvements, but on the whole, I don't see it reporting anything that seems crazy unreasonable. Opportunities for improvement, certainly, but nothing that seems like it would cause the huge load times I'm seeing.
Thanks.
UPDATE
Some background and metrics, in case it's useful. This is a CakePHP application and I'm using my UsersController::login action as the benchmark. For the sake of identifying how much of a factor the application code plays in this, I've printed a stacktrace immediately upon entering UsersController::beforeFilter(). Here's output:
UsersController::beforeFilter() - APP/controllers/users_controller.php, line 13
Controller::startupProcess() - CORE/cake/libs/controller/controller.php, line 522
Dispatcher::_invoke() - CORE/cake/dispatcher.php, line 187
Dispatcher::dispatch() - CORE/cake/dispatcher.php, line 171
[main] - APP/webroot/index.php, line 83
Load times, as shown by Safari's inspector range from 11.2 seconds to 52.2 seconds. This would seem to point me away from the application code and maybe something with my host, but maybe I'm completely misinterpreting this or oversimplifying it?
If you cannot identify directly a slow moving component of your application, there are a number of other steps along the way that can certainly slow your site down. Whenever I'm experiencing unusually long polling, I typically start by looking at the local DNS and then onto my hosted DNS. Sometimes a cache refresh (on their part, not yours) can cause a lot of polling until their database has caught up.
Else, they might actually have a service outage and your requests are being made to their secondary or backup server. If everything seems fine in terms of domain resolution, your hosting provider might be experiencing a service outage that can take a number of different shapes like serving static content from their backups or over-allocating shared resources until everything is running as it should. You can experience a ton of what they call throttling on shared cloud architectures when they have a box go down. On the plus side, you don't have a total outage in this circumstance.
One time, and this was just in a shared grid configuration, I had a processor go to hell. The bizarre part of it was that static content was still serving from a backup, but it was still polling against our database (which was on a different server) and causing our account to throttle because of over allocation on the backup. Wasn't our fault, but the host started sending nasty emails about our excessive long-polls. Moral of the story is, if it's not your application, and it's out of the blue, somewhere along the line I'll bet you'll find some hardware failure or misconfiguration.
Also now that I think of it, if you are syndicating some outside content (be it server or browser side) it might not be in your chain of responsibility altogether. If you are serving ads for example from a subscriber service, they might be having a high-load period or outage. These are just the steps that I would take to narrow down the culprit.
Probably this will be not the solution for you, but when I has doggy slow safari (and FF too) I simply changed the DNS servers to opendns (208.67.222.222, 208.67.220.220) and all my problems are resolved.

Strange performance using JPA, am I missing something?

We have a JPA -> Hibernate -> Oracle setup, where we are only able to crank up to 22 transactions per seconds (two reads and one write per transaction). The CPU and disk and network are not bottlenecking.
Is there something I am missing? I wonder if there could be some sort of oracle imposed limit that the DBA's have applied?
Network is not the problem, as when I do raw reads on the table, i can do 2000 reads per second. The problem is clearly writes.
CPU is not the problem on the app server, the CPU is basically idling.
Disk is not the problem on the app server, the data is completely loaded into memory before the processing starts
Might be worth comparing performance with a different client technology (or even just a simple test using SQL*Plus) to see if you can beat this performance anyway - it may simply be an under-resourced or misconfigured database.
I'd also compare the results for SQLPlus running directly on the d/b server, to it running locally on whatever machine your Java code is running on (where it is communicating over SQLNet). This would confirm if the problem is below your Java tier.
To be honest there are so many layers between your JPA code and the database itself, diagnosing the cause is going to be fun . . . I recall one mysterious d/b performance problem resolved itself as a misconfigured network card - the DBAs were rightly insistent that the database wasn't showing any bottlenecks.
It sounds like the application is doing a transaction in a bit less than 0.05 seconds. If the SELECT and UPDATE statements are extracted from the app and run them by themselves, using SQL*Plus or some other tool, how long do they take, and if you add up the times for the statements do they come pretty near to 0.05? Where does the data come from that is used in the queries, and which eventually gets used in the UPDATE? It's entirely possible that the slowdown is not the database but somewhere else in the app, such a the data acquisition phase. Perhaps something like a profiler could be used to find out where the app is spending its time.
Share and enjoy.

Resources