How to monitor API calls on EC2? - amazon-ec2

I have a requirement to find response time for API (Rest API calls to external instances) calls going out of an amazon EC2 instance (there is an application running on EC2 making these calls). It will be great if I could also filter the calls based on a regex or complete urls. We have been thinking of logging the calls and analyzing the data or using tools like Dynatrace, Nagios so that code changes are not required.
If someone has implemented such a solution, please let us know.

This may not be a complete answer, but following the idea of starting with logs, I would recommend looking into using Cloudwatch: https://aws.amazon.com/cloudwatch/details/#log-monitoring

Depending on the level of access you have, you use something like https://www.wireshark.org/ to monitor all the traffic and do url/protocol filtering. You might try a log aggregator like http://papertrailapp.com/ which has filtering.
As you stated you could also use some form of APM like dynatrace, statsd, instrumental[1], newrelic, datadog, etc. to do monitoring inside your application.
[1] I work for instrumental.

You can setup a simple proxy that would measure web requests time and write metrics for this measures.
There are some some nice tools, take a look at http://datadoghq.com/product or http://devmetrics.io/logslib
For example, simple nodejs proxy with devmetrics lib:
var httpProxy = require('http-proxy');
var logger = require('devmetrics-core');
var http = require('http');
var proxy = new httpProxy.createProxyServer({});
var proxyServer = http.createServer(function (req, res) {
// now
var start_time = new Date().getTime();
proxy.web(req, res, { target: 'http://localhost:80' });
res.on('finish', function() {
var latency = new Date().getTime() - start_time;
console.log("The request was proxied in " + latency + "ms");
logger.appGauge('web_request', latency);
});
});
proxyServer.listen(3000);

Related

Winston Force flush before ending lambda execution

I'm trying to use Winston to send logs to Datadog from an Aws Lambda. The problem with the lambdas is that once we return a response, the lambda execution stops and it doesn't give time to Winston to flush the logs.
Is there a way I can force the flush before returning. I'm trying this but it doesn't seem to do the trick:
async function handler (event): Promise<FormattedJSONResponse> {
const logger = getLogger()
// do some work
await closeLogger(logger)
return awsResponse
}
function closeLogger (logger: Logger): Promise<any> {
const loggerDone = new Promise((resolve, _) => {
logger.on('finish', () => {
resolve(logger)
})
})
logger.end()
logger.close()
return loggerDone
}
Versions:
AWS Lambda with nodejs 12
Winston: 3.3.3
Thanks for your help
First of all I don't understand why you would want to send your logs within you lambda function? If you do so your lambda function will run longer to process the logs, meaning you will be charged for the time it takes to send the logs to Datadog.
Instead, you could save the logs to CloudWatch. To avoid high charges for CloudWatch set the retention to a rather short time, maybe one day. On the CloudWatch log stream you can then add a subscriber which could be another lambda function. This "log-processor"-lambda-function will process, transform the logs and send them to Datadog. With this architecture your first lambda function containing the business logic won't fail if Datadog cannot be reached for instance. It makes your architecture more resilient and has better separation of concerns. Yan Cui wrote a great article on "Centralised logging for AWS Lambda"
Another approach, still separating your logging from your lambda function business logic to some degree, builds upon lambda extensions namely the Lambda Logs API.
Put simple, lambda extensions add an extra layer to your function but are not part of the lambda function's code itself. Probably the best part for you: Datadog already offers a ready to use extension, which is responsible for:
Pushing real-time enhanced Lambda metrics, custom metrics, and traces from the Datadog Lambda Library to Datadog.
Forwarding logs from your Lambda function to Datadog.
For more info on Lambda extensions follow the links mentioned above or have a look at Yan Cui's post "Lambda Logs API: a new way to process Lambda logs in real-time"
After spending 4 hours on this issue, I found no other way (that works, isn't buggy and is transport agnostic) than to use an arbitrary timeout before returning a response.
This example is for NextJS but you can easily remove res: NextApiResponse.
export const gracefulExit = (response: any, res: NextApiResponse) => {
setTimeout(() => {
res.send({ ...response, sessionId });
}, 400);
};
Then in all my serverless functions I don't do res.send({x}) but rather gracefulExit({x}, res)

Google Apps Script Web App long-polling and simultaneous executions limit

My google script web app is recently hitting qps limits. What would be a better way to improve performance.
I have about 50 active users. I use 15,000 rows google spreadsheet as a database and my app is serving json data requested by users from this spreadsheet. I use long-poll to keep connection alive for 5 min and close it if no update in spreadsheet happens. Then client reconnects. Web App is published to be executed as me.
My polling works like this:
function doGet(e){
var userHasVersion = e.parameter.userVersion
while (runningTime < 300001) {
var currentServerVersion = parseInt(cache.get("currentVersion"),10)
if(userVersion<currentServerVersion){
var returndata = []
for(var i = userVersion+1; i <= currentServerVersion;i++){
var newData = cache.get(i)
if(newData!=null){returnData.push(JSON.parse(cache.get(newData)))}
}
return ContentService.createTextOutput(JSON.stringify({currentServerVersion,data:returnData })).setMimeType(ContentService.MimeType.JSON);
} else {
Utilities.sleep(20000)
}
runningTime = calculateRunningTime()
}
}
What I have tried so far:
1) I optimized requests with CacheService to reduce calls to Spreadsheet. It helped for few months, but now I'm getting qps errors more and more often.
2) Asking Google team about quotas. They explained me, that there is no published quotas/limits for simultanous executions and they are subject to change without notice. They advised further usage of cacheService and better error handling.
I think to switch from long-polling to short-polling. But it feels like drawback. Should I try to further optimize performance or move to another service?
Would trying to use "execute app as user accessing the app" help? (users should use the same database)
Is Google Script API Executable different from Web App? It looks like it might fit, but I'm not sure if they share the same qps quotas.
I'm also considering GAE service, but I'd like to avoid going over free quota.
Any advice will be much appreciated!
I think that a following part can be improved. When data is retrieved from cache service, getAll() is more effectively than get(). I have ever measured the difference. That is about 890 times faster than get(). If the number of data retrieving from cache service is large, I think that the improvement of this part is important for performance.
Your script :
var returndata = []
for(var i = userVersion+1; i <= currentServerVersion;i++){
var newData = cache.get(i)
if(newData!=null){returnData.push(JSON.parse(cache.get(newData)))}
}
Improved script :
var ar = [];
for(var i = userVersion+1; i <= currentServerVersion;i++){
ar.push([i]);
}
var r = JSON.parse(JSON.stringify(cache.getAll(ar))); // Since key is number, I used this.
var returnData = [r[j] for each (j in r)if (!r[j])];
Since I cannot see your data, I cannot confirm this execution. So if errors occur, please tell me.
If I misunderstand your question, I'm sorry.

Exchange data between node.js script and client's Javascript

I have the following situation, where the already sent headers problem happens, when sending multiple request from the server to the client via AJAX:
It is something I expected since I opted to go with AJAX, instead of sockets. Is there is other way around to exchange the data between the server and the client, like using browserify to translate an emitter script for the client? I suppose that I can't escape the sockets, so I will take advice about simpler library, as sockets.io seems too complex for such a small operation.
//-------------------------
Update:
Here is the node.js code as requested.
var maxRunning = 1;
var test_de_rf = ['rennen','ausgehen'];
function callHandler(word, cb) {
console.log("word is - " + word);
gender.gender_function_rf( word , function (result_rf) {
console.log(result_rf);
res.send(result_rf);// Here I send data back to the ajax call
setTimeout(function() { cb(null);
}, 3000);
});
}
async.eachLimit(test_de_rf, maxRunning, function(item, done) {
callHandler(item, function(err) {
if (err) throw new Error(err);
done();
});
}, function(err) {
if (err) throw new Error(err);
console.log('done');
});
res.send() sends and finishes an http response. You can only call it once per request because the request is finished and done after calling that. It is a fairly high level way of sending a response (does it all at once in one call).
If you wanted to have several different functions contributing to a response, you could use the lower level functions on the http object such as res.setHeader(), res.writeHead(), res.write() (which you can call multiple times) and res.end() (which indicates the end of the response).
You can use the standard webSocket API in the browser and get webSocket module for server-side support or you can use socket.io which offers both client and server support and a number of higher level functions (such as automatic reconnect, automatic failover to http polling if webSockets are not supported, etc...).
All that said, if what you really want is the ability to just send some data from server to client whenever you want, then a webSocket is really the better way to go. This is a persistent connection, is supported by all modern browsers and allows the server to send data unsolicited to the client at any time. I'd hardly say socket.io is complex. The doc isn't particularly great at explaining things (not uncommon in the open source world as the node.js doc isn't particularly great either). But, I've always been able to figure advanced things out by just looking at a few runtime data structures in the debugger and/or looking at the source code.

How to check application runs in AWS EC2 instance

How can I check which platform my app runs, AWS EC2 instance, Azure Role instance and non-cloud system?
now I do that like this:
if(isAzure())
{
//run in Azure role instance
}
else if(isAWS())
{
//run in AWS EC2 instance
}
else
{
//run in the non-cloud system
}
//checked whether it runs in AWS EC2 instance or not.
bool isAWS()
{
string url = "http://instance-data";
try
{
WebRequest req = WebRequest.Create(url);
req.GetResponse();
return true;
}
catch
{
return false;
}
}
but I have one problem when my apps runs in the non-cloud system, like local windows system. It got very slowly while executing isAWS() method. the code 'req.GetResponse()' takes a long time. so I want to know how can I to deal with it? please help me! thanks in advance.
The better way to do this would be to make a request to get instance metadata.
From the AWS Documentation:
To view all categories of instance metadata from within a running
instance, use the following URI:
http://169.254.169.254/latest/meta-data/
On a Linux instance, you can use a tool such as cURL, or use the GET
command, for example:
PROMPT> GET http://169.254.169.254/latest/meta-data/
Here's an example using the Python Boto wrapper:
from boto.utils import get_instance_metadata
m = get_instance_metadata()
if len(m.keys()) > 0:
print "Running on EC2"
else:
print "Not running on EC2"
I think your original idea is pretty good, but no need to make the web request. Simply try to see if the name resolves (in python):
def is_ec2():
import socket
try:
socket.gethostbyname('instance-data.ec2.internal.')
return True
except socket.gaierror:
return False
As you said the WebRequest.Create() call is slow on your desktop so you really need to check the network traffic (using Netmon) to actually determine what took long time. This request, opens connection, connects to target server, downloads the content and then close the connection so it is good to know where this time is taken.
Also if you just want to know if any URL (on Azure, on EC2 or any other web server is live and working fine you can just request to only download headers by using
string URI = "http://www.microsoft.com";
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(URI);
req.Method = WebRequestMethods.Http.Head;
var response = req.GetResponse();
int TotalSize = Int32.Parse(response.Headers["Content-Length"]);
// Now you can parse the headers for 200 OK and know that it is working.
You can also use GET only a range of the data instead of full data to expedite to call:
HttpWebRequest myHttpWebReq =(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
myHttpWebReq.AddRange(-200, ContentLength); // return first 0-200 bytes
//Now you can send the request and then parse date for headers for 200 OK
Any of the above method will be faster to get where your site is running.
On ec2 Ubuntu instances, the file /sys/hypervisor/uuid exists and its first three characters are 'ec2'. I like using this because it doesn't rely on external servers.

Building an high performance node.js application with cluster and node-webworker

I'm not a node.js master, so I'd like to have more points of view about this.
I'm creating an HTTP node.js web server that must handle not only lots of concurrent connections but also long running jobs. By default node.js runs on one process, and if there's a piece of code that takes a long time to execute any subsequent connection must wait until the code ends what it's doing on the previous connection.
For example:
var http = require('http');
http.createServer(function (req, res) {
doSomething(); // This takes a long time to execute
// Return a response
}).listen(1337, "127.0.0.1");
So I was thinking to run all the long running jobs in separate threads using the node-webworker library:
var http = require('http');
var sys = require('sys');
var Worker = require('webworker');
http.createServer(function (req, res) {
var w = new Worker('doSomething.js'); // This takes a long time to execute
// Return a response
}).listen(1337, "127.0.0.1");
And to make the whole thing more performant, I thought to also use cluster to create a new node process for each CPU core.
In this way I expect to balance the client connections through different processes with cluster (let's say 4 node processes if I run it on a quad-core), and then execute the long running job on separate threads with node-webworker.
Is there something wrong with this configuration?
I see that this post is a few months old, but I wanted to provide a comment to this in the event that someone comes along.
"By default node.js runs on one process, and if there's a piece of code that takes a long time to execute any subsequent connection must wait until the code ends what it's doing on the previous connection."
^-- This is not entirely true. If doSomething(); is required to complete before you send back the response, then yes, but if it isn't, you can make use of the Asynchronous functionality available to you in the core of Node.js, and return immediately, while this item processes in the background.
A quick example of what I'm explaining can be seen by adding the following code in your server:
setTimeout(function(){
console.log("Done with 5 second item");
}, 5000);
If you hit the server a few times, you will get an immediate response on the client side, and eventually see the console fill with the messages seconds after the response was sent.
Why don't you just copy and paste your code into a file and run it over JXcore like
$ jx mt-keep:4 mysourcefile.js
and see how it performs. If you need a real multithreading without leaving the safety of single threading try JX. its 100% node.JS 0.12+ compatible. You can spawn the threads and run a whole node.js app inside each of them separately.
You might want to check out Q-Oper8 instead as it should provide a more flexible architecture for this kind of thing. Full info at:
https://github.com/robtweed/Q-Oper8

Resources