Write an AWS Lambda to query a Neptune DB with openCypher using aws-sdk v3 - aws-lambda

I have a Lambda with Node.js 18 runtime, in which I would like to send an openCypher query to an AWS Neptune DB.
My Lambda is using an IAM role with these policies:
NeptuneFullAccess
AWSLambdaBasicExecutionRole
AmazonSSMReadOnlyAccess
(The last policy will be used to fetch the Neptune endpoint from the SSM parameters store later).
I'm trying to find the aws-sdk/client-neptune method to submit the query, but I couldn't find any in the library # GitHub.
This is frustrating, as I'm struggling for days to find a simple way to use the aws-sdk V3 with a Nodejs 18 Lambda to do a simple task of querying the Neptune DB.
Here my current skeleton of the Lambda:
import { NeptuneClient } from "#aws-sdk/client-neptune";
export async function handler() {
const neptuneEndpoint = "https://<my-db-instance>.us-east-1.neptune.amazonaws.com";
const neptune = new NeptuneClient({
endpoint: neptuneEndpoint,
region: "us-east-1",
});
const cypher = `MATCH (n) RETURN n`;
const query = {
Gremlin: cypher
};
const command = {
GremlinCommand: query,
};
const result = await neptune.send(command).promise();
console.log(result);
return result;
}
Can anyone please help me turn this into a working Lambda?

The client you are using only exposes Control Plane actions, such as Creating/Modifying cluster instances, and is not meant to be used to query Neptune. For openCypher, the recommendation is to query Neptune using the HTTPS endpoint as described here.

I have created a sample Neptune db and sample lambda. I have also used Lambda layers to install necessary gremlin dependencies. Please find the lambda code and lambda response screenshots below
Lambda code
const gremlin = require('gremlin');
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const Graph = gremlin.structure.Graph;
dc = new DriverRemoteConnection('wss://<db cluster name>.ap-south-1.neptune.amazonaws.com:8182/gremlin',{});
const graph = new Graph();
const g = graph.traversal().withRemote(dc);
const { t: { id } } = gremlin.process;
const { cardinality: { single } } = gremlin.process;
const createVertex = async (vertexId, vlabel) => {
const vertex = await g.addV(vlabel)
.property(id, vertexId)
.property(single, 'name', 'lambda')
.property('lastname', 'Testing') // default database cardinality
.next();
return vertex.value;
};
createVertex('sampledata1','testing')
exports.handler = async(event) => {
try {
const results = await g.V().hasLabel("sampledata1").properties("name")
console.log("--------",results);
return results
} catch (error) {
// error handling.
console.log("---error---",error);
return error
}
};
Note:
Replace db endpoint with your own db endpoint
Sample response from lambda
Lambda response screenshot

Related

AWS Lambda Timeout when connecting to Redis Elasticache in same VPC

Trying to publish from a Lambda function to a Redis Elasticache, but I just continue to get 502 Bad Gateway responses with the Lambda function timing out.
I have successfully connected to the Elasticache instance using an ECS in the same VPC which leads me to think that the VPC settings for my Lambda are not correct. I tried following this tutorial (https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html) and have looked at several StackOverflow threads to no avail.
The Lambda Function:
export const publisher = redis.createClient({
url: XXXXXX, // env var containing the URL which is also used in the ECS server to successfully connect
});
export const handler = async (
event: AWSLambda.APIGatewayProxyWithCognitoAuthorizerEvent
): Promise<AWSLambda.APIGatewayProxyResult> => {
try {
if (!event.body || !event.pathParameters || !event.pathParameters.channelId)
return ApiResponse.error(400, {}, new InvalidRequestError());
const { action, payload } = JSON.parse(event.body) as {
action: string;
payload?: { [key: string]: string };
};
const { channelId } = event.pathParameters;
const publishAsync = promisify(publisher.publish).bind(publisher);
await publishAsync(
channelId,
JSON.stringify({
action,
payload: payload || {},
})
);
return ApiResponse.success(204);
} catch (e) {
Logger.error(e);
return ApiResponse.error();
}
};
In my troubleshooting, I have verified the following in the Lambda functions console:
The correct role is showing in Configuration > Permissions
The lambda function has access to the VPC (Configuration > VPCs), Subnets, and the same SG as the Elasticache instance.
The SG is allowing all traffic from anywhere.
It is indeed the Redis connection. Using console.log the code stops at this line: await publishAsync()
I am sure it is something small, but it is racking my brain!
Update 1:
Tried adding an error handler to log any issues with the publish in addition to the main try/catch block, but it's not logging a thing.
publisher.on('error', (e) => {
Logger.error(e, 'evses-service', 'message-publisher');
});
Also have copied my Elasticache setup:
And my Elasticache Subnet Group:
And my Lambda VPC settings:
And that my Lambda has the right access:
Update 2:
Tried to follow the tutorial here (https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html) word for word, but getting the same issue. No logs, just a timeout after 30 seconds. Here is the test code:
const crypto = require('crypto');
const redis = require('redis');
const util = require('util');
const client = redis.createClient({
url: 'rediss://clusterforlambdatest.9nxhfd.0001.use1.cache.amazonaws.com',
});
client.on('error', (e) => {
console.log(e);
});
exports.handler = async (event) => {
try {
const len = 24;
const randomString = crypto
.randomBytes(Math.ceil(len / 2))
.toString('hex') // convert to hexadecimal format
.slice(0, len)
.toUpperCase();
const setAsync = util.promisify(client.set).bind(client);
const getAsync = util.promisify(client.get).bind(client);
await setAsync(randomString, 'We set this string bruh!');
const doc = await getAsync(randomString);
console.log(`Successfully receieved document ${randomString} with contents: ${doc}`);
return;
} catch (e) {
console.log(e);
return {
statusCode: 500,
};
}
};
If you have timeout, assuming the lambda network is well configured, you should check the following:
redis SSL configuration: check diffs between redisS connection url and cluster configuration (in-transit encryption and client configuration with tls: {})
configure the client with a specific retry strategy to avoid lambda timeout and catch connection issue
check VPC acl and security groups
I had same issue with my elasticache cluster, here are few findings -
Check the number client connections with elasticache and resources used
Check VPC subnet and CIDR for nodes security group
Try to increase the TTL for lambda and see which service is taking more time to respond Lambda or elasticache

How do you use a nodejs Lambda in AWS as a producer to send messages to MSK topic without creating EC2 client server?

I am trying to create a Lambda in AWS that serves as a producer to an MSK topic. All the AWS docs say to create a new EC2 instance, but as my Lambda is in the same VPC I feel like this should work. I am very new to this and I notice my log statement never hits in my producer.on function.
I am using nodejs and the kafka-node module. The code can be found below.
Essentially, I am just wondering if anyone knows how to do this and why the producer.on function is never hit when I run test through the Lambda? This is just some test code to see if I can get it to send, but if any more data is needed oy help please let me know and thanks in advance.
exports.handler = async (event, context,callback) => {
const kafka = require('kafka-node');
const bp = require('body-parser');
const kafka_topic = 'MyTopic';
const Producer = kafka.Producer;
var KeyedMessage = kafka.KeyedMessage;
const Client = kafka.Client;
const client = new kafka.KafkaClient({kafkaHost: 'myhost:9094'});
console.log('client :: '+JSON.stringify(client));
const producer = new Producer(client);
console.log('about to hit producer code');
producer.on('ready', function() {
console.log('Hello there!')
let message = 'my message';
let keyedMessage = new KeyedMessage('keyed', 'me keyed message');
producer.send([
{ topic: kafka_topic, partition: 0, messages: [message, keyedMessage], attributes: 0 }
], function (err, result) {
console.log(err || result);
process.exit();
});
});
producer.on('error', function (err) {
console.log('error', err);
});
}
return "success";
What you need is to be able to produce messages on your MSK cluster using REST API. Why not setup a REST proxy for MSK as detailed here and then call this API to produce your messages to MSK.

AWS CDK passing API Gateway URL to static site in same Stack

I'm trying to deploy an S3 static website and API gateway/lambda in a single stack.
The javascript in the S3 static site calls the lambda to populate an HTML list but it needs to know the API Gateway URL for the lambda integration.
Currently, I generate a RestApi like so...
const handler = new lambda.Function(this, "TestHandler", {
runtime: lambda.Runtime.NODEJS_10_X,
code: lambda.Code.asset("build/test-service"),
handler: "index.handler",
environment: {
}
});
this.api = new apigateway.RestApi(this, "test-api", {
restApiName: "Test Service"
});
const getIntegration = new apigateway.LambdaIntegration(handler, {
requestTemplates: { "application/json": '{ "statusCode": "200" }' }
});
const apiUrl = this.api.url;
But on cdk deploy, apiUrl =
"https://${Token[TOKEN.39]}.execute-api.${Token[AWS::Region.4]}.${Token[AWS::URLSuffix.1]}/${Token[TOKEN.45]}/"
So the url is not parsed/generated until after the static site requires the value.
How can I calculate/find/fetch the API Gateway URL and update the javascript on cdk deploy?
Or is there a better way to do this? i.e. is there a graceful way for the static javascript to retrieve a lambda api gateway url?
Thanks.
You are creating a LambdaIntegration but it isn't connected to your API.
To add it to the root of the API do: this.api.root.addMethod(...) and use this to connect your LambdaIntegration and API.
This should give you an endpoint with a URL
If you are using the s3-deployment module to deploy your website as well, I was able to hack together a solution using what is available currently (pending a better solution at https://github.com/aws/aws-cdk/issues/12903). The following together allow for you to deploy a config.js to your bucket (containing attributes from your stack that will only be populated at deploy time) that you can then depend on elsewhere in your code at runtime.
In inline-source.ts:
// imports removed for brevity
export function inlineSource(path: string, content: string, options?: AssetOptions): ISource {
return {
bind: (scope: Construct, context?: DeploymentSourceContext): SourceConfig => {
if (!context) {
throw new Error('To use a inlineSource, context must be provided');
}
// Find available ID
let id = 1;
while (scope.node.tryFindChild(`InlineSource${id}`)) {
id++;
}
const bucket = new Bucket(scope, `InlineSource${id}StagingBucket`, {
removalPolicy: RemovalPolicy.DESTROY
});
const fn = new Function(scope, `InlineSource${id}Lambda`, {
runtime: Runtime.NODEJS_12_X,
handler: 'index.handler',
code: Code.fromAsset('./inline-lambda')
});
bucket.grantReadWrite(fn);
const myProvider = new Provider(scope, `InlineSource${id}Provider`, {
onEventHandler: fn,
logRetention: RetentionDays.ONE_DAY // default is INFINITE
});
const resource = new CustomResource(scope, `InlineSource${id}CustomResource`, { serviceToken: myProvider.serviceToken, properties: { bucket: bucket.bucketName, path, content } });
context.handlerRole.node.addDependency(resource); // Sets the s3 deployment to depend on the deployed file
bucket.grantRead(context.handlerRole);
return {
bucket: bucket,
zipObjectKey: 'index.zip'
};
},
};
}
In inline-lambda/index.js (also requires archiver installed into inline-lambda/node_modules):
const aws = require('aws-sdk');
const s3 = new aws.S3({ apiVersion: '2006-03-01' });
const fs = require('fs');
var archive = require('archiver')('zip');
exports.handler = async function(event, ctx) {
await new Promise(resolve => fs.unlink('/tmp/index.zip', resolve));
const output = fs.createWriteStream('/tmp/index.zip');
const closed = new Promise((resolve, reject) => {
output.on('close', resolve);
output.on('error', reject);
});
archive.pipe(output);
archive.append(event.ResourceProperties.content, { name: event.ResourceProperties.path });
archive.finalize();
await closed;
await s3.upload({Bucket: event.ResourceProperties.bucket, Key: 'index.zip', Body: fs.createReadStream('/tmp/index.zip')}).promise();
return;
}
In your construct, use inlineSource:
export class TestConstruct extends Construct {
constructor(scope: Construct, id: string, props: any) {
// set up other resources
const source = inlineSource('config.js', `exports.config = { apiEndpoint: '${ api.attrApiEndpoint }' }`);
// use in BucketDeployment
}
}
You can move inline-lambda elsewhere but it needs to be able to be bundled as an asset for the lambda.
This works by creating a custom resource that depends on your other resources in the stack (thereby allowing for the attributes to be resolved) that writes your file into a zip that is then stored into a bucket, which is then picked up and unzipped into your deployment/destination bucket. Pretty complicated but gets the job done with what is currently available.
The pattern I've used successfully is to put a CloudFront distribution or an API Gateway in front of the S3 bucket.
So requests to https://[api-gw]/**/* are proxied to https://[s3-bucket]/**/*.
Then I will create a new Proxy path in the same API gateway, for the route called /config which is a standard Lambda-backed API endpoint, where I can return all sorts of things like branding information or API keys to the frontend, whenever the frontend calls GET /config.
Also, this avoids issues like CORS, because both origins are the same (the API Gateway domain).
With CloudFront distribution instead of an API Gateway, it's pretty much the same, except you use the CloudFront distribution's "origin" configuration instead of paths and methods.

Log apollo-server GraphQL query and variables per request

When using apollo-server 2.2.1 or later, how can one log, for each request, the query and the variables?
This seems like a simple requirement and common use case, but the documentation is very vague, and the query object passed to formatResponse no longer has the queryString and variables properties.
Amit's answer works (today), but IMHO it is a bit hacky and it may not work as expected in the future, or it may not work correctly in some scenarios.
For instance, the first thing that I thought when I saw it was: "that may not work if the query is invalid", it turns out that today it does work when the query is invalid. Because with the current implementation the context is evaluated before the the query is validated. However, that's an implementation detail that can change in the future. For instance, what if one day the apollo team decides that it would be a performance win to evaluate the context only after the query has been parsed and validated? That's actually what I was expecting :-)
What I'm trying to say is that if you just want to log something quick in order to debug something in your dev environment, then Amit's solution is definitely the way to go.
However, if what you want is to register logs for a production environment, then using the context function is probably not the best idea. In that case, I would install the graphql-extensions and I would use them for logging, something like:
const { print } = require('graphql');
class BasicLogging {
requestDidStart({queryString, parsedQuery, variables}) {
const query = queryString || print(parsedQuery);
console.log(query);
console.log(variables);
}
willSendResponse({graphqlResponse}) {
console.log(JSON.stringify(graphqlResponse, null, 2));
}
}
const server = new ApolloServer({
typeDefs,
resolvers,
extensions: [() => new BasicLogging()]
});
Edit:
As Dan pointed out, there is no need to install the graphql-extensions package because it has been integrated inside the apollo-server-core package.
With the new plugins API, you can use a very similar approach to Josep's answer, except that you structure the code a bit differently.
const BASIC_LOGGING = {
requestDidStart(requestContext) {
console.log("request started");
console.log(requestContext.request.query);
console.log(requestContext.request.variables);
return {
didEncounterErrors(requestContext) {
console.log("an error happened in response to query " + requestContext.request.query);
console.log(requestContext.errors);
}
};
},
willSendResponse(requestContext) {
console.log("response sent", requestContext.response);
}
};
const server = new ApolloServer(
{
schema,
plugins: [BASIC_LOGGING]
}
)
server.listen(3003, '0.0.0.0').then(({ url }) => {
console.log(`GraphQL API ready at ${url}`);
});
If I had to log the query and variables, I would probably use apollo-server-express, instead of apollo-server, so that I could add a separate express middleware before the graphql one that logged that for me:
const express = require('express')
const { ApolloServer } = require('apollo-server-express')
const { typeDefs, resolvers } = require('./graphql')
const server = new ApolloServer({ typeDefs, resolvers })
const app = express()
app.use(bodyParser.json())
app.use('/graphql', (req, res, next) => {
console.log(req.body.query)
console.log(req.body.variables)
return next()
})
server.applyMiddleware({ app })
app.listen({ port: 4000}, () => {
console.log(`🚀 Server ready at http://localhost:4000${server.graphqlPath}`)
})
Dan's solution mostly resolves the problem but if you want to log it without using express,
you can capture it in context shown in below sample.
const server = new ApolloServer({
schema,
context: params => () => {
console.log(params.req.body.query);
console.log(params.req.body.variables);
}
});
I found myself needing something like this but in a more compact form - just the query or mutation name and the ID of the user making the request. This is for logging queries in production to trace what the user was doing.
I call logGraphQlQueries(req) at the end of my context.js code:
export const logGraphQlQueries = ( req ) => {
// the operation name is the first token in the first line
const operationName = req.body.query.split(' ')[0];
// the query name is first token in the 2nd line
const queryName = req.body.query
.split('\n')[1]
.trim()
.split(' ')[0]
.split('(')[0];
// in my case the user object is attached to the request (after decoding the jwt)
const userString = req.user?.id
? `for user ${req.user.id}`
: '(unauthenticated)';
console.log(`${operationName} ${queryName} ${userString}`);
};
This outputs lines such as:
query foo for user e0ab63d9-2513-4140-aad9-d9f2f43f7744
Apollo Server exposes a request lifecycle event called didResolveOperation at which point the requestContext has populated properties called operation and operationName
plugins: [
{
requestDidStart(requestContext) {
return {
didResolveOperation({ operation, operationName }) {
const operationType = operation.operation;
console.log(`${operationType} recieved: ${operationName}`)
}
};
}
}
]
// query recieved: ExampleQuery
// mutation recieved: ExampleMutation

Connect Lambda to mLab

I would like to read a collection in mLab(mongoDB) and get result document based on the request from AWS LAMBDA function.
I could write a nodeJS function code snippet and whatever timeout I set it results in
Task timed out after *** seconds
Any solution, link or thoughts will be helpful. Either JAVA or NODE
'use strict';
const MongoClient = require('mongodb').MongoClient;
exports.handler = (event, context, callback) => {
console.log('=> connect to database');
MongoClient.connect('mongodb://test:test123#ds.xyx.fleet.mlab.com:1234', function (err, client) {
if (err) {
console.log("ERR ",err );
throw err;
}
var db = client.db('user');
db.collection('sessions').findOne({}, function (findErr, result) {
if (findErr){
console.log("findErr ",findErr);
throw findErr;
} else {
console.log("#",result);
console.log("##",result.name);
context.succeed(result);
}
client.close();
});
});
};
P.S : Referred all related stack questions.
Lambda function returned success after adding db name in
MongoClient.connect('mongodb://test:test123#ds.xyx.fleet.mlab.com:1234/dbNAME')
Apart from declaring db name in
var db = client.db('dbNAME');
It should also be added in mLab connection URI.

Resources