Unable to create Kinesis Client in Lambda function - aws-lambda

I have created a Lambda function which is triggered by a DynamoDB stream. I am trying to process Dynamodb events and put them into a Kinesis stream after some transformation. The Lambda has full access to both DynamoDB and Kinesis stream.
I am using Cloudwatch to check the logs and can see that the DynamoDb events are successfully processed. But when I try to create the Kinesis client (present in a different class), the code fails. I tried logging the error and even printing it but it did not help. Sometimes the logs end with this message
END RequestId: {some request id}
Other times, I get the following error
log4j:WARN No appenders could be found for logger (com.amazonaws.AmazonWebServiceClient).
The code fails at the time of creation of Kinesis client. I can see the log messages / print statements before the creation of Kinesis client. But right at that line code fails. I am not sure what the problem is. Can someone please help me out?
Here is the class in which the code fails
private AmazonKinesis kinesisClient;
private String streamName;
public TestKinesisPut(String streamName) {
this.streamName = streamName;
BasicAWSCredentials awsCreds = new BasicAWSCredentials("ACCESS_KEY", "SECRET_KEY");
System.out.println("aws creds are: " + awsCreds);
clientBuilder = AmazonKinesisClientBuilder.standard().withRegion(Regions.AP_SOUTH_1).
withCredentials(new AWSStaticCredentialsProvider(awsCreds));
System.out.println("Credentials are set: \n " + clientBuilder);
try {
System.out.println("This one is new \n About to build new kinesis client");
// the code fails after this line
kinesisClient = clientBuilder.build();
System.out.println("failed to build client");
}
catch(Exception e) {
System.out.println("failed to initialize producer: " + e.getMessage());
kinesisClient = null;
}
}
Thanks

After a few days of head scratching I decided to tinker with the configuration of my Lambda function. Looks like the problem was caused by OutOfMemoryError. I increased the memory of my Lambda function and it started working.
It seems that at the time of creation of the KinesisClient, the JVM was getting out of metaspace. I did some research and found this stackoverflow thread. Please refer the link to view a detailed discussion on a similar scenario.

Related

How to submit apache beam dataflow job to GCP through java application

I have a dataflow job which is written in apache beam with java. I am able run the dataflow job in GCP through this steps.
Created dataflow template from my code. Then uploading template in cloud storage.
Directly creating job from template option available in GCP->Dataflow->jobs
This flow is working fine.
I want to do same step through java app. means, I have one api when someone sends request to that api, I want to start this dataflow job through the template which I have already stored in storage.
I could see rest api is available to implement this approach. as below,
POST /v1b3/projects/project_id/locations/loc/templates:launch?gcsPath=template-location
But I didn't find any reference or samples for this. I tried the below approach
In my springboot project I added this dependency
<!-- https://mvnrepository.com/artifact/com.google.apis/google-api-services-dataflow -->
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-dataflow</artifactId>
<version>v1b3-rev20210825-1.32.1</version>
</dependency>
and added below code in controller
public static void createJob() throws IOException {
GoogleCredential credential = GoogleCredential.fromStream(new FileInputStream("myCertKey.json")).createScoped(
java.util.Arrays.asList("https://www.googleapis.com/auth/cloud-platform"));
try{
Dataflow dataflow = new Dataflow.Builder(new LowLevelHttpRequest(), new JacksonFactory(),
credential).setApplicationName("my-job").build(); --- this gives error
//RuntimeEnvironment
RuntimeEnvironment env = new RuntimeEnvironment();
env.setBypassTempDirValidation(false);
//all my env configs added
//parameters
HashMap<String,String> params = new HashMap<>();
params.put("bigtableEmulatorPort", "-1");
params.put("gcsPath", "gs://bucket//my.json");
// all other params
LaunchTemplateParameters content = new LaunchTemplateParameters();
content.setJobName("Test-job");
content.setEnvironment(env);
content.setParameters(params);
dataflow.projects().locations().templates().launch("project-id", "location", content);
}catch (Exception e){
log.info("error occured", e);
}
}
This gives {"id":null,"message":"'boolean com.google.api.client.http.HttpTransport.isMtls()'"}
error in this line itself
Dataflow dataflow = new Dataflow.Builder(new LowLevelHttpRequest(), new JacksonFactory(),
credential).setApplicationName("my-job").build();
this is bcs, this dataflow builder expects HttpTransport as 1st argument but I passed LowLevelHttpRequest()
I am not sure is this the correct way to implement this. Can any one suggest any ideas on this? how to implement this? any examples or reference ?
Thanks a lot :)

TestContainer can't start due to error: Timed out waiting for log output matching

I got "ContainerLaunchException: Timed out waiting for log output matching" when starting testcontainer for elasticserach. How should I fix this issue?
container = new ElasticsearchContainer(ELASTICSEARCH_IMAGE)
.withEnv("discovery.type", "single-node")
.withExposedPorts(9200);
container.start();
12:16:50.370 [main] ERROR 🐳 [docker.elastic.co/elasticsearch/elasticsearch:7.16.3] - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.("message":\s?"started".|] started
$)'
at org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:49)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:51)
Updated:
I looked into contructor ElasticsearchContainer
public ElasticsearchContainer(DockerImageName dockerImageName) {
super(dockerImageName);
this.caCertAsBytes = Optional.empty();
dockerImageName.assertCompatibleWith(new DockerImageName[]{DEFAULT_IMAGE_NAME, DEFAULT_OSS_IMAGE_NAME});
this.isOss = dockerImageName.isCompatibleWith(DEFAULT_OSS_IMAGE_NAME);
this.logger().info("Starting an elasticsearch container using [{}]", dockerImageName);
this.withNetworkAliases(new String[]{"elasticsearch-" + Base58.randomString(6)});
this.withEnv("discovery.type", "single-node");
this.addExposedPorts(new int[]{9200, 9300});
this.isAtLeastMajorVersion8 = (new ComparableVersion(dockerImageName.getVersionPart())).isGreaterThanOrEqualTo("8.0.0");
String regex = ".*(\"message\":\\s?\"started\".*|] started\n$)";
this.setWaitStrategy((new LogMessageWaitStrategy()).withRegEx(regex));
if (this.isAtLeastMajorVersion8) {
this.withPassword("changeme");
}
}
It uses setWaitStrategy. So I updated my code as below
container.setWaitStrategy((new LogMessageWaitStrategy()).withRegEx(regex).withTimes(1));
But I still get the same error. Here is how far the log messages go.
Updated again: I relized above code change doesn't update any default values.
Here is the new change:
container.setWaitStrategy((new LogMessageWaitStrategy())
.withRegEx(regex)
.withStartupTimeout(Duration.ofSeconds(180L)));
It works with this new change. I have to copy regex from ElasticsearchContainer constructor. I hope it has a better way to override the timeout value.

Listener for NATS JetStream

Can some one help how to configure NATS jet stream subscription in spring boot asynchronously example: looking for an equivalent annotation like #kafkalistener for Nats jetstream
I am able to pull the messages using endpoint but however when tried to pull messages using pushSubscription dispatcherhandler is not invoked. Need to know how to make the listener to be active and consume messages immediately once the messages are published to the subject.
Any insights /examples regarding this will be helpful, thanks in advance.
I don't know what is your JetStream retention policy, neither the way you want to subscribe. But I have sample code for WorkQueuePolicy push subscription, wish this will help you.
public static void subscribe(String streamName, String subjectKey,
String queueName, IMessageHandler iMessageHandler) throws IOException,
InterruptedException, JetStreamApiException {
long s = System.currentTimeMillis();
Connection nc = Nats.connect(options);
long e = System.currentTimeMillis();
logger.info("Nats Connect in " + (e - s) + " ms");
JetStream js = nc.jetStream();
Dispatcher disp = nc.createDispatcher();
MessageHandler handler = (msg) -> {
try {
iMessageHandler.onMessageReceived(msg);
} catch (Exception exc) {
msg.nak();
}
};
ConsumerConfiguration cc = ConsumerConfiguration.builder()
.durable(queueName)
.deliverGroup(queueName)
.maxDeliver(3)
.ackWait(Duration.ofMinutes(2))
.build();
PushSubscribeOptions so = PushSubscribeOptions.builder()
.stream(streamName)
.configuration(cc)
.build();
js.subscribe(subjectKey, disp, handler, false, so);
System.out.println("NatsUtil: " + durableName + "subscribe");
}
IMessageHandler is my custom interface to handle nats.io received messages.
First, configure the NATS connection. Here you will specify all your connection details like server address(es), authentication options, connection-level callbacks etc.
Connection natsConnection = Nats.connect(
new Options.Builder()
.server("nats://localhost:4222")
.connectionListener((connection, eventType) -> {})
.errorListener(new ErrorListener(){})
.build());
Then construct a JetStream instance
JetStream jetStream = natsConnection.jetStream();
Now you can subscribe to subjects. Note that JetStream consumers can be durable or ephemeral, can work according to push or pull logic. Please refer to NATS documentation (https://docs.nats.io/nats-concepts/jetstream/consumers) to make the appropriate choice for your specific use case. The following example constructs a durable push consumer:
//Subscribe to a subject.
String subject = "my-subject";
//queues are analogous to Kafka consumer groups, i.e. consumers belonging
//to the same queue (or, better to say, reading the same queue) will get
//only one instance of each message from the corresponding subject
//and only one of those consumers will be chosen to process the message
String queueName = "my-queue";
//Choosing delivery policy is analogous to setting the current offset
//in a partition for a consumer or consumer group in Kafka.
DeliverPolicy deliverPolicy = DeliverPolicy.New;
PushSubscribeOptions subscribeOptions = ConsumerConfiguration.builder()
.durable(queueName)
.deliverGroup(queueName)
.deliverPolicy(deliverPolicy)
.buildPushSubscribeOptions();
Subscription subscription = jetStream.subscribe(
subject,
queueName,
natsConnection.createDispatcher(),
natsMessage -> {
//This callback will be called for incoming messages
//asynchronously. Every subscription configured this
//way will be backed by its own thread, that will be
//used to call this callback.
},
true, //true if you want received messages to be acknowledged
//automatically, otherwise you will have to call
//natsMessage.ack() manually in the above callback function
subscribeOptions);
As for the declarative API (i.e. some form of #NatsListener annotation analogous to #KafkaListener from Spring for Apache Kafka project), there is none available out of the box in Spring. If you feel like you absolutely need it, you can write one yourself, if you are familiar with Spring BeanPostProcessor-s or other extension mechanism that can help to do that. Alternatively you can refer to 3rd party libs, it looks like a bunch of people (including myself) felt a bit uncomfortable when switching from Kafka to NATS, so they tried to bring the usual way of doing things with them from the Kafka world. Some examples can be found on github:
https://github.com/linux-china/nats-spring-boot-starter,
https://github.com/dstrelec/nats
https://github.com/amalnev/declarative-nats-listeners
There may be others.

Kinesis with SQS DLQ missing event data

I'm trying to set up a DLQ for a Kinesis.
I used SQS and set it as the Kinesis on failure destination.
The Kinesis is attached to a lambda that always throws an error so the event will go right away to the SQS DLQ.
I can see the events in the SQS, but that payload of the event is missing ( the json I send as part of the event ), in the lambda if I print the event before throwing the exception, I can see the base64 encoded data, but not in my DLQ.
Is there a way to send the event data to the DLQ as well? I want to be able to examine the cause of the error correctly and put the event back to the Kinesis after I finished fixing the issue in the lambda.
https://docs.aws.amazon.com/lambda/latest/dg//with-kinesis.html#services-kinesis-errors
The actual records aren't included, so you must process this record and retrieve them from the stream before they expire and are lost.
According to the above the event payload won't be sent to the DLQ event so "missing event data" is expected here.
Therefore, in order to retrieve the actual record back, you might want to try something like
1) assuming we have the following kinesis batch info
{
"KinesisBatchInfo": {
"shardId": "shardId-000000000001",
"startSequenceNumber": "49601189658422359378836298521827638475320189012309704722",
"endSequenceNumber": "49601189658422359378836298522902373528957594348623495186",
"approximateArrivalOfFirstRecord": "2019-11-14T00:38:04.835Z",
"approximateArrivalOfLastRecord": "2019-11-14T00:38:05.580Z",
"batchSize": 500,
"streamArn": "arn:aws:kinesis:us-east-2:123456789012:stream/mystream"
}
}
2) we can get the record back by doing something like
import AWS from 'aws-sdk';
const kinesis = new AWS.Kinesis();
const ShardId = 'shardId-000000000001';
const ShardIteratorType = 'AT_SEQUENCE_NUMBER';
const StreamName = 'my-awesome-stream';
const StartingSequenceNumber =
'49601189658422359378836298521827638475320189012309704722';
const { ShardIterator } = await kinesis
.getShardIterator({
ShardId,
ShardIteratorType,
StreamName,
StartingSequenceNumber,
})
.promise();
const records = await kinesis
.getRecords({
ShardIterator,
})
.promise();
console.log('Records', records);
NOTE: don't forget to make sure your process has permission to 1) kinesis:GetShardIterator 2) kinesis:GetRecords
Hope that helps!

WebFlux/Reactive Spring RabbitmMq Message is acknowledged even the save failed

I've started working recently with spring webflux and Rabbitmq along with cassandra reactive repository. What I've noticed is that the message is acknowledged even saving in cassandra didn't succued for some element. I propagete exception thrown during saving but even though the message is take down from queue. I'm wondering what I should do to let Rabbitmq know that this message should be consider as failed (I want to reject message to send it to dead letter queue )
#RabbitListener(queues = Constants.SOME_QUEUE, returnExceptions = "true")
public void receiveMessage(final List<ItemList> itemList) {
log.info("Received message from queue: {}", Constants.SOME_QUEUE);
itemService.saveAll(itemList)
.subscribe(
item-> log.info("Saving item with {}", item.getId()),
error -> {
log.error("Error during saving item", error);
throw new AmqpRejectAndDontRequeueException(error.getMessage());
},
() -> log.info(Constants.SOME_QUEUE+
" queue - {} items saved", itemList.size())
);
}
Reactive is non-blocking; the message will be acked as soon as the listener thread returns to the container. You need to somehow block the listener thread (e.g. with a Future<?>) and wake it up when the cassandra operation completes, exiting normally if successful, or throwing an exception on failure so the message will be redelivered.
I solved my problem by sending explicitly acknowledge/reject message to rabbitmq. It caused that I was forced to wrote a little more code but now at least it works and I have full controll what's happening.

Resources