Writing tests with local dynamodb using mocha - mocha.js

I am using Dynamodb for an api service I am writing. I started writing tests and I found that there is no command(or query) which destroys all the "items" in a table. I am using vogels to access dynamodb.
I usually clean the table before every test. How do I do that given that there is no single command( or query) that deletes all the items in a table?
If I delete each item one by one the tests start executing before all the items get deleted.

The CRUD operation is atomic in DynamoDB. There is no API available to delete all the items in the DynamoDB table.
Solution 1:
The best recommended solution is to delete the table and recreate it.
Solution 2:
Use batchWriteItem with DeleteRequest for deleting multiple items in one go. The maximum number of requests on batch write is 25 items.
Wait for:-
After executing the delete table, please wait until the resource is not available. Similarly, after executing the create table, you need to wait until the resource is available.
var params = {
TableName: 'STRING_VALUE' /* required */
};
dynamodb.waitFor('tableNotExists', params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
Waits for the tableNotExists state by periodically calling the
underlying DynamoDB.describeTable() operation every 20 seconds (at
most 25 times).

Related

Analytics Microservices

I'm implementing a microservice responsible to generate analytics, retrieving data asynchronously from microservices through RabbitMQ.
I'm trying to understand if every times there is event on domain data, it should be sent on rabbitmq and update the analytics-database (MongoDB).
This approach would update the same document (retrieved from database) every time there is an event that needs that document.
-- Example:
{
"date":"2022-06-15",
"day":"Monday",
"restaurantId": 2,
"totalSpent":250,
"nOfLogin":84,
"categories":[
{
"category":"wine",
"total":100
},
{
"category":"burgers",
"total":150
}
],
"payment":[
{
"method":"POS",
"total":180
},
{
"method":"Online",
"total":20
},
{
"method":"Cash",
"total":50
}
],
...
}
So if an event with some data arrives, it updates its relative data and save on mongodb:
{
{
"category":"wine",
"total":2
}
}
it should update its category adding its total and saving it.
--End Example
The struggling part is that if there are a lot of events on the same document, it would be retrieved twice (or more, depending on events) from database, generating a concurrency error.
Firstly I thought the best approach would be using Spring Batch (retrieving data from different databases, transforming it and send on rabbitmq), but it's not real-time and it would be scheduled with Quartz.
To make you understand the kind of data are:
quantity of product ordered (real time and from database)
quantity of customers logged in (daily and subdivided in hours, always on real-time)
These are not all data, but these are the ones that would have been sent a lot of times during the day.
I don't want to make some kind of flooding inside of rabbitmq, but I'm struggling understanding which approach is the best (thinking even about the design pattern to use for this kind of situation).
Thanks in advice
I see several possible solutions to this.
Optimistic locking.
What this strategy does is it maintains the version of your document.
On each document read the version attribute is fetched alongside other attributes.
Document update is performed as usual, but what's different to this approach is that an update query must check if the version have changed (document was updated by another concurrent event) since the read operation.
In case the version did change, you would have to handle an optimistic locking exception.
How would you do that largely depends on your needs, i.e., log & discard the event, retry, etc.
Otherwise query increments the version and updates the rest of the attributes.
Different event updates different attributes. In this case all you need to do is to update the individual attributes. This approach is a simpler and a more efficient one since there's no need for the read operation, and no extra effort of maintaining and checking the version attribute on each update.
Message ordering. For this to work properly all of the events must be coming from the same domain object (aggregate) and the message broker has to support this mechanism. Kafka, for example, does this through topic partitioning, where each partition is created using a calculated hash key (partition id), it might be the hash of the domain object's id or some other identifier.

How can I upload a big file into MySQL DB Laravel

I have a huge table where I get the users and then I make some calculations and I store this new information in another table, the thing is thta it does not finish to do it because it's too huge and I think that the server executes timing out and it kills the process so I have read that it exists chunk() o something like that OR if I can use paginate but what I do not understand it's this:
My query is:
$user = Users::all(); <- this displays me an error
But if I want to do it like every 25 rows:
$user = Users::paginate(25);
the thing is that if I add this piece of code as above it will not work because it will return me 25 rows only I wonder:
how can I make that it returns me 25 rows it finishes to process that 25 rows and then it restarts in a new 25 rows again?
Thanks
You're looking for either the chunk feature (which will process batches of records in chunks of a size you select) or the lazy feature (which will fetch one at a time).
You may still run into timeouts on the server side if this is a web request. If so, you may need to do this in an Artisan command, or split it up into queue jobs.

Oracle wait for table update in procedure

I've two databases (for example let's call them A and B) with one-way dblink - security restriction on one base (A) doesn't allow to connect from outside.
I need to put some data from B to A, process it and return a response. This task was done in this way:
On database B I made 2 tables - REQUESTS_IN (req_id number, data clob) and REQUESTS_OUT (req_id number, data clob). When I need to send data to db A I put it into REQUESTS_IN and start job which is checking any new rows in REQUESTS_OUT.
On database A there is a job which checks for new rows in REQUESTS_IN using dblink, gets data, processes it and put answer into REQUESTS_OUT.
Based on data process business logic, jobs delay - it can take up to 1 minute to get response in REQUESTS_OUT. It's ok when application is async and can wait for response.
Now i need to make sync version of this solution. On database B application will call some function to send data to database A and it needs to return response in same call.
I tried to find a solution in oracle db functions, but the only thing that comes to mind is to use dbms_pipe. i.e. on database B use dbms_pipe.receive_message to wait for message and on database A use dbms_pipe.send_message. But i'm not sure if this is correct solution.

How to avoid concurrent requests to a lambda

I have a ReportGeneration lambda that takes request from client and adds following entries to a DDB table.
Customer ID <hash key>
ReportGenerationRequestID(UUID) <sort key>
ExecutionStartTime
ReportExecutionStatus < workflow status>
I have enabled DDB stream trigger on this table and a create entry in this table triggers the report generation workflow. This is a multi-step workflow that takes a while to complete.
Where ReportExecutionStatus is the status of the report processing workflow.
I am supposed to maintain the history of all report generation requests that a customer has initiated.
Now What I am trying to do is avoid concurrent processing requests by the same customer, so if a report for a customer is already getting generated don’t create another record in DDB ?
Option Considered :
query ddb for the customerID(consistent read) :
- From the list see if any entry is either InProgress or Scheduled
If not then create a new one (consistent write)
Otherwise return already existing
Issue: If customer clicks in a split second to generate report, two lambdas can be triggered, causing 2 entires in DDB and two parallel workflows can be initiated something that I don’t want.
Can someone recommend what will be the best approach to ensure that there are no concurrent executions (2 worklflows) for the same Report from same customer.
In short when one execution is in progress another one should not start.
You can use ConditionExpression to only create the entry if it doesn't already exist - if you need to check different items, than you can use DynamoDB Transactions to check if another item already exists and if not, create your item.
Those would be the ways to do it with DynamoDB, getting a higher consistency.
Another option would be to use SQS FIFO queues. You can group them by the customer ID, then you wouldn't have concurrent processing of messages for the same customer. Additionally with this SQS solution you get all the advantages of using SQS - like automated retry mechanisms or a dead letter queue.
Limiting the number of concurrent Lambda executions is not possible as far as I know. That is the whole point of AWS Lambda, to easily scale and run multiple Lambdas concurrently.
That said, there is probably a better solution for your problem using a DynamoDB feature called "Strongly Consistent Reads"
By default reads to DynamoDB (if you use the AWS SDK) are eventually consistent, causing the behaviour you observed: Two writes to the same table are made but your Lambda only was able to notice one of those writes.
If you use Strongly consistent reads, the documentation states:
When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful.
So your Lambda needs to do a strongly consistent read to your table to check if the customer already has a job running. If there is already a job running the Lambda does not create a new job.

Spring #Transactional + Isolation.REPEATABLE_READ for Rate Limiting

We are trying a scenario of Rate Limiting the total no. of JSON records requested in a month to 10000 for an API.
We are storing the total count of records in a table against client_id and a Timestamp(which is primary key).
Per request we fetch record from table for that client with Timestamp with in that month.
From this record we get the current count, then increment it with no. of current records in request and update the DB.
Using the Spring Transaction, the pseudocode is as below
#Transactional(propagation=Propagation.REQUIRES_NEW, isolation=Isolation.REPEATABLE_READ)
public void updateLimitData(String clientId, currentRecordCount) {
//step 1
startOfMonthTimestamp = getStartOfMonth();
endOfMonthTimestamp = getEndOfMonth();
//step 2
//read from DB
latestLimitDetails = fetchFromDB(startOfMonthTimestamp, endOfMonthTimestamp, clientId);
latestLimitDetails.count + currentRecordCount;
//step 3
saveToDB(latestLimitDetails)
}
We want to make sure that in case of multiple threads accessing the "updateLimitData()" method, each thread get the updated data for a clientId for a month and it do not overwrite the count wrongly.
In the above scenario if multiple threads access the method "updateLimitData()" and reach the "step 3". First thread will update "count" in DB, then the second thread update "count" in DB which may not have latest count.
I understand from Isolation.REPEATABLE_READ that "Write Lock" is placed in the rows when update is called at "Step 3" only(by that time other thread will have stale data). How I can ensure that always threads get he latest count from table in multithread scenario.
One solution came to my mind is synchronizing this block but this will not work well in multi server scenario.
Please provide a solution.
A transaction would not help you unless you lock the table/row whilst doing this operation (don't do that as it will affect performance).
You can migrate this to the database, doing this increment within the database using a stored procedure or function call. This will ensure ACID and transactional safety as this is built into the database.
I recommend doing this using standard Spring Actuator to produce a count of API calls however, this will mean re-writing your service to use the actuator endpoint and not the database. You can link this to your Gateway/Firewall/Load-balancer to deny access to the API once their quote is reached. This means that your API endpoint is pure and this logic is removed from your API call. All new API's you developer will automatically get this functionality.

Resources