Sometime, there is a delay required to execute queries successful in Cockroachdb - cockroachdb

I'm trying to migrate our PostgreSQL database to Cockroach DB. So, I'm running our moch tests after setting up a single node server on my local (Macbook Pro). There were some issues which I fixed one by one, but the last one. The issue is that the tests (which were passed) started to fail with the following error messages:
Error: current transaction is aborted, commands ignored until end of transaction block
.
This is my running node configuration (which is run in insecure mode):
CockroachDB node starting at 2019-01-27 13:11:46.17270052 +0000 UTC (took 0.9s)
build: CCL v2.1.1 # 2018/11/19 18:20:16 (go1.10.3)
webui: http://localhost:8080
sql: postgresql://root#localhost:26257?sslmode=disable
client flags: cockroach <client cmd> --host=localhost:26257 --insecure
I tried to run the sql queries by a 1-second delay after insertion of the records and it started to work.
I would appreciate if somebody can explain why this kind of delay may required, and any solution.

Thanks to Ron. The problem was in code where I tried to handle error code 40001 (Retryable Errors). The proper way to handle this situation is to rollback the transaction before retrying, which I didn't. The correct code should looks like:
.catch( (e) => {
// check whether the error is "retryable error" (40001) or not. if yes,
// retry the whole batch, if not rollback and then re-throw the exception
return client.query('rollback').then(() => {
if (e.code === '40001') {
return batch(list)
} else {
throw e
}
})
})
but I had done it wrong like:
// *** THIS IS THE WRONG CODE.
.catch( (e) => {
// check whether the error is "retryable error" (40001) or not. if yes,
// retry the whole batch, if not rollback and then re-throw the exception
if (e.code == 40001) {
return batch(list)
} else {
return client.query('rollback').then(() => {
throw e
})
}
})
So the connection which made the aborted transaction was not rolled back before putting back into the connection pool, and it caused the error next time being used.
Thinking about why the 1-second delay helps to overcome the situation, I guess that the connection was killed by connection pool manager after a second of being idle.

The error you are seeing is a result of trying to issue another command in the context of a transaction had already been aborted.
But without seeing what was happening inside the transaction that caused it to abort I wouldn't be able to tell why the transaction aborted.
Could you share your DML, DDL, and Schema that you're using?
Thanks,
Ron

Related

How to deal with FATAL: terminating connection due to idle-in-transaction timeout

I have one method where some DB operations are happened with some API call. This is a scenario with Spring and postgres DB. We also have property set for idle transaction in postgres DB
idle_in_transaction_session_timeout = 10min
Now the issue is I do get Exception sometime
org.postgresql.util.PSQLException: This connection has been closed. Root Cause is FATAL: terminating connection due to idle-in-transaction timeout
For example, my code look like this:
#Transactional(value = "transactionManagerDC")
public void Execute()
{
// 1. select from DB - took 2 min
// 2. call another API - took 10 min. <- here is when the postgres close my connection
// 3. select from DB - throws exception.
};
What could be the correct design for it? We are using output for select from step 1 in API call and output of that API call used in step 3 for select. So these three steps are interdependent.
Ten minutes is a very long time to hold a transaction open. Your RDBMS server automatically disconnects your session, rolling back the transaction, because it cannot tell whether the transaction was started by an interactive (command-line) user who then went out to lunch without committing it, or by a long-running task like yours.
Open transactions can block other users of your RDBMS, so it's best to COMMIT them quickly.
Your best solution is to refactor your application code so it can begin, and then quickly commit, the transaction after the ten-minute response from that other API call.
It's hard to give you specific advice without knowing more. But you could set some sort of status = "API call in progress" column on a row before you call that slow API, and clear that status within your transaction after the API call completes.
Alternatively, you can set the transaction timeout for just that connection with something like this, to reduce the out-to-lunch risk on your system.
SET idle_in_transaction_session_timeout = '15m';

How to send an exception to Sentry from Laravel Job only on final fail?

Configuration
I'm using Laravel 8 with sentry/sentry-laravel plugin.
There is a Job that works just fine 99% of time. It retries N times in case of any problems due to:
public $backoff = 120;
public function retryUntil()
{
return now()->addHours(6);
}
And it simply calls some service:
public function handle()
{
// Service calls some external API
$service->doSomeWork(...);
}
Method doSomeWork sometimes throws an exception due to network problems, like Curl error: Operation timed out after 15001 milliseconds with 0 bytes received. This is fine due to automatic retries. In most cases next retry will succeed.
Problem
Every curl error is sent to Sentry. As an administrator I must check every alert, because this job is pretty important and I can't miss actually failed job. For example:
There is some network problem that is not resolved for an hour.
Application queues a Job
Every 2 minutes application generates similar message to Sentry
After network problems resolved job succeeds, so no attention required
But we are seing dozens of errors, that theoretically could be ignored. But what if there an actual problem in that pile and I will miss it?
Question
How to make that only "final" job fail would send a message to Sentry? I mean after 6 hours of failed retries: only then I'd like to receive one alert.
What I tried
There is one workaround that kind of "works". We can replace Exception with SomeCustomException and add it to \App\Exceptions\Handler::$dontReport array. In that case there are no "intermediate" messages sent to Sentry.
But when job finally fails, Laravel sends standard ... job has been attempted too many times or run too long message without details of actual error.

hibernate repeat query

I have Web app(as ORM I use Hibernate) that populates data from Oracle 11 DB.
For short period of time some Oracle packages becomes invalid and then becomes valid back (it's legacy data load and during this process user can use other UI).
When data load finishes and user perform any query to those packages I have an error:
ORA-04068: existing state of packages has been discarded ORA-04061:
existing state of package "sche.pck" has been invalidated ORA-04065:
not executed, altered or dropped package "sche.pck" ORA-06508: PL/SQL:
could not find program unit being called: "sche.pck"
If user press F5 (on error message screen) then the query executes successfully. If there any way to repeat user query when such errors happen?
Yes - try/catch the exception, inspect the exception message, looking for ORA-04068, and if it is found, rerun the query.
Ideally, you should have a number of retries. Something like:
for (int i = 0; i < 3; i++) {
try {
executeQuery();
break; //if successful;
} catch (..) {
if (!ex.getMessage().contains("ORA-06508")){
throw ex;
}
}
}
Looks a bit hacky, and I'd suggest to try to fix the original problem instead.
Update:
It seems you have to do that in many places, so the above will be tedious. If you really cannot fix the underlying oracle problem, you can try wrapping your DataSource, Connection and Statement objects into your own implementations that simply delegate to the underlying object, but in the case of executeQuery(), performs the retry.

Persistable Workflow with MVC - Run throwing exception on completed

I'm running a persistable Workflow in an MVC 3 Application, which is working out well, but when the workflow is completed, a WorkflowApplicationCompletedException is thrown. The Workflow is completed sucessfully, the last actions done and the instance deleted from the database.
I've had no luck searching for an answer so far, so any ideas what is causing the exception would be appreciated. My current workaround is catching the exception and doing my stuff there for the OnCompleted-Event.
I'm simply creating a WorkflowApplication, loading it and resuming the bookmark.
Any hints or suggestions appreciated.
Thanks
application.Load(new Guid(basket.StandardFields.Settings));
application.ResumeBookmark(application.GetBookmarks().First().BookmarkName, WorkflowInputs);
application.PersistableIdle = delegate(WorkflowApplicationIdleEventArgs e)
{
if (e.Bookmarks != null && e.Bookmarks.Count > 0)
{
_viewName = e.Bookmarks[0].BookmarkName;
}
syncContext.OperationCompleted();
return PersistableIdleAction.Unload;
};
application.Completed = delegate (WorkflowApplicationCompletedEventArgs e)
{
CompleteWorkflow(syncContext);
};
application.SynchronizationContext.OperationStarted();
try
{
application.Run();
}
catch(WorkflowApplicationCompletedException)
{
CompleteWorkflow(syncContext);
}
Edit
It seems that the application.ResumeBookmark(bookmark, WorkflowInputs) starts the Workflow and Completes the activities, then when I call run, it complains the it's already completed. But if I don't call run when resume workflow is called, the browser never gets any information and I think it stays waiting endlessly cause not even a refresh can knock it out of the waiting state.
It seems that with ResumeBookmark there is no need to call Run afterwards. I think I was doing it at the wrong place before and so the workflow got messed up, but it seems to be working fine now.
if(hasWorkflow)
application.ResumeBookmark(application.GetBookmarks().First().BookmarkName, WorkflowInputs);
else
application.Run();
MSDN:
Represents the exception that is
thrown when an operation on a workflow
instance is not valid because the
instance has completed.
The code you show appears valid. However, somewhere you are attempting to resume a workflow that has entered the completed state. You should be checking the Completed property of any Workflow you are attempting to resume. Thrown an InvalidOperationException and you'll see where this is happening.
If this doesn't identify where the problem is, your workflow may not be bookmarking properly. That code is in the activity that is creating the bookmark, so I can't tell if it is being done correctly...

What is JDBC's Connection.isClosed() good for, and why is Snaq DBPool misbehaving on close?

I have the following code in Java:
if(!conn.isClosed())
{
conn.close();
}
Instead of working, I am awarded with:
java.sql.SQLException: Connection already closed
My connection object is a Snaq.db.CacheConnection
I checked the JavaDocs for isClosed, and they state that:
This method generally cannot be called
to determine whether a connection to a
database is valid or invalid. A
typical client can determine that a
connection is invalid by catching any
exceptions that might be thrown when
an operation is attempted.
So my questions are:
1) What good is JDBC's isClosed() anyway? Since when do we use Exceptions in Java to check for validity?
2) What is the correct pattern to close a database? Should I just close and swallow exceptions?
3) Any idea why would SnaqDB be closing the connection? (My backend is a Postgres 8.3)
I'll answer your questions with corresponding numbers:
I agree with you, it seems strange that isClosed provides the closed state only on a best effort basis, and that your code still has to be prepared to catch the exception when closing the connection. I think the reason is that the connection may be closed at any time by the database, and so any status returned by a query state method like isClosed is intrinsicly stale information - the state may change between checking isClosed and calling close on the Connection.
Calling close has no affect on your data and on previous queries. JDBC operations execute with synchronous results, so all useful execution has either succeeded or failed by the time isClosed is called. (True with both autoCommit or explicit transaction boundaries.) If your application is a single user accessing a local database, then perhaps showing the error to the user might help them diagnose problems. In other environments, logging the exception and swallowing it is probably the best course of action. Either way, swallowing the excpetion is safe, as has no bearing on the state of the database.
Looking at the source for SnaqDB CacheConnection, the isClosed method delegates to the underlying connection. So the problem is not there, but lies with the defined contract for isClosed() and Connection.close() throwing an exception.

Resources