Weblogic "Abandoning transaction" warning - jms

We randomly get warnings such as below on our WL server. We'd like to better understand what exactly these warnings are and what we should possibly do to avoid them.
Abandoning transaction after 86,606
seconds:
Xid=BEA1-52CE4A8A9B5CD2587CA9(14534444),
Status=Committing,numRepliesOwedMe=0,numRepliesOwedOthers=0,seconds
since begin=86605, seconds
left=0,XAServerResourceInfo[JMS_goJDBCStore]=(ServerResourceInfo[JMS_goJDBCStore]= (state=committed,assigned=go_server),xar=JMS_goJDBCStore,re-Registered
= true),XAServerResourceInfo[weblogic.jdbc.wrapper.JTSXAResourceImpl]=
(ServerResourceInfo[weblogic.jdbc.wrapper.JTSXAResourceImpl]=(state=new,assigned=none),xar=
weblogic.jdbc.wrapper.JTSXAResourceImpl#1a8fb80,re-Registered
= true),SCInfo[go+go_server]= (state=committed),properties=({weblogic.jdbc=t3://10.6.202.37:18080}),local
properties=
({weblogic.transaction.recoveredTransaction=true}),OwnerTransactionManager=
ServerTM[ServerCoordinatorDescriptor=(CoordinatorURL=go_server+10.6.202.37:18080+go+t3+,
XAResources={JMS_goJDBCStore,
weblogic.jdbc.wrapper.JTSXAResourceImpl},NonXAResources=
{})],CoordinatorURL=go_server+10.6.202.37:18080+go+t3+)
I do understand the BEA explanation:
Error: Abandoning transaction after secs seconds: tx
Description: When a transaction is abandoned,
knowledge of the transaction is
removed from the transaction manager
that was attempting to drive the
transaction to completion. The JTA
configuration attribute
AbandonTimeoutSeconds determines how
long the transaction manager should
persist in trying to commit or
rollback the transaction.
Cause: A resource or participating server may
have been unavailable for the duration of the
AbandonTimeoutSeconds period.
Action: Check participating resources for heuristic
completions and correct any data inconsistencies.
We have observed that you can get rid of these warnings by deleting the *.tlog files but this doesn't seem like the right strategy to deal with the warnings.
The warnings refer to JMS and our JMS store. We do use JMS. We just don't understand why transactions are hanging out there and why they would be "abandoned"??

I know it's not very satisfying, but we do delete *.tlog files before startup in our app hosted on WLS 7.
Our app is an event-processing back-end, largely driven by JMS. We aren't interested in preserving transactions across WLS restarts. If it didn't complete before the shutdown, it tends not to complete after a restart. So doing this *.tlog cleanup just eliminates some warnings and potential flaky behavior.
I don't think JMS is fundamental to any of this, by the way. At least not that I know.
By the way, we moved from JDBC JMS store to local files. That was said to be better performing and we didn't need the location independence we'd get from using JDBC. If that describes your situation also, maybe moving to local files would eliminate the root cause for you?

Related

Lots of retries and transaction expired on NEAR localnet

We created a test chain that runs locally on a computer, launches a chain with 4 validators (quite similar to localnet) and from there, we are deploying a smart contract testing various aspect of the chain (failed transactions, async receipts, args encoding, logs, and such stuff).
Everything can be run/seen here https://github.com/streamingfast/battlefield-near (it’s a bunch of scripts that facilitate running this network and the transactions).
When I deploy my contract, it always requires like 2 to 3 retries before getting the transaction to pass correctly. More than that, I would say in 33% of the cases, I reach the retry limit and get a Transaction Expired error.
This seems weird to me that such amount of retry is required to deploy a contract assuming that everything runs locally on my computer. When deploying the contract, it’s the only transaction going in, so there should be no congestion involved (there should be actually no traffic at all).
How the contract deployment can pass right away without retries and without never expiring the transaction?
It is possible that the network is too fast given that, as you mentioned, it is a one-node local network. This may cause the transaction to expire quickly, especially given that the default value for expiration on localnet is quite small I believe. Check for transaction_validity_period in your genesis.json and see if setting it to a large number helps.

How to crash Jboss based on some condition

I am using JBoss 7x, and have the following use case.
I am going to do load testing of messaging queues with Jboss. The queues are external to JBoss.
I will push a lot of message in the queue, around 1000 message. When around 100+ message has been pushed I want to crash JBoss. Later I want to re-start the Jboss the verify the message processing.
I had earlier made use of Byteman to crash the JVM using the following
JAVA_OPTS="-javaagent:/BYTEMAN_HOME/lib/byteman.jar=script:/QUICKSTART_HOME/jta-crash-rec/src/main/scripts/xa.btm ${JAVA_OPTS}"
Details are here: https://github.com/Naresh-Chaurasia/jboss-eap-quickstarts/tree/7.3.x/jta-crash-rec
In the above case when ever XA Transaction is happening the JVM is being crashed using byteman, but in my case I want to only crash the JVM/Jboss lets say after 100+ messages. i.e not for each transaction but after processing some messages.
I have also tried a few examples from here, to get ideas of how to achieve it, but did not succeed. https://developer.jboss.org/docs/DOC-17213#top
Question: How can I crash JBoss/ running JVM using byteman or some other way.
See the Programmers Guide that comes bundled with the distribution.
Sections headed "CountDowns" and "Aborting Execution" provide what's necessary. These are built-in features of the Rule Language.

How resilient is reporting to Trains server?

How would Trains go about sending any missing data to the server in the following scenarios?
Internet connection breaks temporarily while running an experiment
Internet connection breaks and doesn't come back before the experiment ends (any manual way to send all the data that was missed?)
The machine running Trains server resets in the middle of an experiment
Disclaimer: I'm part of the allegro.ai Trains team
Trains will auto retry to send logs, basically forever. The logs/metrics are sent in a background thread so it should not interfere with execution. You can set the backoff parameter, to control the retry frequency, by adjusting the sdk.network.iteration.retry_backoff_factor_sec parameter in your ~/trains.conf file, see example here
The experiment will try to flush all metrics to the backend when the experiment ends, i.e. the process will wait at_exit until all metrics are sent. This means if the connection was dropped, it will retry until it is up again. If the experiment was aborted manually, there is no way to capture/resend those lost metric reports. That said with the new 0.16 version, offline mode was introduced. This way one can run the entire experiment offline, then later report all logs/metrics/artifacts.
The Trains-Server machine is fully stateless (the states themselves are stored in the databases on the machine) this means that from the experiment perspective, the connection was dropped for a few minutes and then it's available again. To your question, if the Trains-Server restarted, it is transparent to all experiments and they continue as usual, no reports will be lost.

To close or not to close an Oracle Connection?

My application have performance issues, so i started to investigate this from the root: "The connection with the database".
The best practices says: "Open a connection, use it and close is as soon as possible", but i dont know the overhead that this causes, so the question is:
1 -"Open, Use, Close connections as soon as possible is the best aproach using ODP.NET?"
2 - Is there a way and how to use connection pooling with ODP.NET?
I thinking about to create a List to store some connections strings and create a logic to choose the "best" connection every time i need. Is this the best way to do it?
Here is a slide deck containing Oracle's recommended best practices:
http://www.oracle.com/technetwork/topics/dotnet/ow2011-bp-performance-deploy-dotnet-518050.pdf
You automatically get a connection pool when you create an OracleConnection. For most middle tier applications you will want to take advantage of that. You will also want to tune your pool for a realistic workload by turning on Performance Counters in the registry.
Please see the ODP.NET online help for details on connection pooling. Pool settings are added to the connection string.
Another issue people run into a lot with OracleConnections is that the garbage collector does not realize how truly resource intensive they are and does not clean them up promptly. This is compounded by the fact that ODP.NET is not fully managed and so some resources are hidden from the garbage collector. Hence the best practice is to Close() AND Dispose() all Oracle ODP.NET objects (including OracleConnection) to force them to be cleaned up.
This particular issue will be mitigated in Oracle's fully managed provider (a beta will be out shortly)
(EDIT: ODP.NET, Managed Driver is now available.)
Christian Shay
Oracle
The ODP.NET is a data provider for ADO.NET.
The best practice for ADO.Net is Open, Get Data (to memory), close, use in memory data.
For example using a OracleDataReader to load data in a DataTable in memory and close connection.
[]'s
For a single transaction this is best but for multiple transaction where you commit at the end this might not be the best solution. You need to keep the connection open until the transaction either committed or rolled back. How do you manage that and also how do you check the connection still exist in that case?(ie network failure) There is ConnectionState.Broken property which does not work at this point.

WebSphereMQ PCFMessageAgent / PCFAgent - Is it Thread Safe?

I am implementing a monitoring and administrative MQ API using the WebSphereMQ java PCF (Program Control Format) library. What I would like to know is if the PCFAgent and/or the PCFMessageAgent classes are thread safe. The documentation does not make it clear [to me].
If not, then I have 2 choices:
Create a pool of agents
Create (and disconnect) agents on demand.
Any insight into this issue is appreciated.
Cheers.
The important information you seek is probably on this page:
http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/index.jsp?topic=%2Fcom.ibm.mq.csqzaw.doc%2Fja11160_.htm
The main issue you will see is that the MQQueueManager object (that you either pass in, or is created for you) cannot really do 2 things at once on a single connection.
So if you have one Agent sitting on a get-with-wait waiting for a response to a big query (saying getting full details for thousands of queues) nothing else can be done using that connection until the reply comes back.
Connect/Disconnect are the biggest overhead when talking to MQ, so if you need multiple threaded access I would go with option 1 otherwise you'll pay a big penalty in performance having to wait for connect each time.

Resources