Establishing a write lock on a row in hbase - hadoop

I am trying to test a workflow where the change i made reordered the deletes and how it cleans up the other indices from hbase.
There are 3 different indices being deleted. The logic somehow roughly resembles this operation.
try{
try{
hTable.delete(firstIndexDeletes);
} catch(IOException ie) {
// clean up and exception handling for first index
}
//more processing logic for second index
try{
hTable.delete(secondIndexDeletes)
} catch(IOException ie) {
// Clean up and exception handling for second index
}
//more processing logic
hTable.delete(thirdIndex);
} catch(IOException ie) {
//Clean up and exception handling for third index
}
I am trying to test the exception handling part via integration tests (i was able to get it tested throughly via unit tests) and i am trying to make the delete thrown an exception and i decided to use a lock on a specific index so that if an delete happens on that row it will throw an exception.
hTable.lockRow(Bytes.toBytes(firstIndexKey));
ideally i expected it to throw an exception for that row when it was deleted as part of firstIndexDeletes but somehow it just doesn't make any difference in my tests, it's not going to the exception handling part like i wanted. Is there something elementary i am missing?

To my knowledge (from routine, close examination of the source) explicit row locks are being retired from HBase. That said I've never tried to use them.
In my opinion, I would expect thorough unit test coverage (where you can exploit mocking) to be sufficient.

Related

Multithreaded Use of Spring Pulsar

I am working on a project to read from our existing ElasticSearch instance and produce messages in Pulsar. If I do this in a highly multithreaded way without any explicit synchronization, I get many occurances of the following log line:
Message with sequence id X might be a duplicate but cannot be determined at this time.
That is produced from this line of code in the Pulsar Java client:
https://github.com/apache/pulsar/blob/a4c3034f52f857ae0f4daf5d366ea9e578133bc2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java#L653
When I add a synchronized block to my method, synchronizing on the pulsar template, the error disappears, but my publish rate drops substantially.
Here is the current working implementation of my method that sends Protobuf messages to Pulsar:
public <T extends GeneratedMessageV3> CompletableFuture<MessageId> persist(T o) {
var descriptor = o.getDescriptorForType();
PulsarPersistTopicSettings settings = pulsarPersistConfig.getSettings(descriptor);
MessageBuilder<T> messageBuilder = Optional.ofNullable(pulsarPersistConfig.getMessageBuilder(descriptor))
.orElse(DefaultMessageBuilder.DEFAULT_MESSAGE_BUILDER);
Optional<ProducerBuilderCustomizer<T>> producerBuilderCustomizerOpt =
Optional.ofNullable(pulsarPersistConfig.getProducerBuilder(descriptor));
PulsarOperations.SendMessageBuilder<T> sendMessageBuilder;
sendMessageBuilder = pulsarTemplate.newMessage(o)
.withSchema(Schema.PROTOBUF_NATIVE(o.getClass()))
.withTopic(settings.getTopic());
producerBuilderCustomizerOpt.ifPresent(sendMessageBuilder::withProducerCustomizer);
sendMessageBuilder.withMessageCustomizer(mb -> messageBuilder.applyMessageBuilderKeys(o, mb));
synchronized (pulsarTemplate) {
try {
return sendMessageBuilder.sendAsync();
} catch (PulsarClientException re) {
throw new PulsarPersistException(re);
}
}
}
The original version of the above method did not have the synchronized(pulsarTemplate) { ... } block. It performed faster, but generated a lot of logs about duplicate messages, which I knew to be incorrect. Adding the synchronized block got rid of the log messages, but slowed down publishing.
What are the best practices for multithreaded access to the PulsarTemplate? Is there a better way to achieve very high throughput message publishing?
Should I look at using the reactive client instead?
EDIT: I've updated the code block to show the minimum synchronization necessary to avoid the log lines, which is just synchronizing during the .sendAsync(...) call.
Your usage w/o the synchronized should work. I will look into that though to see if I see anything else going on. In the meantime, it would be great to give the Reactive client a try.
This issue was initially tracked here, and the final resolution was that it was an issue that has been resolved in Pulsar 2.11.
Please try updating the Pulsar 2.11.

NHibernate ArgumentOutOfRangeException

I recently ran into an instance where I wanted to hit the database from a Task I have running periodically within a web application. I refactored the code to use the ThreadStaticSessionContext so that I could get a session without an HttpContext. This works fine for reads, but when I try to flush an update from the Task, I get the "Index was out of range. Must be non-negative and less than the size of the collection." error. Normally what I see for this error has to do with using a column name twice in the mapping, but that doesn't seem to be the issue here, as I'm able to update that table if the session is associated with a request (and I looked and I'm not seeing any duplicates). It's only when the Task tries to flush that I get the exception.
Does anyone know why it would work fine from a request, but not from a call from a Task?
Could it be because the Task is asynchronous?
Call Stack:
at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List`1.System.Collections.IList.get_Item(Int32 index)
at NHibernate.Engine.ActionQueue.ExecuteActions(IList list)
at NHibernate.Engine.ActionQueue.ExecuteActions()
at NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource session)
at NHibernate.Event.Default.DefaultFlushEventListener.OnFlush(FlushEvent event)
at NHibernate.Impl.SessionImpl.Flush()
Session Generation:
internal static ISession CurrentSession {
get {
if(HasSession) return Initializer.SessionFactory.GetCurrentSession();
ISession session = Initializer.SessionFactory.OpenSession();
session.BeginTransaction();
CurrentSessionContext.Bind(session);
return session;
}
}
private static bool HasSession {
get { return CurrentSessionContext.HasBind(Initializer.SessionFactory); }
}
Task that I want to access the database from:
_maid = Task.Factory.StartNew(async () => {
while(true) {
if(CleaningSession != null) CleaningSession(Instance, new CleaningSessionEventArgs { Session = UnitOfWorkProvider.CurrentSession });
UnitOfWorkProvider.TransactionManager.Commit();
await Task.Delay(AppSettings.TempPollingInterval, _paycheck.Token);
}
//I know this function never returns, I'm using the cancellation token for that
// ReSharper disable once FunctionNeverReturns
}, _paycheck.Token);
_maid.GetAwaiter().OnCompleted(() => _maid.Dispose());
Edit: Quick clarification about some of the types above. CleaningSession is an event that is fired to run the various things that need to be done, and _paycheck is the CancellationTokenSource for the Task.
Edit 2: Oh yeah, and this is using NHibernate version 4.0.0.4000
Edit 3: I have since attempted this using a Timer, with the same results.
Edit 4: From what I can see of the source, it's doing a foreach loop on an IList. Questions pertaining to an IndexOutOfRangeException in a foreach loop tend to suggest a concurrency issue. I still don't see how that would be an issue, unless I misunderstand the purpose of ThreadStaticSessionContext.
Edit 5: I thought it might be because of requests bouncing around between threads, so I tried creating a new SessionContext that combines the logic of the WebSessionContext and ThreadStaticSessionContext. Still getting the issue, though...
Edit 6: It seems this has something to do with a listener I have set up to update some audit fields on entities just before they're saved. If I don't run it, the commit occurs properly. Would it be better to do this through an event than OnPreInsert, or use an interceptor instead?
After muddling through, I found out exactly where the problem was. Basically, there was a query that was run to load the current user record called from inside of the PreUpdate event in my listener.
I came across two solutions to this. I could cache the user in memory, avoiding the query, but having possibly stale data (not that anything other than the id matters here). Alternatively, I could open a temporary stateless session and use that to look up the user in question.

Is it possible to add additional information for crashes handled by Xamarin.Insights analytics framework

I have an xamarin.android with xamarin.insights intergrated.
Right now every time I handle error manually (try/catch) I'm adding information about environment (staging/production):
try
{
ExceptionThrowingFunction();
}
catch (Exception exception)
{
exception.Data["Environment"] = "staging";
throw;
}
But this information is missing in case if error handled by xamarin.insights itself (in case of crash).
It is possible to add additional exception data in case of crash?
docs reference I used
From reading the docs page reference that you mentioned, I still get the impression that you have to call the .Report method as well as in:-
Insights.Report(exception, new Dictionary <string, string> {
{"Some additional info", "foobar"}
});
What I believe they are saying in this example:-
try {
ExceptionThrowingFunction();
}
catch (Exception exception) {
exception.Data["AccountType"] = "standard";
throw;
}
Is that you have the ability when any Exception is encountered, to package additional information that you can later send to the Insights server, as the Data property of the Exception is just a Key/Value Dictionary.
So if you had an Exception several layers deep, you can choose to re-throw the Exception with additional information contained within it that you will later send to the Insights server.
At a higher level, you can then take the Exception that was thrown deeper down the call-hierarchy and then call the Insights.Report, with:-
Insights.Report(
{the rethrown exception in your higher up try..catch block},
{rethrown exception}.Data
);
that will then send all the additional Key/Value information previously captured.
From seeing your last part of your question though it looks like you are interested in Insights handling and sending this additional .Data automatically should there be an unhandled exception.
If it is not currently being sent, then perhaps suggest to them that this can be sent also? As it sounds a feasible request for this to automatically be sent as well incase of an unhandled exception.
Update 1:-
Yes - I understand about the unhandled exception scenario now that you are referring to.
I have not dealt with this component directly, so there may be hooks / event handlers or something already defined where you can tap into this, and execute some custom code just prior to this being sent.
If this is not available, then perhaps suggest this to them to include as its a Beta product?
Alternatively, you could still achieve this yourself by capturing the unhandled exceptions just prior to them falling. You'd have to code this however on each platform.
For instance on Windows Phone in the App class there is Application_UnhandledException(object sender, ApplicationUnhandledExceptionEventArgs e) to which you could then supplement the Exception thrown with this extra .Data.
For Android you could take a look at this post that describes how to catch uncaughtException that will help you in capturing the unhandled exceptions.
Whether just supplementing the Exception in these handlers above is enough all depends on how they've written their hook into this, as to how well it behaves and whether it is executed first, prior to their implementation.
You will have to try and see if it does. If it doesn't behave well, allowing you to supplement extra data prior to the automatic call to Insights, you have another fallback solution, to just do the .Report call manually within these unhandled exception handlers yourself to make this work and supplement the extra .Data to achieve your aim.

How to manually manage Hibernate sessions in #PostContruct methods?

My problem is straightforward. I want to access some data from the database when the application loads on Tomcat. To do something at that point in time I use #PostConstruct (which does its job properly).
However, in that method I make 2 separate connections to the DB: one for bringing a list of entities and another for adding them into a common library. The second step implies some behind-the-scenes queries for resolving some lazy-loading associations. Here is the code snippet:
#Override
#PostConstruct
public void populateLibrary() {
// query for the Book Descriptors - 1st query works!!!
List<BookDescriptor> bookDescriptors= bookDescriptorService.list();
Session session = sessionFactory.openSession();
Transaction transaction = null;
try {
transaction = session.beginTransaction();
// resolving some lazy-loading associations - 2nd query fails!!!
for (BookDescriptor book: bookDescriptors) {
library.addEntry(book);
}
transaction.commit();
} catch (HibernateException e) {
transaction.rollback();
e.printStackTrace();
} finally {
session.close();
}
}
1st query works while the 2nd fails, as I wrote in the comments. The failure gives:
org.hibernate.LazyInitializationException: could not initialize proxy - no Session
at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:86)
at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:140)
at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:190)
at com.freightgate.domain.SecurityFiling_$$_javassist_7.getSfSubmissionType(SecurityFiling_$$_javassist_7.java)
at com.freightgate.dao.SecurityFilingTest.test(SecurityFilingTest.java:73)
Which is very odd since I explicitly opened and closed a transaction. However, if I inspect some details of how the 1st query works it seems like behind the scenes the session is bound to AbstractLazyInitializer class.
I resolved my problem by abstracting away the functionality from the for loop into a separate service class that is annotated with #Transactional(readOnly = true). Still I'm puzzled as to why the approch that I posted here fails.
If anyone has some hints, I'd be very happy to hear them.
You load entities in a first session, then close this session, then open a new session, and try to lazy-load collections of the entities. That can't work.
For lazy-loading to work, the entity must be attached to an open session. Just opening another session doesn't make any entity you have loaded before attached to this new session. In the meantime, some other transaction could have radically changed the database, the entity could not exist anymore...
The best solution is what you have done. Encapsulate evrything into a single transactional service. You could also have open the transaction before calling the first service, but why handle transactions programmatically, since Spring does it for you declaratively?

hibernate repeat query

I have Web app(as ORM I use Hibernate) that populates data from Oracle 11 DB.
For short period of time some Oracle packages becomes invalid and then becomes valid back (it's legacy data load and during this process user can use other UI).
When data load finishes and user perform any query to those packages I have an error:
ORA-04068: existing state of packages has been discarded ORA-04061:
existing state of package "sche.pck" has been invalidated ORA-04065:
not executed, altered or dropped package "sche.pck" ORA-06508: PL/SQL:
could not find program unit being called: "sche.pck"
If user press F5 (on error message screen) then the query executes successfully. If there any way to repeat user query when such errors happen?
Yes - try/catch the exception, inspect the exception message, looking for ORA-04068, and if it is found, rerun the query.
Ideally, you should have a number of retries. Something like:
for (int i = 0; i < 3; i++) {
try {
executeQuery();
break; //if successful;
} catch (..) {
if (!ex.getMessage().contains("ORA-06508")){
throw ex;
}
}
}
Looks a bit hacky, and I'd suggest to try to fix the original problem instead.
Update:
It seems you have to do that in many places, so the above will be tedious. If you really cannot fix the underlying oracle problem, you can try wrapping your DataSource, Connection and Statement objects into your own implementations that simply delegate to the underlying object, but in the case of executeQuery(), performs the retry.

Resources