Can a TPL Dataflow ActionBlock be Reset after a Fault? - task-parallel-library

I have a TPL Dataflow Action Block that I'm using to receive trigger messages for a camera, then do some processing. If the processing task throws an exception, the ActionBlock enters the faulted state. I would like to send a faulted message to my UI and send a reset message to the ActionBlock so it can continue processing incoming trigger messages. Is there a way to return the ActionBlock to a ready state (clear the fault)?
Code for the curious:
using System.Threading.Tasks.Dataflow;
namespace Anonymous
{
/// <summary>
/// Provides a messaging system between objects that inherit from Actor
/// </summary>
public abstract class Actor
{
//The Actor uses an ActionBlock from the DataFlow library. An ActionBlock has an input queue you can
// post messages to and an action that will be invoked for each received message.
//The ActionBlock handles all of the threading issues internally so that we don't need to deal with
// threads or tasks. Thread-safety comes from the fact that ActionBlocks are serialized by default.
// If you send two messages to it at the same time it will buffer the second message until the first
// has been processed.
private ActionBlock<Message> _action;
...Properties omitted for brevity...
public Actor(string name, int id)
{
_name = name;
_id = id;
CreateActionBlock();
}
private void CreateActionBlock()
{
// We create an action that will convert the actor and the message to dynamic objects
// and then call the HandleMessage method. This means that the runtime will look up
// a method called ‘HandleMessage’ with a parameter of the message type and call it.
// in TPL Dataflow if an exception goes unhandled during the processing of a message,
// (HandleMessage) the exception will fault the block’s Completion task.
//Dynamic objects expose members such as properties and methods at run time, instead
// of at compile time. This enables you to create objects to work with structures that
// do not match a static type or format.
_action = new ActionBlock<Message>(message =>
{
dynamic self = this;
dynamic msg = message;
self.HandleMessage(msg); //implement HandleMessage in the derived class
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 1 // This specifies a maximum degree of parallelism of 1.
// This causes the dataflow block to process messages serially.
});
}
/// <summary>
/// Send a message to an internal ActionBlock for processing
/// </summary>
/// <param name="message"></param>
public async void SendMessage(Message message)
{
if (message.Source == null)
throw new Exception("Message source cannot be null.");
try
{
_action.Post(message);
await _action.Completion;
message = null;
//in TPL Dataflow if an exception goes unhandled during the processing of a message,
// the exception will fault the block’s Completion task.
}
catch(Exception ex)
{
_action.Completion.Dispose();
//throw new Exception("ActionBlock for " + _name + " failed.", ex);
Trace.WriteLine("ActionBlock for " + _name + " failed." + ExceptionExtensions.GetFullMessage(ex));
if (_action.Completion.IsFaulted)
{
_isFaulted = true;
_faultReason = _name + " ActionBlock encountered an exception while processing task: " + ex.ToString();
FaultMessage msg = new FaultMessage { Source = _name, FaultReason = _faultReason, IsFaulted = _isFaulted };
OnFaulted(msg);
CreateActionBlock();
}
}
}
public event EventHandler<FaultMessageEventArgs> Faulted;
public void OnFaulted(FaultMessage message)
{
Faulted?.Invoke(this, new FaultMessageEventArgs { Message = message.Copy() });
message = null;
}
/// <summary>
/// Use to await the message processing result
/// </summary>
public Task Completion
{
get
{
_action.Complete();
return _action.Completion;
}
}
}
}

An unhandled exception in an ActionBlock is like an unhandled exception in an application. Don't do this. Handle the exception appropriately.
In the simplest case, log it or do something inside the block's delegate. In more complex scenarios you can use a TransformBlock instead of an ActionBlock and send Succes or Failure messages to downstream blocks.
The code you posted though has some critical issues. Dataflow blocks aren't agents and agents aren't dataflow blocks. You can use the one to build the other of course, but they represent different paradigms. In this case your Actor emulates ActionBlock's own API with several bugs.
For example, you don't need to create a SendAsync, blocks already have one. You should not complete the block once you send a message. You won't be able to handle any other messages. Only call Complete() when you really don't want to use the ActionBlock any more. You don't need to set a DOP of 1, that's the default value.
You can set bounds to a DataflowBlock so that it accepts only eg 10 messages at a time. Otherwise all messages would be buffered until the block found the chance to process them
You could replace all of this code with the following :
void MyMethod(MyMessage message)
{
try
{
//...
}
catch(Exception exc)
{
//ToString logs the *complete exception, no need for anything more
_log.Error(exc.ToString());
}
}
var blockOptions new ExecutionDataflowBlockOptions {
BoundedCapacity=10,
NameFormat="Block for MyMessage {0} {1}"
};
var block=new ActionBlock<MyMessage>(MyMethod,blockOptions);
for(int i=0;i<10000;i++)
{
//Will await if more than 10 messages are waiting
await block.SendAsync(new MyMessage(i);
}
block.Complete();
//Await until all leftover messages are processed
await block.Completion;
Notice the call to Exception.ToString(). This will generate a string containing all exception information, including the call stack.
NameFormat allows you to specify a name template for a block that can be filled by the runtime with the block's internal name and task ID.

Related

Immediately return first emitted value from two Monos while continuing to process the other asynchronously

I have two data sources, each returning a Mono:
class CacheCustomerClient {
Mono<Entity> createCustomer(Customer customer)
}
class MasterCustomerClient {
Mono<Entity> createCustomer(Customer customer)
}
Callers to my application are hitting a Spring WebFlux controller:
#PostMapping
#ResponseStatus(HttpStatus.CREATED)
public Flux<Entity> createCustomer(#RequestBody Customer customer) {
return customerService.createNewCustomer(entity);
}
As long as either data source successfully completes its create operation, I want to immediately return a success response to the caller, however, I still want my service to continue processing the result of the other Mono stream, in the event that an error was encountered, so it can be logged.
The problem seems to be that as soon as a value is returned to the controller, a cancel signal is propagated back through the stream by Spring WebFlux and, thus, no information is logged about a failure.
Here's one attempt:
public Flux<Entity> createCustomer(final Customer customer) {
var cacheCreate = cacheClient
.createCustomer(customer)
.doOnError(WebClientResponseException.class,
err -> log.error("Customer creation failed in cache"));
var masterCreate = masterClient
.createCustomer(customer)
.doOnError(WebClientResponseException.class,
err -> log.error("Customer creation failed in master"));
return Flux.firstWithValue(cacheCreate, masterCreate)
.onErrorMap((err) -> new Exception("Customer creation failed in cache and master"));
}
Flux.firstWithValue() is great for emitting the first non-error value, but then whichever source is lagging behind is cancelled, meaning that any error is never logged out. I've also tried scheduling these two sources on their own Schedulers and that didn't seem to help either.
How can I perform these two calls asynchronously, and emit the first value to the caller, while continuing to listen for emissions on the slower source?
You can achieve that by transforming you operators to "hot" publishers using share() operator:
First subscriber launch the upstream operator, and additional subscribers get back result cached from the first subscriber:
Further Subscriber will share [...] the same result.
Once a second subscription has been done, the publisher is not cancellable:
It's worth noting this is an un-cancellable Subscription.
So, to achieve your requirement:
Apply share() on each of your operators
Launch a subscription on shared publishers to trigger processing
Use shared operators in your pipeline (here firstWithValue).
Sample example:
import java.time.Duration;
import reactor.core.publisher.Mono;
public class TestUncancellableMono {
// Mock a mono successing quickly
static Mono<String> quickSuccess() {
return Mono.delay(Duration.ofMillis(200)).thenReturn("SUCCESS !");
}
// Mock a mono taking more time and ending in error.
static Mono<String> longError() {
return Mono.delay(Duration.ofSeconds(1))
.<String>then(Mono.error(new Exception("ERROR !")))
.doOnCancel(() -> System.out.println("CANCELLED"))
.doOnError(err -> System.out.println(err.getMessage()));
}
public static void main(String[] args) throws Exception {
// Transform to hot publisher
var sharedQuick = quickSuccess().share();
var sharedLong = longError().share();
// Trigger launch
sharedQuick.subscribe();
sharedLong.subscribe();
// Subscribe back to get the cached result
Mono
.firstWithValue(sharedQuick, sharedLong)
.subscribe(System.out::println, err -> System.out.println(err.getMessage()));
// Wait for subscription to end.
Thread.sleep(2000);
}
}
The output of the sample is:
SUCCESS !
ERROR !
We can see that error message has been propagated properly, and that upstream publisher has not been cancelled.

MassTransit Mediator MessageNotConsumedException

I Noticed a weird issue in one of our applications, from time to time, we get MessageNotConsumedException errors on API requests which we route via MT's Mediator.
As you will notice below, we have configured a customer LogFilter<T> which implements IFilter<ConsumeContext<T>> which ensure that we log each mediator message before and after consuming, or a 'ConsumeFailed' log in case an exception is thrown in any consumer.
When the error manifests itself, in the logs we see the following sequence of events:
T 0 : PreConsume logged
T +5ms: PostConsume logged
T +6ms: R-FAULT logged (I believe this logging is made by MT's internals?)
T +9ms: API Request 500 response logged, with `MessageNotConsumedException` as internal error
In the production environment, we see these errors with various timings, it happens in requests taking as 'little' as 9ms, over several seconds up to 30+ seconds.
I've trying to reproduce this problem in my local development environment, and did manage to produce the same sequence of events, but only by adding a delay of 35 seconds inside the consumer (see GetSomethingById class below for consumer body)
If I reduce the delay to 30s or less, the reponse will be fine.
Since the production errors are happening with very low handling times in the consumer, I suspect what I'm able to reproduce is not exactly the same.
However I'd still like to understand why I'm getting the MessageNotConsumedException, since while debugging I can easily step through my entire consumer (after the delay has elapsed) and happily reach the context.RespondAsync() call without any problems. Also while stepping through the consumer, the context.CancellationToken has not been cancelled.
I also came across this question, which sounds exactly like what I'm having, however I did add the HttpContext scope as documented. To be fair, I didn't try this change in production yet, but my local issue with the 35s delay remains unchanged.
I have MassTransit medatior configured as follows:
services.AddHttpContextAccessor();
services.AddMediator(x =>
{
x.AddConsumer<GetSomethingByIdHandler>();
x.ConfigureMediator((context, cfg) =>
{
//The order of using the middleware matters, so don't change this
cfg.UseHttpContextScopeFilter(context); // Extension method & friends copy/pasted from https://masstransit-project.com/usage/mediator.html#http-context-scope
cfg.UseConsumeFilter(typeof(LogFilter<>), context);
});
});
The LogFilter which is configured is the following class:
public class LogFilter<T> : IFilter<ConsumeContext<T>> where T : class
{
private readonly ILogger<LogFilter<T>> _logger;
public LogFilter(ILogger<LogFilter<T>> logger)
{
_logger = logger;
}
public void Probe(ProbeContext context) => context.CreateScope("log-filter");
public async Task Send(ConsumeContext<T> context, IPipe<ConsumeContext<T>> next)
{
LogPreConsume(context);
try
{
await next.Send(context);
}
catch (Exception exception)
{
LogConsumeException(context, exception);
throw;
}
LogPostConsume(context);
}
private void LogPreConsume(ConsumeContext context) => _logger.LogInformation(
"{MessageType}:{EventType} correlated by {CorrelationId} on {Address}"
+ " with send time {SentTime:dd/MM/yyyy HH:mm:ss:ffff}",
typeof(T).Name,
"PreConsume",
context.CorrelationId,
context.ReceiveContext.InputAddress,
context.SentTime?.ToUniversalTime());
private void LogPostConsume(ConsumeContext context) => _logger.LogInformation(
"{MessageType}:{EventType} correlated by {CorrelationId} on {Address}"
+ " with send time {SentTime:dd/MM/yyyy HH:mm:ss:ffff}"
+ " and elapsed time {ElapsedTime}",
typeof(T).Name,
"PostConsume",
context.CorrelationId,
context.ReceiveContext.InputAddress,
context.SentTime?.ToUniversalTime(),
context.ReceiveContext.ElapsedTime);
private void LogConsumeException(ConsumeContext<T> context, Exception exception) => _logger.LogError(exception,
"{MessageType}:{EventType} correlated by {CorrelationId} on {Address}"
+ " with sent time {SentTime:dd/MM/yyyy HH:mm:ss:ffff}"
+ " and elapsed time {ElapsedTime}"
+ " and message {#message}",
typeof(T).Name,
"ConsumeFailure",
context.CorrelationId,
context.ReceiveContext.InputAddress,
context.SentTime?.ToUniversalTime(),
context.ReceiveContext.ElapsedTime,
context.Message);
}
I then have a controller method which looks like this:
[Route("[controller]")]
[ApiController]
public class SomethingController : ControllerBase
{
private readonly IMediator _mediator;
public SomethingController(IMediator mediator)
{
_mediator = mediator;
}
[HttpGet("{somethingId}")]
public async Task<IActionResult> GetSomething([FromRoute] int somethingId, CancellationToken ct)
{
var query = new GetSomethingByIdQuery(somethingId);
var response = await _mediator
.CreateRequestClient<GetSomethingByIdQuery>()
.GetResponse<Something>(query, ct);
return Ok(response.Message);
}
}
The consumer which handles this request is as follows:
public record GetSomethingByIdQuery(int SomethingId);
public class GetSomethingByIdHandler : IConsumer<GetSomethingByIdQuery>
{
public async Task Consume(ConsumeContext<GetSomethingByIdQuery> context)
{
await Task.Delay(35000, context.CancellationToken);
await context.RespondAsync(new Something{Name = "Something cool"});
}
}
MessageNotConsumedException is thrown when a message is sent using mediator and that message is not consumed by a consumer. That wouldn't typically be a transient error since one would expect that the consumer remains configured/connected to the mediator for the lifetime of the application.

Elasticsearch : AssertionError while getting index name from alias

We have been using Elasticsearch Plugin in our project. While getting index name from alias getting below error
Error
{
"error": "AssertionError[Expected current thread[Thread[elasticsearch[Seth][http_server_worker][T#2]{New I/O worker #20},5,main]] to not be a transport thread. Reason: [Blocking operation]]", "status": 500
}
Code
String realIndex = client.admin().cluster().prepareState()
.execute().actionGet().getState().getMetaData()
.aliases().get(aliasName).iterator()
.next().key;
what causes this issue?? Googled it didn't get any help
From the look of the error, it seems like this operation is not allowed on the transport thread as it will block the thread until you get the result back. You need to execute this on a execute thread.
public String getIndexName() {
final IndexNameHolder result = new IndexNameHolder(); // holds the index Name. Needed a final instance here, hence created a holder.
getTransportClient().admin().cluster().prepareState().execute(new ActionListener<ClusterStateResponse>() {
#Override
public void onResponse(ClusterStateResponse response) {
result.indexName = response.getState().getMetaData().aliases().get("alias").iterator().next().key;
}
#Override
public void onFailure(Throwable e) {
//Handle failures
}
});
return result.value;
}
There is another method for execute(), one which takes a listener. You need to implement your own listener. In my answer, I have an anonymous implementation of Listener.
I hope it helps

Which one takes priority, ExceptionFilter or ExceptionHandler in ASP.NET Web Api 2.0?

I have a global ExceptionHandler in my web api 2.0, which handles all unhandled exceptions in order to return a friendly error message to the api caller.
I also have a global ExceptionFilter, which handles a very specific exception in my web api and returns a specific response. The ExceptionFilter is added dynamically by a plugin to my web api so I cannot do what it does in my ExceptionHandler.
I am wondering if I have both the ExceptionHandler and the ExceptionFilter registered globally, which one will take priority and be executed first? Right now I can see that the ExceptionFilter is being executed before the ExceptionHandler. And I can also see that in my ExceptionFilter if I create a response the ExceptionHandler is not being executed.
Will it be safe to assume that:
ExceptionFilters are executed before ExceptionHandlers.
If the ExceptionFilter creates a response, the ExceptionHandler will not be executed.
I had to debug through the System.Web.Http in order to find the answer to my question. So the answer is:
It is safe to assume that ExceptionFilters will be executed before ExceptionHandlers
If the ExceptionFilter creates a response the ExceptionHandler would not be executed.
Why this is so:
When you have an ExceptionFilter registered to execute globally or for your controller action, the ApiController base class from which all the api Controllers inherit will wrap the result in an ExceptionFilterResult and call its ExecuteAsync method. This is the code in the ApiController, which does this:
if (exceptionFilters.Length > 0)
{
IExceptionLogger exceptionLogger = ExceptionServices.GetLogger(controllerServices);
IExceptionHandler exceptionHandler = ExceptionServices.GetHandler(controllerServices);
result = new ExceptionFilterResult(ActionContext, exceptionFilters, exceptionLogger, exceptionHandler,
result);
}
return result.ExecuteAsync(cancellationToken);
Looking at the ExceptionFilterResult.ExecuteAsync method:
try
{
return await _innerResult.ExecuteAsync(cancellationToken);
}
catch (Exception e)
{
exceptionInfo = ExceptionDispatchInfo.Capture(e);
}
// This code path only runs if the task is faulted with an exception
Exception exception = exceptionInfo.SourceException;
Debug.Assert(exception != null);
bool isCancellationException = exception is OperationCanceledException;
ExceptionContext exceptionContext = new ExceptionContext(
exception,
ExceptionCatchBlocks.IExceptionFilter,
_context);
if (!isCancellationException)
{
// We don't log cancellation exceptions because it doesn't represent an error.
await _exceptionLogger.LogAsync(exceptionContext, cancellationToken);
}
HttpActionExecutedContext executedContext = new HttpActionExecutedContext(_context, exception);
// Note: exception filters need to be scheduled in the reverse order so that
// the more specific filter (e.g. Action) executes before the less specific ones (e.g. Global)
for (int i = _filters.Length - 1; i >= 0; i--)
{
IExceptionFilter exceptionFilter = _filters[i];
await exceptionFilter.ExecuteExceptionFilterAsync(executedContext, cancellationToken);
}
if (executedContext.Response == null && !isCancellationException)
{
// We don't log cancellation exceptions because it doesn't represent an error.
executedContext.Response = await _exceptionHandler.HandleAsync(exceptionContext, cancellationToken);
}
You can see that the ExceptionLogger is executed first, then all ExceptionFilters are executed and then if if executedContext.Response == null, the ExceptionHandler is executed.
I hope this is useful!

Using PLINQ / TPL in a custom workflow activity

I have a workflow defined to execute when a field on a custom entity is changed.
The workflow calls into a custom activity which in turn uses PLINQ to process a bunch of records.
The code that the custom activty calls into looks like so:
protected override void Execute(CodeActivityContext executionContext)
{
// Get the context service.
IWorkflowContext context = executionContext.GetExtension<IWorkflowContext>();
IOrganizationServiceFactory serviceFactory =
executionContext.GetExtension<IOrganizationServiceFactory>();
// Use the context service to create an instance of IOrganizationService.
IOrganizationService _orgService = serviceFactory.CreateOrganizationService
(context.InitiatingUserId);
int pagesize = 2000;
// use FetchXML aggregate functions to get total count of the number of record to process
// Reference: http://msdn.microsoft.com/en-us/library/gg309565.aspx
int totalcount = GetTotalCount();
int totalPages = (int)Math.Ceiling((double)totalcount / (double)pagesize);
try
{
Parallel.For(1,
totalPages + 1,
() => new MyOrgserviceContext(_orgService),
(pageIndex, state, ctx) =>
{
var items = ctx.myEntitySet.Skip((pageIndex - 1) * pagesize).Take(pagesize);
foreach(var item in items)
{
//process item as needed
ctx.SaveChanges();
}
return ctx;
},
ctx => ctx.Dispose()
);
}
catch (AggregateException ex)
{
//handle as needed
}
}
I'm noticing the following error(s) as an aggregate exception (multiple occurences of the same error in the InnerExceptions):
"Encountered disposed CrmDbConnection when it should not be disposed"
From what I've read:
CRM 2011 Workflow "Invalid Pointer" error
this can happen when you have class level variables since the workflow runtime can end up
sharing the same class instance across multiple workflow invocations. This is clearly not the case here and also I don't have multiple instances of this workflow running on multiple records. There is just one instance of this workflow running at any point in time.
The code above works fine when extracted and hosted outside the workflow host (CRMAsyncService).
This is using CRM 2011 Rollup 10.
Any insights much appreciated.
I'm not certain, but this might just because you are disposing of your connection, at ctx.Dispose().
As each new MyOrgservicecontext(_orgService) object uses the same IOrganizationService, I would suspect when the first MyOrgservicecontext is disposed, then all the other MyOrgservicecontext objects then have a disposed connection - meaning their service calls will fail and an exception is throw.
I would suggest removing the dispose to see if this resolves the problem.

Resources