Is there a DSL or declarative system for TPL Dataflow? - task-parallel-library

Is there any DSL or other fully- or partially-declarative mechanism for declaring TPL Dataflow flows? Or is the best (only) practice to just wire them up in code?
Failing that, is there any DSL or other fully- or partially-declarative mechanism for using any dataflow library that I could use as a model and/or source of ideas?
(I've searched without success so maybe one doesn't exist ... or maybe I didn't find it.)
Update: To answer #svick below as to why I want this and what do I gain by it:
First, I just like a sparser syntax that more clearly shows the flow rather than the details. I think
downloadString => createWordList => filterWordList => findPalindromes;
is preferable to
downloadString.LinkTo(createWordList);
createWordList.LinkTo(filterWordList);
filterWordList.LinkTo(findPalindromes);
findPalindromes.LinkTo(printPalindrome);
with its repeated names and extra punctuation. Similar to the way you'd rather use the dot DSL to describe a DAG than a bunch of calls to the Visio DOM API. You can imagine a syntax for network flows, as well as pipelines, such that network flows in particular would be very clear. That may not seem compelling, of course, but I like it.
Second, I think that with a DSL you might be able to persist the DSL description, e.g., as a field in a row in a database, and then instantiate it later. Though perhaps that's a different capability entirely.

Let's start with the relevant facts and work from there:
There isn't anything like this for TPL Dataflow yet.
There isn't a good way of embedding a DSL into C#. The common compilers are not extensible and it would be hard to access local variables from a string-based DSL.
The are several limitations to operators in C#, but the most significant here is that operators can't be generic. This means that the sparser syntax either wouldn't be type-safe (which is unacceptable to me), or it can't use overloaded operators.
The IDisposable returned from LinkTo() that can be used to break the created link isn't used that often, so it doesn't have to be supported. (Or maybe the expression that sets up the flow could return a single IDisposable that breaks the whole flow?)
Because of this, I think the best that can be done is something like:
downloadString.Link(createWordList).Link(filterWordList).Link(findPalindromes);
This avoids the repetition of LinkTo(), but is not much better.
The implementation of the simple form of this is mostly trivial:
public static class DataflowLinkExtensions
{
public static ISourceBlock<TTarget> Link<TSource, TTarget>(
this ISourceBlock<TSource> source,
IPropagatorBlock<TSource, TTarget> target)
{
source.LinkTo(
target,
new DataflowLinkOptions { PropagateCompletion = true });
return target;
}
public static void Link<TSource>(
this ISourceBlock<TSource> source, ITargetBlock<TSource> target)
{
source.LinkTo(
target,
new DataflowLinkOptions { PropagateCompletion = true });
}
}
I chose to set PropagateCompletion to true, because I think that makes the most sense here. But it could also be an option of Link().
I think most of the alternative linking operators of Axum are not relevant to TPL Dataflow, but linking multiple blocks to or from the same block could be done by taking a collection or array as one of the parameters of Link():
new[] { source1, source2 }.Link(target);
source.Link(target1, target2);
If Link() actually returned something that represents the whole flow (similar to Encapsulate()), you could combine this to create more complicated flows, like:
source.Link(propagator1.Link(target1), target2);

Related

What's the better coding style in oop: one method with one parameter VS two methods without parameters?

What's the better way for clean code from the oop point of view? Having two related methods with different names or one common method with an extra parameter?
(Simplified) Example:
1.) public void LogError() { ... }
public void LogWarning() { ... }
VS
2.) public void Log(LogType logType) { ... } //LogType.Error vs LogType.Warning
Both are good choices. Maybe a few examples can make it more clear. Usually, I try to think who is gonna use the library (me or someone else) and what programming language I use.
For example:
If I use strongly typed language like Java, C#, etc then I prefer choice 2.
If I use something else like PHP or Python, then I prefer choice 1.
If I want to make a simplified interface for other developers that are gonna use my library, for example, then I prefer choice 1 too.
When you have LogType enum for example, then it really doesn’t matter. Just try to think about how to describe the intent and make it clear.
Watch out boolean parameters that can be confusing many times. For example:
public void SaveProduct(bool cache) { ... }
In those situations, choice 1 is usually better because it can be very hard to understand what boolean values do. (How it changes the behavior) Also, it usually tells that the method is doing two different actions so possibly there is a way to refactor it. For example, splitting it into two methods and then the developer does not need to know about the implementation details.

Confused about the Interface and Class coding guidelines for TypeScript

I read through the TypeScript Coding guidelines
And I found this statement rather puzzling:
Do not use "I" as a prefix for interface names
I mean something like this wouldn't make a lot of sense without the "I" prefix
class Engine implements IEngine
Am I missing something obvious?
Another thing I didn't quite understand was this:
Classes
For consistency, do not use classes in the core compiler pipeline. Use
function closures instead.
Does that state that I shouldn't use classes at all?
Hope someone can clear it up for me :)
When a team/company ships a framework/compiler/tool-set they already have some experience, set of best practices. They share it as guidelines. Guidelines are recommendations. If you don't like any you can disregard them.
Compiler still will compile your code.
Though when in Rome...
This is my vision why TypeScript team recommends not I-prefixing interfaces.
Reason #1 The times of the Hungarian notation have passed
Main argument from I-prefix-for-interface supporters is that prefixing is helpful for immediately grokking (peeking) whether type is an interface. Statement that prefix is helpful for immediately grokking (peeking) is an appeal to Hungarian notation. I prefix for interface name, C for class, A for abstract class, s for string variable, c for const variable, i for integer variable. I agree that such name decoration can provide you type information without hovering mouse over identifier or navigating to type definition via a hot-key. This tiny benefit is outweighed by Hungarian notation disadvantages and other reasons mentioned below. Hungarian notation is not used in contemporary frameworks. C# has I prefix (and this the only prefix in C#) for interfaces due to historical reasons (COM). In retrospect one of .NET architects (Brad Abrams) thinks it would have been better not using I prefix. TypeScript is COM-legacy-free thereby it has no I-prefix-for-interface rule.
Reason #2 I-prefix violates encapsulation principle
Let's assume you get some black-box. You get some type reference that allows you to interact with that box. You should not care if it is an interface or a class. You just use its interface part. Demanding to know what is it (interface, specific implementation or abstract class) is a violation of encapsulation.
Example: let's assume you need to fix API Design Myth: Interface as Contract in your code e.g. delete ICar interface and use Car base-class instead. Then you need to perform such replacement in all consumers. I-prefix leads to implicit dependency of consumers on black-box implementation details.
Reason #3 Protection from bad naming
Developers are lazy to think properly about names. Naming is one of the Two Hard Things in Computer Science. When a developer needs to extract an interface it is easy to just add the letter I to the class name and you get an interface name. Disallowing I prefix for interfaces forces developers to strain their brains to choose appropriate names for interfaces. Chosen names should be different not only in prefix but emphasize intent difference.
Abstraction case: you should not not define an ICar interface and an associated Car class. Car is an abstraction and it should be the one used for the contract. Implementations should have descriptive, distinctive names e.g. SportsCar, SuvCar, HollowCar.
Good example: WpfeServerAutosuggestManager implements AutosuggestManager, FileBasedAutosuggestManager implements AutosuggestManager.
Bad example: AutosuggestManager implements IAutosuggestManager.
Reason #4 Properly chosen names vaccinate you against API Design Myth: Interface as Contract.
In my practice, I met a lot of people that thoughtlessly duplicated interface part of a class in a separate interface having Car implements ICar naming scheme. Duplicating interface part of a class in separate interface type does not magically convert it into abstraction. You will still get concrete implementation but with duplicated interface part. If your abstraction is not so good, duplicating interface part will not improve it anyhow. Extracting abstraction is hard work.
NOTE: In TS you don't need separate interface for mocking classes or overloading functionality.
Instead of creating a separate interface that describes public members of a class you can use TypeScript utility types. E.g. Required<T> constructs a type consisting of all public members of type T.
export class SecurityPrincipalStub implements Required<SecurityPrincipal> {
public isFeatureEnabled(entitlement: Entitlement): boolean {
return true;
}
public isWidgetEnabled(kind: string): boolean {
return true;
}
public areAdminToolsEnabled(): boolean {
return true;
}
}
If you want to construct a type excluding some public members then you can use combination of Omit and Exclude.
Clarification regarding the link that you reference:
This is the documentation about the style of the code for TypeScript, and not a style guideline for how to implement your project.
If using the I prefix makes sense to you and your team, use it (I do).
If not, maybe the Java style of SomeThing (interface) with SomeThingImpl (implementation) then by all means use that.
I find #stanislav-berkov's a pretty good answer to the OP's question. I would only share my 2 cents adding that, in the end it is up to your Team/Department/Company/Whatever to get to a common understanding and set its own rules/guidelines to follow across.
Sticking to standards and/or conventions, whenever possible and desirable, is a good practice and it keeps things easier to understand. On the other side, I do like to think we are still free to choose the way how we write our code.
Thinking a bit on the emotional side of it, the way we write code, or our coding style, reflects our personality and in some cases even our mood. This is what keeps us humans and not just coding machines following rules. I believe coding can be a craft not just an industrialized process.
I personally quite like the idea of turning a noun into an adjective by adding the -able suffix. It sounds very impropper, but I love it!
interface Walletable {
inPocket:boolean
cash:number
}
export class Wallet implements Walletable {
//...
}
}
The guidelines that are suggested in the Typescript documentation aren't for the people who use typescript but rather for the people who are contributing to the typescript project. If you read the details at the begging of the page it clearly defines who should use that guideline. Here is a link to the guidelines.
Typescript guidelines
In conclusion as a developer you can name you interfaces the way you see fit.
I'm trying out this pattern similar to other answers, but exporting a function that instantiates the concrete class as the interface type, like this:
export interface Engine {
rpm: number;
}
class EngineImpl implements Engine {
constructor() {
this.rpm = 0;
}
}
export const createEngine = (): Engine => new EngineImpl();
In this case the concrete implementation is never exported.
I do like to add a Props suffix.
interface FormProps {
some: string;
}
const Form:VFC<FormProps> = (props) => {
...
}
The type being an interface is an implementation detail. Implementation details should be hidden in API:s. That is why you should avoid I.
You should avoid both prefix and suffix. These are both wrong:
ICar
CarInterface
What you should do is to make a pretty name visible in the API and have a the implemtation detail hidden in the implementation. That is why I propose:
Car - An interface that is exposed in the API.
CarImpl - An implementation of that API, that is hidden from the consumer.

When to use Encapsulate Collection?

In the smell Data Class as Martin Fowler described in Refactoring, he suggests if I have a collection field in my class I should encapsulate it.
The pattern Encapsulate Collection(208) says we should add following methods:
get_unmodified_collection
add_item
remove_item
and remove these:
get_collection
set_collection
To make sure any changes on this collection need go through the class.
Should I refactor every class which has a collection field with this pattern? Or it depends on some other reasons like frequency of usage?
I use C++ in my project now.
Any suggestion would be helpful. Thanks.
These are well formulated questions and my answer is:
Should I refactor every class which has a collection field with this
pattern?
No, you should not refactor every class which has a collection field. Every fundamentalism is a way to hell. Use common sense and do not make your design too good, just good enough.
Or it depends on some other reasons like frequency of usage?
The second question comes from a common mistake. The reason why we refactor or use design pattern is not primarily the frequency of use. We do it to make the code more clear, more maintainable, more expandable, more understandable, sometimes (but not always!) more effective. Everything which adds to these goals is good. Everything which does not, is bad.
You might have expected a yes/no answer, but such one is not possible here. As said, use your common sense and measure your solution from the above mentioned viewpoints.
I generally like the idea of encapsulating collections. Also encapsulating plain Strings into named business classes. I do it almost always when the classes are meaningful in the business domain.
I would always prefer
public class People {
private final Collection<Man> people;
... // useful methods
}
over the plain Collection<Man> when Man is a business class (a domain object). Or I would sometimes do it in this way:
public class People implements Collection<Man> {
private final Collection<Man> people;
... // delegate methods, such as
#Override
public int size() {
return people.size();
}
#Override
public Man get(int index) {
// Here might also be some manipulation with the returned data etc.
return people.get(index);
}
#Override
public boolean add(Man man) {
// Decoration - added some validation
if (/* man does not match some criteria */) {
return false;
}
return people.add(man);
}
... // useful methods
}
Or similarly I prefer
public class StreetAddress {
private final String value;
public String getTextValue() { return value; }
...
// later I may add more business logic, such as parsing the street address
// to street name and house number etc.
}
over just using plain String streetAddress - thus I keep the door opened to any future change of the underlying logic and to adding any useful methods.
However, I try not to overkill my design when it is not needed so I am as well as happy with plain collections and plain Strings when it is more suited.
I think it depends on the language you are developing with. Since there are already interfaces that do just that C# and Java for example. In C# we have ICollection, IEnumerable, IList. In Java Collection, List, etc.
If your language doesn't have an interface to refer to a collection regarless of their inner implementation and you require to have your own abstraction of that class, then it's probably a good idea to do so. And yes, you should not let the collection to be modified directly since that completely defeats the purpose.
It would really help if you tell us which language are you developing with. Granted, it is kind of a language-agnostic question, but people knowledgeable in that language might recommend you the best practices in it and if there's already a way to achieve what you need.
The motivation behind Encapsulate Collection is to reduce the coupling of the collection's owning class to its clients.
Every refactoring tries to improve maintainability of the code, so future changes are easier. In this case changing the collection class from vector to list for example, changes all the clients' uses of the class. If you encapsulate this with this refactoring you can change the collection without changes to clients. This follows on of SOLID principles, the dependency inversion principle: Depend upon Abstractions. Do not depend upon concretions.
You have to decide for your own code base, whether this is relevant for you, meaning that your code base is still being changed and has to be maintained (then yes, do it for every class) or not (then no, leave the code be).

Is there a good way to use polymorphism to remove this switch statement?

I've been reading on refactoring and replacing conditional statements with polymorphism. The trouble I have is that it only seems to make sense to me when you have a more complex case where, without polymorphism, you would have to repeat the same switch statements or if-elses many times. I don't see how it makes sense if you're only doing it once - you have to have that conditional somewhere, right?
As an example, I recently wrote the following class, which is responsible for reading a XML file and converting its data into the program's objects. There are 2 possible formats for the file that we are supporting, so I simply wrote a method in the class for handling each one, and used a case-switch to determine which one to use:
public class ComponentXmlReader
{
public IEnumerable<UserComponent> ImportComponentsFromXml(string path)
{
var xmlFile = XElement.Load(path);
switch (xmlFile.Name.LocalName)
{
case "CaseDefinition":
return ImportComponentsFromA(xmlFile);
case "Root":
return ImportComponentsFromB(xmlFile);
}
}
private IEnumerable<UserComponent> ImportComponentsFromA(XContainer file)
{
//do stuff
}
private IEnumerable<UserComponent> ImportComponentsFromB(XContainer file)
{
//do stuff
}
}
As far as I can tell, I could write a class hierarchy for this to do the parsing, but I don't see the advantage here - I'd still have to use a case-switch to determine which class to instantiate. It looks to me like it would be extra complexity for no benefit. If I was going to keep these classes around and do more things with them that depended on the file type, then it would eliminate doing the same switch in multiple places, but this is single-use. Is this right, or is there some reason or technique I'm not seeing that makes it a good idea to use a polymorphic class hierarchy to do this?
If you had, say, an abstract ComponentImporter class, with concrete subclasses FromA and FromB, you could instantiate one of each, and put it in a Map. Then you could call componentImporterMap.get(xmlFile.Name.LocalName).importComponents() and avoid the switch.
As with all design choices, context is key. In this case, you have what seems to be a fairly simple class handling two very similar tasks. If the two Import methods contained very little duplicate code, then including them in a single class is perhaps the best choice since, as you say, it reduces complexity.
However, it's possible you'll use this class in the future, and even add new types of imports. In that case, the class would be more reusable if it was polymorphic.
Additionally, since these methods sound very similar, you're likely to have a bunch of duplicate code, which you could keep in a base class and only put import-specific code in the child classes.
Plus, as Carl mentions, there are numbers of ways to implement this logic without using a case statement.

Executing a certain action for all elements in an Enumerable<T>

I have an Enumerable<T> and am looking for a method that allows me to execute an action for each element, kind of like Select but then for side-effects. Something like:
string[] Names = ...;
Names.each(s => Console.Writeline(s));
or
Names.each(s => GenHTMLOutput(s));
// (where GenHTMLOutput cannot for some reason receive the enumerable itself as a parameter)
I did try Select(s=> { Console.WriteLine(s); return s; }), but it wasn't printing anything.
A quick-and-easy way to get this is:
Names.ToList().ForEach(e => ...);
You are looking for the ever-elusive ForEach that currently only exists on the List generic collection. There are many discussions online about whether Microsoft should or should not add this as a LINQ method. Currently, you have to roll your own:
public static void ForEach<T>(this IEnumerable<T> value, Action<T> action)
{
foreach (T item in value)
{
action(item);
}
}
While the All() method provides similar abilities, it's use-case is for performing a predicate test on every item rather than an action. Of course, it can be persuaded to perform other tasks but this somewhat changes the semantics and would make it harder for others to interpret your code (i.e. is this use of All() for a predicate test or an action?).
Disclaimer: This post no longer resembles my original answer, but rather incorporates the some seven years experience I've gained since. I made the edit because this is a highly-viewed question and none of the existing answers really covered all the angles. If you want to see my original answer, it's available in the revision history for this post.
The first thing to understand here is C# linq operations like Select(), All(), Where(), etc, have their roots in functional programming. The idea was to bring some of the more useful and approachable parts of functional programming to the .Net world. This is important, because a key tenet of functional programming is for operations to be free of side effects. It's hard to understate this. However, in the case of ForEach()/each(), side effects are the entire purpose of the operation. Adding each() or ForEach() is not just outside the functional programming scope of the other linq operators, but in direct opposition to them.
But I understand this feels unsatisfying. It may help explain why ForEach() was omitted from the framework, but fails to address the real issue at hand. You have a real problem you need to solve. Why should all this ivory tower philosophy get in the way of something that might be genuinely useful?
Eric Lippert, who was on the C# design team at the time, can help us out here. He recommends using a traditional foreach loop:
[ForEach()] adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:
foreach(Foo foo in foos){ statement involving foo; }
into this code:
foos.ForEach(foo=>{ statement involving foo; });
His point is, when you look closely at your syntax options, you don't gain anything new from a ForEach() extension versus a traditional foreach loop. I partially disagree. Imagine you have this:
foreach(var item in Some.Long(and => possibly)
.Complicated(set => ofLINQ)
.Expression(to => evaluate))
{
// now do something
}
This code obfuscates meaning, because it separates the foreach keyword from the operations in the loop. It also lists the loop command prior to the operations that define the sequence on which the loop operates. It feels much more natural to want to have those operations come first, and then have the the loop command at the end of the query definition. Also, the code is just ugly. It seems like it would be much nicer to be able to write this:
Some.Long(and => possibly)
.Complicated(set => ofLINQ)
.Expression(to => evaluate)
.ForEach(item =>
{
// now do something
});
However, even here, I eventually came around to Eric's point of view. I realized code like you see above is calling out for an additional variable. If you have a complicated set of LINQ expressions like that, you can add valuable information to your code by first assigning the result of the LINQ expression to a new variable:
var queryForSomeThing = Some.Long(and => possibly)
.Complicated(set => ofLINQ)
.Expressions(to => evaluate);
foreach(var item in queryForSomeThing)
{
// now do something
}
This code feels more natural. It puts the foreach keyword back next to the rest of the loop, and after the query definition. Most of all, the variable name can add new information that will be helpful to future programmers trying to understand the purpose of the LINQ query. Again, we see the desired ForEach() operator really added no new expressive power to the language.
However, we are still missing two features of a hypothetical ForEach() extension method:
It's not composable. I can't add a further .Where() or GroupBy() or OrderBy() after a foreach loop inline with the rest of the code, without creating a new statement.
It's not lazy. These operations happen immediately. It doesn't allow me to, say, have a form where a user chooses an operation as one field in a larger screen that is not acted on until the user presses a command button. This form might allow the user to change their mind before executing the command. This is perfectly normal (easy even) with a LINQ query, but not as simple with a foreach.
(FWIW, most naive .ForEach() implementations also have these issues. But it's possible to craft one without them.)
You could, of course, make your own ForEach() extension method. Several other answers have implementations of this method already; it's not all that complicated. However, I feel like it's unnecessary. There's already an existing method that fits what we want to do from both semantic and operational standpoints. Both of the missing features above can be addressed by use of the existing Select() operation.
Select() fits the kind of transformation or projection described by both of the examples above. Keep in mind, though, that I would still avoid creating side effects. The call to Select() should return either new objects or projections from the originals. This can sometimes be aided through the use of an anonymous type or dynamic object (if and only if necessary). If you need the results to persist in, say, an original list variable, you can always call .ToList() and assign it back to your original variable. I'll add here that I prefer working with IEnumerable<T> variables as much as possible over more concrete types.
myList = myList.Select(item => new SomeType(item.value1, item.value2 *4)).ToList();
In summary:
Just stick with foreach most of the time.
When foreach really won't do (which probably isn't as often as you think), use Select()
When you need to use Select(), you can still generally avoid (program-visible) side effects, possibly by projecting to an anonymous type.
Avoid the crutch of calling ToList(). You don't need it as much as you might think, and it can have significant negative consequence for performance and memory use.
Unfortunately there is no built-in way to do this in the current version of LINQ. The framework team neglected to add a .ForEach extension method. There's a good discussion about this going on right now on the following blog.
http://blogs.msdn.com/kirillosenkov/archive/2009/01/31/foreach.aspx
It's rather easy to add one though.
public static void ForEach<T>(this IEnumerable<T> enumerable, Action<T> action) {
foreach ( var cur in enumerable ) {
action(cur);
}
}
You cannot do this right away with LINQ and IEnumerable - you need to either implement your own extension method, or cast your enumeration to an array with LINQ and then call Array.ForEach():
Array.ForEach(MyCollection.ToArray(), x => x.YourMethod());
Please note that because of the way value types and structs work, if the collection is of a value type and you modify the elements of the collection this way, it will have no effect on the elements of the original collection.
Because LINQ is designed to be a query feature and not an update feature you will not find an extension which executes methods on IEnumerable<T> because that would allow you to execute a method (potentially with side effects). In this case you may as well just stick with
foreach(string name in Names)
Console.WriteLine(name);
Using Parallel Linq:
Names.AsParallel().ForAll(name => ...)
Well, you can also use the standard foreach keyword, just format it into a oneliner:
foreach(var n in Names.Where(blahblah)) DoStuff(n);
Sorry, thought this option deserves to be here :)
There is a ForEach method off of List. You could convert the Enumerable to List by calling the .ToList() method, and then call the ForEach method off of that.
Alternatively, I've heard of people defining their own ForEach method off of IEnumerable. This can be accomplished by essentially calling the ForEach method, but instead wrapping it in an extension method:
public static class IEnumerableExtensions
{
public static IEnumerable<T> ForEach<T>(this IEnumerable<T> _this, Action<T> del)
{
List<T> list = _this.ToList();
list.ForEach(del);
return list;
}
}
As mentioned before ForEach extension will do the fix.
My tip for the current question is how to execute the iterator
[I did try Select(s=> { Console.WriteLine(s); return s; }), but it wasn't printing anything.]
Check this
_= Names.Select(s=> { Console.WriteLine(s); return 0; }).Count();
Try it!

Resources