Is it right to use a subflow with only one JavaComputeNode - coding-style

I have a JavaComputeNode with java class that i use in other subflows. So for me is interesting if it is right to wrap this node in one subflow instead of creating it separately in each place and connect with same java class.
Is it right to create a subflow with only one node?

If the subflow is in the same application and only has that one node with the terminals of the node wired to the subflow input and outputs directly, than I wouldn't create a subflow, because it is not adding anything.
To justify a subflow, it would need to have something, that it adds to the node, like error handling logic or logging, or even just rewiring terminals.
It might also make sense to put the node in a subflow, if you plan to put that subflow in a library, for example because you want to version it separately, and especially if you plan to put your subflow in a shared library.

Related

How to use Gremlin to get both node properties and edge names in one query?

I have been thrown into a pool of golang / gremlin / Neptune, and am able to get some things to work. Life is good - enough, but I am hoping there is a simple answer (which I have not been able to find) to what seems like a simple question.
I have 'obs' nodes with some properties, two of which are ('type','domain') and ('value','whitehouse.com).
Another set of nodes is 'attack' ('type','group') and ('value','Emotet'), along with other properties.
An observation node can have an edge pointing to one or more attack nodes. (and actually, other types of nodes as well.) These edges have a time-based property - when the observation was seen manifesting a certain type of attack.
I'm working in Go, using gremson to communicate with a Neptune db. In this environment you construct your query as a string and send it down the wire to Neptune, and get something called graphson back.
Thus, I construct this, and send it...
fmt.Sprintf("g.V().hasLabel('obs').has('value','%s').limit(1)", domain)
And I get back properties for a vector, in gremson. Were I using the console, all I would get back would be the id. Go figure.
Then I construct this, and send it...
fmt.Sprintf("g.V().hasLabel('obs').has('value','%s').limit(1).out()", domain)
and I get back the properties of the connected nodes, in graphson. Again, using the console I would only get back ids. No sweat.
What I would LIKE to do is to combine these two queries somehow so that I am not doing what seems to be like two almost identical lookups.
console-wise, assume both queries also have valueMap() or entityMap() tacked on the end. Is there any way to do them as one query?
There are many ways you could write this query. Here are a couple of options
g.V().hasLabel('obs').
has('value','%s').
limit(1).as('a').
out().as('b').
select('a','b')
or using project
g.V().hasLabel('obs').
has('value','%s').
limit(1).
project('a','b').
by().
by(out().fold())
My preference is for the project example as you will get the connected vertices back in a list.

(Golang) Clean Architecture - Who should do the orchestration?

I am trying to understand which of the following two options is the right approach and why.
Say we have GetHotelInfo(hotel_id) API that is being invoked from the Web till the Controller.
The logic of the GetHotelInfo is:
Invoke GetHotelPropertyData() (Location, facilities…)
Invoke GetHotelPrice(hotel_id, dates…)
Invoke GetHotelReviews(hotel_id)
Once all results come back, process and merge the data and return 1 object that contains all relevant data of the hotel.
Option 1:
Create 3 different repositories (HotelPropertyRepo, HotelPriceRepo,
HotelReviewsRepo)
Create GetHotelInfo usecase that will use these 3 repositories and
return the final result.
Option 2:
Create 3 different repositories (HotelPropertyRepo, HotelPriceRepo,
HotelReviewsRepo)
Create 3 different usecases (GetHotelPropertyDataUseCase,
GetHotelPriceUseCase, GetHotelReviewsUseCase)
Create GetHotelInfoUseCase that will orchestrate the previous 3
usecases. (It can also be a controller, but that’s a different topic)
Let’s say that right now only GetHotelInfo is being exposed to the Web but maybe in the future, I will expose some of the inner requests as well.
And would the answer be different if the actual logic of GetHotelInfo is not a combination of 3 endpoints but rather 10?
You can see a similar method (called Get()) in "Clean Architecture with GO" from Manato Kuroda
Manato points out that:
following Acyclic Dependencies Principle (ADP), the dependencies only point inward in the circle, not point outward and no circulation.
that Controller and Presenter are dependent on Use Case Input Port and Output Port which is defined as an interface, not as specific logic (the details). This is possible (without knowing the details in the outer layer) thanks to the Dependency Inversion Principle (DIP).
That is why, in example repository manakuro/golang-clean-architecture, Manato creates for the Use cases layer three directories:
repository,
presenter: in charge of Output Port
interactor: in charge of Input Port, with a set of methods of specific application business rules, depending on repository and presenter interface.
You can use that example, to adapt your case, with GetHotelInfo declared first in hotel_interactor.go file, and depending on specific business method declared in hotel_repository, and responses defined in hotel_presenter
Is expected Interactors (Use Case class) call other interactors. So, both approaches follow Clean Architecture principles.
But, the "maybe in the future" phrase goes against good design and architecture practices.
We can and should think the most abstract way so that we can favor reuse. But always keeping things simple and avoiding unnecessary complexity.
And would the answer be different if the actual logic of GetHotelInfo is not a combination of 3 endpoints but rather 10?
No, it would be the same. However, as you are designing APIs, in case you need the combination of dozens of endpoints, you should start considering put a GraphQL layer instead of adding complexity to the project.
Clean is not a well-defined term. Rather, you should be aiming to minimise the impact of change (adding or removing a service). And by "impact" I mean not only the cost and time factors but also the risk of introducing a regression (breaking a different part of the system that you're not meant to be touching).
To minimise the "impact of change" you would split these into separate services/bounded contexts and allow interaction only through events. The 'controller' would raise an event (on a shared bus) like 'hotel info request', and each separate service (property, price, and reviews) would respond independently and asynchronously (maybe on the same bus), leaving the controller to aggregate the results and return them to the client, which could be done after some period of time. If you code the result aggregator appropriately it would be possible to add new 'features' or remove existing ones completely independently of the others.
To improve on this you would then separate the read and write functionality of each context into its own context, each responding to appropriate events. This will allow you to optimise and scale the write function independently of the read function. We call this CQRS.

Multiple flows with nifi

We have multiple (50+) nifi flows that all do basically the same thing: pull some data out of a db, append some columns conver to parquet and upload to hdfs. They differ only in details such as the sql query to run or the location in hdfs that they land.
The question is how to factor these common nifi flows out such that any change made to the common flow automatically applies to all all derived flows. E.g if i want to add an extra step to also publish the data to Kafka I want to make this once and have it automatically apply to all 50 flows.
We’ve tried to get this working with nifi registry, however it seems like an imperfect fit. Essentially the issue is that nifi registry seems to work well for updating a flow in one environment (say wat) and then autmatically updating it in another environment (say prod). It seems less suited for updating multiple flows in the same environment with one specific example bing that it will reset the name of each flow to be the template name every time we redeploy meaning that al flows end up with the same name!
Does anyone know how one is supposed to manage a situation like ours asi guess it must be pretty common.
Apache NiFi has ProcessorGroups. As the name itself suggests, the processor groups are there to group together a set of processors' and their pipeline that does similar task.
So for your case what you can do is, you can refactor the flow by moving the common flow which can be reused with different pipelines to a separate processor group with an input port. Connect the outside flow that depends on this reusable flow by connecting to the input port of the reusable processor group. Depending on your requirement you can create an output port as well in this processor group and connect it with the outside flow.
Attaching a sample:
For the sake of explaining, I have made a mock flow so ignore the Processor types that are used, but rather see the name I had given to those processors.
The following screenshots show that I read from two different sources and individually connect them to two different processors that does the source specific changes to those processors
Then I connect these two flows to the input port of a processor group that has the reusable flow inside. So ultimately the two different flows shown in the above screenshot gets to work with a common reusable flow.
Showing what's inside the reusable flow:
Finally the output port output to outside connects the reusable flow to the outside component Write to somewehere
I hope this helps you with refactoring your complex flows. Feel free to get back, if you have any queries.

Alt data dependency between actions not stores

I have a react app where I'm using alt for the flux architecture side of things.
I have a situation where I have two stores which are fed by ajax calls in their corresponding actions.
Having read the alt getting started page on data dependencies it mentions dependencies between stores using waitFor - http://alt.js.org/guide/wait-for/ but I don't see a way to use this kind of approach if one of my store actions is dependent on another store action (both of which are async).
If I was doing this inside a single action handler, I might return or chain some promises but I'm not sure how to implement this across action handlers. Has anyone achieved this? or am I going about my usage of ajax in react the wrong way?
EDIT: More detail.
In my example I have a list of nodes defined in a local json config file, my node-store makes an ajax request to get the node detail.
Once it's complete, a different component (with a different action handler and store) wants to use the node collection to make an ajax query to different endpoints a node may expose.
The nodes are re-used across many different components so I don't want to roll their functionality into several different stores/action handlers if possible.

Reuse states from Spring Web flow definitions?

I have, for example, 2 flows that should end in the same transitions to the same states, e.g.,
Flow 1 ends in either go to state A or B.
Flow 2 ends in either go to state A or B.
Right now, I seem to need to define the same end-state for A and B in flow1.xml and flow2.xml.
Is there any way they can all share the same states, A and B?
I've tried creating something like flowState and defining two end states in it, and trying to refer to them in flows 1 and 2 like
flowState#stateA and flowstate#stateB
but no luck. Any help??
Refactor the common state in a subflow, and call the subflow from the different main flows where you want to reuse the state.
You can even pass parameters to the subflow to configure it using the spring expression language if needed.

Resources