Nifi - Remote Process Group - PeerSelector - cluster-computing

I have build a simple Process Group. It generates a FlowFile with some random stuff in it and sends it to the Nifi Remote Process Group.
This Remote Process Group is configured to send the FlowFile to localhost or in this case to my own Hostname (I have tried localhost as well).
After this the FlowFile should Appear at the "From MiNiFi" input Port and is sended to the LogAttribute. Nothing Special.
I configured to using RAW but with HTTP it neither works.
I am using the apache/nifi docker image and didn´t changed something in nifi.properties and authorizers.xml but of couse i provide you both:
nifi.properties
authorizers.xml
The Error occuring is this:
WARNING org.apache.nifi.remote.client.PeerSelector#40081613 Unable to refresh Remote Group´s peers due to Unable to communicate with remote Nifi cluster in order to determine which nodes exist in the remote cluster
I hope you can help me. I have wasted too much time with this Problem XD

In nifi.properties you have nifi.web.http.host=f4f40c87b65f so that means the hostname that NiFi is listening for requests on is f4f40c87b65f which means the URL of your RPG must be http://f4f40c87b65f:8080/nifi

Related

Can one NiFi node have multiple host names?

Problem:
Not able to allow multiple host names for one single NiFi node.
Description:
I have an internal NiFi server with internal computer name 'nifi-1'. nifi.properties has the following:
nifi.web.https.host=0.0.0.0
nifi.web.https.port=9443
This works fine when I hit "https://nifi-1:9443/nifi/" internally.
I have another dns name - "nifi-1.company.com" (both names must be supported) that is routed to the same nifi node. The nifi node rejects with the following error messages when I hit "https://nifi-1.company.com:9443/nifi/":
System Error
The request contained an invalid host header [nifi-1.company.com:9443] in the request [/nifi]. Check for request manipulation or third-party intercept.
Valid host headers are [empty] or:
127.0.0.1
127.0.0.1:9443
localhost
localhost:9443
[::1]
[::1]:9443
nifi-1
nifi-1:9443
10.0.1.82
10.0.1.82:9443
0.0.0.0
0.0.0.0:9443
Question:
How to resolve this problem? Any solutions? (Thanks!)
Another way to phrase the question is how I may add more host names into the list of "valid host headers" as the above.
This issue was pointed at in NiFi 1.5 NIFI-4761. To resolve this issue, whitelist the hostname used to access NiFi using the following parameter in the nifi.properties configuration file :
nifi.web.proxy.host = host:port
Its a comma-separated list of allowed HTTP Host header values to consider when NiFi is running securely and will be receiving requests to a different host[:port]. For example, when running in a Docker container or behind a proxy (e.g. localhost:18443, proxyhost:443). By default, this value is blank, meaning NiFi should allow only requests sent to the host[:port] that NiFi is bound to.
original answer source: how to use nifi.web.proxy.host and nifi.web.proxy.context.path?

Can I access to Nifi Rest-API using localhost instead of actual node-ip address in Nifi cluster?

For example; I have 3 nifi nodes in nifi cluster. Example hostnames of these nodes;
192.168.12.50:8080(primary)
192.168.54.60:8080
192.168.95.70:8080
I know that I can access to nifi-rest api from all nifi nodes. I have GetHTTP processor for get cluster summary from rest-api, and this processor runs on only pimary node. I did set "URL" property of this processor to 192.168.12.50:8080/nifi-api/controller/cluster.
But, if primary node is down, new primary node will be elected. Thus, I will not be able to access 192.168.12.50:8080 address from new primary node. Because this node was down. So, I will not be able to get cluster summary result from rest-api.
In this case, Can I use "localhost:8080/nifi-api/controller/cluster" instead of "192.168.12.50:8080/nifi-api/controller/cluster" for each node in nifi cluster?
It depends on a few things... if you are running securely then you have certificates that are generated for each node specific to the hostname, so the host in the web requests needs to match the host in the certificates, so you can't use localhost in that case.
It also depends how NiFi's web server is configured. If nifi.web.http.host or nifi.web.https.host has a specific hostname specified, then the web server is only bound to that hostname and may not accept connections with a different hostname. In a default unsecure setup, if you leave nifi.web.http.host blank then it binds to all interfaces.
You may be able to use the expression language function to obtain the hostname of the current node. So you could make the url something like "http://${hostname()}/nifi-api/controller/cluster".

"not in dispatcher" - issues connecting a validator peer to genesis validator

I have been banging my head for a while on this one.
So, I have successfully (maybe) created a running sawtooth validator with a settings-tp and poet-validator-registry (all containers from scratch).
I created it with a config-genesis.batch - then "proposal create" with poet and a public key pem etc. for a config.batch - then "poet registration create" for a poet.batch - "proposal create" again with the additional poet settings which give a poet-settings.batch.
Basically, I am copying for the most part the docker-compose for poet default, but now rolled with my own containers from scratch (I want to know how everything pieces together in detail).
Anyway, one of those details is regarding keys and auth... it's finally running, the settings-tp and poet-val-reg are happy with it and communicating normally and then it makes a genesis block as it should.
However, I then try to connect another validator to it as a peer...
"No chain head and not the genesis node: starting in peering mode" - GREAT!
However, when it tries to connect:
[2018-05-10 10:30:10.542 INFO dispatch] Can't send message PING_RESPONSE back to ee58844c071426276de533cadfafbd3c2448604e59fd81f4758edc07b5beea89476a6252e0a2144d43f14e06bf90c57dd2613562221954e3b2eddc6d2fcd9ef6 because connection OutboundConnectionThread-tcp://192.168.1.200:8800 not in dispatcher
[2018-05-10 10:30:10.542 INFO dispatch] Can't send last message AUTHORIZATION_VIOLATION back to ee58844c071426276de533cadfafbd3c2448604e59fd81f4758edc07b5beea89476a6252e0a2144d43f14e06bf90c57dd2613562221954e3b2eddc6d2fcd9ef6 because connection OutboundConnectionThread-tcp://192.168.1.200:8800 not in dispatcher
It's so hard to find explanations on this, only places I can find anything is the original refs in the source code and I'm not going to backwards engineer that anytime soon.
My settings for the validators on startup are:
The usual binds to 0.0.0.0
peering dynamic
scheduler serial
network trust
Any help would be so soooo appreciated!
Many thanks in advance :)
Aaron.
The usual problem with the
Can't send message PING_RESPONSE back to . . . because connection ... not in dispatcher
is configuring the peer endpoints
1) If you are using Ubuntu directly instead of Docker, use the Validator's hostname or IP address instead of the default ("validator"), which only works with Docker, or "localhost", which may not be routable
2) If you are using Docker, make sure the Docker ports are mapped to the Ubuntu OS, and that the OS IP address/port is routable between the two machines. Check the expose: and ports: entries in your docker-compose.yaml file or similar file.
3) Verify network connectivity to the remote machine with ping
4) Verify port connectivity telnet aremotehostname 8800 (replace aremotehostname with the remote peer's hostname or IP address)
5) Check peer configuration in your /etc/sawtooth/validator.toml files. Check the peering and endpoint lines. Check the seeds line (for dynamic peering) or peers line (for static peering)

Nifi 1.5.0 Cluster configuration

Does anyone know how to cluster NiFi 1.5.0? I want to use dataflow.mydomain.com but... I get this error when I try to hit the loadbalancer that reads:
"The request contained an invalid host header [dataflow.mydomain.com] in the request [/nifi/]. Check for request manipulation or third-party intercept."
According to one post that I read, the problem was that the value of nifi.web.http.host had to match the value of the url.
If that's true, I don't understand how a cluster would be possible.
Thanks!
(I'm using a 3 host setup in AWS, the hosts will individually respond if I set the nifi.web.http.host to their private IP and I access it at http://[ip]/nifi/
but not if I use a loadbalancer in front of the cluster).
It is not really an issue of clustering NiFi, it is an issue of accessing it through a load balancer. A cluster does not imply a load balancer.
In the next version of NiFi there will be a new property (nifi.web.proxy.host) where you could put dataflow.mydomain.com and it would let it through.
For now I think you'd have to strip off the host header of each request at your load balancer so that it doesn't get passed on to the NiFi nodes, that it was is triggering the rejection. NiFi is inspecting the headers of the incoming request and seeing that the host header has a value that is not the host of NiFi.

Starting multiple remote servers with Akka

I'm running into some deployment issues using Akka remoting to implement a small search application.
I want to deploy my ActorSystem on a set of local cluster machines to use them as workers, but I'm a bit confused for what to put into my application.conf to make this happen. For example, I can use:
akka.remote {
transport = "akka.remote.netty.NettyRemoteTransport"
netty {
hostname = "0.0.0.0"
port = 2552
}
}
Each worker just runs the ActorSystem at startup.
This allows my worker machines to bind to their address when they start up, but then they refuse to listen to messages:
beaker-24: [ERROR] ... dropping message DaemonMsgWatch for non-local recipient akka://SearchService#beaker-24:2552/remote at akka://SearchService#0.0.0.0:2552
The documentation I've found for this so far only discusses deployment on my localhost, which is not so useful :). I'm hoping there is a way to do this without generating a separate configuration for each host.
Update:
Using an empty string as the hostname allows for contacting the host via the normal IP address. Addressing using the hostname itself doesn't work at the moment.
Setting “0.0.0.0” as host name will currently basically disable remoting, because that is not a legal IP to send to. Background: actor references get the configured IP (or host name) inserted in their address part when they leave the local system, and that is exactly their “pointer home” for other systems to send messages back.
There has been an effort by Scott which would enable a system to receive replies to a different address here, but that is not included yet—and we may well chose a different solution to this problem.

Resources