Apache nifi gettwitter processor giving error 403 forbidden - apache-nifi

I just started using apache Nifi to create a flow from the getTwitter process and when I run the process it keeps giving me error403 forbidden .Even 1 after I gave the correct keys and secrets I tried watching multiple videos but none of them seem to help I generated new key values and retired but it still gives me the same error

I would check in with verifying that the Twitter API first, then work with NiFi. If Twitter API does work outside of NiFi, then it's a NiFi issue. If this works, then please feel free to accept this as the answer.

Related

What is the purpose of OpenDistro/Elasticsearch cluster permission cluster:monitor/main?

On the Permissions page of OpenDistro, Permissions, the cluster permission cluster:monitor/main is mentioned. However, I've been unable to find any documentation or information regarding what this permission actually gives access to.
The minimum amount of info I've been able to find is that it gives access to the root endpoint of the cluster endpoint, and that the endpoint can display things like cluster version and other general stats. At least according to the following post: Discuss Elasticsearch.
The reason that I'm interested in that specific permission is that I'm experiencing issues when using Serilog to log to multiple nodes in a cluster using C#. The application that logs using Serilog receives Unauthorized exceptions, with the Elasticsearch cluster displaying the message: "No permissions for [cluster:monitor/main]". Granting the internal Elasticsearch user, which is used with Serilog, the cluster:monitor/main permission fixes the issue. But I don't know what the permission opens up for, or why the user doing logging actually even needs that permission to begin with.
So the question boils down to:
What does the permission cluster:monitor/main do, and why is it required for the user doing the logging when using a multi-node cluster with Serilog?
You are right when saying "it gives access to the root endpoint of the cluster endpoint, and that the endpoint can display things like cluster version and other general stats".
To further clarify what the action is performing, look at the logic for this action:
https://github.com/elastic/elasticsearch/blob/7.9/server/src/main/java/org/elasticsearch/action/main/TransportMainAction.java#L49
It is merely getting the cluster state.
ClusterState clusterState = clusterService.state();

How to send updated data from java program to NiFi?

I have micro-services running and when a web user update data in DB using micro-service end-points, I want to send updated data to NiFi also. This data contains updated list of names, deleted names, edited names etc. How to do it? which processor I have to use from NiFi side?
I am new to NiFi. I am yet to try anything from my side. I am reading google documents which can guide me.
No source code is written. I want to start it. But I will share here once I write it.
Expected result is NiFi should get updated list of names and NiFi should refer updated list for generating required alerts/triggers etc.
You can actually do it in lots of ways. MQ, Kafka, HTTP(usinh ListenHTTP). Just deploy the relevant one to you and configure it, even listen to a directory(using ListFile & FetchFile).
You can connect NiFi to pretty much everything, so just choose how you want to connect your micro services to NiFi.

Cluster CDH installation stuck at download

I am trying to set up a cluster on 3 nodes on a Cloud Server with Cloudera Manager. But at Cluster installation step, it gets stuck at 64%. Please guide me on how to go forward with it and where to see logs of the same.
Following is the image of the installation screen
Some cloud companies have policies in which they if lots of data requests are coming, they remove the IP from public hostings for sometime. This is done to prevent DDoS attacks.
A solution can be to ask them to raise the data transfer limit.

springxd stream using HDFS-Dataset to save avro data unable to renew kerberos ticket

I have created a springxd stream ====> source-JMS queue -> Transform-Custom Java Processor (XML to AVRO) -> Sink - HDFS-Dataset.
Stream works perfectly fine but after 24 hours, since its continuous connection it is unable to renew the kerberos authentication ticket and stopped writing to HDFS. We are restarting the container where this stream deployed but still we face problems and losing the messages as they are not even sent to redis error queue.
I need help with -
If we can renew the kerberos ticket for the stream. Do I need to update the sink code and need to create custom sink.
I don't find any sink in springxd documentation similar to HDFS-Dataset and writes to local files system where I don't need to go through kerberos authentication.
Appreciate your help here.
Thanks,
This is a well known problem in spring xd which is not documented :). Something pretty similar happen to batch jobs which are deployed for long time and try to run later.. why? Because the hadoopConfiguration object is forcing the scope to singleton and it is getting instanced once you deploy your stream/job in spring-xd. In our case we created a listener for the spring batch jobs to renew the ticket before the jobs executions. You could do something similar in your streams, take this like guide
https://github.com/spring-projects/spring-hadoop/blob/master/spring-hadoop-core/src/main/java/org/springframework/data/hadoop/configuration/ConfigurationFactoryBean.java
I hope it helps.

Getting started with Fabric8, AWS using stackpoint

I have historically used a lot of manual chaining to get a CI pipeline in place for microservice development so am excited to try Fabric8 as it seems that it will make life a lot easier. Running into some early issues though.
I did manage to get Fabric8 running locally but want to get things running on AWS so I can present a more real world flow to stakeholders. Following the notes on this page Fabric8 on AWS I was able to get a 3 server cluster running using Stackpoint. But, I cannot connect to that cluster to be able to start administering the services. The page references this link (http://fabric8.default.replace.me.io) but it is not working for me. Tried hitting each of the AWS instances by public IP but that failed also. What would be my next steps here?
yeah the getting started guides don't really explain this in great deal. There's a similar issue on the fabric8 issue tracker that we've tried to help answer how to access the console
TL;DR using the AWS loadbalancer can add expense so we deploy an NGINX reverse proxy so you can set up a wildcard DNS. We use and recommend cloudflare for that as its free for this type of use and fast to setup.
We also created a blog to explain the different options how to access apps on kubernetes
Hope that helps!

Resources