I want to upload a csv file in SharePoint via NiFi, is it possible? I can't find a processor that allows to ingest a csv file in SharePointe via NiFi, thank you for your help
I tried to search for processor PutSharePointList or PutSharePointOnline in https://repo1.maven.org/maven2/org/apache/nifi/ but I can't find it
Related
I am working on writing a NiFi workflow and stuck in a scenario where I have to get a few files with different formats. But all the files will be routed to next processor only when we get the json file.
Thanks
I have muliple folders in a S3 bucket and each folder contains one JSON lines file.
I want to do two things with this data
Apply some transformations and get tabular data and save it to some database.
save these json objects, as it is to Elasticseach cluster for full text search
I am using AWS glue for this task and I know how to do 1, but, I can't find any resources that talks about getting data from s3 and storing it to elasticsearch using AWS glue.
Is there a way to do this?
If anyone is looking for an answer to this then I used Logstash to load files to Elasticsearch.
I'm newbie of Nifi
I'm trying to get data from database and put it to hadoop.
It seems i succeed to connect hadoop from nifi using PutHDFS processor.
After running PutHDFS, a file is created successfully.
But the problem is.. the file is empty. no contents.
I tried to get a file from local nifi server using GetFile, but the result is same. so there is no problem from source.
I have no idea why NIFI fails to write the contents into hadoop file. There is even no error occured.
please help me.
I have a requirement to read huge CSV file from Kafka topic to Cassandra. I configured Apache Nifi to achieve the same.
Flow:
User does not have a control on Nifi setup. He only specifies the URL where the CSV is located. The web application writes the URL into kafka topic. Nifi fetches the file and inserts into Cassandra.
How will I know that Nifi has inserted all the rows from the CSV file into Cassandra? I need to let the user know that inserting is done.
Any help would be appreciated.
I found the solution.
Using MergeContent processor, all FlowFiles with the same value for "fragment.identifier" will be grouped together. Once MergeContent has defragmented them, we can notify the user.
I have installed Hadoop and hive. I can process and query over xls, tsv files using hive. I want to process other files such as docx, pdf, ppt. how can i do this? Is there any separate procedure to process these files in AWS? please help me.
There isn't any difference in consuming those files as in any Hadoop platform. For easy access and durable storage - you may put those files in S3.