NiFi PutFile processor doesn't save file to a directory - apache-nifi

In my NiFi workflow I need to download .zip file from SOAP web-server, save it on machine (optional) and unpack content to sub-folder. Everything works on my local Win 10 machine, but issues occur when I try to move to remote Linux server. Here is the part of my flow when error happens:
So we have a FlowFile entered UpdateAttribute where filename attributed is set with required name and .zip extension. The file is correct as can be seen in queue after starting the processor.
Problems start happen when I pass the FlowFile to PutFile processors. I tried different scenarios based on selected directory:
relative to NiFi main folder ./out:
12:30:01 MSK ERROR
PutFile[id=05788ae5-64e5-32af-bb40-88d50d4c886c] Penalizing StandardFlowFileRecord[uuid=3e0c5e38-76f8-4ce3-b911-90f6901c35a4,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1586337594333-49, container=default, section=49], offset=2335, length=375628606],offset=0,name=,size=375628606] and transferring to failure due to /opt/nifi/nifi-1.11.4/./out: java.nio.file.DirectoryNotEmptyException: /opt/nifi/nifi-1.11.4/./out
12:30:01 MSK ERROR
PutFile[id=05788ae5-64e5-32af-bb40-88d50d4c886c] Unable to remove temporary file /opt/nifi/nifi-1.11.4/./out/. due to /opt/nifi/nifi-1.11.4/./out/.: Invalid argument: java.nio.file.FileSystemException: /opt/nifi/nifi-1.11.4/./out/.: Invalid argument
Full path /opt/nifi/nifi-1.11.4/out/file/ :
12:32:45 MSK ERROR
PutFile[id=0171102b-c82d-149d-c9ae-ea4da99b1750] Penalizing StandardFlowFileRecord[uuid=0573803f-8407-46e4-93f0-e52a5fc35a07,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1586337594333-49, container=default, section=49], offset=2335, length=375628606],offset=0,name=,size=375628606] and transferring to failure due to Failed to export StandardFlowFileRecord[uuid=0573803f-8407-46e4-93f0-e52a5fc35a07,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1586337594333-49, container=default, section=49], offset=2335, length=375628606],offset=0,name=,size=375628606] to /opt/nifi/nifi-1.11.4/out/file/. due to java.io.FileNotFoundException: /opt/nifi/nifi-1.11.4/out/file/. (No such file or directory): org.apache.nifi.processor.exception.FlowFileAccessException: Failed to export StandardFlowFileRecord[uuid=0573803f-8407-46e4-93f0-e52a5fc35a07,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1586337594333-49, container=default, section=49], offset=2335, length=375628606],offset=0,name=,size=375628606] to /opt/nifi/nifi-1.11.4/out/file/. due to java.io.FileNotFoundException: /opt/nifi/nifi-1.11.4/out/file/. (No such file or directory)
So it adds dot ('.') to path which causes exception. All folders are created and permissions granted. I tried to run simple test flow with file of 42B and the same path (GenerateFlowFile -> PutFile) and everything is OK.
What am I doing wrong?

The problem was in Linux system rights: there was 'nifi' user for flow execution and one more user for accessing Linux filesystem.
Assigning rwxrwxrwx for used folders solved the issue.

Related

Reg: database is not starting up an error

getting below error while starting the database:-
startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/mis/PARAMETERFILE/spfile.276.967375255'
ORA-17503: ksfdopn:10 Failed to open file +DATA/mis/PARAMETERFILE/spfile.276.967375255
ORA-04031: unable to allocate 56 bytes of shared memory ("shared pool","unknown object","KKSSP^24","kglseshtSegs")
Your database cannot find the SPFILE (newer init.ora) within ASM with the actual system parameters or has no permissions to access it.
Either your Grid Infrastructure stack or the dbs/spfile.ora is pointing to the wrong file.
To find out what the grid infrastructure stack is using, run "srvctl" which should display the parameterfile name the database should be using
srvctl config database -d <dbname>
...
Spfile: +DATA/<dbname>/PARAMETERFILE/spfile.269.1066152225
...
Then check (as the grid user), if the file indeed is not visible (by using asmcmd):
asmcmd
ASMCMD> ls +DATA/<dbname>/PARAMETERFILE/
spfile.269.1066152225
If the name is different, then you got the issue... (and you have to point to the correct file).
If the name is correct, then it could be wrong permissions on the oracle executable(s) (check My Oracle Support):
RAC Database Can't Start: ORA-01565, ORA-17503: ksfdopn:10 Failed to open file +DATA/BPBL/spfileBPBL.ora (Doc ID 2316088.1)

Azure Data Factory - How to filter out specific files in multiple Zip. files?

I have set up a ADF pipeline that gets a set of .Zip files from Azure Storage, and iterates through each Zip file's folders and files to land them in an output container with preserved hierarchy.
Get Metadata:
For Each:
Issue:
The issue is that there is a specific .PDF file (ASC_NTS.pdf) that is embedded within each .Zip file that has the same name:
It is causing this error when trying to run the pipeline:
Error
Operation on target ForEach1 failed: Activity failed because an inner activity failed; Inner activity name: Copy data1, Error: ErrorCode=AdlsGen2OperationFailedConcurrentWrite,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error occurred when trying to upload a file. It's possible because you have multiple concurrent copy activities runs writing to the same file 'FAERS_output/ascii/ASC_NTS.pdf'. Check your ADF configuration.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ADLS Gen2 operation failed for: Operation returned an invalid status code 'PreconditionFailed'. Account: 'asastgssuaefdbhdg2dbc4'. FileSystem: 'curated'. Path: 'FAERS_output/ascii/ASC_NTS.pdf'. ErrorCode: 'LeaseIdMissing'. Message: 'There is currently a lease on the resource and no lease ID was specified in the request.'. RequestId: 'b21022a6-b01f-0031-641a-453ab6000000'. TimeStamp: 'Thu, 31 Mar 2022 16:15:56 GMT'..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.Azure.Storage.Data.Models.ErrorSchemaException,Message=Operation returned an invalid status code 'PreconditionFailed',Source=Microsoft.DataTransfer.ClientLibrary,'
Is there a workaround for this pipeline setup that allows me to filter within the For Each loop? I just need the .TXT files, the .PDF files can be discarded.
This was the closest reference I could find, but does not address my use case:
Filter out file using wildcard path azure data factory
Have you tried using an If Condition activity? You can set the expression to check for the correct file extension.

Oracle Data Integrator SQL to HDFS IKM returns error

I am using ODI (12.1.3.0.0). I created topology for Oracle DB which is OK and I created topology for HDFS using File technology where I think the problem is in.
DataServer for HDFS, I left JDBC driver empty, and filled JDBC Url with hdfs://remotehostname:port
Physical Schema for HDFS, I filled both Schema and Work Schema with /my/path
Then created Logical Schema and Model. After that created Datastore under the model with these definitions.
Name: TestName
Resource Name: TESTFILE.txt
File Format: Fixed
After all these, created a project and a mapping under the project.
Finally when I run the mapping I see these errors:
ODI-1217: Session Oracle2HDFSMapping_Physical_SESS (15) fails with return code ODI-1298.
ODI-1226: Step Physical_STEP fails after 1 attempt(s).
ODI-1240: Flow Physical_STEP fails while performing a Add execute to Sqoop script-IKM SQL to HDFS File (Sqoop)- operation. This flow loads target table null.
ODI-1298: Serial task "SERIAL-MAP_MAIN- (10)" failed because child task "SERIAL-EU-GGUSER_UNIT (20)" is in error.
ODI-1298: Serial task "SERIAL-EU-GGUSER_UNIT (20)" failed because child task "Add execute to Sqoop script-IKM SQL to HDFS File (Sqoop)- (40)" is in error.
Caused By: java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at java.lang.Runtime.exec(Runtime.java:617)
at java.lang.Runtime.exec(Runtime.java:450)
at java.lang.Runtime.exec(Runtime.java:347)
at oracle.odi.runtime.agent.execution.cmd.OSCommandExecutor.execute(OSCommandExecutor.java:54)
at oracle.odi.runtime.agent.execution.cmd.OSCommandExecutor.execute(OSCommandExecutor.java:29)
at oracle.odi.runtime.agent.execution.TaskExecutionHandler.handleTask(TaskExecutionHandler.java:52)
at oracle.odi.runtime.agent.execution.SessionTask.processTask(SessionTask.java:203)
at oracle.odi.runtime.agent.execution.SessionTask.doExecuteTask(SessionTask.java:114)
at oracle.odi.runtime.agent.execution.AbstractSessionTask.execute(AbstractSessionTask.java:886)
at oracle.odi.runtime.agent.execution.SessionExecutor$SerialTrain.runTasks(SessionExecutor.java:2198)
at oracle.odi.runtime.agent.execution.SessionExecutor.executeSession(SessionExecutor.java:591)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:718)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:611)
at oracle.odi.core.persistence.dwgobject.DwgObjectTemplate.execute(DwgObjectTemplate.java:203)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor.doProcessStartAgentTask(TaskExecutorAgentRequestProcessor.java:800)
at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor.access$1400(StartSessRequestProcessor.java:74)
at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor$StartSessTask.doExecute(StartSessRequestProcessor.java:702)
at oracle.odi.runtime.agent.processor.task.AgentTask.execute(AgentTask.java:180)
at oracle.odi.runtime.agent.support.DefaultAgentTaskExecutor$2.run(DefaultAgentTaskExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(ProcessImpl.java:385)
at java.lang.ProcessImpl.start(ProcessImpl.java:136)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 20 more
I wonder where I did it wrong?
For a file Datastore, you need to define the attributes (columns) by opening the Datastore and going on the attribute tab. If the file already exists, you can reverse-engineer the attributes and rename them and change the datatype if needed.
The error message you received for the second task mentions that the file (generated in the fist task) does not exist. So there might be a problem with the first task, probably due to the missing attributes in your datastore.
Here is a detailed article about SQL To HDFS file (Sqoop) KM written by the ODI A-Team : http://www.ateam-oracle.com/importing-data-from-sql-databases-into-hadoop-with-sqoop-and-oracle-data-integrator-odi/

Specify CLI path for mTurk for Mac

I am setting up the CLI on a MAC (iOS 10.9) and I believe I've set up the MTURK_CMD_HOME and JAVA_CMD_HOME paths correctly.
But, I'm still getting an error that the file can't be found when I run getBalance.sh. My code is as follows:
/users/USER/Desktop/aws-mturk-clt-1.3.1/
-bash: /users/USER/Desktop/aws-mturk-clt-1.3.1/: is a directory
/System/Library/Frameworks/JavaVM.framework/Home
-bash: /System/Library/Frameworks/JavaVM.framework/Home: is a directory
export MTURK_CMD_HOME=/users/USER/Desktop/aws-mturk-clt-1.3.1/
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home
export PATH=$PATH:/users/USER/Desktop/aws-mturk-clt-1.3.1/bin
/users/USER/Desktop/aws-mturk-clt-1.3.1/bin/getBalance.sh
Returns the following error:
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: ../log/aws-mturk-clt.log (No such file or directory)
at java.io.FileOutputStream.openAppend(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:192)
at java.io.FileOutputStream.<init>(FileOutputStream.java:116)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:194)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:97)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:689)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:647)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:544)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:440)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:476)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:471)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:125)
at org.apache.log4j.Logger.getLogger(Logger.java:118)
at com.amazonaws.mturk.cmd.AbstractCmd.<clinit>(AbstractCmd.java:51)
There was a problem reading your properties file from mturk.properties
The exception was java.io.FileNotFoundException: mturk.properties (No such file or directory)
Exception in thread "main" java.lang.RuntimeException: Cannot load configuration properties file from mturk.properties
at com.amazonaws.mturk.util.PropertiesClientConfig.<init>(PropertiesClientConfig.java:99)
at com.amazonaws.mturk.util.PropertiesClientConfig.<init>(PropertiesClientConfig.java:72)
at com.amazonaws.mturk.cmd.AbstractCmd.<init>(AbstractCmd.java:61)
at com.amazonaws.mturk.cmd.GetBalance.<init>(GetBalance.java:24)
at com.amazonaws.mturk.cmd.GetBalance.main(GetBalance.java:27)
Caused by: java.io.FileNotFoundException: mturk.properties (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at com.amazonaws.mturk.util.PropertiesClientConfig.<init> (PropertiesClientConfig.java:95)
... 4 more
It seems the directory is specified correctly (and bin contains getBalance.sh). I've double checked that my access keys are correct and the file path is correct. What do I do?
This works for me (without needing to change any scripts). I did set the default access key and secret in the mturk.properties file and change the URL to the sandbox https address.
$ cd /Users/me/Downloads/aws-mturk-clt-1.3.1/bin
$ ./getBalance.sh
An error occurred while fetching your balance: Error #1 for RequestId: 75edd876-61eb-4525-8c5a-5c984e1e31f3 - AWS.NotAuthorized: The identity contained in the request is not authorized to use this AWSAccessKeyId (1424124881922 s)
com.amazonaws.mturk.service.exception.AccessKeyException: Error #1 for RequestId: 75edd876-61eb-4525-8c5a-5c984e1e31f3 - AWS.NotAuthorized: The identity contained in the request is not authorized to use this AWSAccessKeyId (1424124881922 s)
at com.amazonaws.mturk.filter.ErrorProcessingFilter.processErrors(ErrorProcessingFilter.java:91)
at com.amazonaws.mturk.filter.ErrorProcessingFilter.execute(ErrorProcessingFilter.java:48)
at com.amazonaws.mturk.filter.Filter.passMessage(Filter.java:56)
at com.amazonaws.mturk.filter.RetryFilter.execute(RetryFilter.java:115)
at com.amazonaws.mturk.filter.Filter.passMessage(Filter.java:56)
at com.amazonaws.mturk.util.CLTExceptionFilter.sendMessage(CLTExceptionFilter.java:77)
at com.amazonaws.mturk.util.CLTExceptionFilter.execute(CLTExceptionFilter.java:62)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequests(FilteredAWSService.java:172)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequest(FilteredAWSService.java:152)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequest(FilteredAWSService.java:116)
at com.amazonaws.mturk.service.axis.RequesterServiceRaw.getAccountBalance(RequesterServiceRaw.java:1193)
at com.amazonaws.mturk.service.axis.RequesterService.getAccountBalance(RequesterService.java:922)
at com.amazonaws.mturk.cmd.GetBalance.getBalance(GetBalance.java:50)
at com.amazonaws.mturk.cmd.GetBalance.runCommand(GetBalance.java:41)
at com.amazonaws.mturk.cmd.AbstractCmd.run(AbstractCmd.java:148)
at com.amazonaws.mturk.cmd.GetBalance.main(GetBalance.java:28)
Have you tried running it from the bin dir? Actually cd into the directory rather than using the fully qualified path

Error while working with Informatica Design SDK for creating mapping

I have found the mapping sdk code samples, however without any documentation whatsoever.
Currently working with 9.0, I am looking for more info.
For one of the current issues, when I try to save the mapping to repository, (with pcconfig.properties lying in the same folder where the xml file is being generated), I am getting the following error:
Written the file..
Caught an exception in run() method
java.io.IOException: Cannot run program ""C:\Informatica\pmrep"" (in directory "C:\Informatica"): CreateProcess error=2, The system cannot find the file specified
java.io.IOException: Cannot run program ""C:\Informatica\pmrep"" (in directory "C:\Informatica"): CreateProcess error=2, The system cannot find the file specified
com.informatica.powercenter.sdk.mapfwk.exception.MapFwkOutputException: Error saving to repository : Failed to connect to repository
at com.informatica.powercenter.sdk.mapfwk.xml.XMLWriter.save(Unknown Source)
at com.informatica.powercenter.sdk.mapfwk.repository.Repository.save(Unknown Source)
at TestRaghavExample.generateOutput(TestRaghavExample.java:259)
at TestRaghavExample.create(TestRaghavExample.java:64)
at TestRaghavExample.main(TestRaghavExample.java:272)
Caught an exception in run() method
This is my initial example.. trying to find my way through the API..
The path to the pmrep utility doesn't look correct. It is normally found at the path C:\Informatica\<version>\server\bin on Windows.
Check the value of PC_CLIENT_INSTALL_PATH within pcconfig.properties is correct.

Resources