how to fix the Failed to index Provenance Events error?

how to fix the Failed to index Provenance Events error? - apache-nifi

I have a problem on nifi at each step of my flow I have this error:
11:44:07 CESTERROR
Failed to index Provenance Events. See logs for more information.
i find in nifi-app.log
however there is no error and my nifi flow is working well
I have no error on the processor but the error is displayed at the top right
the nifi.properties looks good,

Related

Azure Data Factory with Data Bricks Notebook Execute Pipeline Error

I am trying to execute my databricks note book and linked service as execution Pool type of connection, also I have upload the Append libraries option for wheel format library in ADF but unable to execute our notebook via ADF and getting below error.
Run result unavailable: job failed with error message Library installation failed for library due to user error for whl:
"dbfs:/FileStore/jars/xxxxxxxxxxxxxxxxxxxx/prophet-1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl"
. Error messages: Library installation attempted on the driver node of
cluster 1129-161441-xwjfzl6k and failed. Please refer to the following
error message to fix the library or contact Databricks support. Error
Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message:
org.apache.spark.SparkException: Process List(bash,
/local_disk0/.ephemeral_nfs/cluster_libraries/python/python_start_clusterwide.sh,
/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/pip, install,
--upgrade, --find-links=/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages,
/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/prophet-1.1-cp38-cp38-manylinux_2_17_x86_6
... *WARNING: message truncated. Skipped 195 bytes of output
Kindly help us. and in linked in service, there is three types of option we have(Select cluster),
1.new job cluster
2.exixting interactive cluster
3.Existing instance pool
in production perspective which is the best, we do not have any job created in databricks and plan note book needs to trigger in adf to success the execution. please advice

Make sure you install the wheel onto the interactive cluster (option 2). This has nothing to do with Azure Data Bricks.
Installing local .whl files on Databricks cluster
See the above article for details.

Karthik from the error it is complaining about the library . This is what i could have done .
Cross check & make sure that the ADF is pointing the correct cluster .
If The cluster is correct , move on the cluster and open the notebook which you are trying to refer from ADF . try to execute that .
If the notebook works fine , go and stop the cluster and restart it again and run the notebook .
My guess is that once the cluster goes into the idle mode and shutsdown and then when ADF starts the cluster , it is not able to find the library it needs .

Apache nifi gettwitter processor giving error 403 forbidden

I just started using apache Nifi to create a flow from the getTwitter process and when I run the process it keeps giving me error403 forbidden .Even 1 after I gave the correct keys and secrets I tried watching multiple videos but none of them seem to help I generated new key values and retired but it still gives me the same error

I would check in with verifying that the Twitter API first, then work with NiFi. If Twitter API does work outside of NiFi, then it's a NiFi issue. If this works, then please feel free to accept this as the answer.

corrupted/unassigned elasticsearch index

I have run the Elasticsearch service for quite long time, but suddenly encountered the following
Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog from source [d:\elasticsearch-7.1.0\data\nodes\0]indices\A2CcAAE-R3KkQh6jSoaEUA\2\translog\translog-1.tlog] is corrupted, expected shard UUID [.......] but got: [...........] this translog file belongs to a different translog.
I executed the GET /_ca/shards?v and most of the indexes are UNASSIGNED state.
Please help!
I went through the log files and saw the error message "Failed to update shard information for ClusterInfoUpdateJob within 15s timeout", could this error message cause most of the shards turn to UNASSIGNED?

You can try to recover using elasticsearch-translog tool as explained in the documentation
Elasticsearch should be stopped while running this tool
If you don't have replica from which data can be recovered, you may lose some data by using the tool.
Reason is mentioned that drive error or user error.

BlockMissingException on running Spark job

I am running a spark job via pyspark, which consistently returns an error:
Diagnostics: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-908041201-10.122.103.38-1485808002236:blk_1073741831_1007 file=/hdp/apps/2.5.3.0-37/spark/spark-hdp-assembly.jar
The error is always on the same block, namely BP-908041201-10.122.103.38-1485808002236:blk_1073741831_1007.
When i look in the hadoop tracking url, the message reads:
Application application_1505726128034_2371 failed 2 times due to AM Container
for appattempt_1505726128034_2371_000002 exited with exitCode: -1000
I can only assume from this that there is some corrupted data? How can i view the data / block via hadoop command line and see exactly which data is on this potentially corrupted block.
Unfortunately there doesnt appear to be more detailed logs on the specific nodes of failure when looking in the web based logs.
Also - is there a way in pyspark to ignore any 'corrupted' blocks and simply ignore any files/blocks it cannot fully read?
Thanks

kylo ingestion feed error

I am new to Kylo.
I manually deployed Kylo on a test cluster of Hortonworks HDP 2.5 and have reused my Nifi instance prior to kylo.
I made a sample feed by following like ingestion tutorial (User Signups) and was successful.
However, when I drop sample data file in /var/dropzone/ the file is removed (assuming it is fetched and read by Nifi) but the operational dashboard does not show any any job running. No status against the feed job is populated.
I looked at the generated nifi process flow and there are two red processes and both are ReleaseHighWaterMark processes.
Also, Upon checking nifi-app.log I found following exception
2017-05-25 16:42:51,939 ERROR [Timer-Driven Process Thread-1] c.t.n.p.ProvenanceEventCollector ERROR PROCESSING EVENT! ProvenanceEventRecordDTO{eventId=759716, processorName=null, componentId=01a84157-0b38-14f2-d63d-c41fbd9c38a3, flowFile=ab93de46-e659-4c41-9812-94bbe2f90cfc, previous=null, eventType=CREATE, eventDetails=null, isEndOfJob=false, isBatch=true, isStream=false, feed=null}. ERROR: null
java.lang.NullPointerException: null
It seems there is a configuration issue and there is hardly any good troubleshooting guide available.
Any idea?

Please check that the KyloPrevenanceEventReportingTask is running in NiFi: http://kylo.readthedocs.io/en/latest/how-to-guides/NiFiKyloProvenanceReportingTask.html
If that doesn't resolve the issue, please post the stack trace that accompanies the error message.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio