Distributeload cloning flowfile and sending double - apache-nifi

I have a NIFI DistributeLoad processor that sends to an executestreamcommand processor. The issue I am seeing is that the distributeload processor is creating a clone and sending the original plus the clone to the executestreamcommand processor. This is not happening for all the files I send through. Has anyone else seen this issue?

it's version 0.7.1. Figured out that it must have something to do with how distributeload processor handles the removal of relationships. I used to have 8 relationships going out of that processor and changed it to 4 and I think that's when i started seeing the clones. I had to recreate the processor and it seems to be working fine.

Confirmed - Didn't even change the number of outbound relationships - just rewired them and started getting duplicate entries.
Replaced the processor with a new one and duplicates have disappeared. Will raise a bug.

Related

Vertex pipeline model training component stuck running forever because of metadata issue

I'm attempting to run a Vertex pipeline (custom model training) which I was able to run successfully in a different project. As far as I'm aware, all the pieces of infrastructure (service accounts, buckets, etc.) are identical.
The error appears in a gray box in the pipeline UI when I click on the model training component and reads the following:
Retryable error reported. System is retrying.
com.google.cloud.ai.platform.common.errors.AiPlatformException: code=ABORTED, message=Specified Execution `etag`: `1662555654045` does not match server `etag`: `1662555533339`, cause=null System is retrying.
I've looked into the log explorer and found that the error logs are audit logs have the following associated tags with them:
protoPayload.methodName="google.cloud.aiplatform.internal.MetadataService.RefreshLineageSubgraph"
protoPayload.resourceName="projects/724306335858/locations/europe-west4/metadataStores/default
Leading me to think that there's an issue with the Vertex Metadatastore or the way my pipeline is using it. The audit logs are automatic though, so I'm not sure.
I've tried purging the metadata store as well as deleting it completely. I've also tried running a different model training pipeline that worked before in a different project as well but with no luck.
screenshot of ui
Retryable error which you were getting is the temporary issue, the issue is resolved now.
You can now be able to rerun the pipeline and it is not expected to enter the infinite retry loop.

NIFI custom processor error - can not upload template

Good day,
I'm trying to upload nifi template and getting this kind of error:
org.apache.nifi.processors.kite.ConvertAvroSchema is not known to this NiFi instance
I suppose that my Nifi Instance misses some component - but where can I get it and where on file system I need to add missing file?
I'm little bit confused.
Thanks!
As stated here, since 1.10.0 kite-nar was taken out of the default binaries because of space limitations.
You can get it here.

Clickhouse server failed to restart because of LowCardinality setting

I tried to play with LowCardinality setting, I got a message saying that this is an experimental feature and I have to SET allow_experimental_low_cardinality_type = 1 in order to use it.
I executed this command inside clickhouse-client and then I restarted the server. But I got
clickhouse-server.service: Unit entered failed state
Now I am trying to find out how to disable this setting and make my clickhouse-server start again.
Can you help with this please ?
PS: The version I use is the 18.12.17 and I installed it on Linux Ubuntu 16.04
ClickHouse has different layers for settings. If you used SET <setting> = <value> then you set it for current session. You don't need to restart ClickHouse. Please, take a look here.
I suppose you faced with another problem during starting your server. There a bunch of reasons why. So, firstly try to recollect what were done in configs since last restart (because you have just applied changes by restarting server).
Digging into logs also an awesome idea. Don't hesitate to check other similar issues on github.com, for example like this one

Clear the cache of FetchDistributedMapCache processor

How to clear the cache of FetchDistributedMapCache processor in Apache NiFi?
I tried deleting the persisted directory and also tried giving a new directory all together but it still fetches old data. Thanks for your help.
You should be able to stop the DistributedMapCacheClient and DistributedMapCacheServer, then delete the existing DistributedMapCacheServer and create a new one with same port as the previous one, then start them back up.
Inside NiFi, you could create a new DistributedMapCacheServer and point your processor at that instead. Outside of NiFi, I've written a Groovy script where you can interact with the DistributedMapCacheServer from the command line. The API only allows you to remove entries you know about; in the upcoming NiFi 1.2.0 release, you will be able to remove entries using a regular expression for the keys (implemented in NIFI-3627). At that point I will update the Groovy script to enable that feature.

Slony cluster successfully created and sync'd, but no further replication

Deb. Squeeze, Postgres 8.4, Slony 1.2.21
I've created a master > slave cluster which has done the initial copy. However, I can't get any more data to replicate. I've always been a bit baffled by what commands should be run to start the required processes/daemons at each end.
Can anyone enlighten me?
Cheers.
OK, got the little bugger sorted. Potential Slony users note that (currently) Slony doesn't replicate Truncate statements, use delete from instead.
This meant my original data was still in the slave and the PK clashed with the new data ,so it wasn't updated.

Resources