HPCC/HDFS Connector - hadoop

Does anyone know about HPCC/HDFS connector.we are using both HPCC and HADOOP.There is one utility(HPCC/HDFS connector) developed by HPCC which allows HPCC cluster to acess HDFS data
i have installed the connector but when i run the program to acess data from hdfs it gives error as libhdfs.so.0 doesn't exist.
I tried to build libhdfs.so using command
ant compile-libhdfs -Dlibhdfs=1
its giving me error as
target "compile-libhdfs" does not exist in the project "hadoop"
i used one more command
ant compile-c++-libhdfs -Dlibhdfs=1
its giving error as
[get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
[get] To: /home/hadoop/hadoop-
[get] Error getting http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
to /home/hadoop/hadoop-
BUILD FAILED java.net.ConnectException: Connection timed out
any suggestion will be a great help

Chhaya, you might not need to build libhdfs.so, depending on how you installed hadoop, you might already have it.
Check in HADOOP_LOCATION/c++/Linux-<arch>/lib/libhdfs.so, where HADOOP_LOCATION is your hadoop install location, and arch is the machine’s architecture (i386-32 or amd64-64).
Once you locate the lib, make sure the H2H connector is configured correctly (see page 4 here).
It's just a matter of updating the HADOOP_LOCATION var in the config file:
good luck.


Ambari Server fails to start because of missing stack definitions

II successfully built the Apache Ambari from git repository, installed and configured the ambari-server. But it just won't start. In the log is the following error:
Error injecting constructor, org.apache.ambari.server.AmbariException: Unable to
find stack definitions under stackRoot = /var/lib/ambari-server/resources/stacks
at org.apache.ambari.server.stack.StackManager.<init>(StackManager.java:149)
while locating org.apache.ambari.server.stack.StackManager annotated with #com.google.inject.internal.UniqueAnnotations$Internal(value=1)
at org.apache.ambari.server.api.services.AmbariMetaInfo.init(AmbariMetaInfo.java:272)
at org.apache.ambari.server.api.services.AmbariMetaInfo.class(AmbariMetaInfo.java:131)
while locating org.apache.ambari.server.api.services.AmbariMetaInfo
for field at org.apache.ambari.server.controller.AmbariServer.ambariMetaInfo(AmbariServer.java:180)
at org.apache.ambari.server.controller.AmbariServer.class(AmbariServer.java:180)
while locating org.apache.ambari.server.controller.AmbariServer
What could be the problem?
I've got it running now at least ambari server, but not the stack.
Thank you for the hint.

Beam / DataFlow unexpected error ProtocolMessageEnum not implemented when using DataFlowRunner

When running my Beam pipeline locally it all works as expected but when trying to run it on the DataflowRunner I suddenly get the error below. Honestly I don't even know where to start evaluating this because the DataflowRunner seems to be a black box.
Jan 14, 2019 11:26:51 AM org.apache.beam.runners.dataflow.DataflowRunner fromOptions
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 165 files. Enable logging at DEBUG level to see which files will be staged.
Exception in thread "main" java.lang.IncompatibleClassChangeError: Class org.apache.beam.model.pipeline.v1.RunnerApi$StandardPTransforms$Primitives does not implement the requested interface com.google.protobuf.ProtocolMessageEnum
at org.apache.beam.runners.core.construction.BeamUrns.getUrn(BeamUrns.java:27)
at org.apache.beam.runners.core.construction.PTransformTranslation.<clinit>(PTransformTranslation.java:58)
at org.apache.beam.runners.core.construction.UnconsumedReads$1.visitValue(UnconsumedReads.java:49)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:666)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:649)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:458)
at org.apache.beam.runners.core.construction.UnconsumedReads.ensureAllReadsConsumed(UnconsumedReads.java:40)
at org.apache.beam.runners.dataflow.DataflowRunner.replaceTransforms(DataflowRunner.java:868)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:660)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at (my code: pipe.run().waitUntilFinish();)
check the versions of beam etc and upgrade your dependencies where possible.
I had the same error and after seeing you get this error, I thought it must be a dependency conflict as it didn't exist before.
I'm using scio to deploy to dataflow and just referenced what they're using. https://github.com/spotify/scio/blob/v0.7.1/build.sbt
I updated guava and protobuf also.
I know you're using java, but try updating beam to 2.9.0 and maybe guava, protobuf...

Getting error file not found for ProcessCenter_CaseManagerConfig.properties while running BPMGenerateUpgradeSchemaScripts.bat command

I am updating IBM BPM 8.6.0 to IBM Business Automation Workflow Version, after updating fix pack for IBAW when I am running below command I get an error.
BPMGenerateUpgradeSchemaScripts.bat -profileName Node1Profile -de ProcessCenter
Below is the error which is coming on running above command.
Unable to find the response file
Unable to find the file C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties, please run the command 'BPMConfig -update -profile deployment_manager_profile -de deployment_environment_name -caseConfigure' to collect the configuration information for the content data sources, please read the knowledge center for details.
java.io.FileNotFoundException: C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties (The system cannot find the file specified.)
CWMCO6007E: The BPMGenerateUpgradeSchemaScripts command could not complete successfully. The following exception occurred :
Faild to initialize the CommonInfo. java.io.FileNotFoundException: C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties (The system cannot find the file specified.)
The file command asked to run first in the above error is on 11 point in the upgrade guide, can some one please suggest whats wrong with this?

Error on installing Titan DB on Windows

Following the official guide of Titan DB here, and trying to run the command:
graph = TitanFactory.open('conf/titan-cassandra-es.properties')
I got this error:
Backend shorthand unknown: conf/titan-cassandra-es.properties
Obviously, the reason is the incorrect path to the
file. So I changed it to:
graph = TitanFactory.open('../conf/titan-cassandra-es.properties')
and got this error:
Encountered unregistered class ID: 141.
The error happens in the following version:
On titan-1.0.0-hadoop2 instead of this error message I get this one:
Invalid import definition: 'com.thinkaurelius.titan.hadoop.MapReduceIndexManagement'; reason: startup failed: script14747941661821834264593.groovy: 1: unable to resolve class com.thinkaurelius.titan.hadoop.MapReduceIndexManagement # line 1, column 1. import com.thinkaurelius.titan.hadoop.MapReduceIndexManagement ^
1 error
And on titan-1.0.0-hadoop2 I get this one:
The input line is too long.
The syntax of the command is incorrect.
Does anyone know how to handle this issue?
It seems like you have not even managed to get Titan 1 to start up yet.
I do not believe Titan 1 has been deployed to support Windows out of the box. I.e. the downloadable package will not just work with windows.
Saying that I have managed to get Titan DB 1 to work on windows. To do so, all you have to do is install Cassandra 2.x on Windows. This guide may help you out. Start cassandra and enable thrift connections.
With that done you should be able to get Titan doing basic operations on windows. From there you may find dealing with you current errors easier.
Side Note: Windows support for Titan 0.5.x may be more substantial. So you could look into that as well.

Spring-xd strange too many open files error

I upgraded from spring-xd 1.2.1 to 1.3.0, and have both under /opt on my system. After starting xd in single node (but configured to use Zookeeper), I tried to create another stream (e.g. "time | log"), and spring-xd throws the following exception:
java.io.FileNotFoundException: /opt/spring-xd-1.2.1.RELEASE/xd/config/modules/modules.yml (Too many open files)
I changed ulimit -n 60000, but it didn't solve the problem. The strange thing is why it still points to spring-xd-1.2.1.RELEASE? I have started both xd-singlenode and xd-shell under /opt/spring-xd-1.3.1.RELEASE
EDIT: add xd-singlenode running process output just to show it's pointing to 1.3.1:
/usr/java/default/bin/java -Dspring.application.name=admin
/xd-singlenode-logback.groovy -Dxd.home=/opt/spring-xd-1.3.0.RELEASE/xd
-Dxd.module.config.name=modules -classpath
have you updated your environment variables? specifically XD_CONFIG_LOCATION based on the error shown above.
