DRPC Server error in storm - hadoop

I am trying to execute the below code and getting an error .. Not sure if i am missing something here.. Also where would i see the output?
Error
java.lang.RuntimeException: No DRPC servers configured for topology
at backtype.storm.drpc.DRPCSpout.open(DRPCSpout.java:79)
at storm.trident.spout.RichSpoutBatchTriggerer.open(RichSpoutBatchTriggerer.java:58)
at backtype.storm.daemon.executor$fn__5802$fn__5817.invoke(executor.clj:519)
at backtype.storm.util$async_loop$fn__442.invoke(util.clj:434)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:744)
Code:
----
package com.**.trident.storm;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import storm.kafka.*;
import storm.trident.*;
import backtype.storm.*;
public class EventTridentDrpcTopology
{
private static final String KAFKA_SPOUT_ID = "kafkaSpout";
private static final Logger log = LoggerFactory.getLogger(EventTridentDrpcTopology.class);
public static StormTopology buildTopology(OpaqueTridentKafkaSpout spout) throws Exception
{
TridentTopology tridentTopology = new TridentTopology();
TridentState ts = tridentTopology.newStream("event_spout",spout)
.name(KAFKA_SPOUT_ID)
.each(new Fields("mac_address"), new SplitMac(), new Fields("mac"))
.groupBy(new Fields("mac"))
.persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("maccount"))
.parallelismHint(4)
;
tridentTopology
.newDRPCStream("mac_count")
.each(new Fields("args"), new SplitMac(), new Fields("mac"))
.stateQuery(ts,new Fields("mac"),new MapGet(), new Fields("maccount"))
.each(new Fields("maccount"), new FilterNull())
.aggregate(new Fields("maccount"), new Sum(), new Fields("sum"))
;
return tridentTopology.build();
}
public static void main(String[] str) throws Exception
{
Config conf = new Config();
BrokerHosts hosts = new ZkHosts("xxxx:2181,xxxx:2181,xxxx:2181");
String topic = "event";
//String zkRoot = topologyConfig.getProperty("kafka.zkRoot");
String consumerGroupId = "StormSpout";
DRPCClient drpc = new DRPCClient("xxxx",3772);
TridentKafkaConfig tridentKafkaConfig = new TridentKafkaConfig(hosts, topic, consumerGroupId);
tridentKafkaConfig.scheme = new SchemeAsMultiScheme(new XScheme());
OpaqueTridentKafkaSpout opaqueTridentKafkaSpout = new OpaqueTridentKafkaSpout(tridentKafkaConfig);
StormSubmitter.submitTopology("event_trident", conf, buildTopology(opaqueTridentKafkaSpout));
}
}

You have to configure the locations of the DRPC servers and launch them.
See Remote mode DRPC on http://storm.apache.org/releases/0.10.0/Distributed-RPC.html
Launch DRPC server(s)
Configure the locations of the DRPC servers
Submit DRPC topologies to Storm cluster
Launching a DRPC server can be done with the storm script and is just like launching Nimbus or the UI:
bin/storm drpc
Next, you need to configure your Storm cluster to know the locations of the DRPC server(s). This is how DRPCSpout knows from where to read function invocations. This can be done through the storm.yaml file or the topology configurations. Configuring this through the storm.yaml looks something like this:
drpc.servers:
- "drpc1.foo.com"
- "drpc2.foo.com"

Related

Client is not connected to any Elasticsearch nodes in Flink

I am using Flink 1.1.2 and have added ElesticSearch dependency in Maven as follows
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch2_2.10</artifactId>
<version>1.2.0</version>
</dependency>
My program contains the following code that is reading data from Kafka and inserting to Elastic search
public class ReadFromKafka {
public static void main(String[] args) throws Exception {
// create execution environment
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
properties.setProperty("zookeeper.connect", "localhost:2181");
properties.setProperty("group.id", "test");
DataStream<JoinedStreamEvent> message = env.addSource(new FlinkKafkaConsumer09<JoinedStreamEvent>("test",
new JoinSchema(), properties));
System.out.println("reading form kafka ");
message.print();
Map<String, String> config = new HashMap<>();
config.put("bulk.flush.max.actions", "1"); // flush inserts after every event
config.put("cluster.name", "elasticsearch_amar"); // default cluster name
List<InetSocketAddress> transports = new ArrayList<>();
// set default connection details
transports.add(new InetSocketAddress(InetAddress.getByName("127.0.0.1"), 9300));
message.addSink(new ElasticsearchSink<>(config,transports,new ElasticInserter()));
env.execute();
} //main
public static class ElasticInserter implements ElasticsearchSinkFunction<JoinedStreamEvent>{
#Override
public void process(JoinedStreamEvent record, RuntimeContext runtimeContext, RequestIndexer requestIndexer) {
Map<String, Integer> json = new HashMap<>();
json.put("Time", record.getPatient_id());
json.put("heart Rate ", record.getHeartRate());
json.put("resp rete", record.getRespirationRate());
IndexRequest rqst = Requests.indexRequest()
.index("nyc-places") // index name
.type("popular-locations") // mapping name
.source(json);
requestIndexer.add(rqst);
} //process
} //ElasticInserter
} //ReadFromKafka
I have installed ElesticSearch using homebrew and then started it using elesticsearch command as shown below
however, when I start my program I got following error
my reputation below 50, can not comment.
I have a bit of suggestion:
first check whether ES is up,
see Can't Connect to Elasticsearch (through Curl).
recommended to use the docker container to start ES, eg. docker run -d --name es -p 9200:9200 elasticsearch:2 -Des.network.host=0.0.0.0
BTW, You can try: modify es.network.host value to 0.0.0.0 in ES config elasticsearch.yml:

spark job fails in yarn cluster

I have a spark job which runs without any issue in spark-shell. I am currently trying to submit this job to yarn using spark's api.
I am using the below class to run a spark job
import java.util.ResourceBundle;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
public class SubmitSparkJobToYARNFromJavaCode {
public static void main(String[] arguments) throws Exception {
ResourceBundle bundle = ResourceBundle.getBundle("device_compare");
String accessKey = bundle.getString("accessKey");
String secretKey = bundle.getString("secretKey");
String[] args = new String[] {
// path to my application's JAR file
// required in yarn-cluster mode
"--jar",
"my_s3_path_to_jar",
// name of my application's main class (required)
"--class", "com.abc.SampleIdCount",
// comma separated list of local jars that want
// SparkContext.addJar to work with
// "--addJars", arguments[1]
};
// create a Hadoop Configuration object
Configuration config = new Configuration();
// identify that I will be using Spark as YARN mode
System.setProperty("SPARK_YARN_MODE", "true");
System.setProperty("spark.local.dir", "/tmp");
// create an instance of SparkConf object
SparkConf sparkConf = new SparkConf();
sparkConf.set("fs.s3n.awsAccessKeyId", accessKey);
sparkConf.set("fs.s3n.awsSecretAccessKey", secretKey);
sparkConf.set("spark.local.dir", "/tmp");
// create ClientArguments, which will be passed to Client
ClientArguments cArgs = new ClientArguments(args);
// create an instance of yarn Client client
Client client = new Client(cArgs, config, sparkConf);
// submit Spark job to YARN
client.run();
}
}
This is the spark job I am trying to run
package com.abc;
import java.util.ResourceBundle;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
public class SampleIdCount {
private static String accessKey;
private static String secretKey;
public SampleIdCount() {
ResourceBundle bundle = ResourceBundle.getBundle("device_compare");
accessKey = bundle.getString("accessKey");
secretKey = bundle.getString("secretKey");
}
public static void main(String[] args) {
System.out.println("Started execution");
SampleIdCount sample = new SampleIdCount();
System.setProperty("SPARK_YARN_MODE", "true");
System.setProperty("spark.local.dir", "/tmp");
SparkConf conf = new SparkConf();
{
conf = new SparkConf().setAppName("SampleIdCount").setMaster("yarn-cluster");
}
JavaSparkContext sc = new JavaSparkContext(conf);
sc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", accessKey);
sc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", secretKey);
JavaRDD<String> installedDeviceIdsRDD = sc.emptyRDD();
installedDeviceIdsRDD = sc.textFile("my_s3_input_path");
installedDeviceIdsRDD.saveAsTextFile("my_s3_output_path");
sc.close();
}
}
When I run my java code , the spark job is being submitted to yarn but the issue is I face the below error
Diagnostics: File file:/mnt/tmp/spark-1b86d806-5c8f-4ae6-a486-7b68d46c759a/__spark_libs__8257948728364304288.zip does not exist
java.io.FileNotFoundException: File file:/mnt/tmp/spark-1b86d806-5c8f-4ae6-a486-7b68d46c759a/__spark_libs__8257948728364304288.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:616)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:829)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:606)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:431)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The problem I thought was that the folder /mnt was not available in slave nodes so I tried to change spark local directory to /tmp by doing the following things
I set System environment variable to "/tmp" in my Java code
I set a system env variable by doing export LOCAL_DIRS=/tmp
I set a system env variable by doing export SPARK_LOCAL_DIRS=/tmp
None of these had any effect and I still face the very same error. None of the suggestions in other links helped me either. I am really stuck in this. Any help would be much appreciated. Thanks in advance. Cheers!

Storm Program not running

So I was trying to learn apache storm and was using the tutorialspoint guide as a reference point for working with my first storm program(https://www.tutorialspoint.com/apache_storm/apache_storm_quick_guide.htm)
I do not get the call log count output as expected. My zookeeper however shuts down
My topology is:
public class logAnalyserStorm {
public static void main(String[] args) throws InterruptedException{
Config config = new Config();
config.setDebug(true);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("call-log-reader-spout", new FakeCallLogGeneratorSpout(),100);
builder.setBolt("call-log-creator-bolt", new callLogCreatorBolt()).shuffleGrouping("call-log-reader-spout");
builder.setBolt("call-log-counter-bolt", new callLogCounterBolt()).fieldsGrouping("call-log-creator-bolt", new Fields("call"));
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("logAnalyserStorm", config, builder.createTopology());
Thread.sleep(10000);
cluster.killTopology("logAnalyserStorm");
cluster.shutdown();
}
}
The error is:
20680 [Thread-10] INFO o.a.s.event - Event manager interrupted
20683 [main] INFO o.a.s.testing - Shutting down in process zookeeper
20683 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - NIOServerCnxn factory exited run method
I realized that my nimbus was not running. Ugh. Thank You
Change Thread.sleep(10000); to Thread.sleep(60000);

integrating word count topology of storm with kafka

i am trying to integrate word count program of storm with kafka, for that my producer is working fine i.e it is reading a text file and sending each line as a message ,and i could see those messages in simple consumer console.
Now for integrating it with storm i.e to send those messages/lines to consumer spout i have just replaced the previous storm spout of word count program with the kafka spout from storm-spout integration dependency and rest of the program is same and i am trying to run it in eclipse but it is not getting execute ,i dont know what is the problem and even dont know whether i am doing it in right way,here is my main class -
package com.spnotes.storm;
import storm.kafka.BrokerHosts;
import storm.kafka.KafkaSpout;
import storm.kafka.SpoutConfig;
import storm.kafka.StringScheme;
import storm.kafka.ZkHosts;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.spout.SchemeAsMultiScheme;
import backtype.storm.topology.TopologyBuilder;
import com.spnotes.storm.bolts.WordCounterBolt;
import com.spnotes.storm.bolts.WordSpitterBolt;
public class WordCount {
public static void main(String[] args) throws Exception{
Config config = new Config();
config.setDebug(true);
config.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 1);
BrokerHosts hosts = new ZkHosts("localhost:9092");
SpoutConfig spoutConfig = new SpoutConfig(hosts, "test", "localhost:2181", "id1");
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("line-reader-spout", kafkaSpout);
builder.setBolt("word-spitter", new WordSpitterBolt()).shuffleGrouping("line-reader-spout");
builder.setBolt("word-counter", new WordCounterBolt()).shuffleGrouping("word-spitter");
LocalCluster cluster = new LocalCluster();
System.out.println("submit topology");
Thread.sleep(10000);
//StormSubmitter.submitTopology("HelloStorm5", config, builder.createTopology());
cluster.submitTopology("HelloStorm5", config, builder.createTopology());
cluster.shutdown();
}
}
There are 2 bolts WordSplitterBolt() and WordCounterBolt() ,Wordsplitterbolt is breaking each line/message into tokens/words and WordCounterBolt is counting the each word. Can anybody tell me m i doing anything in a wrong way? do i need to create own spout instead of using predefined KafkaSpout ? and is my main class correct?
change code:
BrokerHosts hosts = new ZkHosts(zkConnect);
zkConnect is zookeeper hostname and port not for kafka. change it to localhost:2181
As discussed on chat for rest issue related to code.
Issue was with Maven dependency.include all the dependency into POM.xml required.

How to purge/delete message from weblogic JMS queue

Is there and way (apart from consuming the message) I can purge/delete message programmatically from JMS queue. Even if it is possible by wlst command line tool, it will be of much help.
Here is an example in WLST for a Managed Server running on port 7005:
connect('weblogic', 'weblogic', 't3://localhost:7005')
serverRuntime()
cd('/JMSRuntime/ManagedSrv1.jms/JMSServers/MyAppJMSServer/Destinations/MyAppJMSModule!QueueNameToClear')
cmo.deleteMessages('')
The last command should return the number of messages it deleted.
You can use JMX to purge the queue, either from Java or from WLST (Python). You can find the MBean definitions for WLS 10.0 on http://download.oracle.com/docs/cd/E11035_01/wls100/wlsmbeanref/core/index.html.
Here is a basic Java example (don't forget to put weblogic.jar in the CLASSPATH):
import java.util.Hashtable;
import javax.management.MBeanServerConnection;
import javax.management.remote.JMXConnector;
import javax.management.remote.JMXConnectorFactory;
import javax.management.remote.JMXServiceURL;
import javax.management.ObjectName;
import javax.naming.Context;
import weblogic.management.mbeanservers.runtime.RuntimeServiceMBean;
public class PurgeWLSQueue {
private static final String WLS_USERNAME = "weblogic";
private static final String WLS_PASSWORD = "weblogic";
private static final String WLS_HOST = "localhost";
private static final int WLS_PORT = 7001;
private static final String JMS_SERVER = "wlsbJMSServer";
private static final String JMS_DESTINATION = "test.q";
private static JMXConnector getMBeanServerConnector(String jndiName) throws Exception {
Hashtable<String,String> h = new Hashtable<String,String>();
JMXServiceURL serviceURL = new JMXServiceURL("t3", WLS_HOST, WLS_PORT, jndiName);
h.put(Context.SECURITY_PRINCIPAL, WLS_USERNAME);
h.put(Context.SECURITY_CREDENTIALS, WLS_PASSWORD);
h.put(JMXConnectorFactory.PROTOCOL_PROVIDER_PACKAGES, "weblogic.management.remote");
JMXConnector connector = JMXConnectorFactory.connect(serviceURL, h);
return connector;
}
public static void main(String[] args) {
try {
JMXConnector connector =
getMBeanServerConnector("/jndi/"+RuntimeServiceMBean.MBEANSERVER_JNDI_NAME);
MBeanServerConnection mbeanServerConnection =
connector.getMBeanServerConnection();
ObjectName service = new ObjectName("com.bea:Name=RuntimeService,Type=weblogic.management.mbeanservers.runtime.RuntimeServiceMBean");
ObjectName serverRuntime = (ObjectName) mbeanServerConnection.getAttribute(service, "ServerRuntime");
ObjectName jmsRuntime = (ObjectName) mbeanServerConnection.getAttribute(serverRuntime, "JMSRuntime");
ObjectName[] jmsServers = (ObjectName[]) mbeanServerConnection.getAttribute(jmsRuntime, "JMSServers");
for (ObjectName jmsServer: jmsServers) {
if (JMS_SERVER.equals(jmsServer.getKeyProperty("Name"))) {
ObjectName[] destinations = (ObjectName[]) mbeanServerConnection.getAttribute(jmsServer, "Destinations");
for (ObjectName destination: destinations) {
if (destination.getKeyProperty("Name").endsWith("!"+JMS_DESTINATION)) {
Object o = mbeanServerConnection.invoke(
destination,
"deleteMessages",
new Object[] {""}, // selector expression
new String[] {"java.lang.String"});
System.out.println("Result: "+o);
break;
}
}
break;
}
}
connector.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
Works great on a single node environment, but what happens if you are on an clustered environment with ONE migratable JMSServer (currently on node #1) and this code is executing on node #2. Then there will be no JMSServer available and no message will be deleted.
This is the problem I'm facing right now...
Is there a way to connect to the JMSQueue without having the JMSServer available?
[edit]
Found a solution: Use the domain runtime service instead:
ObjectName service = new ObjectName("com.bea:Name=DomainRuntimeService,Type=weblogic.management.mbeanservers.domainruntime.DomainRuntimeServiceMBean");
and be sure to access the admin port on the WLS-cluster.
if this is one time, the easiest would be to do it through the console...
the program in below link helps you to clear only pending messages from queue based on redelivered message parameter
http://techytalks.blogspot.in/2016/02/deletepurge-pending-messages-from-jms.html

Resources