IndexOutOfBound when trying Sphinx4 - maven

I have created all the models required by Sphinx4 (language model, dictionary and acoustic model). I created a Maven project in Eclipse and all the libraries downloaded, but when I run the program as shown in the official website (http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4), an IndexOutOfBounds Exception is thrown.
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 768, Size: 768
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool.get(Pool.java:55)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.createSenonePool(Sphinx3Loader.java:403)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.loadModelFiles(Sphinx3Loader.java:341)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.load(Sphinx3Loader.java:278)
at edu.cmu.sphinx.frontend.AutoCepstrum.newProperties(AutoCepstrum.java:118)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:165)
at edu.cmu.sphinx.util.props.PropertySheet.getComponentList(PropertySheet.java:422)
at edu.cmu.sphinx.frontend.FrontEnd.newProperties(FrontEnd.java:160)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:290)
at edu.cmu.sphinx.decoder.scorer.SimpleAcousticScorer.newProperties(SimpleAcousticScorer.java:46)
at edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer.newProperties(ThreadedAcousticScorer.java:130)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:290)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.newProperties(WordPruningBreadthFirstSearchManager.java:201)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:290)
at edu.cmu.sphinx.decoder.AbstractDecoder.newProperties(AbstractDecoder.java:70)
at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:37)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:290)
at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:89)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:165)
at edu.cmu.sphinx.api.Context.<init>(Context.java:73)
at edu.cmu.sphinx.api.Context.<init>(Context.java:44)
at edu.cmu.sphinx.api.AbstractSpeechRecognizer.<init>(AbstractSpeechRecognizer.java:37)
at edu.cmu.sphinx.api.LiveSpeechRecognizer.<init>(LiveSpeechRecognizer.java:33)
at Main.main(Main.java:26)
The source code which I am running is as follows:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import edu.cmu.sphinx.api.Configuration;
import edu.cmu.sphinx.api.LiveSpeechRecognizer;
import edu.cmu.sphinx.api.SpeechResult;
import edu.cmu.sphinx.api.StreamSpeechRecognizer;
public class Main {
public static void main(String[] args) {
Configuration configuration = new Configuration();
configuration.setAcousticModelPath("Alphabets/acoustic");
configuration.setDictionaryPath("Alphabets/alphabets.dic");
configuration.setLanguageModelPath("Alphabets/alphabets.lm.dmp");
LiveSpeechRecognizer recognizer = null;
try {
recognizer = new LiveSpeechRecognizer(configuration);
} catch (IOException e) {
e.printStackTrace();
}
recognizer.startRecognition(true);
SpeechResult result = recognizer.getResult();
recognizer.stopRecognition();
System.out.println(result.getHypothesis());
result.getLattice().dumpDot("lattice.dot", "lattice");
}
}
I highly appreciate the assistance.

You are trying to use sphinx4 with semi-continuous model. You need to train continuous model to use with sphinx4, see for details
http://cmusphinx.sourceforge.net/wiki/tutorialam
You need to set
$CFG_HMM_TYPE = '.cont.'; # Sphinx 4, Pocketsphinx

Related

Pig store fails to run store command when it is called using java code (embedded mode)

I am learning Hadoop, I tried running my pig script using java but it seems like it skips the store command written in the script and does not produce the output data file at the particular location.
But when I try running the pig script using command line, it gives the output data file as desired.
First I thought that java might have some permission issues that it is not creating a file. But I tried creating a file in the exact location using java and it easily creates an empty file. So it doesn't seem to be a permission issue.
Can anybody tell me what is the issue here that pig script runs successfully when used through command line but fails in the embedded mode?
Java Code:
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.pig.PigServer;
import java.io.IOException;
public class storePig {
public static void main(String args[]) throws Exception {
try {
PigServer pigServer = new PigServer("local");
runQuery(pigServer);
}catch (Exception e){
e.printStackTrace();
}
}
public static void runQuery(PigServer pigServer) throws IOException {
pigServer.registerScript("/home/anusharma/Desktop/stackoverflow/sampleScript.pig");
}
}
Pig-Script:
Employee = LOAD '/home/anusharma/Desktop/Hadoop/Pig/record.txt' using PigStorage(',') as (id:int, firstName:chararray, lastName:chararray, age:int, contact:chararray, city:chararray);
Employe = ORDER Employee BY age desc;
limitedEmployee = LIMIT Employe 4;
STORE limitedEmployee into '/home/anusharma/Desktop/stackoverflow/output' using
PigStorage('|');

Writing data from JavaDStream<String> in Apache spark to elasticsearch

I am working on programming to process data from Apache kafka to elasticsearch. For that purpose I am using Apache Spark. I have gone through many link but unable to find example to write data from JavaDStream in Apache spark to elasticsearch.
Below is sample code of spark which gets data from kafka and prints it.
import org.apache.log4j.Logger;
import org.apache.log4j.Level;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Arrays;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.regex.Pattern;
import scala.Tuple2;
import kafka.serializer.StringDecoder;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.*;
import org.apache.spark.streaming.api.java.*;
import org.apache.spark.streaming.kafka.KafkaUtils;
import org.apache.spark.streaming.Durations;
import org.elasticsearch.spark.rdd.api.java.JavaEsSpark;
import com.google.common.collect.ImmutableMap;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import java.util.List;
public class SparkStream {
public static JavaSparkContext sc;
public static List<Map<String, ?>> alldocs;
public static void main(String args[])
{
if(args.length != 2)
{
System.out.println("SparkStream <broker1-host:port,broker2-host:port><topic1,topic2,...>");
System.exit(1);
}
Logger.getLogger("org").setLevel(Level.OFF);
Logger.getLogger("akka").setLevel(Level.OFF);
SparkConf sparkConf=new SparkConf().setAppName("Data Streaming");
sparkConf.setMaster("local[2]");
sparkConf.set("es.index.auto.create", "true");
sparkConf.set("es.nodes","localhost");
sparkConf.set("es.port","9200");
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(2));
Set<String> topicsSet=new HashSet<>(Arrays.asList(args[1].split(",")));
Map<String,String> kafkaParams=new HashMap<>();
String brokers=args[0];
kafkaParams.put("metadata.broker.list",brokers);
kafkaParams.put("auto.offset.reset", "largest");
kafkaParams.put("offsets.storage", "zookeeper");
JavaPairDStream<String, String> messages=KafkaUtils.createDirectStream(
jssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams,
topicsSet
);
JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
#Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});
lines.print();
jssc.start();
jssc.awaitTermination();
}
}
`
One method to save to elastic search is using the saveToEs method inside a foreachRDD function. Any other method you wish to use would still require the foreachRDD call to your dstream.
For example:
lines.foreachRDD(lambda rdd: rdd.saveToEs("ESresource"))
See here for more
dstream.foreachRDD{rdd=>
val es = sqlContext.createDataFrame(rdd).toDF("use headings suitable for your dataset")
import org.elasticsearch.spark.sql._
es.saveToEs("wordcount/testing")
es.show()
}
In this code block "dstream" is the data stream which observe data from server like kafka. Inside brackets of "toDF()" you have to use headings. In "saveToES()" you have use elasticsearch index. Before this you have create SQLContext.
val sqlContext = SQLContext.getOrCreate(SparkContext.getOrCreate())
If you are using kafka to send data you have to add dependency mentioned below
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.10.2.1"
Get the dependency
To see full example see
In this example first you have to create kafka producer "test" then start elasticsearch
After run the program. You can see full sbt and code using above url.

Why does job submission from Java fail?

I submit a Spark job from Java as a RESTful service. I keep getting the following error:
Application application_1446816503326_0098 failed 2 times due to AM
Container for appattempt_1446816503326_0098_000002 exited with
exitCode: -1000 For more detailed output, check application tracking
page:http://ip-172-31-34-
108.us-west-2.compute.internal:8088/proxy/application_1446816503326_0098/Then,
click on links to logs of each attempt. Diagnostics:
java.io.FileNotFoundException: File file:/opt/apache-tomcat-
8.0.28/webapps/RESTfulExample/WEB-INF/lib/spark-yarn_2.10-1.3.0.jar does not exist Failing this attempt. Failing the application.
spark-yarn_2.10-1.3.0.jar file is there in the lib folder.
Here is my program.
package SparkSubmitJava;
import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import java.io.IOException;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.PathParam;
import javax.ws.rs.core.Response;
#Path("/spark")
public class JavaRestService {
#GET
#Path("/{param}/{param2}/{param3}")
public Response getMsg(#PathParam("param") String bedroom,#PathParam("param2") String bathroom,#PathParam("param3")String area) throws IOException {
String[] args = new String[] {
"--name",
"JavaRestService",
"--driver-memory",
"1000M",
"--jar",
"/opt/apache-tomcat-8.0.28/webapps/scalatest-0.0.1-SNAPSHOT.jar",
"--class",
"ScalaTest.ScalaTest.ScalaTest",
"--arg",
bedroom,
"--arg",
bathroom,
"--arg",
area,
"--arg",
"yarn-cluster",
};
Configuration config = new Configuration();
System.setProperty("SPARK_YARN_MODE", "true");
SparkConf sparkConf = new SparkConf();
ClientArguments cArgs = new ClientArguments(args, sparkConf);
Client client = new Client(cArgs, config, sparkConf);
client.run();
return Response.status(200).entity(client).build();
}
}
Any help will be appreciated.

Load Data into Hbase outside Client Node

Thanks in advance.
We are loading data into Hbase using Java. It's pretty straight and works fine when we run the program on the client node (edge node). But we want to run this program remotely (outside the hadoop cluster) within our network to load the data.
Is there anything required to do this in terms of security on the hadoop cluster? When I run the program outside the cluster it's hanging..
Please advise. Greatly appreciate your help.
Thanks
Code here
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import com.dev.stp.cvsLoadEventConfig;
import com.google.protobuf.ServiceException;
public class LoadData {
static String ZKHost;
static String ZKPort;
private static Configuration config = null;
private static String tableName;
public LoadData (){
//Set Application Config
LoadDataConfig conn = new LoadDataConfig();
ZKHost = conn.getZKHost();
ZKPort = conn.getZKPort();
config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", ZKHost);
config.set("hbase.zookeeper.property.clientPort", ZKPort);
config.set("zookeeper.znode.parent", "/hbase-unsecure");
tableName = "E_DATA";
}
//Insert Record
try {
HTable table = new HTable(config, tableName);
Put put = new Put(Bytes.toBytes(eventId));
put.add(Bytes.toBytes("E_DETAILS"), Bytes.toBytes("E_NAME"),Bytes.toBytes("test data 1"));
put.add(Bytes.toBytes("E_DETAILS"), Bytes.toBytes("E_TIMESTAMP"),Bytes.toBytes("test data 2"));
table.put(put);
table.close();
} catch (IOException e) {
e.printStackTrace();
}
}

How include image in .jar file for exporting into .exe

i made this code:
import java.awt.Image;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.imageio.ImageIO;
import javax.swing.ImageIcon;
import javax.swing.JFrame;
import javax.swing.JLabel;
public class ExplaneImage {
public static void main( String[] args )
{
Image image = null;
try {
File sourceimage = new File
("C:\\...\\ExplaneImage\\im1.jpg");
image = ImageIO.read(sourceimage);
} catch (IOException e) {}
JFrame frame = new JFrame();
frame.setSize(496, 325);
JLabel label = new JLabel(new ImageIcon(image));
frame.add(label);
frame.setVisible(true);
}
}
I'll need export this first in .jar file, after in .exe file (I have found the program to convert it), but it doesn't work cause exportation was made from class and not from a package, and the picture is not included in class.
I would like to make a simple images viewer for export in .exe format (to use on computers that do not have the jdk and jre) , independent, also containing images.
Thanks a lot for your help

Resources