I have problem with multiple sequence alignement. I have two sequences as follow and I m trying to align them using biojava methods and I get error like this. I have no idea what is wrong. I know that sequences are not the same length but it should not matter.
GSKTGTKITFYEDKNFQGRRYDCDCDCADFHTYLSRCNSIKVEGGTWAVYERPNFAGYMYILPQGEYPEYQRWMGLNDRLSSCRAVHLPSGGQYKIQIFEKGDFSGQMYETTEDCPSIMEQFHMREIHSCKVLEGVWIFYELPNYRGRQYLLDKKEYRKPIDWGAASPAVQSFRRIVE
SMSAGPWKMVVWDEDGFQGRRHEFTAECPSVLELGFETVRSLKVLSGAWVGFEHAGFQGQQYILERGEYPSWDAWGGNTAYPAERLTSFRPAACANHRDSRLTIFEQENFLGKKGELSDDYPSLQAMGWEGNEVGSFHVHSGAWVCSQFPGYRGFQYVLECDHHSGDYKHFREWGSHAPTFQVQSIRRIQQ
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at
org.forester.evoinference.distance.NeighborJoining.getValueFromD(NeighborJoining.java:150)
at
org.forester.evoinference.distance.NeighborJoining.execute(NeighborJoining.java:123)
at org.biojava3.alignment.GuideTree.(GuideTree.java:88) at
org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:183)
at Fasta.main(Fasta.java:41)
public class Fasta {
public static void main(String[] args) throws Exception{
ArrayList<String> fileName = new ArrayList<String> ();
fileName.add("2M3T.fasta.txt");
fileName.add("3LWK.fasta.txt");
ArrayList<ProteinSequence> al = new ArrayList<ProteinSequence>();
//ArrayList<ProteinSequence> all = new ArrayList<ProteinSequence>();
for (String fn : fileName)
{
al = getProteinSequenceFromFasta(fn);
//all.add(al.get(0));
for (ProteinSequence s : al)
{
System.out.println(s);
}
}
Profile<ProteinSequence, AminoAcidCompound> profile = Alignments.getMultipleSequenceAlignment(al);
System.out.printf("Clustalw:%n%s%n", profile);
ConcurrencyTools.shutdown();
}
//for (int i=0;i<sequence.size();i++)
// System.out.println(sequence);
public static ArrayList<ProteinSequence> getProteinSequenceFromFasta(String file) throws Exception{
LinkedHashMap<String, ProteinSequence> a = FastaReaderHelper.readFastaProteinSequence(new File(file));
//sztuczne
ArrayList<ProteinSequence> sequence = new ArrayList<ProteinSequence>(a.values());
return sequence;
}
}
My guess is the problem is at this line:
for (String fn : fileName)
{
al = getProteinSequenceFromFasta(fn);
...
}
You are overwriting the contents of a1 for each file. (I assume you want to add all the fasta records into a1. If your fasta files only has 1 record each then it can't do a multiple alignment to a single record.
You probably want
for (String fn : fileName)
{
al.addAll(getProteinSequenceFromFasta(fn) );
...
}
Granted, the library you are using should probably have checked first to make sure there are more than 1 sequences....
Related
I am very new to programming as you can probably tell from my code! I am creating a program to edit uploaded pictures by creating a buffered image which is then to be modified but I am having issues. My code is
private Scanner sc;
private String inputPath = null;
public String getInputPath() throws Exception {
System.out.println("Please enter the path to the picture you wish to edit.");
sc = new Scanner(System.in);
inputPath = sc.next();
getFile(inputPath);
return inputPath;
}
private File getFile(String inputPath) throws Exception{
File f = new File(inputPath);
while (true) {
if (f.exists()){
bufferedImage(); //line 27
}else{
throw new Exception("Invalid file name!");
}
}
}
public void bufferedImage() throws Exception {
File f = getFile(inputPath); //line 35
BufferedImage bi = ImageIO.read(f);
System.out.println(bi);
return;
}
My error message is:
Exception in thread "main" java.lang.StackOverflowError:
at ie.gmit.dip.ImageWriter.getFile(ImageWriter.java:27)
at ie.gmit.dip.ImageWriter.bufferedImage(ImageWriter.java:35)
I have marked the lines in my code above. Any help on this would be massively appreciated!
I am trying to extract the precondition's expression as SWRL to make a IOPE web service OWL-S matchmaking
Here's my code
final OWLIndividualList<Condition> cs = service.getProcess().getConditions();
final ArrayList<ArrayList<URI>> conditions = new ArrayList<ArrayList<URI>>();
for (final Condition<?> c : cs){
if (c.canCastTo(Condition.SWRL.class)){
final Condition.SWRL sc = c.castTo(Condition.SWRL.class);
for (final Atom a : sc.getBody()){
a.accept(new AtomVisitor() {
public void visit(final IndividualPropertyAtom atom){
URI aux = null;
final ArrayList<URI> uris = new ArrayList<URI>();
URI a1 = aux.create((atom.getArgument1().getNamespace().toString()
+atom.getArgument1().toString()));
URI a2 = aux.create((atom.getArgument2().getNamespace().toString()
+atom.getArgument2().toString()));
URI p = aux.create(atom.getPropertyPredicate().toString());
uris.add(p);
uris.add(a1);
uris.add(a2);
conditions.add(uris);
}
public void visit(final DataPropertyAtom atom) { }
public void visit(final SameIndividualAtom atom) { }
public void visit(final DifferentIndividualsAtom atom) { }
public void visit(final ClassAtom atom) { }
public void visit(final BuiltinAtom atom) { }
});
}
}
}
I am getting an java.lang.NullPointerException on "final Atom a : sc.getBody()"
The OWL-S precondition statement
<expr:SWRL-Condition rdf:ID="DifferentLocations">
<expr:expressionLanguage rdf:resource="http://www.daml.org/services/owl-
s/1.2/generic/Expression.owl#SWRL"/>
<expr:expressionBody rdf:parseType="Literal">
<swrl:AtomList>
<rdf:first>
<swrl:DifferentIndividualsAtom>
<swrl:argument1 rdf:resource="#_GEOPOLITICAL-ENTITY"/>
<swrl:argument2 rdf:resource="#_GEOPOLITICAL-ENTITY1"/>
</swrl:DifferentIndividualsAtom>
</rdf:first>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
</swrl:AtomList>
</expr:expressionBody>
</expr:SWRL-Condition>
Please I need help
This issue is not linked to the java code but to the OWLS-S file syntaxe. you can resolve this issue by replacing:
<expr:expressionBody rdf:parseType="Literal">
which hold the SWRL precondition (or eventually the result), by:
<expr:expressionObject>
I have configured a job as follow, which is to read from db and write into files but by partitioning data on basis of sequence.
//Job Config
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) throws Exception {
Flow masterFlow1 = (Flow) new FlowBuilder<Object>("masterFlow1").start(masterStep()).build();
return (jobBuilderFactory.get("Partition-Job")
.incrementer(new RunIdIncrementer())
.start(masterFlow1)
.build()).build();
}
#Bean
public Step masterStep() throws Exception
{
return stepBuilderFactory.get(MASTERPPREPAREDATA)
//.listener(customSEL)
.partitioner(STEPPREPAREDATA,new DBPartitioner())
.step(prepareDataForS1())
.gridSize(gridSize)
.taskExecutor(new SimpleAsyncTaskExecutor("Thread"))
.build();
}
#Bean
public Step prepareDataForS1() throws Exception
{
return stepBuilderFactory.get(STEPPREPAREDATA)
//.listener(customSEL)
.<InputData,InputData>chunk(chunkSize)
.reader(JDBCItemReader(0,0))
.writer(writer(null))
.build();
}
#Bean(destroyMethod="")
#StepScope
public JdbcCursorItemReader<InputData> JDBCItemReader(#Value("#{stepExecutionContext[startingIndex]}") int startingIndex,
#Value("#{stepExecutionContext[endingIndex]}") int endingIndex)
{
JdbcCursorItemReader<InputData> ir = new JdbcCursorItemReader<>();
ir.setDataSource(batchDataSource);
ir.setMaxItemCount(DBPartitioner.partitionSize);
ir.setSaveState(false);
ir.setRowMapper(new InputDataRowMapper());
ir.setSql("SELECT * FROM FIF_INPUT fi WHERE fi.SEQ > ? AND fi.SEQ < ?");
ir.setPreparedStatementSetter(new PreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps) throws SQLException {
ps.setInt(1, startingIndex);
ps.setInt(2, endingIndex);
}
});
return ir;
}
#Bean
#StepScope
public FlatFileItemWriter<InputData> writer(#Value("#{stepExecutionContext[index]}") String index)
{
System.out.println("writer initialized!!!!!!!!!!!!!"+index);
//Create writer instance
FlatFileItemWriter<InputData> writer = new FlatFileItemWriter<>();
//Set output file location
writer.setResource(new FileSystemResource(batchDirectory+relativeInputDirectory+index+inputFileForS1));
//All job repetitions should "append" to same output file
writer.setAppendAllowed(false);
//Name field values sequence based on object properties
writer.setLineAggregator(customLineAggregator);
return writer;
}
Partitioner provided for partitioning db is written separately in other file so as follows
//PartitionDb.java
public class DBPartitioner implements Partitioner{
public static int partitionSize;
private static Log log = LogFactory.getLog(DBPartitioner.class);
#SuppressWarnings("unchecked")
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
log.debug("START: Partition"+"grid size:"+gridSize);
#SuppressWarnings("rawtypes")
Map partitionMap = new HashMap<>();
int startingIndex = -1;
int endSize = partitionSize+1;
for(int i=0; i< gridSize; i++){
ExecutionContext ctxMap = new ExecutionContext();
ctxMap.putInt("startingIndex",startingIndex);
ctxMap.putInt("endingIndex", endSize);
ctxMap.put("index", i);
startingIndex = endSize-1;
endSize += partitionSize;
partitionMap.put("Thread:-"+i, ctxMap);
}
log.debug("END: Created Partitions of size: "+ partitionMap.size());
return partitionMap;
}
}
This one is executing properly but problem is even after partitioning on the basis of sequence i am getting same rows in multiple files which is not right as i am providing different set of data for each partition. Can anyone tell me whats wrong. I am using HikariCP for Db connection pooling and spring batch 4
This one is executing properly but problem is even after partitioning on the basis of sequence i am getting same rows in multiple files which is not right as i am providing different set of data for each partition.
I'm not sure your partitioner is working properly. A quick test shows that it is not providing different sets of data as you are claiming:
DBPartitioner dbPartitioner = new DBPartitioner();
Map<String, ExecutionContext> partition = dbPartitioner.partition(5);
for (String s : partition.keySet()) {
System.out.println(s + " : " + partition.get(s));
}
This prints:
Thread:-0 : {endingIndex=1, index=0, startingIndex=-1}
Thread:-1 : {endingIndex=1, index=1, startingIndex=0}
Thread:-2 : {endingIndex=1, index=2, startingIndex=0}
Thread:-3 : {endingIndex=1, index=3, startingIndex=0}
Thread:-4 : {endingIndex=1, index=4, startingIndex=0}
As you can see, almost all partitions will have the same startingIndex and endingIndex.
I recommend you unit test your partitioner before using it in a partitioned step.
I recently asked this question How can I pass a proper method reference in so Nashorn can execute it? and got an answer that helped me get much further along with my project, but I discovered a limitation around providing a custom JSObject implementation that I don't know how to resolve.
Given this simple working JSObject that can handle most of the methods JS would invoke on it such as map:
import javax.script.*;
import jdk.nashorn.api.scripting.*;
import java.util.*;
import java.util.function.*;
public class scratch_6 {
public static void main(String[] args) throws Exception {
ScriptEngineManager m = new ScriptEngineManager();
ScriptEngine e = m.getEngineByName("nashorn");
// The following JSObject wraps this list
List<Object> l = new ArrayList<>();
l.add("hello");
l.add("world");
l.add(true);
l.add(1);
JSObject jsObj = new AbstractJSObject() {
#Override
public Object getMember(String name) {
if (name.equals("map")) {
// return a functional interface object - nashorn will treat it like
// script function!
final Function<JSObject, Object> jsObjectObjectFunction = callback -> {
List<Object> res = new ArrayList<>();
for (Object obj : l) {
// call callback on each object and add the result to new list
res.add(callback.call(null, obj));
}
// return fresh list as result of map (or this could be another wrapper)
return res;
};
return jsObjectObjectFunction;
} else {
// unknown property
return null;
}
}
};
e.put("obj", jsObj);
// map each String to it's uppercase and print result of map
e.eval("print(obj.map(function(x) '\"'+x.toString()+'\"'))");
//PROBLEM
//e.eval("print(Object.keys(obj))");
}
}
If you uncomment the last line where Object.keys(obj) is called, it will fail with the error ... is not an Object.
This appears to be because Object.keys() [ NativeObject.java:376 ] only checks whether the object is an instance of ScriptObject or of ScriptObjectMirror. If it is neither of those things, it throws the notAnObject error. :(
Ideally, user implemented JSObject objects should be exactly equivalent to script objects. But, user implemented JSObjects are almost script objects - but not quite. This is documented here -> https://wiki.openjdk.java.net/display/Nashorn/Nashorn+jsr223+engine+notes
Object.keys is one such case where it breaks. However, if you just want for..in javascript iteration support for your objects, you can implement JSObject.keySet in your class.
Example code:
import javax.script.*;
import jdk.nashorn.api.scripting.*;
import java.util.*;
public class Main {
public static void main(String[] args) throws Exception {
ScriptEngineManager m = new ScriptEngineManager();
ScriptEngine e = m.getEngineByName("nashorn");
// This JSObject wraps the following Properties object
Properties props = System.getProperties();
JSObject jsObj = new AbstractJSObject() {
#Override
public Set<String> keySet() {
return props.stringPropertyNames();
}
#Override
public Object getMember(String name) {
return props.getProperty(name);
}
};
e.put("obj", jsObj);
e.eval("for (i in obj) print(i, ' = ', obj[i])");
}
}
i'm learning spark for distributed systemes. i runned this code and it's worked.
but i know that it's count word in input files but i have probleme undestanding how Methods are written and what the us of JavaRDD
public class JavaWordCount {
public static void main(String[] args) throws Exception {
System.out.print("le programme commence");
//String inputFile = "/mapr/demo.mapr.com/TestMapr/Input/alice.txt";
String inputFile = args[0];
String outputFile = args[1];
// Create a Java Spark Context.
System.out.print("le programme cree un java spark contect");
SparkConf conf = new SparkConf().setAppName("JavaWordCount");
JavaSparkContext sc = new JavaSparkContext(conf);
// Load our input data.
System.out.print("Context créeS");
JavaRDD<String> input = sc.textFile(inputFile);
// map/split each line to multiple words
System.out.print("le programme divise le document en multiple line");
JavaRDD<String> words = input.flatMap(
new FlatMapFunction<String, String>() {
#Override
public Iterable<String> call(String x) {
return Arrays.asList(x.split(" "));
}
}
);
System.out.print("Turn the words into (word, 1) pairse");
// Turn the words into (word, 1) pairs
JavaPairRDD<String, Integer> wordOnePairs = words.mapToPair(
new PairFunction<String, String, Integer>() {
#Override
public Tuple2<String, Integer> call(String x) {
return new Tuple2(x, 1);
}
}
);
System.out.print(" // reduce add the pairs by key to produce counts");
// reduce add the pairs by key to produce counts
JavaPairRDD<String, Integer> counts = wordOnePairs.reduceByKey(
new Function2<Integer, Integer, Integer>() {
#Override
public Integer call(Integer x, Integer y) {
return x + y;
}
}
);
System.out.print(" Save the word count back out to a text file, causing evaluation.");
// Save the word count back out to a text file, causing evaluation.
counts.saveAsTextFile(outputFile);
System.out.println(counts.collect());
sc.close();
}
}
As mentioned by PinoSan this question is probably too generic, and you should be able to find your answer in any Spark Getting Started, or Tutorial.
Let me point you to some interesting content:
Spark Quick Start Guide
Getting Started with Apache Spark, ebook
Introduction to Apache Spark with Examples and Use Cases
Disclaimer: I am working for MapR this is why I put online resources on Spark from MapR site