Trying to align reference genome to STAR - rna-seq

I am trying to set up a reference genome on STAR 2.7.3a so that I can map the reads from my RNA-Seq to it.
Here's the code I use:
STAR --runThreadN 4--runMode genomeGenerate--genomeDir /media/bigData/Valentin/Software/referenceGenomeforSTAR--genomeDir /media/bigData/Valentin/ReferenceGenome--genomeFastaFiles /media/bigData/Valentin/ReferenceGenome/Homo_sapiens.GRCh38.ncrna.fa /media/bigData/Valentin/ReferenceGenome/Homo_sapiens.GRCh38.dna.primary_assembly.fa--sjdbGTFfile /media/bigData/Valentin/ReferenceGenome--sjbdOverhang 100
EXITING because of fatal input
ERROR: could not open readFilesIn=Read1
Jan 24 15:44:39 ...... FATAL ERROR, exiting
Does anyone know what readFilesIn=Read1 means?
Thank you

Related

How to generate flamegraphs from macOS process samples?

Anyone have a clean process for converting samples on macOS to FlameGraphs?
After a bit of fiddling I thought I could perhaps use a tool such as flamegraph-sample, but it seems to give me some trouble and so I thought perhaps there may be other more up-to-date options that I'm missing insomuch that this tool gives an error:
$ sudo sample PID -file ~/tmp/sample.txt -fullPaths 1
Sampling process 198 for 1 second with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Sample analysis of process 35264 written to file ~/tmp/sample.txt
$ python stackcollapse-sample.py ~/tmp/sample.txt > ~/tmp/sample_collapsed.txt
$ flamegraph.pl ~/tmp/sample_collapsed.txt > ~/tmp/sample_collapsed_flamegraph.svg
Ignored 2335 lines with invalid format
ERROR: No stack counts found

H-PoPG Haplotyper NullPointerException Error at algorithms.HBOP2Builder

Hello I am building a genome assembly method and a critical step of my pipeline is phasing. I've been searching through different methods and recently discovered H-PoPG which looks promising for polyploid haplotyping. I am trying to test my data on it but I got the following result and couldn't find any help or forum on the web.
This is the command I am using:
java -jar H-PoPG.jar -p 3 -w 0.9 -f fragment_matrix_Chrm1 -vcf PilonChrm1.vcf -o output_phased_Chrm1
This is the error message:
Exception in thread "main" java.lang.NullPointerException
at algorithms.HBOP2Builder.<init>(HBOP2Builder.java:59)
at algorithms.HBOP2Builder.<init>(HBOP2Builder.java:25)
at algorithms.HPBOP2Alg.buildHaplotype(HPBOP2Alg.java:24)
at main.PolyPlotyping.Polyphasing(PolyPlotyping.java:224)
at main.PolyPlotyping.go(PolyPlotyping.java:159)
at main.PolyPlotyping.main(PolyPlotyping.java:280)
srun: error: neumann: task 0: Exited with exit code 1
Could anyone point me in the right direction by explaining me where this error could come from?
Many Thanks
I have run your data and found it is OK to run the command without the vcf file.
The error messages occur when it is run with the vcf file.
The vcf file contains many overlaps such as:
Chromosome_1_Reference 16 . A . 1486 LowCov DP=39;TD=43;BQ=38;MQ=57;QD=38;BC=39,0,0,0;QP=100,0,0,0;PC=119;IC=0;DC=0;XC=0;AC=0;AF=0.00 GT 0/0
Chromosome_1_Reference 16 . AAACCC A . Amb;LowCov DP=56;TD=60;BQ=39;MQ=57;QD=25;BC=19,21,0,0;QP=48,52,0,0;PC=119;IC=0;DC=16;XC=1;AC=1;AF=0.29 GT 0/1
Chromosome_1_Reference 17 . A C 1018 Amb;LowCov DP=56;TD=60;BQ=39;MQ=57;QD=25;BC=19,21,0,0;QP=48,52,0,0;PC=119;IC=0;DC=16;XC=1;AC=1;AF=0.52 GT 0/1
Please check the vcf file and ensure that every SNP position is covered by only one line,
and that the last column of each line should be 0/0/1 or 0/1/1 when the polyploidy is 3 (-p 3).

Error creating LMDB database file in MATLAB for Caffe

I am trying to convert a dataset of images to LMDB format for use with Caffe, and I need to call the convert_imageset function for applying this conversion from inside Matlab.
I am using Linux, and I have created a shell (.sh) script with the needed parameters for running the conversion. Here is an example of how does my shell file look like:
GLOG_logtostderr=1 /usr/local/caffe-master2/build/tools/convert_imageset -resize_height=256 -resize_width=256 images_folder data_split/train.txt data_split/dataCNN_train_lmdb
When I simply run my script from the terminal like this:
./example_shell.sh
it works without any problem.
But when I try to do it from Matlab using the system() function:
system('./example_shell.sh')
it seems it is not able to open/find my files, rising the following error for each image in train.txt:
I0917 18:15:13.637830 8605 convert_imageset.cpp:82] A total of 68175 images.
I0917 18:15:13.638947 8605 db.cpp:34] Opened lmdb data_split/dataCNN_train_lmdb
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
Here are some sample lines from train.txt file (do not mind the 0s, they are just dummy labels):
/media/user/HDD_2TB/Food_101_Dataset/images/beef_carpaccio/970563.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/chocolate_mousse/1908117.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/cup_cakes/632892.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/garlic_bread/1498092.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/ceviche/3115634.jpg 0
They are absolute paths, so there should be no problem.
Any idea you might have about what could be happening can be very helpful for me!
Thank you,
Marc
I have not been able to solve the specific problem with Matlab, but I have managed to do the following (weird) workaround by using .txt files for communication:
Call the main Matlab's program from Python.
Check whenever Matlab needs to call the ./example_shell.sh script.
Python does the conversion calling ./example_shell.sh.
Matlab execution continues.

Read Matrix from a file to Octave

I'm trying to read a matrix from a file to Octave and then apply a svd. As example, the following matrix in a text file called "teste.txt":
1 3 -2 3
3 5 1 5
-2 1 4 2
I'm trying to execute the following script in octave:
data = dlmread ("teste.txt", "\t",0,0);
svd(data)
However, I'm facing the following error, which I don't know exacly why:
/home/thiago/Documents/svd.oct: invalid ELF header
error: called from:
error: /home/thiago/Documents/svd.oct at line 2, column 1
Does anybody have any clue? I'm executing it on Ubuntu 14.04 and the file separator is a tab(\t).
Thank you very much in advance,
Thiago.

Ghostscript 'offending input'

When searching for an occurrence of text in a PostScript file, I receive the following error:
gsapi_run_string_continue returns -21
The API documentation specifies that return codes > 0 are "Error" but doesn't describe it any more specifically. Full error console output below - error occurs twice identically, only one occurrence displayed here.
GPL Ghostscript 9.15 (2014-09-22)
Copyright (C) 2014 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Displaying DSC file C:/Users/c-toothm/Desktop/PRDFlow12_30_2014_050307/1230ouptut.ps
Displaying page 1
%%[ ProductName: GPL Ghostscript ]%%
%%[ LastPage ]%%
Extracting text using pstotext...
Ghostscript returns error code -21`
--- Begin offending input ---
evice /pop , d
initmatrix [1 0 0 1 0 0] concat colspSet`
0.00 43.32 +
0.94 0.95 +S
(XSFT2200041.img) run
EPSFILE2200041 restore
;
0 0 0 sco 5 Lw N 4950 4742 M 4800 4742 I K
0 0 0 sco 5 Lw N 4950 4752 M 4800 4752 I K
0 0 0 sco 5 Lw N 4950 4762 M 4800 476
--- End offending input ---
gsapi_run_string_continue returns -21`
[duplicate error redacted]
Our production output creates a giant .ps file every day and this error occurs in many, but not all, .ps files when searching for text. Randomly selected .ps files from the web do not throw the error, so this GS build seems OK - definitely a problem with my file.
What "offending input" is being referred to here and what can I do to address it?
I'd need to see the PostScript file to tell you exactly what is wrong, but 'evice' is not a PostScript operator and so that is likely the problem. Also, from ghostpdl/gs/psi/ierrors.h error code -21 is e_undefined which means the interpreter has encountered an undefined token, which is some confirmation that this is the problem.
This could be because the file contains a 'typo' like that (perhaps it should be setpagedevice or something), or it could be because a filter is improperly terminated, or has insufficient data, and consumes extra bytes from the input stream, chewing up your program.
You should start by using the Ghostscript executable and reproduce the error with that (you might also try the display device, to see whether the problem is related to pstotext), that will allow you to give a command line which other people can then duplicate. With that, and a copy of the offending file I can tell you exactly what's wrong, without it, not much hope.
Bear in mind that PostScript is an interpreted programming language, so its pretty much impossible to tell you what's wrong with your program without seeing the code.
FWIW you might like to try the Ghostscript txtwrite device instead of pstotext, the device doesn't rely on tinkering with the language like pstotext does. pstotext is also really old (the last release is coming up on its 11th birthday) and unsupported.....

Resources