How to extract CpG.bed from bismark alignment bam files - bam

I am doing WGBS analysis and want get CpG.bed from the alignment bam files.
I am able to covert bam to bed using the bedtools. However, the bismark bam files contain all CpG, CpH...alignment, how can i only extract CpG alignment?

Related

convert WAV to TETRA format

I am trying to convert a wav file into TETRA encoded file (https://en.wikipedia.org/wiki/Terrestrial_Trunked_Radio). Tetra is used by fire-fighters, it provides a radio-like voice.
I am trying to use the official tetra codec codes available at (https://www.etsi.org/deliver/etsi_en/300300_300399/30039502/01.03.01_60/) and we can easily compile it using the scripts available at https://github.com/sq5bpf/install-tetra-codec.
However, I am not able to figure out how to convert a wav file into tetra codec files using these files. I tried going through the documentation of the compiled files (ccoder, cdecoder, scoder, sdecoder).
I tried the following command -
tetra/bin/scoder input.wav serial_file synth_file
here serial_file and synth_file are the output files and have following documentation in the scoder.c file:
INPUT : - Description : speech file to be analyzed
- Format : binary file 16 bit-samples
240 samples per frame
serial_file : - Description : serial stream output file
- Format : binary file 16 bit-samples
each 16 bit-sample represents one encoded bit
138 (= 1 + 137) bits per frame
synth_file : - Description : local synthesis output file
- Format : binary file 16 bit-samples
For an input file of size 13M, I obtained serial_file and synth_file of size 8.0M and 16M. However, I thought since the wav file is getting converted into a walkie-talkie type signal the output file size will be alot smaller.
I want to clarify if:
I used the correct code to convert an input wav file into a tetra format output file.
can you please help me understand, what is serial_file and synth_file.
Thanks,
Ashish
synth_file is an internal computed parameters, you can ignore it.
serial_file is actual encoded data, you can use it in sdecoder.
traditionally ETSI code use very unefficent output format where instead of bits they output textual or expanded representation of those bits. I.e. instead of a bit they write a word of 16-bit where all the bits are 0 and the last one has information. They do it to interface with their transmission corruption test tools I presume.
So the actual length of the serial file should be 16 times smaller, so you converted 13Mb to 0.5Mb which is OK.

unable to merge content in NIFI using merge content processor

I have a text file and i need to merge text files into one single file using NIFI
when I am using merge content processor data is appending unable to merge
example :- if my text file has data like this
in file1:- 2201
and in file2:- 2202
after using merge content in nifi I am getting output as 22012202
merge content processor :-
but I need out put as below
2201
2202
2203
can any one help me with this i need to do this in nifi only
Put Delimiter Strategy as Text and set the Demarcator a new-line by entering shift+enter in the demarcator value.

How to print a pre formatted text in java

i have string like this
31/12/2002
HINDUSTHAN MOTORS LIMITED
THREE LACS SEVENTY SEVEN THOUSAND THREE HUNDRED
EIGHTY SEVEN ONLY
**377387.00
I want to print it in pre-printed cheque.
how to print this pre-formatted text in java.
I have text file which contains the above pre formatted text
my problem is the extension of the file is different (.kris format)
when I open the file in text doc the formatted text becomes one single line.
So, one single line printing in the printer is not acceptable.
I want to send above text to the printer by using the above fixed format.
1.Download blank cheque template in word format (This is the toughest task I would say if you are searching for free template).
2.Replace labels in it by blank spaces(Make sure that blank spaces occupy same number of spaces as labels).
3.In place of values put some identifiers.for ex: in your case,in place of **377387.00 put **$amount,$date in place of date.
4.Save word as pdf.
5.Now with itext in java replace $ values by your actual values.(itext library in java is used to handle pdfs).
Take print and check if it comes properly on blank cheque else do some minor changes to re-align them.

read file name from text file in ksh and compress

I want to write a KornShell (ksh) script to read filename from a textfile(having list of prefixes) and compress it if it is found in the directory by looping.
ex: I will keep the prefix 'abcd' in the text file .
By reading it, I want to compress the files in the directory matching the name like this abcd###.YYYYMMDDXXXXXX.txt
I do not want to do anything with same prefix but with different extensions or different patterns like abcd###.YYYYMMDDXXXXXX.dat or abcd###.YYYYMMDDXXXXXX.txt.Z. I only want to compress matching like this abcd###.YYYYMMDDXXXXXX.txt only.
How to implement this in ksh?
Superficially, this should do:
: ${COMPRESS:=xz}
while read prefix
do
$COMPRESS ${prefix}[0-9][0-9][0-9].[12][09][0-9][0-9][01][0-9][0-3][0-9]??????.txt
done < file
Obviously, I'm having to make some guesses, that # means a digit, that the YYYYMMDD is a date, and that X is any character (that's ? in the answer). If X is meant to be any upper-case alphabetic, or something else, adjust accordingly. The year rules will accept 19xx and 20xx (also 10xx and 29xx, but you're unlikely to have files dated like that; the month rules accept 00..19; the day rules accept 00..39. If you have to validate more, then you can't readily use a simple globbing regex.
I used xz as the compress program. I would not use the compress program for compression as it simply doesn't compare with gzip, let alone bzip2 or xz, etc.

How to read title and id from Blu-ray disc?

Is it somehow possible to fetch Blu-Ray Disc id and title programmatically on Windows7+ platform?
If you can programmatically open the following files you'll probably get what you need:
/AACS/mcmf.xml - This file is the Managed Copy manifest file and will contain a 'contentID' attribute (in the mcmfManifest tag) that can be used to identify the disc. Typically it is a 32 hexadecimal digit string.
There is sometimes, also an /CERTIFICATE/id.bdmv file which contains a 4 byte disc organization id (at byte offset 40) followed by a 16 byte disc id.
Sometimes, there is metadata information in the /BDMV/META/DL directory in the XML file bdmt_eng.xml (replace eng for other 3 letter language codes for other languages). For example on the supplemetary disc of The Dark Knight I see this file contains:
<di:title><di:name>The Dark Knight Bonus Disc</di:name></di:title>
For .NET, the BDInfo library will parse the relevant disc structure.

Resources