How to generate a custom bed file to use for bedtools intersect? - bioinformatics

I have a custom reference genome, gene.fa and 18 bed files. I want to generate a bed file that contains a region of interest, 5100-5600 bp, as a single entry that I can use for intersection using bedtools intersect on my 18 bed files.
I was thinking of copy/pasting the region of interest sequence from the reference genome and aligning it to generate my bed file. The problem with this is that my reference genome is a trimer so this sequence is repeated three times and there would be error in the alignment.
Is there a better way to do this? Can you use bedtools intersect with a text file?
I am new to bioinformatics and sequencing so I may be overthinking this problem.

BED files are text files, so if you only have a small number of regions of interest and you know their coordinates, you can write the file with a text editor. See the BED file specification.
If you only have the sequence of your ROI, you can get the coordinates by aligning it to the genome e.g. by BLAST. If the sequence appears in the genome multiple times, it should not result in errors, but you need to know which alignment is to your true ROI or include them all in the BED file as separate entries.

Related

XSLT create complex SVG visualisation minimizing line crossings etc

This is no actual single coding problem, rather a problem of the right approach to a complex issue.
So, I have built a rather complex svg visualisation of my XML data using xslt. It looks like this:
(source: erksst.de)
This is just a small sample of the whole data. There are two or three rows. Each row could contain up to 160 yellow boxes.
(The yellow boxes are letter collections, the blue/grey boxes single letters, the lines represent their way of dissemination.)
It works well so far but I want to optimize it:
(1) minimize the number of line crossing
(2) minimize the number of lines crossing a blue/grey box
(3) minimize the lines being too near to another line.
To achieve this there are things to vary:
(a) The broadest row (in the sample it is the third) is fix. It can't be moved. But the other (two) can be moved in the range of the width of the broadest row. I.e. in my example the yellow box of the second row could be moved some 160 pixels to the right.
(b) Furthermore, in the two smaller rows the margin between the yellow boxes could be varied. In my example there is just one per line. But of course there could be more than one yellow box in the two smaller rows.
(c) The order of the yellow boxes within a row could be altered.
So, many possibilites to realize this visualisation.
The problem is the performance time.
I have started with the line crossing problem by using a function which kind of pre-builds the visualisation and calculates the number of crossings.
The variation with the smallest number of crossings is actually built in the output.
The problem is the time it needs. The transformation with just 100 possibilites and my hole XML data took 90 seconds. Doesn't sound like much, but taking into account that 100 variations are just a very small part of all theoritically possible options and that the visualition should at some point in the future build on the fly on server for a user's selection of the data 90 seconds simply is way too much.
I have allready reduced the visualition template for the calculate line crossings functions to all what is necessary leaving asside all captions and so on. That did help, but not as much as expected.
The lines are drawn as follows: First, all boxes are drawn keeping their id from the original data. Then I go back to my data, look where connections are and build the lines.
You could transform your XML into the DOT language (plain TXT format) by XSLT and process it by GraphViz. I solved some similar issue (although not so huge as yours seems to be) this way.

Camera calibration patterns

I would like to know if there is a process to generate camera calibration patterns.
We can use paint or any other graphic tool and set the precise measurements but then we need to hard-code the point positions or create a txt/xml file.
Is there a software that exports the data to a file that we can upload in our software.
What about 3D targets like boxes and/or cubes. Is there a method to generate the correct data points?
Cheers.
For 2D targets such as checkerboards, I used to do it like user469049 describes. Which was quite time consuming. In the end I gave up and created a web tool that does all of the leg work:
https://calib.io/pages/camera-calibration-pattern-generator
I'm using inkscape:
http://dominoc925.blogspot.co.uk/2012/06/create-camera-calibration-chess-board.html
I usually create a pdf file used to print and save files as LaTeX with PSTricks extensions.
The tex file has paths, so for a square it has a \moveto command to set the starting point and it has \line to command to set the next points.
In the dominoc925 example they define black and white squares but I just define the black squares to avoid repeated points.
I have a simple file loader in my code to get the points, just search for the \moveto and \line commands and workout the points from there.
For the 3D targets I treat each patter as one view because I don't have the tools to build a precise 3D target.
So instead of having different views of one patter like in the Matlab toolbox, I treat each detected pattern as a view.
In other words, if you have a 3D object then the target on each face is treated as a independent view.
There is probably a more professional way to do the job but this is my process :)
I hope this helps.

2D Tile Map Generation from Tileset

I have been looking all over and seen lot of questions of this nature.
My problem is much simpler than generating a 3D world with heights and such.
I'd like to generate a 2D map in a limited space (15x15, 20x20 ...) based on a tileset.
Here is a random example what a simple result could look like:
Is anyone aware of an algorithm which is capable of executing such a task?
First, you should create a spritesheet that will have all the tiles you need.
Then you should create a class 'Tile', that will be able to render yourself (tile will be of fixed size, so it will not be hard)
Then, you should create a level. I mean, you should create a description of your level. You can do it in text format, xml or you can generate it randomly. For example, your level:
GGGGGGGG
GGGGGGGG
GGGBBGGG
GGGGGGGG
Where one letter means one tile (G is for grass, B is for bridge).
Then, when you've done it, you should iterate through your level descritpion and render it.
UPD. Sorry for misunderstanding your question. I use an excellent piece of software: "Tiled". It's open source and great! You cat create levels by drag and drop. And when completed you can export your level to xml, txt, json and other formats.

How to create Frustum in DXF format for Autocad?

I am trying to create a frustum using DXF file with my text editor, but unable to find any good solution or way to create it. If anyone knows and would be obliged to help me find a solution DXF file or an example, that would be a great help. Thank You
A frustum is a 3-dimensional figure. While the DXF file format does contain provisions for coordinates in the Z-direction(3rd dimension), you cannot use it to define a true solid model. There are 3DFACE and 3DSOLID entities available, but they do not give you solid model objects like the IGES or STEP file formats would.
That being said, you have a tremendously large task ahead of you to create a valid DXF file in a text editor. There is a huge amount of header information required in a DXF file for even the simplest shapes.
If you are serious about it, your best bet will be to start with a blank (no entities drawn) DXF file that has been created in a CAD package and use that as your base. Then you can add entities to it as needed. You will need the DXF file specification. I use the R2000 spec since it has all the needed functionality while still being backwards compatible with the largest number of systems.
If you are doing a conic frustum, you will need to investigate the ELLIPSE entity, and if you are doing a pyramidal frustum, you will need the LINE entity. Depending on the level of complexity you want and the version of the DXF spec that you work from, you can also use the SURFACE entity, but in a DXF file, this will just be represented as a series of lines forming a grid(polyface mesh).
You can also use the 3DFACE and 3DSOLID entities, but if you really want solid geometry, you're better off using a 3D file format as mentioned above.

How to visualize the contents of any file as an image?

I would like to print (in a paper) the contents of any file, so that someone can recreate the original file from the scanned image.
Think of it as storing a file in paper.
One solution is to make a 2D barcode by printing the binary components of the file (1 as black squares, 0 as white squares).
I don't want to reinvent the wheel. If there is any (open) standard to make this, I would be grateful to hear it.
What if you get the content of the file and then do a base64 encode on it. Then, the resulting code can be used to print the contents of the file on paper. Finally you can scan the paper, do some OCR on the scanned image, reverse the base64 encoding and you will end up with the binary form of the file.
I'd look into QR Codes. Unfortunately, they max out at about 3 kilobytes each, but you could simply print a page with many of them, in order of how your file is appended. I'd imagine that you could fit maybe 20 kilobytes on a page if you had a good printer and scanner. I'd also suggest compressing the data first to save space.

Resources