At the moment I am working on some performance measurements in Vulkan. I want to measure the difference between uncompressed formats such as VK_FORMAT_R32_SFLOAT and compressed formats such as VK_FORMAT_BC6H_UFLOAT_BLOCK. Is there a built-in feature in Vulkan that allows switching between formats at runtime?
Since the data is created at runtime, it is unfortunately not an option to compress the data offline. I also know that I could implement the compression myself, but BC6 is so complex that I would like to avoid it if possible.
If Vulkan does not support this feature, is there some C++ lib that I could use instead?
Vulkan does not have built-in on-the-fly image compression. According to a quick Google search, the DirectXTex library seems like it should do what you want.
Related
Could anybody give me pointers on how to process Switchboard dataset for training with RETURNN? I did see BlissDataset class that seems to be designed for switchboard, but it's not clear to me what I should include in the paths given in the example:
Example:
./tools/dump-dataset.py "
{'class':'BlissDataset',
'path': '/u/tuske/work/ASR/switchboard/corpus/xml/train.corpus.gz',
'bpe_file': '/u/zeyer/setups/switchboard/subwords/swb-bpe-codes',
'vocab_file': '/u/zeyer/setups/switchboard/subwords/swb-vocab'}"
The switchboard dataset has several folders with audios, i.e. swb1_d2/data/*.sph and transcripts swb1_LDC97S62/swb_ms98_transcriptions/**/*
I'm not quite sure how to proceed with this to get a dataset that can be used to train RETURNN.
At our group (RWTH Aachen University), we use the config as it was published on GitHub. As you see, this one uses ExternSprintDataset. That dataset uses
The implementation uses Sprint (publicly called RWTH ASR (RASR), see here) as an external tool (ran in a subprocess) to handle the data (feature extraction, etc). Sprint gets a Bliss XML file which describes all the segments with path to audio and audio offsets and transcriptions, and also it gets further configs for the feature extraction and maybe other things. There is an open source version of RASR which should work but it might be a bit involved to get this to work.
The BlissDataset was planned to be a simpler replacement for that. However, the implementation is incomplete. Also, you still would need to generate the Bliss XML by yourself in some way (we have used some own internal scripts to prepare that based on the official LDC data).
So, unfortunately, there is no simple way yet. Actually, I think the easiest way would be to come up with yet another custom format, which might be similar to the LibriSpeechDataset implementation, or maybe just the same, and then you could just reuse LibriSpeechDataset, or at least parts of that. That dataset implementation takes the data in some zip format which contains the transcripts in txt files and the audio in ogg or wav files. It uses librosa to do MFCC feature extraction (or also other feature types). I planned to implement that for Switchboard, and then reproduce the results, however I did not have time yet and not sure when I will get to that. But if you want to try that on your own, I will be happy to help you however I can. The starting point would be to look at LibriSpeechDataset and understand how the format of that looks like.
It seems that couchdb automatically compress all its _attachments when requested with the correct header. But unfortunately this doesn't happen for views, show or lists.
Is there any way to achieve a compression before returning the result to the client?
Is using a third party library like deflatejs (didn't test it yet) a bad approach?
Thanks
You can certainly use js-deflate in show and list functions, but you cannot do it in view functions. I also suspect it would be inefficient (just a guess, test it if you want numbers).
Until CouchDB does not support gzip encoding, the easiest solution is to put a reverse proxy in front of CouchDB to do the compression. For example you can use nginx with the HttpGzipModule.
The Couchbase distribution of CouchDB (Couchbase Single Server) supports Google's snappy compression for the JSON files on disk. I believe the same goes for the views, but I'll have to defer to someone better qualified.
On one of the applications that I am writing, I was asked to provide the feature for "pencil and eraser" to allow the user to doodle randomly on a document (for proofreading, note-taking, etc.)
What would be the best way to store such data?
I was thinking of using an image with transparency for each doodle (so that I can also support multiple colors of "doodles") but it seems like it will very quickly make any saved project with doodles grow large in file size.
I am looking if there is a better (existing) alternative (e.g. is there a DoodleXML spec out there?) or just any suggestions.
I think the "DoodleXML" spec you're looking for might just be SVG. Simply save the doodles as a series of lines. You don't need a full SVG engine as long as you're only supporting the subset that you generate in the first place.
I am in the process of selecting an image format that will be used as the storage format for all in-house textures.
The format will be used as a source format from which compressed textures for different platforms and configurations will be generated, and so needs to cover all possible texture types (2D, cube, volymetric, varying number of mip-maps, floating point pixel formats, etc.) and be completely lossless.
In addition the format has to be able to keep a bit of metadata.
Currently a custom format is used for this, but a commonly available format will be easier to work with for the artists since its viewable in most image editors.
I have thought of using DDS, but this format does not support metadata as far as I can see.
All suggestions appreciated!
With your requirements you should stay with your selfmade format. I don't know about any image-format besides DDS that supports volumetric and cube-textures. Unfortunately DDS does not support meta-data.
The closest thing you can find is TIFF. It does not directly support cube-maps or volumetric textures, but it supports any number of sub-images. That way you could re-use the sub-images as slices or cube-sides.
TIFF also has a very good support for custom meta-data. The libtiff image reading/writing library works pretty good. It looks a bit archaic if you come from a OO side, but it gets it's job done.
Nils
When peeking inside various games' resources I found out that most of them store textures (I don't know whether they're compressed or not) in TGA
TIFF would probably be your closest bet for a format which supports arbitrary meta-data and multiple frames, but I think you are better off keeping the assets (in this case, images) separate from how they are converted and utilized in your engine.
Keep images in 32 bit PNG format, and put type- and meta information in XML. That keeps your data human viewable, readable and editable. Obscure custom formats are for engines, not people.
Stick with whatever your artists work with.
If you are a windows/mac shop and use
photoshop stick with .psd
If you are a unix shop and use gimp
stick with .xcf
These formats will store layers and all the stuff your artists need and are used to.
Since your artists will be creating loads of assets make their life as easy as possible,
even if it means to write some extra code.
Put the meta data (whatever it may be) somewhere "along" the images if the native format (psd/xcf) doesn't support it.
For stuff like cube maps, mipmaps (if not generated by the converter) stick to naming guidlines or guidlines on how to put them into one file.
Depending on what tool you use to create the volumetric stuff, just stick with that tools native format.
While writing custom formats for the target is usually a good idea,
writing custom formats for artists results in mayhem...
My experience with DDS is that it is a poorly documented and difficult format to work with and offers few advantages. It is generally simpler to just store a master file for each image class that has references to the source images that make it up ( i.e. 6 faces for a cube map, an arbitrary number of slices for a volume texture ) as well as any other useful meta-data. It's always going to be a good idea to keep the meta-data in a seperate file ( or in a database ) as you do not want to be loading large numbers of images when carryong out searches, populating browsers, etc. It also makes sense to seperate your source image format ( tiff, tga, jpeg, dds ... ) from your "meta-format" ( cube, volume ... ) since you may well find that you need to use lossy compression to support HDR formats or very large source volume data.
Have you tried PNG? http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/png_metadata.html
As an alternative solution, maybe spend some time writing a plugin for a Free Image Editor for your file format? I've never done it before, so I don't know the work involved, but there is boatloads of example code out there for you.
I'm looking for an OSX (or Linux?) application that can recieve data from a webcam/video-input and let you do some image processing on the pixels in something similar to c or python or perl, not that bothered about the processing language.
I was considering throwing one together but figured I'd try and find one that exists already first before I start re-inventing the wheel.
Wanting to do some experiments with object detection and reading of dials and numbers.
If you're willing to do a little coding, you want to take a look at QTKit, the QuickTime framework for Cocoa. QTKit will let you easity set up an input source from the webcam (intro here). You can also apply Core Image filters to the stream (demo code here). If you want to use OpenGL to render or apply filters to the movie, check out Core Video (examples here).
Using theMyMovieFilter demo should get you up and running very quickly.
Found a cross platform tool called 'Processing', actually ran the windows version to avoid further complications getting the webcams to work.
Had to install quick time, and something called gVid to get it to work but after the initial hurdle coding seems like C; (I think it gets "compiled" into Java), and it runs quite fast; even scanning pixels from the webcam in real time.
Still to get it working on OSX.
Depending on what processing you want to do (i.e. if it's a filter that's available in Apple's Core Image filter library), the built-in Photo Booth app may be all you need. There's a comercial set of add-on filters available from the Apple store as well (http://www.apple.com/downloads/macosx/imaging_3d/composerfxeffectsforphotobooth.html).