FFmpeg export specific pixels - ffmpeg

I'm looking for a way to extract a set of specific pixels from a video (and possibly store them in JSON)
So far i have found FFmpeg which looks like it should do the heavy lifting if i can find the correct commands.
Alternatively I could try using the source and building my own project that just leverages FFmpeg to extract frame data, but i think/hope that's unnecessary.
So if its possible, What commands could accomplish this?
Or perhapse there is a whole different approach i could take, any help would be great!

After a solid day of research into various tools, i stumbled across Accord.NET which is a machine learning SDK, and one of the big parts of machine learning is image processing.
Simply adding the 'Accord.Video.FFMPEG' NuGET package allows me to access their 'VideoFileReader.ReadVideoFrame' and combined with the standard .Net 'Bitmap.GetPixel' i now have a super simple little CLI that can output a pixel colour for my position of choice.
http://accord-framework.net/

Related

Returnn Switchboard data processing

Could anybody give me pointers on how to process Switchboard dataset for training with RETURNN? I did see BlissDataset class that seems to be designed for switchboard, but it's not clear to me what I should include in the paths given in the example:
Example:
./tools/dump-dataset.py "
{'class':'BlissDataset',
'path': '/u/tuske/work/ASR/switchboard/corpus/xml/train.corpus.gz',
'bpe_file': '/u/zeyer/setups/switchboard/subwords/swb-bpe-codes',
'vocab_file': '/u/zeyer/setups/switchboard/subwords/swb-vocab'}"
The switchboard dataset has several folders with audios, i.e. swb1_d2/data/*.sph and transcripts swb1_LDC97S62/swb_ms98_transcriptions/**/*
I'm not quite sure how to proceed with this to get a dataset that can be used to train RETURNN.
At our group (RWTH Aachen University), we use the config as it was published on GitHub. As you see, this one uses ExternSprintDataset. That dataset uses
The implementation uses Sprint (publicly called RWTH ASR (RASR), see here) as an external tool (ran in a subprocess) to handle the data (feature extraction, etc). Sprint gets a Bliss XML file which describes all the segments with path to audio and audio offsets and transcriptions, and also it gets further configs for the feature extraction and maybe other things. There is an open source version of RASR which should work but it might be a bit involved to get this to work.
The BlissDataset was planned to be a simpler replacement for that. However, the implementation is incomplete. Also, you still would need to generate the Bliss XML by yourself in some way (we have used some own internal scripts to prepare that based on the official LDC data).
So, unfortunately, there is no simple way yet. Actually, I think the easiest way would be to come up with yet another custom format, which might be similar to the LibriSpeechDataset implementation, or maybe just the same, and then you could just reuse LibriSpeechDataset, or at least parts of that. That dataset implementation takes the data in some zip format which contains the transcripts in txt files and the audio in ogg or wav files. It uses librosa to do MFCC feature extraction (or also other feature types). I planned to implement that for Switchboard, and then reproduce the results, however I did not have time yet and not sure when I will get to that. But if you want to try that on your own, I will be happy to help you however I can. The starting point would be to look at LibriSpeechDataset and understand how the format of that looks like.

Censor Plugin or Extension for VLC Media Player

I'm having an idea to create a Censor Plugin/Extension for VLC Player..
Problem Scenario :
An Adult-Scene for 1 minute in a nice movie makes it not watchable with Family.
My Solution :
Create a Plugin/Extension which does the following
Reads time positions from a file similar to subtitle files
Skip these time positions (which are adult or inappropriate) when playing
Help i needed :
I searched in Google and in videolan website, But can't find an exact solution
Are there already similar Plugins available?
Where should i start?
Please help me if you could guys.. thanks..
Same looking for having/developing Exact same solution. This might be helpful to you.
http://code.google.com/p/movie-content-editor/
A similar thing is also available on github:
https://github.com/rdp/sensible-cinema
You may also want to read this discussion thread:
https://forum.videolan.org/viewtopic.php?t=89466
finding great similar answer here
If you chop random bytes out the movie is likely not playable. The player might crash or fail to resynchronize the stream – the video might just stop. Plus, you're gonna have a hard time figuring out where the "adult" bytes are, so to speak.
If you already know where the parts are that you want to cut out, I would edit the file in any of the numerous video editors. Even Windows Movie Maker or iMovie would do the job, and those are easily available on both major OSes.
This is a requested feature for VLC. Not really anything user-friendly out there. Still, VLC offers the possibility to create playlists in a certain format that would mute or skip parts of a file. This is called XSPF. You might be able to figure out the proper format for this.
Also, there's movie-content-editor:
A VLC based editor built in python that allows users to create and use custom filter files to make movies more family friendly. Allows users to have the player automatically mute specific words or skip certain scenes based on the content of those scenes.
And sensible-cinema:
Clean Editing Movie Player allows you watch edited movies by applying delete lists (EDL's) (i.e. "mute out" or "cut out" scenes) to DVD's/files, with preliminary support for also applying them to arbitrary web/internet based players like netflix instant, hulu/hulu plus etc
See also these threads on The VideoLAN Forums:
auto skip unwanted parts of a video
Clearplay-like (content filter) module exists?

easy, programmable data plotting

I spend most of my time plotting data, but unfortunately I haven't found a decent solution for my plotting needs. At the moment, the most powerful and pleasant library I found that performs plotting is matplotlib. The results are stunning, but I mostly spend my time fighting with the library when trying to do simple things like having an arrow as I want. SImilar programs like R and gnuplot produce visually less appealing results, and they are not GUI based.
On the other hand, programs like xmgrace (or better) allow direct manipulation of the plotted objects and direct feedback, but they fail on two important points:
if my dataset (normally stored in csv files) changes for some reason, I have to reimport it and perform the manipulations again, by hand
once I obtain a nice plot setup, the only way I have to recreate the plot is to use a graphical, interactive program. I would like to have the possibility to run a command line utility on my csv files and get the .pdf as a result, with no human intervention.
I still have to find something that provides me both worlds, and it has an affordable price. Ideally, I would need an interactive GUI program (a la Origin) to generate matplotlib-based python scripts.
Does anyone have any hints on software that could address my needs on OSX (preferably) or Linux ?
You may want to check out Igor Pro. It's quite old, and quirky but it provides the most advanced plotting system I've found yet on the Mac. You can modify anything graphically, at a command line or in script files. The most powerful feature (IMO) is the ability to automatically generate a script to recreate a figure or to use a figure to create a script that generates figures like (in style etc.) a particular figure. I use Igor for all publication figures I produce.
Data is stored in "waves" (translation: vectors) which encapsulate data and information about the delta between data points (e.g. time step). Figures reference waves as their data source. When you update a wave (e.g. by re-importing a CSV file and specifying that the data overwrite specific waves), all figures that reference that wave are automatically updated.
You can create "layouts" which are page-layouts containing multiple graphs. These layouts are also automatically updated whenever any of the figures in the layout are updated (see above). You can add drawing/text/annotations to either graphs or layout.s
Be warned: Igor Pro's scripting language is something like the bastard child of VB and Matlab. It makes my eyes bleed. It makes me pray to whatever God that the pain just end. But the entire system is so powerful that it's worth it.
I have always used Matlab or R for this sort of thing. While you may not like how the generic plots look, I find that once I familiarize myself with the libraries I can make them as fancy as I want them to be.
R being free, I would try to stick it out with that. It is extremely powerful and perfectly suited to what you need (generate charts on the fly directly from datafiles). I bet that the more you get comfortable with it, you'll find yourself using R for a wide range of tasks outside of plotting data.
MathGL is cross-platform GPL library which meet all yours criteria. It can produce nice graphics, it can read csv files, it have window for displaying graphics (you don't need to know widget libraries), and it can plot in console (don't need a window or X at all). At this you can use C/C++/Fortran/Python/... for yours own code or MGL scripts for simplicity (see UDAV front-end in the last case).
Finally it can produce bitmaps (PNG/JPEG/GIF/...) or vector (EPS/SVG) output. Later it can be converted to PDF easily. Or you can create a PDF with U3D directly -- you'll need HPDF and U3D libraries in this case.

Best image format for uncompressed textures?

I am in the process of selecting an image format that will be used as the storage format for all in-house textures.
The format will be used as a source format from which compressed textures for different platforms and configurations will be generated, and so needs to cover all possible texture types (2D, cube, volymetric, varying number of mip-maps, floating point pixel formats, etc.) and be completely lossless.
In addition the format has to be able to keep a bit of metadata.
Currently a custom format is used for this, but a commonly available format will be easier to work with for the artists since its viewable in most image editors.
I have thought of using DDS, but this format does not support metadata as far as I can see.
All suggestions appreciated!
With your requirements you should stay with your selfmade format. I don't know about any image-format besides DDS that supports volumetric and cube-textures. Unfortunately DDS does not support meta-data.
The closest thing you can find is TIFF. It does not directly support cube-maps or volumetric textures, but it supports any number of sub-images. That way you could re-use the sub-images as slices or cube-sides.
TIFF also has a very good support for custom meta-data. The libtiff image reading/writing library works pretty good. It looks a bit archaic if you come from a OO side, but it gets it's job done.
Nils
When peeking inside various games' resources I found out that most of them store textures (I don't know whether they're compressed or not) in TGA
TIFF would probably be your closest bet for a format which supports arbitrary meta-data and multiple frames, but I think you are better off keeping the assets (in this case, images) separate from how they are converted and utilized in your engine.
Keep images in 32 bit PNG format, and put type- and meta information in XML. That keeps your data human viewable, readable and editable. Obscure custom formats are for engines, not people.
Stick with whatever your artists work with.
If you are a windows/mac shop and use
photoshop stick with .psd
If you are a unix shop and use gimp
stick with .xcf
These formats will store layers and all the stuff your artists need and are used to.
Since your artists will be creating loads of assets make their life as easy as possible,
even if it means to write some extra code.
Put the meta data (whatever it may be) somewhere "along" the images if the native format (psd/xcf) doesn't support it.
For stuff like cube maps, mipmaps (if not generated by the converter) stick to naming guidlines or guidlines on how to put them into one file.
Depending on what tool you use to create the volumetric stuff, just stick with that tools native format.
While writing custom formats for the target is usually a good idea,
writing custom formats for artists results in mayhem...
My experience with DDS is that it is a poorly documented and difficult format to work with and offers few advantages. It is generally simpler to just store a master file for each image class that has references to the source images that make it up ( i.e. 6 faces for a cube map, an arbitrary number of slices for a volume texture ) as well as any other useful meta-data. It's always going to be a good idea to keep the meta-data in a seperate file ( or in a database ) as you do not want to be loading large numbers of images when carryong out searches, populating browsers, etc. It also makes sense to seperate your source image format ( tiff, tga, jpeg, dds ... ) from your "meta-format" ( cube, volume ... ) since you may well find that you need to use lossy compression to support HDR formats or very large source volume data.
Have you tried PNG? http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/png_metadata.html
As an alternative solution, maybe spend some time writing a plugin for a Free Image Editor for your file format? I've never done it before, so I don't know the work involved, but there is boatloads of example code out there for you.

Looking for an OSX application that can do image processing using a webcam

I'm looking for an OSX (or Linux?) application that can recieve data from a webcam/video-input and let you do some image processing on the pixels in something similar to c or python or perl, not that bothered about the processing language.
I was considering throwing one together but figured I'd try and find one that exists already first before I start re-inventing the wheel.
Wanting to do some experiments with object detection and reading of dials and numbers.
If you're willing to do a little coding, you want to take a look at QTKit, the QuickTime framework for Cocoa. QTKit will let you easity set up an input source from the webcam (intro here). You can also apply Core Image filters to the stream (demo code here). If you want to use OpenGL to render or apply filters to the movie, check out Core Video (examples here).
Using theMyMovieFilter demo should get you up and running very quickly.
Found a cross platform tool called 'Processing', actually ran the windows version to avoid further complications getting the webcams to work.
Had to install quick time, and something called gVid to get it to work but after the initial hurdle coding seems like C; (I think it gets "compiled" into Java), and it runs quite fast; even scanning pixels from the webcam in real time.
Still to get it working on OSX.
Depending on what processing you want to do (i.e. if it's a filter that's available in Apple's Core Image filter library), the built-in Photo Booth app may be all you need. There's a comercial set of add-on filters available from the Apple store as well (http://www.apple.com/downloads/macosx/imaging_3d/composerfxeffectsforphotobooth.html).

Resources