I'm really interested in image and video compression, but its hard for me to find a main source to start implementing the major algorithms.
What I want is just a source of information to begin the implementation of my own codec. I want to implement it from scratch (for example, for jpeg, implement my own Huffman, cosine conversion ...). All I need is a little step by step guide showing me which steps are involved in each algorithm.
I'm interested mainly on image compression algorithms (by now, JPEG) and video compression algorithms (MPEG-4, M-JPEG, and maybe AVI and MP4).
Can anyone suggest me an on-line source, with a little more information than wikipedia? (I checked it, but information is not really comprehensive)
Thank you so much :)
Start with JPEG. You'll need the JPEG standard. It will take a while to go through, but that's the only way to have a shot at writing something compatible. Even then, the standard won't help much with deciding on how and how much you quantize the coefficients, which requires experimentation with images.
Once you get that working, then get the H.264 standard and read that.
ImpulseAdventure site has fantastic series of articles about basics of JPEG encoding.
I'm working on an experimental JPEG encoder that's partly designed to be readable and easy to change (rather than obfuscated by performance optimizations).
Related
A little background: a coworker was creating some "glitch-art" from, using this link. He deleted some bytes from a jpeg image, and created the result:
http://jmelvnsn.com/prince_fielder.jpg
The thing that's blowing my mind here, is that chrome is rendering this image differently on each refresh. I'm not sure I understand how the image-rendering code is non-deterministic. What's going on?
EDIT>> I really wish stackoverflow would stop redirecting my url to their imgur url.
Actually it's interesting to know that the JPG standard it's not a standard about imaging techniques or imaging algorithms, it's more like a standard about a container.
As far as I know if you respect the jpeg standard you can decode/encode a jpeg with X number of different techniques and algorithms, that's why it's hard to support JPEG/JPG, from a programmer prospective a JPG can be a million things and it's really hard to handle that kind of fragmentation, often times you are forced to simply jump on the train offered by some library and hope that your users wouldn't experience a trouble with it.
There is no standard way to encode or decode a JPEG image/file ( including the algorithms used in this processes ), considering this the apparent "weird" result offered by your browser is 100% normal.
Disclaimer: Forgive my ignorance of audio/sound processing, my background is web and mobile development and this is a bespoke requirement for one of my clients!
I have a requirement to concatenate 4 audio files, with a background track playing behind all 4 audio files. The source audio files can be created in any format, or have any treatment applied to them, to improve the processing time, but the output quality is still important. For clarity, the input files could be named as follows (.wav is only an example format):
background.wav
segment-a.wav
segment-b.wav
segment-c.wav
segment-d.wav
And would need to be structured something like this:
[------------------------------background.wav------------------------------]
[--segment-a.wav--][--segment-b.wav--][--segment-c.wav--][--segment-d.wav--]
I have managed to use the SoX tool to achieve the concatenation portion of the above using MP3 files, but on a reasonably fast computer I am getting roughly an hours worth of concatenated audio per minute of processing, which isn't fast enough for my requirements, and I haven't applied the background sound or any 'nice to haves' such as trimming/fading yet.
My questions are:
Is SoX the best/only tool for this kind of operation?
Is there any way to make the process faster without sacrificing (too much) quality?
Would changing the input file format result in improved performance? If so, which format is best?
Any suggestions from this excellent community would be much appreciated!
Sox may not be the best tool, but I doubt you will find anything much better without hand-coding.
I would venture to guess that you are doing pretty well to process that much audio in that time. You might do better, but you'll have to experiment. You are right that probably the main way to improve speed is to change the file format.
MP3 and OGG will probably give you similar performance, so first identify how MP3 compares to uncompressed audio, such as wav or aiff. If MP3/OGG is better, try different compression ratios and sample rates to see which goes faster. With wav files, you can try lowering the sample rate (you can do this with MP3/OGG as well). If this is speech, you can probably go as low as 8kHz, which should speed things up considerably. For music, I would say 32kHz, but it depends on the requirements. Also, try mono instead of stereo, which should also speed things up.
I am working on a web script that handles image processing using ImageMagick. It takes relevant parameters, executes an ImageMagick command at the command line or shell depending on OS, and passes the raw image data back to the script. The language of the web script is obviously not pertinent.
Simple use cases include:
convert -resize 750 H:/221136.png - which just resizes the input image to 750 width and outputs the raw data to the console. More complex use cases involve rotating, resizing, cropping/panning, and drawing.
The script works great and is quite fast for PNG, GIF, and JPEG inputs, even at fairly large (4000x5000 resolutions). Unfortunately my input data also includes JPEG-2000. A 10-15 Megabyte JPEG2000 takes a truly insane amount of time for ImageMagick to process, in the order of 10-15 seconds. It is not suitable for live, on the fly conversion.
I know quick conversion of JPEG-2000 to JPEG for web output is possible, because a piece of Enterprise software I work with does it fairly on-the-fly. I'm not sure which library they use--the DLL/so they use is DL80JP2KLib.dll/.so. Looking it up, it seems that a company called DataLogic makes this, but they don't seem to have any obviously relevant programs on their site.
Ideally I'm looking for a solution (plug-in?) that would either enable ImageMagick to convert these high resolution JPEG-2000 images on-the-fly like it does with PNG, GIF, or JPEG... or a separate command utility that I can run in advance of ImageMagick to convert the JPEG-2000 to an intermediate format that ImageMagick can process quickly.
The servers that will run this script have 32 gigs of RAM and beefy processors. Assume that speed of conversion is more important than resource usage efficiency. Assume also that while I need some semblance of quality, image lossyness is not an urgent thing. Licensing requirements and/or price are not important, except that I need to be able to test it myself for speed on a few sample files before we buy. The ideal solution is also (relatively) OS independent
I tried an application from Kakadu Software and it's fairly quick, in the order of 3-4 seconds, but that's still not fast enough. If it's not possible to get below, say, one second, I will look at batch converting files in advance.
I have uploaded a representative file (JPEG-2000, ~8MB) to MediaFire:
http://www.mediafire.com/?yxv0j6vdwx0k996
I found exact image to be much faster in the past.
http://www.exactcode.de/site/open_source/exactimage/
Mark Tyler (original author of mtPaint) once split out the excellent graphics handling parts into a separate library (mtpixel ...since abandoned as a separate project, but included in mtcelledit # its Google code home)
can anybody compare popular video codecs by encoding speed? I understand that usually better compression requires more processing time, but it's also possible that some codecs still provide comparably good compression with fast encoding. any comparison links?
thanks for your help
[EDIT]: codecs can be compared by used algorithms, regardless of its particular implementation, hardware used or video source, something like big O for mathematical algorithms
When comparing VP8 and x264, VP8 also shows 5-25 times lower encoding speed with 20-30% lower quality at average. For example x264 High-Speed preset is faster and has higher quality than any of VP8 presets at average."
its tough to compare feature sets vs speed/quality.
see some quality comparison http://www.compression.ru/video/codec_comparison/h264_2012/
The following paragraph and image are from VP9 encoding/decoding performance vs. HEVC/H.264 by Ronald S. Bultje:
x264 is an incredibly well-optimized encoder, and many people still
use it. It’s not that they don’t want better bitrate/quality ratios,
but rather, they complain that when they try to switch, it turns out
these new codecs have much slower encoders, and when you increase
their speed settings (which lowers their quality), the gains go away.
Let’s measure that! So, I picked a target bitrate of 4000kbps for each
encoder, using otherwise the same settings as earlier, but instead of
using the slow presets, I used variable-speed presets (x265/x264:
–preset=placebo-ultrafast; libvpx: –cpu-used=0-7).
This is one of those topics where Your Mileage May Vary widely. If I were in your position, I'd start off with a bit of research on Wikipedia, and then gather the tools to do some testing and benchmarking. The source video format will probably affect overall encoding speed, so you should test with video that you intend to use on the Production system.
Video encoding time can vary widely depending on the hardware used, and whether you used an accelerator card, and so on. It's difficult for us to make any hard and fast recommendations without explicit knowledge of your particular set up.
The only way to make decisions like this, is to test these things yourself. I've done the same thing when comparing Virtualisation tools. It's fun too!
What is the minimum source length (in bytes) for LZ77? Can anyone suggest a small and fast real time compression technique (preferable with c source). I need it to store compressed text and fast retrieval for excerpt generation in my search engine.
thanks for all the response, im using D language for this project so it's kinda hard to port LZO to D codes. so im going with either LZ77 or Predictor. thanks again :)
I long ago had need for a simple, fast compression algorithm, and found Predictor.
While it may not be the best in terms of compression ratio, Predictor is certainly fast (very fast), easy to implement, and has a good worst-case performance. You also don't need a license to implement it, which is goodness.
You can find a description and C source for Predictor in Internet RFC 1978: PPP Predictor Compression Protocol.
The lzo compressor is noted for its smallness and high speed, making it suitable for real-time use. Decompression, which uses almost zero memory, is extremely fast and can even exceed memory-to-memory copy on modern CPUs due to the reduced number of memory reads. lzop is an open-source implementation; versions for several other languages are available.
If you're looking for something more well known this is about the best compressor in terms of general compression you'll get. LZMA, the 7-zip encoder. http://www.7-zip.org/sdk.html
There's also LZJB:
https://hg.java.net/hg/solaris~on-src/file/tip/usr/src/uts/common/os/compress.c
It's pretty simple, based on LZRW1, and is used as the basic compression algorithm for ZFS.