What's the VOBsub subtitle format? - subtitle

Where can I find documentation/sample code of the VOBsub subtitles format? The one that's an .idx and a .sub file.
I need to create a program that generates those subtitles. I've been looking on Google but only found how to rip them from DVD.
Thanks

VOBsub extracts the DVD subtitles raw PES from a DVD and dumps this to a .sub file. It also creates a .idx Index file with the times and byteoffsets for each and every single subtitle. The format has support for multiple tracks and can also be embedded in MP4 (by Nero) and Matroska files.
Technical specs vobsub
Technical specs from Matroska.org
Example files: Specs_and_idx-sub_files.rar
Have a look at these open source implementations:
BDSup2Sub (Java)
Subtitle Edit (C#)
guliverkli by gabest (C++) check out VSFilter and VSRip; original implementation
Son2VobSub.rar (C++)
And then there are the media players like VLC. You can also check out these threads on doom9:
http://forum.doom9.org/archive/index.php/t-87171.html
http://forum.doom9.org/archive/index.php/t-99815.html

I think your best approach would be to have a look at the sourcecode of some of the open-source media players. Some of them will have the code to interpret an .idx or .sub subtitle file.
This might be a useful starting point:
http://sourceforge.net/projects/guliverkli/

Related

End Part of .WAV file is not converted to .MP3 file

As you can see the in the above image, the end part of the .wav file is not represented in the mp3 file. Here, I am making use of avcodec_decode_audio4() api to decode each packet, and using lame_encode_buffer() api to encode it in mp3 format. Here, I am seeing this issue for MONO streams( 1.wav -> 1.mp3 ). I just wanted to know why is this occuring, even when I am providing all the .wav file content. I am suspecting there is some caching that is happening, due to which I am unable to get whole data into the mp3 file. Any help here would be appreciated.
Add a section of silence at the end of the WAV file then reconvert to mp3

How does Flixtools work?

I have been using an app named flixtools from http://flixtools.com/, It got me really curious about how it searches for the movie name. I renamed the movie file with some random text and yet it finds the correct name of the movie for finding subtitles. From a developers view I wanted to understand how it is able to find the real name of the movie file.
Simply put : the movie name is in the file's metadata.
Most media formats (jpeg, mp3, mp4, etc...) are structured so that, along with the image/music/video data, there are some metadata about the photo/music track/tv show/movie that the file contains.

ffprobe fake file safety/mime content checking

Anyone know if ffprobe is safe to use as a method to check the content of a video? I want to determine if someone renames an exe to mp4 or some other video mime and I run ffprobe on it, would the file execute or fail safely without executing the content?
ffprobe should not execute content unless DirectDraw filter would do it itself (which is rather weird thing to think of).
Yet, AFAIK, ffprobe doesnt't produce nice MIME information, especially for not-multimedia-files. format_name/format_long_name are not very good.
For what you're looking for, the best approach is content sniffing described at https://mimesniff.spec.whatwg.org/
I have found that link in this SO question: MIME type for transcoded stream

Using the linux 'file' command to determine type (ie. image, audio, or video)

The word file here refers to the shell file command, and not actual files. I want to determine whether a file is a, for example, video file (.mpg, .mkv, .avi). file is pretty good at returning image for image files, video for video files, and audio for audio files (and application/x-empty for some reason for text). My question is how reliable this is for identifying types. If I did a simple
file -ib deliverance.avi | grep video
would that work for all of the main video files outlined here?
The results from file are less than perfect, and it has more problems with some types of files than others. File basically just looks for particular pieces of binary data in predictable patterns to figure out filetypes.
Unfortunately, in particular, some of the filetypes often used for video fall into this "problematic" category. The newer container formats like .mp4 and .mkv usually have several different MIME types that should properly depend on what type of data is being contained. For example, an .mp4 could properly be identified as video/mp4, audio/mp4, or application/mp4 depending on the content.
In practice, file often makes guesses that simply conform with common usage, and it may work perfectly well for you. For example, while I mentioned some theoretical difficulties with identifying Matroska files correctly, file basically just assumes that any Matroska file is a video. On the other hand, the usage of the Ogg container is more evenly split between audio and video, and I believe the current version of file just splits the difference, and identifies Ogg files as application/ogg, which wouldn't fall into any of your categories.
The one thing I can say with certainty is that you want the most up-to-date version of file you can get your hands on. The "magic" files that contain the patterns to match against and the MIME types that will result from a match are updated fairly often to include newer filetypes like WebM, or just to improve accuracy for older types.
file works by referencing the header of the file against a "magic number" file. I suspect the best way to see how robust file is to check your local magic number file (possibly /usr/share/magic but see man file for details) for the file types from your referenced list.
It seems like it should work for most video/audio/image files. But, if it doesn't, there's actually a file that contains the relations between an extension and it's type:
The information identifying these files is read from the compiled magic file /usr/share/magic.mgc , or /usr/share/magic if the compile file does not exist.
see:
http://linux.about.com/library/cmd/blcmdl1_file.htm
Hope this helps!

Changing Title attributes of a bunch of music files by a script

I have quite a lot of music files but their title attributes have the track numbers infront of them, like 01.TrackName, 02.TrackName.
What is the best way to strip off integers from the file attributes?
Edit: I am using windows and all music files are MP3. Any solution as batch files, c++ or .net etc will be appreciated.
I'd use Flexible File Renamer to rename the files per your needs. It does require some familiarity with patterns and, in some advanced scenarios, regular expressions.
There are several other similar tools on the market, each with their own scope as to what they can do and how easily it can be done.
See http://hp.vector.co.jp/authors/VA014830/english/FlexRena/. The website shows a screenshot of a user batch-renaming a set of MP3 files using not only elements within the filename but also ID3 tag-related information such as track title and artist.
I use Flash Renamer Great util, and it has support for renameing MP3's
You can use an ID3 tag editor. Some have support for batch tagging using regexes. Another option would be to use python with the id3 tag library (I don't remember its name...)
Edit: For example http://www.mp3tag.de/en/
Mutagen: Id3 tag library for python.

Resources