Difference between VAST, VPAID and VMAP - vast

For some reason I need to know difference between VAST,VPAID and VMAP.
I know both are video ad deliver tags, these are following IAB standard, but I need to know clear difference between these three.
Any help is appreciated.

VAST, VMAP and VPAID solve different challenges when it comes to showing advertisements in a video player.
Short answer
VAST describes ads and how a video player should handle them. (more or less)
VPAID (deprecated, see update below) describes what "public" communication (methods, properties and events) an executable ad unit should at least implement/expose, so the video player can communicate with the ad unit in an uniform way and control it.
VMAP describes when an ad should be played.
In more detail
VAST (Video Ad Serving Template) is used to describe ads and how a video player should handle these. Note that the concrete implementation is up to the video player itself. There are three types of ads:
A linear ad is an advertisement video rendered inside the video player.
A non-linear ad is an advertisement overlaying the video player. It is mostly a banner image, but it could also be HTML or an iFrame.
A companion ad is an advertisement rendered outside the video player. It is mostly rendered alongside a linear or a non-linear ad, as they can complement each other (hence the name).
More examples of cool stuff VAST describes:
When an ad is allowed to be skipped (for linear ads)
What URIs should be pinged for tracking
Sequence of ads (ad pods) that should be played together
Different resolutions / codecs for same advertisement
VMAP (Video Multiple Ad Playlist) is an optional addition allowing you to specify when an ad must be played. Via VMAP you can indicate whether an ad is a pre-roll (ad before the content), a mid=roll (ad somewhere in the content) or a post-roll (ad after the content). VMAP can also refer to multiple VAST files to be played at different times.
VPAID (Video Player Ad Interface Definition) is a specification describing what an executable ad unit (= interactive ad) should at least implement and expose for public communication/control. This allows the player to delegate instructions to the ad and yet keep control over it (e.g. starting, pausing, finishing it...). That way, a player can give instructions (methods) and request information (properties). The ad itself can also dispatch events indicating that a certain action has happened (e.g. volume has changed, ad has been skipped, ad has been clicked...).
It is interesting to note that VPAID has two versions: version 1 is only Flash, while version 2 is only JavaScript.
How these three connect with each other
VMAP refers to a VAST, but never to another VMAP.
VAST can contain its ad data internally (Inline) or refer to another VAST (Wrapper), but never to a VMAP. VAST describes ads. Some ads can be executable (interactive).
If an ad is executable then it must implement VPAID so the player can cooperate with it.
Update June 2019
Quite a few things have changed since this answered was submitted. In VAST 4.1, the IAB deprecated the VPAID specification in favor of an upcoming specification. VAST 4.2 (currently in public comment phase) formalized the successor of VPAID:
for ad verification, the Open Measurement SDK should be used
for interactivity, the SIMID (Secure Interactive Media Interface) specification should be implemented.

IAB Digital Video Suite
VAST(Digital Video Ad Serving Template) is an XML with <VAST> root where the main part is MediaFile tag with a url to video file. IAB
VPAID(Digital Video Player-Ad Interface Definition) is extension of VAST where MediaFile tag contains type="application/javascript" apiFramework="VPAID" attributes which allows to define a JS source to be handled. SpotXChange, Innovid
VMAP(Digital Video Multiple Ad Playlist) - is an XML with <vmap:VMAP> root and is used to describe a schedule for VAST files( pre/mid/post roll)
Google IMA Examples
[MRAID]

Related

Multiple XAudio2 instances needed for AUDIO_STREAM_CATEGORY?

In the newer XAudio2 API's for Windows 8 and 10, an AUDIO_STREAM_CATEGORY is passed to IXAudio2::CreateMasteringVoice.
The documentation goes on to say how these should be used for different types of audio. However an IXAudio2 is only allowed one master voice. To do this is completely separate IXAudio2 instances along with all associated interfaces required, or can categories be specified elsewhere in the audio graph by some means?
Games should categorize their music streams as AudioCategory_GameMedia so that game music mutes automatically if another application plays music in the background. Music or video applications should categorize their streams as AudioCategory_Media or AudioCategory_Movie so they will take priority over AudioCategory_GameMedia streams. Game audio for in-game cinematics or cutscenes, when the audio is premixed or for creative reasons should take priority over background audio, should also be categorized as Media or Movie.
You can create more than one IXAudio2 instance in a process so each will have it's own master voice. If you want to output more than one category of audio from a process, you need to create more than one IXAudio2 instance.
Generally you can get away with just one and always use AudioCategory_GameMedia.
I know this design is a bit of a kludge but the category is set on the WASAPI output voice, which is where XAudio2 sends it's mastering voice stuff to. Any other design would have required annotating category data within the internal XAudio audio graph which would have been quite complicated to implement for not a lot of value. We choose instead to just let applications have more than one audio-graph active at once each with it's own mastering voice and therefore it's own category.
How you choose to you support the audio category feature of WASAPI is up to you, and of course the best user experience depends on what exactly your application actually does.

Mozilla location service vs open cell id

What the difference between opencellid and mozilla location service?
Generally speaking both services collects "visible by device CDMA, GSM, UMTS, LTE cells, Wi-Fi hotspots, Bluetooth beacons at particular latitude, longitude". It is the place where GPS receiver is located at the moment of scanning, not exact base station or it's antenna sector place. When multiple measurements from different places around are available, it is possible to perform averaging of coordinates which is published (see below).
Things complicated for cellular networks
Most cell towers payloaded with multiple telecommunication equipment: 2G (GSM, GPRS, EDGE), 3G (WCDMA, UMTS, HSPDA)
Equipment divide area around in sectors and use directional antennas. When you go around base station (e.g. make a closed circle) phone will be connected to different sectors - antennas, which has different Cell ID / UTRAN ID. At this moment MLS and OCI can't aggregate this measurements in one base station. However for geolocation purpose more sectors means higher accuracy.
Meanwhile, databases contain exact positions of some base stations (or sectors?), check changeable==0 column in CSV dump.
Mozilla Location Service (MLS)
Collects cells and Wi-Fi measurements with libstumbler library which is incorporated in Mobile Firefox (collection is disabled by default) and Mozilla stumbler. Bluetooth beacons being collected some other way. Geolocation backend is called Ichnaea (it responsible for data exchange between MLS and OCI).
Looks like when user requests geoposition through API or Android MozillaNlpBackend, MLS query own collected data database, own copy of Opencellid database, GEO IP, and SkyHook partner. Collected Wi-Fi data is sensitive and used for online geopositon.
Published data: Public domain license. Daily CSV dumps of estimated cells location only (because of privacy: no raw measurements data, no wi-fi, no bluetooth beacons).
Opencellid (OCI)
Collects only cells (with 3-rd party software).
Published data: CC-BY-SA 3.0 license. CSV dumps of estimated cells location (updated in something around week) and raw measurement data. Free API key required.
No officially averaged MLS+OCI data published (I want be wrong here). Projects can't merge it because of license and privacy (Mozilla don't wont publish raw measurements). One can download CSV dumps and use it for offline geolocation. There are at least one succeed project for Android - LocalGsmNlpBackend for µg UnifiedNlp.
According to the Mozilla website:
The service incorporates aggregated cell data from our partner the
OpenCellID project. The OpenCellID data is provided under the CC-BY-SA
3.0 license and can be acquired from the OpenCellID downloads section.
The OpenCellID project puts a stronger emphasis on public data
compared to possible privacy risks, whereas this project has a
stronger emphasis on privacy. Please consider contributing to the
OpenCellID project if you do not agree with the privacy choices made
by this project.

Outputting Sound to Multiple Audio Devices Simultaneously

OK, the first issue. I am trying to write a virtual soundboard that will output to multiple devices at once. I would prefer OpenAL for this, but if I have to switch over to MS libs (I'm writing this initially on Windows 7) I will.
Anyway, the idea is that you have a bunch of sound files loaded up and ready to play. You're on Skype, and someone fails in a major way, so you hit the play button on the Price is Right fail ditty. Both you and your friends hear this sound at the same time, and have a good laugh about it.
I've gotten OAL to the point where I can play on the default device, and selecting a device at this point seems rather trivial. However, from what I understand, each OAL device needs its context to be current in order for the buffer to populate/propagate properly. Which means, in a standard program, the sound would play on one device, and then the device would be switched and the sound buffered then played on the second device.
Is this possible at all, with any audio library? Would threads be involved, and would those be safe?
Then, the next problem is, in order for it to integrate seamlessly with end-user setups, it would need to be able to either output to the default recording device, or intercept the recording device, mix it with the sound, and output it as another playback device. Is either of these possible, and if both are, which is more feasible? I think it would be preferable to be able to output to the recording device itself, as then the program wouldn't have to be running in order to have the microphone still work for calls.
If I understood well there are two questions here, mainly.
Is it possible to play a sound on two or more audio output devices simultaneously, and how to achieve this?
Is it possible to loop back data through a audio input (recording) device so that is is played on the respective monitor i.e for example sent through the audio stream of Skype to your partner, in your respective case.
Answer to 1: This is absolutely feasable, all independent audio outputs of your system can play sounds simultaneously. For example some professional audio interfaces (for music production) have 8, 16, 64 independent outputs of which all can be played sound simultaneously. That means that each output device maintains its own buffer that it consumes independently (apart from concurrency on eventual shared memory to feed the buffer).
How?
Most audio frameworks / systems provide functions to get a "device handle" which will need you to pass a callback for feeding the buffer with samples (so does Open AL for example). This will be called independently and asynchroneously by the framework / system (ultimately the audio device driver(s)).
Since this all works asynchroneously you dont necessarily need multi-threading here. All you need to do in principle is maintaining two (or more) audio output device handles, each with a seperate buffer consuming callback, to feed the two (or more) seperate devices.
Note You can also play several sounds on one single device. Most devices / systems allow this kind of "resources sharing". Actually, that is one purpose for which sound cards are actually made for. To mix together all the sounds produced by the various programs (and hence take off that heavy burden from the CPU). When you use one (physical) device to play several sounds, the concept is the same as with multiple device. For each sound you get a logical device handle. Only that those handle refers to several "channels" of one physical device.
What should you use?
Open AL seems a little like using heavy artillery for this simple task I would say (since you dont want that much portability, and probably dont plan to implement your own codec and effects ;) )
I would recommend you to use Qt here. It is highly portable (Win/Mac/Linux) and it has a very handy class that will do the job for you: http://qt-project.org/doc/qt-5.0/qtmultimedia/qaudiooutput.html
Check the example in the documentation to see how to play a WAV file, with a couple of lines of code. To play several WAV files simultaneously you simply have to open several QAudioOutput (basically put the code from the example in a function and call it as often as you want). Note that you have to close / stop the QAudioOutput in order for the sound to stop playing.
Answer to 2: What you want to do is called a loopback. Only a very limited number of sound cards i.e audio devices provide a so called loopback input device, which would permit for recording what is currently output by the main output mix of the soundcard for example. However, even this kind of device provided, it will not permit you to loop back anything into the microphone input device. The microphone input device only takes data from the microphone D/A converter. This is deep in the H/W, you can not mix in anything on your level there.
This said, it will be very very hard (IMHO practicably impossible) to have Skype send your sound with a standard setup to your conversation partner. Only thing I can think of would be having an audio device with loopback capabilities (or simply have a physical cable connection a possible monitor line out to any recording line in), and have then Skype set up to use this looped back device as an input. However, Skype will not pick up from your microphone anymore, hence, you wont have a conversation ;)
Note: When saying "simultaneous" playback here, we are talking about synchronizing the playback of two sounds as concerned by real-time perception (in the range of 10-20ms). We are not looking at actual synchronization on a sample level, and the related clock jitter and phase shifting issues that come into play when sending sound onto two physical devices with two independent (free running) clocks. Thus, when the application demands in phase signal generation on independent devices, clock recovery mechanisms are necessary, which may be provided by the drivers or OS.
Note: Virtual audio device software such as Virtual Audio Cable will provide virtual devices to achieve loopback functionnality in Windows. Frameworks such as Jack Audio may achieve the same in UX environment.
There is a very easy way to output audio on two devices at the same time:
For Realtek devices you can use the Audio-mixer "trick" (but this will give you a delay / echo);
For everything else (and without echo) you can use Voicemeeter (which is totaly free).
I have explained BOTH solutions in this video: https://youtu.be/lpvae_2WOSQ
Best Regards

Flex 4 > spark.components.VideoPlayer > How to switch bit rate?

The VideoPlayer (possibly VideoDisplay also) component is capable of somehow automatically picking the best quality video on the list it's given. An example is here:
http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/spark/components/mediaClasses/DynamicStreamingVideoItem.html#includeExamplesSummary
I cannot find the answers to below questions.
Assuming that the server that streams recorded videos is capable of switching across same videos with different bit rates and streaming them from any point within their timelines:
Is the bandwidth test/calculation within this component only done before the video starts playing, at which point it picks the best video source and never uses the other ones? Or, does it continuously or periodically execute its bandwidth tests and does it accordingly switch between video sources during the playback?
Does it support setting the video source through code and can its automatic switching between video sources be turned off (in case I want to provide this functionality to the user in the form of some button/dropdown or similar)? I know that the preferred video source can be set, but this only means that that video source will be tested/attempted first.
What other media servers can be used with this component, besides the one provided by Adobe, to achieve automated and manual switching between different quality of same video?
Obviously, I'd like to create a player that is smart enough to automatically switch between different quality of videos, and that will support manual instructions related to which source to play - both without interrupting the playback, or at least without restarting it (minor interruptions acceptable). Also, the playback needs to be able to start at any given point within the video, after enough data has been buffered (of course), but most importantly, I want to be able to start the playback beyond what's buffered. A note or two about fast-forwarding is not going to hurt, if anyone knows anything.
Thank you for your time.

Use NSSpeechRecognizer or alternative with audio file instead of microphone input?

Is it possible to use the NSSpeechRecognizer with an pre-recorded audio file instead of direct microphone input?
Or is there any other speech-to-text framework for Objective-C/Cocoa available?
Added:
Rather than using voice at the machine that is running the application external devices (e.g. iPhone) could be used for sending just an recorded audio stream to that desktop application. The desktop Cocoa app then would process and do whatever it's supposed to do using the assigned commands.
Thanks.
I don't see any obvious way to switch the input programmatically, though the "Speech" companion guide's first paragraph in the "Recognizing Speech" section seems to imply other inputs can be used. I think this is meant to be set via System Preferences, though. I'm guessing it uses the primary audio input device selected there.
I suspect, though, you're looking for open-ended speech recognition, which NSSpeechRecognizer is not. If you're looking to transform any pre-recorded audio into text (ie, make a transcript of a recording), you're completely out of luck with NSSpeechRecognizer, as you must give it an array of "commands" to listen for.
Theoretically, you could feed it the whole dictionary, but I don't think that would work since you usually have to give it clear, distinct commands. Its performance would suffer, I would guess, if you gave it a bunch of stuff to analyze for (in real time).
Your best bet is to look at third-party open source solutions. There are a few generalized packages out there (none specifically for Cocoa/Objective-C), but this poses another question: What kind of recognition are you looking for? The two main forms of speech recognition ('trained' is more accurate but less flexible for different voices and the recording environment, whereas 'open' is generally much less accurate).
It'd probably be best if you stated exactly what you're trying to accomplish.

Resources