NTFS + CreateFile: specify stream offset in pathname? - winapi

Is it possible to specify stream offset in the pathname when opening a NTFS file data stream via CreateFile?
What about if pathname starts with \\?\?
E.g. abcd.txt::$DATA specifies offset 0 at the unnamed stream*; is it possible to specify a different offset within the pathname**?
*technically, this also means offset equal to stream length in case WriteFile is called with append
**without ever making use of SetFilePointer

There is no syntax that lets you specify a stream offset in the pathname. See MSDN for the supported syntax:
File Streams
You must seek to the desired offset after opening the stream.

Related

Faster way of getting number of key frames than "show_frames" in ffprobe?

I'm making a little in-house utility using ffmpg and ffprobe. Works fine and does what is needed: give a count of the number of key frames in a video file plus some other details.
Alas, with the large video files this will be used on it can take many seconds for show_frames to return – and I then have to parse the JSON dump of frame data and keep a running count of the total key frames.
Is there a faster way? Perhaps it is listed in the "stream" or "format" data dumps and I am not recognizing what it is being called? I've been through the ffmpg and ffprobe docs and didn't find anything else.
For MP4 and MOV files, you can get this info by reading the contents of the STSS box
You can use a tool like MP4parser, which will generate a log file with an entry like this:
/moov/trak/mdia/minf/stbl/stss # 0x1d7218e
Box size: 0x74 version: 0x0 flags: 0x0
entry_count: 0x19
sample_number:
0x1 0x86 0x180 0x27a ....
That entry count (in Hex) is the number you want.
Alternatively, atomicparsley will also tell you of the location of the STSS within the file and you can then read it directly.

FFMpeg - Is it difficultt to use

I am trying to use ffmpeg, and have been doing a lot of experiment last 1 month.
I have not been able to get through. Is it really difficult to use FFmpeg?
My requirement is simple as below.
Can you please guide me if ffmpeg is suitable one or I have implement on my own (using codec libs available).
I have a webm file (having VP8 and OPUS frames)
I will read the encoded data and send it to remote guy
The remote guy will read the encoded data from socket
The remote guy will write it to a file (can we avoid decoding).
Then remote guy should be able to pay the file using ffplay or any player.
Now I will take a specific example.
Say I have a file small.webm, containing VP8 and OPUS frames.
I am reading only audio frames (OPUS) using av_read_frame api (Then checks stream index and filters audio frames only)
So now I have data buffer (encoded) as packet.data and encoded data buffer size as packet.size (Please correct me if wrong)
Here is my first doubt, everytime audio packet size is not same, why the difference. Sometimes packet size is as low as 54 bytes and sometimes it is 420 bytes. For OPUS will frame size vary from time to time?
Next say somehow extract a single frame (really do not know how to extract a single frame) from packet and send it to remote guy.
Now remote guy need to write the buffer to a file. To write the file we can use av_interleaved_write_frame or av_write_frame api. Both of them takes AVPacket as argument. Now I can have a AVPacket, set its data and size member. Then I can call av_write_frame api. But that does not work. Reason may be one should set other members in packet like ts, dts, pts etc. But I do not have such informations to set.
Can somebody help me to learn if FFmpeg is the right choice, or should I write a custom logic like parse a opus file and get frame by frame.
Now remote guy need to write the buffer to a file. To write the file
we can use av_interleaved_write_frame or av_write_frame api. Both of
them takes AVPacket as argument. Now I can have a AVPacket, set its
data and size member. Then I can call av_write_frame api. But that
does not work. Reason may be one should set other members in packet
like ts, dts, pts etc. But I do not have such informations to set.
Yes, you do. They were in the original packet you received from the demuxer in the sender. You need to serialize all information in this packet and set each value accordingly in the receiver.

Apache Pig handles bz2 file natively?

I can see that pig can read .bz2 files natively but I am not sure whether it runs an explicit job to split bz2 into multiple inputsplits? Can anyone confirm this? If pig is running a job to create inputsplits, is there a way to avoid that? I mean a way to have MapReduce framework split bz2 files into muplitple inputslits in the framework level?
Splittable input formats are not implemented in hadoop (or in pig, which just runs MR jobs for you) such that a file is split by one job, then the splits processed by a second job.
The input format defines an isSplittable method which defines whether in principal the file format can be split. In addition to this, most text based formats will check to see whether the file is using a known compression codec (for example: gzip, bzip2) and if the codec support splits (gzip doesn't, in principal, but bz2 does).
If the input format / codec does allow for splitting of the files, then splits are defined at defined (and configurable) points in the compressed file (say every 64 MB). When the map tasks are created to process each split, then get the input format to create a record reader for the file, passing the split information for where the reader should start from (the 64MB block offset). The reader is then told to seek to the offset point of the split. At this point the underlying codec will seek to that point in the compressed file, and scan forward until it finds the next compressed block header (in the case of bz2). Reads then continue as normal on the uncompressed stream returned from the codec, until the split end point has been passed over in the uncompressed stream.

How to pass Java InputStream to JRuby io that supports pos/seek

I need to call a (j)ruby script from java runtime, and I want to pass an input stream as a parameter.
On the ruby side, I'm using to_io to convert input stream
io = my_stream.to_io
I'm getting these errors:
org.jruby.exceptions.RaiseException: (Errno::ESPIPE) Illegal seek
at org.jruby.RubyIO.pos(org/jruby/RubyIO.java:1602) ~[jruby-core-1.7.4.jar:na]
The question is, are there better options to convert input stream to io that support pos and seek?
Your stream is coming in as a pipe and that is not seekable. Since seek and pos won't work properly on a pipe you would have to read from the pipe.
io_stream.read(number_of_bytes)

Fail to generate correct wav file from raw stream

I captured raw audio data stream together with its WAVEFORMATEXTENSIBLE struct.
WAVEFORMATEXTENSIBLE is shown in the figure below:
Following the standard of wav file, I tried to write the raw bits into a wav file.
What I do is:
write "RIFF".
write a DWORD. (filesize - sizeof("RIFF") - sizeof(DWORD)).
=== WaveFormat Chunk ===
write "WAVEfmt "
write a DWORD. (size of the WAVEFORMATEXTENSIBLE struct)
write the WAVEFORMATEXTENSIBLE struct.
=== Fact Chunk ===
write "fact"
write a DWORD. ( 4 )
write a DWORD. ( num of samples in the stream, which should be sizeof(rawdata)*8/wBitsPerSample ).
=== Data Chunk ===
write "data"
write a DWORD (size of rawdata)
write the raw data.
After getting the wav file from the above steps, I played the wav file with media player, there is no sound, playing with audacity will give me a distorted sound, I can hear that it is the correct audio I want, but the sound is distorted with noise.
The raw data can be find here
The wav file I generate is here
It is very confusing to me, because when I use the same method to convert IEEE-float data to wav file, it works just fine.
I figured this out, it seems the getbuffer releasebuffer cycle in IAudioRenderClient is putting raw data that has the format same as that passed into the initialize method of the IAudioClient.
The GetMixFormat in IAudioClient in my case is different from the format passed into the initialize method. I think GetMixFormat gets the format that the device supports.
IAudioClient should have done the conversion of format from the initialized format to the mixformat. I intercept the initialize method, get the format, and it works like a charm.
I'm intercepting WASAPI to access the audio data and face the exact same issue where the generated audio file from the data sounds like the correct content but is very noisy somehow although the frame rate, sample width, number of channels etc. are set properly.
The SubFormat field of WAVEFORMATEXTENSIBLE shows that the data is actually KSDATAFORMAT_SUBTYPE_IEEE_FLOAT, while I originally treat it as integers. According to this page, KSDATAFORMAT_SUBTYPE_IEEE_FLOAT is equivalent to WAVE_FORMAT_IEEE_FLOAT in WAVEFORMATEX. Hence, setting the "audio format" in the wav file's fmt chunk(normally starts in the 20th position) to WAVE_FORMAT_IEEE_FLOAT(which is 3) solved the problem. Remember to put it in little endian.
Original value of audio format
After modification

Resources