I'm recording speech using MediaRecorder API, why are chunks smaller than the actual size? - ruby

I have an issue with MediaRecorder API (https://www.w3.org/TR/mediastream-recording/#mediarecorder-api).
I'm using it to record the speech from the web page using Chrome, and save it as chunks. I need to be able to play it while and after it is recorded, so it's important to keep those chunks.
Here is the code which is recording data:
navigator.mediaDevices.getUserMedia({ audio: true, video: false }).then(function(stream) {
recorder = new MediaRecorder(stream, { mimeType: 'audio/webm; codecs="opus"' })
var previous_timecode = null
recorder.ondataavailable = function(e) {
duration = previous_timecode ? e.timecode - previous_timecode : null
previous_timecode = e.timecode
// Read blob from `e.data`, decode64 and send to sever;
// Additionally send the duration calculated from the events;
// Duration of first chunk is calculated in a different way.
}
recorder.start(1000)
})
The issue actually happened only once, but still it is quite a scary one. The problem was that during 7 minutes of recording i got only 5 minutes of audio. Analyzing the chunks gave me the following input - at some point, the chunks became much smaller then they were expected to be - the data was emitted every second, but the duration of the chunks was arround 400-700ms.
The audio was correct, it did not have any gaps, it just came with a growing delay. At some points the duration of chunk was growing a bit - up 4.8sec in a chunk, but still the total delay grew up to ~2 min.
In the CSV attached https://transfer.sh/stgnW/1.csv you can see the durations of each chunk calculated with ffmpeg (size audio file containing first n chunks minus the size of file containing first n-1 chunks) and also the durations calculated by e.timecode values.
It looks like some throttling issue - is there something like that in chrome? How could i fix my code to make sure it's not throttled that way?

It is a longshot but try removing any browser level processing that might be done on audio with these constraints:
echoCancellation:false
autoGainControl:false
noiseSuppression:false

Related

Nifi Group Content by Given Attributes

I am trying to run a script or a custom processor to group data by given attributes every hour. Queue size is up to 30-40k on a single run and it might go up to 200k depending on the case.
MergeContent does not fit since there is no limit on min-max counts.
RouteOnAttribute does not fit since there are too many combinations.
Solution 1: Consume all flow files and group by attributes and create the new flow file and push the new one. Not ideal but gave it a try.
While running this when I had 33k flow files on queue waiting.
session.getQueueSize().getObjectCount()
This number is returning 10k all the time even though I increased the queue threshold numbers on output flows.
Solution 2: Better approach is consume one flow file and and filter flow files matching the provided attributes
final List<FlowFile> flowFiles = session.get(file -> {
if (correlationId.equals(Arrays.stream(keys).map(file::getAttribute).collect(Collectors.joining(":"))))
return FlowFileFilter.FlowFileFilterResult.ACCEPT_AND_CONTINUE;
return FlowFileFilter.FlowFileFilterResult.REJECT_AND_CONTINUE;
});
Again with 33k waiting in the queue I was expecting around 200 new grouped flow files but 320 is created. It looks like a similar issue above and does not scan all waiting flow files on filter query.
Problems-Question:
Is there a parameter to change so this getObjectCount can take up to 300k?
Is there a way to filter all waiting flow files again by changing a parameter or by changing the processor?
I tried making default queue threshold 300k on nifi.properties but it didn't help
in nifi.properties there is a parameter that affects batching behavior
nifi.queue.swap.threshold=20000
here is my test flow:
1. GenerateFlowFile with "batch size = 50K"
2. ExecuteGroovyScript with script below
3. LogAttrribute (disabled) - just to have queue after groovy
groovy script:
def ffList = session.get(100000) // get batch with maximum 100K files from incoming queue
if(!ffList)return
def ff = session.create() // create new empty file
ff.batch_size = ffList.size() // set attribute to real batch size
session.remove(ffList) // drop all incoming batch files
REL_SUCCESS << ff // transfer new file to success
with parameters above there are 4 files generated in output:
1. batch_size = 20000
2. batch_size = 10000
3. batch_size = 10000
4. batch_size = 10000
according to documentation:
There is also the notion of "swapping" FlowFiles. This occurs when the number of FlowFiles in a connection queue exceeds the value set in the nifi.queue.swap.threshold property. The FlowFiles with the lowest priority in the connection queue are serialized and written to disk in a "swap file" in batches of 10,000.
This explains that from 50K incoming files - 20K it keeps inmemory and others in swap batched by 10K.
i don't know how increasing of nifi.queue.swap.threshold property will affect your system performance and memory consumption, but i set it to 100K on my local nifi 1.16.3 and it looks good with multiple small files, and first batch increased to 100K by this.

How to do syncronized playback using AVAudioPlayer API?

There is an example here on using AVAudioPlayer. In the description it says it's able to:
Play multiple sounds at the same time with optional synchronization.
I don't see how to do that in the example.
Apple API that says the same thing:
Play multiple sounds simultaneously by synchronizing the playback of multiple players
https://developer.apple.com/documentation/avfaudio/avaudioplayer?language=objc
Example:
https://github.com/xamarin/docs-archive/tree/master/Recipes/ios/media/sound/avaudioplayer
Note: The repository is archived and does not allow adding issues.
Use the playAtTime() method on all the sounds you want and pass in the same date to all the sounds to play at the same time.
I read about the playAtTime() method and thought it was "play at this position in time of the sound" BECAUSE IT SAYS IT SAYS THE PARAMETER IS NAMED TIME NOT DATE:
but it actually takes a Date and that means play at a future date and time.
So if you were only looking at the auto complete API and it says playAtTime(time) you don't get the details you do when looking at the documentation. Seeing that there is another property on sound player that is currentTime that is a number and not a date.
Documentation:
Plays audio asynchronously, starting at a specified point in the audio
output device’s timeline.
func startSynchronizedPlayback() {
// Create a time offset relative to the current device time.
let timeOffset = playerOne.deviceCurrentTime + 0.01
// Start playback of both players at the same time.
playerOne.play(atTime: timeOffset)
playerTwo.play(atTime: timeOffset)
}

THREE JS DefaultLoadingManager onProgress function returns wrong number of total items first

I used this code to calculate the percentage of loading:
THREE.DefaultLoadingManager.onProgress = (item, loaded, total) => {
console.error(loaded / total * 100);
};
It reaches about 80% and then returns to 60% then it reaches 90% and then returns to 80%.
After some debugging, I found that the number of total items first is 25 and then increases to about 35 and after that, it reaches 52.
This increase is because of JSONLoader. I load some objects and these objects have materials as images, so onProgress function adds these images to the total number of items to be loaded.
I want to know how to know the real number of items to be loaded (52) at the start. If it is not possible, how to solve the going-back from 80% to 60%?
A couple things you could do:
You can run your load once, and record the results the final count, and hardcode it for the next run.
or..
Use a format like GLTF with all the assets embedded. Then you'll get one item per model.
or..
Fire off all your loads in parallel.. don't respond to the first few onProgress.. and hopefully capture the complete item count before you start displaying progress..
or.. make a progress bar that always advances by some percentage of the remaining time, and maybe adjust that percentage to roughly match the load time on your single machine, or adjust it dynamically as you get more information about remaining loads.

How to get all the videos of a YouTube channel with the Yt gem?

I want to use the Yt gem to get all the videos of channel. I configure the gem with my YouTube Data API key.
Unfortunately when I use it it returns a maximum of ~1000 videos, even for channels having more than 1000 videos. Yt::Channel#video_count returns the correct number of videos.
channel = Yt::Channel.new id: "UCGwuxdEeCf0TIA2RbPOj-8g"
channel.video_count # => 1845
channel.videos.map(&:id).size # => 949
The Youtube API can't be set to return more than 50 items per request, so I guess Yt automatically performs several requests going through each next page of results to be able to return more than 50 results.
For some reason though it does not go through all the result pages. I don't see a way in Yt for me to control how it goes through the pages of results. In particular I could not find a way to force it to get a single page of results, access the returned value nextPageToken, in order to perform a new request with this value.
Any idea?
Looking into gem's /spec folder, you can see a test for your code.
describe 'when the channel has more than 500 videos' do
let(:id) { 'UC0v-tlzsn0QZwJnkiaUSJVQ' }
specify 'the estimated and actual number of videos can be retrieved' do
# #note: in principle, the following three counters should match, but
# in reality +video_count+ and +size+ are only approximations.
expect(channel.video_count).to be > 500
expect(channel.videos.size).to be > 500
end
end
I did some tests and what I have noticed it that: video_count is the number that is displayed on youtube next to channel's name. This value is not accurate. Not rly sure what it represents.
If you do channel.videos.size, the number is not accurate either, because the videos collection can contain some empty(?) records.
If you do channel.videos.map(&:id).size the returned value should be correct. By correct I mean it should equal to number of videos listed at:
https://www.youtube.com/channel/:channel_id/videos

Hard Disk scheduling simulator algorithm (track to track timing) Perl

I am trying to get to grips with perl. I am trying to write a few scripts as a scheduling simulator. FCFS, SSTF and Scan and Look
I have one array with a list of block requests and another to act as the buffer. First I will copy over the first request, then I need to work out the time it takes to get from the first to the second block.
the buffer reads in blocks at 1 per ms, seek, search and access time are all 1ms to make the calculations a bit easier, the simulator always starts on block 1 track 1.
http://postimg.org/image/d9osb8tkj/
so if the first block is 5, the search time will be 3ms to traverse to the start of the 5th block, the seek time will be zero as its on the same track and the access time to read the block will always be 1ms. This means that the time for this request will be 4ms so the simulator will read in the next 4 requests into the buffer. In first come first served this will just be the order that the requests are served.
So if the next request to serve is 12 the arm is on the end of the 5th block so will take 2ms to get to the right track then 1ms to get to the start of the 12th block and another 1ms to access it.
I was just wondering if anyone could give me some idea how I could express this as an algorithm. Just some pointers would be much appreciated.
write a class HardDiskSim::Abstract, 3 subs seek_time(), spin_time(), and read_time()
Write a subclass of AbstractDisk for each different set of values/logic for the three methods.
Fir example:
package HardDiskSim::Simple;
use base qw(HardDiskSim::Abstract);
our $SECTORS_PER_TRACK = 5;
our $SEEK_TTIM_PER_TRACK = 1;
sub read_time { return 1 }
sub seek_time {
my $block = #_;
my $tracks_to_seek = int($block / $SECTORS_PER_TRACK);
return $tracks_to_seek * $SEEK_TTIM_PER_TRACK;
}
sub spin_time {
# compute head position at end of seek using seek time and RPM of disk
# compute number of sectors to spin past using computed head position
# return number_of_sectors_to_spin_past * time_per_sector
}
I had the fun of writing this kind of code in Fortran, for a class, back in 1985.

Resources