Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 days ago.
Improve this question
Please note: pure golang implementation, no ffmpeg-wrapper / c-go
Is there a pure golang implementation of aac transcode opus?
I've written a streaming-WebRTC gateway app that can convert av streams from streaming devices to WebRTC via pion, but now there's a tricky problem, the audio encoding provided by these media devices is usually aac(WebRTC do not support aac), I can't find a library that implements aac -> opus (or pcm -> opus) in pure go, only some library based on c-go (like this one). The c-go based library has some limitations, e.g. it can't be self-contained, so there a pure golang implementation of aac transcode opus?
The code (snippet) below is my current implementation using Glimesh's fdk-aac and hraban's opus
...
// 音频相关配置
// https://github.com/Glimesh/go-fdkaac
aacDecoder := fdkaac.NewAacDecoder()
defer func() {
_ = aacDecoder.Close()
}()
aacDecoderInitDone := false
var opusEncoder *hrabanopus.Encoder
minAudioSampleRate := 16000
var opusAudioBuffer []byte
opusBlockSize := 960
opusBufferSize := 1000
opusFramesSize := 120
...
// 用 AAC 的元数据初始化 AAC 编码器
// https://github.com/winlinvip/go-fdkaac
if tag.AACPacketType == flvio.AAC_SEQHDR {
if !aacDecoderInitDone {
if err := aacDecoder.InitRaw(tagData); err != nil {
return errors.Wrapf(err, "从 %s.%s 的(音频元数据 %s)标签初始化 AAC 解码器失败", flowControlGroup, streamKey, hex.EncodeToString(tagData))
}
aacDecoderInitDone = true
logrus.Infof("从 %s.%s 的(音频元数据 %s)标签初始化了 AAC 解码器 %p", flowControlGroup, streamKey, hex.EncodeToString(tagData), aacDecoder)
}
} else {
tagDataString := hex.EncodeToString(tagData)
logrus.Tracef("使用已初始化了的 AAC 解码器 %p 解码 %s.%s 的音频数据 %s", aacDecoder, flowControlGroup, streamKey, tagDataString)
// 解码 AAC 为 PCM
decodeResult, err := aacDecoder.Decode(tagData)
if err != nil {
return errors.Wrapf(err, "从 %s.%s 的标签解码 PCM 数据失败", flowControlGroup, streamKey)
}
rate := aacDecoder.SampleRate()
channels := aacDecoder.NumChannels()
if rate < minAudioSampleRate {
logrus.Tracef("从 %s.%s 的标签解码 PCM 数据得到音频采样频 %d小于要求的最小值(%d),将忽略编码opus的操作", flowControlGroup, streamKey, rate, minAudioSampleRate)
} else {
if opusEncoder == nil {
oEncoder, err := hrabanopus.NewEncoder(rate, channels, hrabanopus.AppAudio)
if err != nil {
return err
}
opusEncoder = oEncoder
}
// https://github.com/Glimesh/waveguide/blob/a7e7745be31d0a112aa6adb6437df03960c4a5c5/internal/inputs/rtmp/rtmp.go#L289
// https://github.com/xiangxud/rtmp-to-webrtc/blob/07d7da9197cedc3756a1c87389806c3670b9c909/rtmp.go#L168
for opusAudioBuffer = append(opusAudioBuffer, decodeResult...); len(opusAudioBuffer) >= opusBlockSize*4; opusAudioBuffer = opusAudioBuffer[opusBlockSize*4:] {
pcm16 := make([]int16, opusBlockSize*2)
pcm16len := len(pcm16)
for i := 0; i < pcm16len; i++ {
pcm16[i] = int16(binary.LittleEndian.Uint16(opusAudioBuffer[i*2:]))
}
opusData := make([]byte, opusBufferSize)
n, err := opusEncoder.Encode(pcm16, opusData)
if err != nil {
return err
}
opusOutput := opusData[:n]
// https://datatracker.ietf.org/doc/html/rfc6716#section-2.1.4
// Opus can encode frames of 2.5, 5, 10, 20, 40, or 60 ms. It can also
// combine multiple frames into packets of up to 120 ms. For real-time
// applications, sending fewer packets per second reduces the bitrate,
// since it reduces the overhead from IP, UDP, and RTP headers.
// However, it increases latency and sensitivity to packet losses, as
// losing one packet constitutes a loss of a bigger chunk of audio.
// Increasing the frame duration also slightly improves coding
// efficiency, but the gain becomes small for frame sizes above 20 ms.
// For this reason, 20 ms frames are a good choice for most
// applications.
sampleDuration := time.Duration(opusFramesSize) * time.Millisecond
sample := media.Sample{
Data: opusOutput,
Duration: sampleDuration,
}
if err := audioTrack.WriteSample(sample); err != nil {
return err
}
}
}
}
...
Also, is there a pure-go ffmpeg alternative? not wrappers
Related
I am trying to use the backend to serve the video from the storage. I use Go + GIN It works but I need to implement video requests with start and end parameters. For example, I have a video with 10 mins duration and I want to request a fragment from 2 to 3 mins. Is it possible or are there examples somewhere?
This is what I have now:
accessKeyID := ""
secretAccessKey := ""
useSSL := false
ctx := context.Background()
endpoint := "127.0.0.1:9000"
bucketName := "mybucket"
// Initialize minio client object.
minioClient, err := minio.New(endpoint, &minio.Options{
Creds: credentials.NewStaticV4(accessKeyID, secretAccessKey, ""),
Secure: useSSL,
})
if err != nil {
log.Fatalln(err)
}
// Get file
object, err := minioClient.GetObject(ctx, bucketName, "1.mp4", minio.GetObjectOptions{})
if err != nil {
fmt.Println(err)
return
}
objInfo, err := object.Stat()
if err != nil {
return
}
buffer := make([]byte, objInfo.Size)
object.Read(buffer)
c.Writer.Header().Set("Content-Length", fmt.Sprintf("%d", objInfo.Size))
c.Writer.Header().Set("Content-Type", "video/mp4")
c.Writer.Header().Set("Connection", "keep-alive")
c.Writer.Header().Set("Content-Range", fmt.Sprintf("bytes 0-%d/%d", objInfo.Size, objInfo.Size))
//c.Writer.Write(buffer)
c.DataFromReader(200, objInfo.Size, "video/mp4", bytes.NewReader(buffer), nil)
This will require your program to at least demux the media stream to get time information out of it, in case you're using a container that supports that, or to actually decode the video stream in case it doesn't - in general, you can't know how many bytes you need to seek into a video file to go to a specific location¹.
As the output again needs to be a valid media container so that whoever requested it can deal with it, there's going to be remixing into an output container.
So, pick yourself a library that can do that and read its documentation. Ffmpeg / avlib is the classical choice there, but I have positively no idea about whether someone else has already written go bindings for it. If not, doing that works be worthwhile.
¹ there is cases where you can, that would probably apply to MPEG Transport Streams with a fixed mux bitrate. But unless you're working in streaming of video for actual TV towers or actual TV satellites that need a constant rate data stream, you will not likely be dealing with these
I am trying to implement forwarding live stream to webrtc for playing in browser.
Work is based on project: https://github.com/aler9/rtsp-simple-server, and webrtc use the library "pion/webrtc"
below my code for write h264 nalus sample to webrtc,
pull h264 data from ringbuffer
write h264 nalus to webrtc video track
func (c *webrtcSession) runRead(ctx context.Context) error {
c.ringBuffer, _ = ringbuffer.New(uint64(c.readBufferCount))
h264FrameDuration := time.Millisecond * 33
for {
item, ok := c.ringBuffer.Pull()
data := item.(*data)
if c.videoTrack != nil && data.trackID == c.videoTrackID
{
if data.h264NALUs == nil {
continue
}
outBuf := []byte{}
for _, nalu := range data.h264NALUs {
outBuf = append(outBuf, []byte{0x00, 0x00, 0x00, 0x01}...)
outBuf = append(outBuf, nalu...)
}
c.videoTrack.WriteSample(media.Sample{Data: outBuf, Duration:time.Duration(h264FrameDuration)})
}
}
}
Below is the screen recording for the problem while I play in browser:https://youtu.be/twyiU9rF8AY
I have a stream of image data coming in from a ros topic subscription. I am converting that data in to an golang image format which is (image.RGBA). I want to encode this data and write as sample to pion webrtc video track.
I Tried using gen2brain x264 codec but it failed due to higher memory allocation.
Please guide the correct way of doing it. Also should I have to encode the image first before writing the sample or does pion automatically encodes the data. I don't really know.
Here is my implementation
sub, err := h.Subscribe(cam.TopicName, func(msg *sensor_msgs.Image) {
throttler(func() {
width := int(msg.Width)
height := int(msg.Height)
data := msg.Data
rect := image.Rect(0, 0, width, height)
newImage := image.NewRGBA(rect)
for i := 0; i < (3 * width * height); i += 3 {
modifiedColor := color.RGBA{
R: data[i+0],
G: data[i+1],
B: data[i+2],
A: 255,
}
newImage.Set((i/3)%width, (i/3)/width, modifiedColor)
}
if err = videoTrack.WriteSample(media.Sample{Data: <Data Here>, Duration: time.Second}); err != nil && err != io.ErrClosedPipe {
panic(err)
}
})
})
I have a video directly from the http body in a [] byte format:
//Parsing video
videoData, err := ioutil.ReadAll(r.Body)
if err != nil {
w.WriteHeader(UPLOAD_ERROR)
w.Write([]byte("Error uploading the file"))
return
}
and I need a single frame of the video and convert it to a png. This is how someone would do it with a static and encoded file using ffmpeg:
filename := "test.mp4"
width := 640
height := 360
cmd := exec.Command("ffmpeg", "-i", filename, "-vframes", "1", "-s", fmt.Sprintf("%dx%d", width, height), "-f", "singlejpeg", "-")
var buffer bytes.Buffer
cmd.Stdout = &buffer
if cmd.Run() != nil {
panic("could not generate frame")
}
How can I achieve the same with a raw video?
A user from reddit told me that I might achieve this with https://ffmpeg.org/ffmpeg-protocols.html#pipe but I was unable to find any resources.
Any help is appreciated, thanks.
(EDIT: I tried to pipe the []byte array to ffmpeg now, but ffmpeg does not fill in my buffer:
width := 640
height := 360
log.Print("Size of the video: ", len(videoData))
cmd := exec.Command("ffmpeg", "-i", "pipe:0", "-vframes", "1", "-s", fmt.Sprintf("%dx%d", width, height), "-f", "singlejpeg", "-")
cmd.Stdin = bytes.NewReader(videoData)
var imageBuffer bytes.Buffer
cmd.Stdout = &imageBuffer
err := cmd.Run()
if err != nil {
log.Panic("ERROR")
}
imageBytes := imageBuffer.Bytes()
log.Print("Size of the image: ", len(imageBytes))
But I get following error:
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7ff05d002600]stream 0, offset 0x5ded: partial file
pipe:0: Invalid data found when processing input
Finishing stream 0:0 without any data written to it.
frame= 0 fps=0.0 q=0.0
Lsize= 0kB time=-577014:32:22.77 bitrate= -0.0kbits/s speed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: unknown
Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
I need a single frame of the video and convert it to a png. This is
how someone would do it with ffmpeg.
There is a popular go library that is exactly made for what you search for:
https://github.com/bakape/thumbnailer
thumbnailDimensions := thumbnailer.Dims{Width: 320, Height: 130}
thumbnailOptions := thumbnailer.Options{JPEGQuality:100, MaxSourceDims:thumbnailer.Dims{}, ThumbDims:thumbnailDimensions, AcceptedMimeTypes: nil}
sourceData, thumbnail, err := thumbnailer.ProcessBuffer(videoData, thumbnailOptions)
imageBytes := thumbnail.Image.Data
They use ffmpeg under the hood, but removes the abstraction for you.
Please check this out, I wrote this to down sample mp3 files to 128k bitrate and it should work with you. Please change the command which suits you:
package main
import (
"bytes"
"io/ioutil"
"log"
"os"
"os/exec"
)
func check(err error) {
if err != nil {
log.Fatalln(err)
}
}
func main() {
file, err := os.Open("test.mp3") // open file
check(err)
defer file.Close()
buf, err := ioutil.ReadAll(file)
check(err)
cmd := exec.Command("ffmpeg", "-y", // Yes to all
//"-hide_banner", "-loglevel", "panic", // Hide all logs
"-i", "pipe:0", // take stdin as input
"-map_metadata", "-1", // strip out all (mostly) metadata
"-c:a", "libmp3lame", // use mp3 lame codec
"-vsync", "2", // suppress "Frame rate very high for a muxer not efficiently supporting it"
"-b:a", "128k", // Down sample audio birate to 128k
"-f", "mp3", // using mp3 muxer (IMPORTANT, output data to pipe require manual muxer selecting)
"pipe:1", // output to stdout
)
resultBuffer := bytes.NewBuffer(make([]byte, 5*1024*1024)) // pre allocate 5MiB buffer
cmd.Stderr = os.Stderr // bind log stream to stderr
cmd.Stdout = resultBuffer // stdout result will be written here
stdin, err := cmd.StdinPipe() // Open stdin pipe
check(err)
err = cmd.Start() // Start a process on another goroutine
check(err)
_, err = stdin.Write(buf) // pump audio data to stdin pipe
check(err)
err = stdin.Close() // close the stdin, or ffmpeg will wait forever
check(err)
err = cmd.Wait() // wait until ffmpeg finish
check(err)
outputFile, err := os.Create("out.mp3") // create new file
check(err)
defer outputFile.Close()
_, err = outputFile.Write(resultBuffer.Bytes()) // write result buffer to file
check(err)
}
Reference: https://gist.github.com/aperture147/ad0f5b965912537d03b0e851bb95bd38
I'm trying to play audio in Go, asynchronously, using PortAudio. As far as I'm aware PortAudio handles its own threading, so I don't need to use any of Go's build-in concurrency stuff. I'm using libsndfile to load the file (also Go bindings). Here is my code:
type Track struct {
stream *portaudio.Stream
playhead int
buffer []int32
}
func LoadTrackFilesize(filename string, loop bool, bytes int) *Track {
// Load file
var info sndfile.Info
soundFile, err := sndfile.Open(filename, sndfile.Read, &info)
if err != nil {
fmt.Printf("Could not open file: %s\n", filename)
panic(err)
}
buffer := make([]int32, bytes)
numRead, err := soundFile.ReadItems(buffer)
if err != nil {
fmt.Printf("Error reading from file: %s\n", filename)
panic(err)
}
defer soundFile.Close()
// Create track
track := Track{
buffer: buffer[:numRead],
}
// Create stream
stream, err := portaudio.OpenDefaultStream(
0, 2, float64(44100), portaudio.FramesPerBufferUnspecified, track.playCallback,
)
if err != nil {
fmt.Printf("Couldn't get stream for file: %s\n", filename)
}
track.stream = stream
return &track
}
func (t *Track) playCallback(out []int32) {
for i := range out {
out[i] = t.buffer[(t.playhead+i)%len(t.buffer)]
}
t.playhead += len(out) % len(t.buffer)
}
func (t *Track) Play() {
t.stream.Start()
}
Using these functions, after initialising PortAudio and all the rest, plays the audio track I supply - just. It's very laggy, and slows down the rest of my application (a game loop).
However, if I change the frames per buffer value from FramesPerBufferUnspecified to something high, say, 1024, the audio plays fine and doesn't interfere with the rest of my application.
Why is this? The PortAudio documentation suggests that using the unspecified value will 'choose a value for optimum latency', but I'm definitely not seeing that.
Additionally, when playing with this very high value, I notice some tiny artefacts - little 'popping' noises - in the audio.
Is there something wrong with my callback function, or anything else, that could be causing one or both of these problems?
I'm using OSX 10.10.5, with Go 1.3.3 and the libsndfile and portaudio from Homebrew.
Thanks.
Moving to the comment to an answer:
Always test with the latest version of Go.
Also, #Joel figured out that you need to use float32 instead of int32.