How to transcribe audio without using external APIs?

How to transcribe audio without using external APIs? - macos

I would prefer not to use Amazon, Google etc, so how would I use my own computer (macOS) to get a time-stamped transcription of mp3s and videos? Preferably on the command line. So I could do something like this
transcribe -o oliver_twist.srt oliver_twist.mp3
.. to create a SRT subtitle file from an mp3.

For Linux there's a package called voice2json: http://voice2json.org/commands.html#transcribe-wav
simply if you have an audio file: sample.wav you run
voice2json transcribe-wav < simple.wav
and you get the output
{"text": "sample voice recording", "transcribe_seconds": 0.123, "wav_seconds": 1.23}
I believe you can install this Linux package to macOS. To do that just look at: https://apple.stackexchange.com/questions/53096/is-it-possible-to-install-linux-packages-on-os-x
EDIT:
To get the srt, you need a package called jq. You can install it the same way. Let's say your output from previous command is output.json. What you need to do is:
jq .text output.json > subtitles.srt and the output will be saved as subtitles.srt

Kdenlive is able to generate SRT files from an audio file: see Kdenlive. It is also available for MacOs.
Once Kdenlive is installed, you can install Kdenlive command line tools to operate Kdenlive from the command line: see Kdenlive command line.

Related

Download video+audio in a single file from YouTube

I tryed to download a video+audio from YouTube by using youtube-dl:
youtube-dl https://www.youtube.com/watch?v=7wfUUZvybPY
I got a video file (.webm) without audio. I'm looking for a way to download video+audio in a single file by using the command line (cmd) in Windows 10. Do you have any suggestion?

youtube-dl.exe --format mp4 https://www.youtube.com/watch?v=7wfUUZvybPY
However, youtube-dl relies on ffmpeg for many format conversions. Install ffmpeg for Windows and look at youtube-dl docs here

I found a solution; the following command merges the audio and video files in a file.mkv:
youtube-dl.exe --sub-lang en --write-sub https://www.youtube.com/watch?v=7wfUUZvybPY

Unable to add silence to end of mp3 using cat

I have a mp3 file with silence, s.mp3, which I'm trying to add to the end of an mp3 using:
cat file1.mp3 s.mp3 > file1.mp3
This worked fine for some mp3 files I downloaded from the net but not the files I created myself using lame.
I'm on mac os x.
Since I'm making the mp3 files myself with lame maybe I can do something that will allow then to work with cat.
How might I establish the problem?

Most file formats are too complicated for concatenation to work. If you know how to program, look at a class called AVAsset in the Mac OS X SDK documentation. If you do not want to program, you can probably find an App that concatenates audio files together.
(Although, to my surprise I do see a "MacHint" that claims that you can cat mp3s together).
Also, see https://superuser.com/questions/78912/free-mp3-merge-for-mac-os-x

Using avconv without an output file specified

I'm using avconv in the following way in order to grab ID3 data from audio files on remote servers:
avconv -i http://myserver.com/my_music.mp3
This command will output all the info I need, which I then parse.
The problem is, it always exits with a non-zero exit status, due to the fact that no output file is specified (since I don't want to actually download the full audio file and convert it in any way).
Is there any way I can run avconv so that it
outputs the audio metadata of the remote file
doesn't download the remote file in full
returns an exit status of 0 if it was able to get this far

How about actually downloading the file only as a temp to work on, and then automatically delete it after the work's been done?
avconv -i http://myserver.com/my_music.mp3 -y /temp/temp.mp3 -f ffmetadata meta.ini
# delete temp file after it's been worked on
wait
echo "Done."
rm /temp/temp.mp3
Keep in mind that I wrote all the above from top of my head so it may contain some errors.
In order to extract the metadata of the provided audio file, you could also use a python script.
>>> from pydub.utils import mediainfo
>>> mediainfo("/temp/temp.mp3")
and add some bash snippets inside.

How to decode HEVC files to YUV?

I would like to decode HEVC encoded files to YUV files.
Is there any simple way to do this yet? An executable would be nice but I would make do with source code that is easily compilable.

It's as simple as (guide assumed linux, tweek it to your needs)
Clone the official reference codec (the official-official is a svn-repo found at https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/ but a read-only git-repo that is kept in sync with the svn is provided by BBC which is so much easier to work with IMHO)
git clone git://hevc.kw.bbc.co.uk/git/jctvc-hm.git
To create the executables:
cd jctvc-hm/build/linux && make -f makefile
Binaries are now placed in
jctvc-hm/bin
Now, to decode a HEVC-encoded binary file into YCbCr, do
./TAppDecoderStatic -b encoded_file.bin -o reconstructed.yuv
If you are not on a linux system, just goto the build folder and you will hopefully find something you can use for your system:
$ cd jctvc-hm/build && ls
HM_vc10.sln HM_vc8.sln HM_vc9.sln linux/ vc10/ vc8/ vc9/

Follow the instructions on https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.2-dev/doc/software-manual.pdf, the source code can be downloaded from https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/ by using any subversion software.
You can build it on both Windows and Linux based OS. After you built the software, you may run the exe files as it is instructed on the software manual.

Alternatively, you can use libde265 as a much faster decoder.
Get the latest version from its github release page.
Configure with ./configure --disable-sherlock265
Compile: make
Generate the YUV file with
./dec265/dec265 hevc-file.bin -o output.yuv -t4
The option -t4 is for multi-threaded decoding. You can also do more things like input NAL-unit streams, dump the headers, directly display the video, or check the SEI hashes.

You can download the ffmpeg windows build exe file
simply decoding HEVC bitstream.
ffmpeg.exe -i xxx.bin out.yuv

How to stream all videos in a folder?

Hi i want to stream videos over web using ffserver. i got this link as reference.
Now what i am not able to figure out is how to pass a folder(which content all videos i want to stream) as input to stream all videos. I also want add more videos dynamically to this folder in time to time and streaming should happen(like how it works in Darwin). now i can't use Darwin because it doesn't support for iOS.
please give me a suggestion.
is there any other open source tool by which i can do this?

I wrote a bash script for this, it's working in ubuntu 16
Hopefully someone else can write it up in a less terrible language
Here's the script:
echo -e "HTTPPort 8090\nHTTPBindAddress 0.0.0.0\nMaxHTTPConnections 2000\nMaxClients 1000\nMaxBandwidth 1000\nCustomLog -\n<Stream stat.html>\nFormat status\n</Stream>"
num=1
for i in *.mp3; do
echo -e "<Stream \"$(urlencode $i)\">\nFile \"$(pwd)/$i\"\nFormat mp2\nAudioCodec libmp3lame\nAudioBitRate 64\nAudioChannels 1\nAudioSampleRate 44100\nNoVideo\n</Stream>"
done
save this as a bash script in the folder you want to serve, I'll refer to it as:
./gen_ffserver_conf.sh
it's hard coded for mp3, you'd have to sort through my echos to get it to do another format.
run the server with:
ffserver -f <(bash -e ./gen_ffserver_conf.sh)
I had to install a package for the url encoding:
sudo apt install gridsite-clients
(and of course you need ffserver as well, in the ffmpeg package)
I stream the files by going to:
http://<ip address of streaming server>:8090/stat.html
and clicking on the urlencoded values, (using chromium). This will open the stream and start playing.
Explanation:
ffserver doesn't like wildcards, or at least I never figured that out, so I'm just creating an entry for each file in the server. The urlencoding is annoying but necessary for the stat page links to work properly.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to transcribe audio without using external APIs? - macos

Kdenlive is able to generate SRT files from an audio file: see Kdenlive. It is also available for MacOs. Once Kdenlive is installed, you can install Kdenlive command line tools to operate Kdenlive from the command line: see Kdenlive command line.

Related

Download video+audio in a single file from YouTube

Unable to add silence to end of mp3 using cat

Using avconv without an output file specified

How to decode HEVC files to YUV?

How to stream all videos in a folder?

Categories

Resources