Librosa get_samplerate(): No such file or directory - librosa

I'm trying to stream an mp3 file from zipfile to make a melspectrogram. The error appears when I try to get sample rate.
with zf.ZipFile('fma_small.zip') as myzip:
with myzip.open('fma_small/155/155066.mp3') as myfile:
stream = librosa.stream(myfile.name,
block_length=256,
frame_length=2048,
hop_length=2048)
sr=librosa.get_samplerate(myfile.name)
for y_block in stream:
m_block = librosa.feature.melspectrogram(y=y_block, sr=sr,
n_fft=2048,
hop_length=2048,
center=False)
I tried it with unzipped folder and it worked. However I need to do it with zipped dataset.

Related

Twitch Stream as input for ffmpeg

My objective is to take a twitch video stream and generate an image sequence from it without having to create an intermediary file. I found out that ffmpeg can take a video and turn it into a video and turn it into an image sequence. The ffmpeg website says that it's input option can take network streams, although I really can't find any clear documentation for it. I've searched through Stack Overflow and I haven't found any answers either.
I've tried adding the link to the stream:
ffmpeg -i www.twitch.tv/channelName
But the program either stated the error "No such file or directory":
or caused a segmentation fault when adding https to the link.
I'm also using streamlink and used that with ffmpeg in a python script to try the streaming url:
import streamlink
import subprocess
streams = streamlink.streams("http://twitch.tv/channelName")
stream = streams["worst"]
fd = stream.open()
url = fd.writer.stream.url
fd.close()
subprocess.run(['/path/to/ffmpeg', '-i', url], shell=True)
But that is producing the same error as the website URL. I'm pretty new to ffmpeg and streamlink so I'm not sure what I'm doing wrong. Is there a way for me to add a twitch stream to the input for ffmpeg?
I've figured it out. Ffmpeg won't pull the files that are online for you, you have to pull them yourself, this can be done by using call GET on the stream url which returns a file containing addresses of .ts files, curl can be used to download these files on your drive. Combine this with my image sequencing goal the process looks like this on python:
import streamlink
import subprocess
import requests
if __name__ == "__main__":
streams = streamlink.streams("http://twitch.tv/twitchplayspokemon")
stream = streams["worst"]
fd = stream.open()
url = fd.writer.stream.url
fd.close()
res = requests.get(url)
tsFiles = list(filter(lambda line: line.startswith('http'), res.text.splitlines()))
print(tsFiles)
for i, ts in enumerate(tsFiles):
vid = 'vid{}.ts'.format(i)
process = subprocess.run(['curl', ts, '-o', vid])
process = subprocess.run(['ffmpeg', '-i', vid, '-vf', 'fps=1', 'out{}_%d.png'.format(i)])
It's not a perfect answer, you still have to create the intermediary video files which I was hoping to avoid. Maybe there's a better and faster answer, but this will suffice.

Flume to stream gz files

I have a folder contains a lot of gzip files. Each gzip file contains xml file. I had used flume to stream the files into HDFS. Below is my configuration file:
agent1.sources = src
agent1.channels = ch
agent1.sinks = sink
agent1.sources.src.type = spooldir
agent1.sources.src.spoolDir = /home/tester/datafiles
agent1.sources.src.channels = ch
agent1.sources.src.deserializer = org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder
agent1.channels.ch.type = memory
agent1.channels.ch.capacity = 1000
agent1.channels.ch.transactionCapacity = 1000
agent1.sinks.sink.type = hdfs
agent1.sinks.sink.channel = ch
agent1.sinks.sink.hdfs.path = /user/tester/datafiles
agent1.sinks.sink.hdfs.fileType = CompressedStream
agent1.sinks.sink.hdfs.codeC = gzip
agent1.sinks.sink.hdfs.fileSuffix = .gz
agent1.sinks.sink.hdfs.rollInterval = 0
agent1.sinks.sink.hdfs.rollSize = 122000000
agent1.sinks.sink.hdfs.rollCount = 0
agent1.sinks.sink.hdfs.idleTimeout = 1
agent1.sinks.sink.hdfs.batchSize = 1000
After I stream the files into HDFS, i use Spark to read it using the following code:
df = sparkSession.read.format('com.databricks.spark.xml').options(rowTag='Panel', compression='gzip').load('/user/tester/datafiles')
But I am having issue to read it. If i manually upload one gzip file into HDFS folder and re-run the above Spark code, it able to read it without any issue. I am not sure is it due to flume.
I tried to download the file streamed by flume and unzip it, when i viewed the contents, it no longer showing the xml format, it is some unreadable character. Could anyone shed me some light on this? Thanks.
I think you are doing it Wrong!!! Why ?
See you are having a source which is "Non Split-able" ZIP . you can'read them partially as record by record,if you do not decompress you will get a GZIPInputStream, which you are getting in flume source.
And after reading that GZIP input stream as input records you are saving already ziped streams into another GZIP stream as you selected sink type as compressed.
So you have Zipped Streamed inside a Gzip in HDFS . :)
I suggest schedule a script in cron to do a copy from local to HDFS will solve your problem .

rclone to Gdrive error files

Good day,
I am using rclone to upload large files to Gdrive and I always end up with some files causing errors for example :
9975 files to upload to Gdrive
output
Errors: 75
Checks: 0
Transferred: 9975
does that mean that all the files have been uploaded correctly or do i loose those 75 files that caused an error.
Thank you !
Found out that this means 75 files did not get transferred to Gdrive and need to be uploaded again.

Flume-ng: source path and type for copying log file from local to HDFS

I am trying to copy some log files from local to HDFS using flume-ng. The source is /home/cloudera/flume/weblogs/ and the sink is hdfs://localhost:8020/flume/dump/. A cron job will copy the logs from tomcat server to /home/cloudera/flume/weblogs/ and I want to log files to be copied to HDFS as the files are available in /home/cloudera/flume/weblogs/ using flume-ng. Below is the conf file I created:
agent1.sources= local
agent1.channels= MemChannel
agent1.sinks=HDFS
agent1.sources.local.type = ???
agent1.sources.local.channels=MemChannel
agent1.sinks.HDFS.channel=MemChannel
agent1.sinks.HDFS.type=hdfs
agent1.sinks.HDFS.hdfs.path=hdfs://localhost:8020/flume/dump/
agent1.sinks.HDFS.hdfs.fileType=DataStream
agent1.sinks.HDFS.hdfs.writeformat=Text
agent1.sinks.HDFS.hdfs.batchSize=1000
agent1.sinks.HDFS.hdfs.rollSize=0
agent1.sinks.HDFS.hdfs.rollCount=10000
agent1.sinks.HDFS.hdfs.rollInterval=600
agent1.channels.MemChannel.type=memory
agent1.channels.MemChannel.capacity=10000
agent1.channels.MemChannel.transactionCapacity=100
I am not able to understand:
1) what will be the value of agent1.sources.local.type = ???
2) where to mention the source path /home/cloudera/flume/weblogs/ in the above conf file ?
3) Is there anything I am missing in the above conf file?
Please let me know on these.
You can use either :
An Exec Source and use a command (i.e. cat or tail on gnu/linux on you files)
Or a Spooling Directory Source for read all files in a directory

SNAP: Simulation and Neuroscience Application Platform

Is there any documentation/help manual on how to use SNAP (Simulation and Neuroscience Application Platform)1.
I wanted to run the Motor Imagery sample scenario with a .avi file for the stimulus instead of the image. How can that be done?
The following error is obtained when using the AlphaCalibration scenario which gives code to play an avi file.Any help appreciated
:movies:ffmpeg(warning): parser not found for codec indeo4, packets or times may be invalid.
:movies:ffmpeg(warning): max_analyze_duration 5000000 reached at 5000000
:movies(error): Could not open /e/BCI_Feb2014/SNAP-master/src/studies/SampleStudy/bird.avi
:audio(error): Cannot open file: /e/BCI_Feb2014/SNAP-master/src/studies/SampleStudy/bird.avi
:audio(error): Could not open audio /e/BCI_Feb2014/SNAP-master/src/studies/SampleStudy/bird.avi
:movies:ffmpeg(warning): parser not found for codec indeo4, packets or times may be invalid.
:movies:ffmpeg(warning): max_analyze_duration 5000000 reached at 5000000
:movies(error): Could not open /e/BCI_Feb2014/SNAP-master/src/studies/SampleStudy/bird.avi
:gobj(error): Texture "/e/BCI_Feb2014/SNAP-master/src/studies/SampleStudy/bird.avi" exists but cannot be read.
Traceback (most recent call last):
File "E:\BCI_Feb2014\SNAP-master\src\framework\latentmodule.py", line 458, in _run_wrap
self.run()
File "modules\BCI\AlphaCalibration.py", line 30, in run
Exception during run():
m = self.movie(self.moviefile, block=False, scale=[0.7,0.4],aspect=1.125,contentoffset=[0,0],volume=0.3,timeoffset=self.begintime+t*self.awake_duration,looping=True)
Could not load texture: bird.avi
File "E:\BCI_Feb2014\SNAP-master\src\framework\basicstimuli.py", line 348, in movie
tex = self._engine.base.loader.loadTexture(filename)
File "E:\BCI_Feb2014\Panda3D-1.8.0\direct\showbase\Loader.py", line 554, in loadTexture
raise IOError, message
IOError: Could not load texture: bird.avi
The video files are not included in the github repository due to the file size. However, your use case should work if you use any other video file that you put on the search path (e.g., next to the other media files - see also Panda3d's notion of content search paths).
There is a SNAP release including media files at: ftp://sccn.ucsd.edu/pub/software/LSE-SDK/ which might include the files you're looking for.

Resources