Calling ffmpeg api from Oracle - oracle

I have installed ffmpeg and ffmpeg-devel packages on Linux.
Oracle 11g is installed and running.
The database stores media files, and for better streaming we need to convert them to AVI format.
For ease of integration, we would like to do this conversion in the database.
Now, the simplest option is to write a wrapper for the ffmpeg command line utility, and enable a PLSQL procedure to call this.
However this would require the following steps:
Read video BLOB
Write to a OS file
Call ffmpeg wrapper giving file name from (2) and output file name
Load output file from 3 into a BLOB in PLSQL
I would like to if possible write a C routine (using the Oracle External Library feature) which accepts the input as the BLOB (OciLOBLocator), calls the appropriate libavformat functions presenting the LOB, and write the return to a LOB (again OciLOBLOcator) which is what the PLSQL layer then uses as the AVI file.
The other advantage of this is it avoids the undesirable impact of issuing a OS command from within Oracle.
The problem I have is that the examples given for ffmpeg show the processing of data from files, whereas I need the libraries to process the LOBs.
The alternative is to see if the OrdVideo data type in Oracle does this kind of conversion by using setformat and process.

Interesting challenge. So it sounds like you would prefer to not have to call the 'ffmpeg' command line utility, but rather leverage the libavformat and libavcodec libraries in native calls within the database. Do I have that right?
I trust that these LOBs/OciLOBLocator things expose a C API for reading and writing? If that's the case, then perhaps you can create a new URLProtocol based on that API. URLProtocols are how FFmpeg deals with I/O. Run 'ffmpeg -protocols' to see all the ones implemented. Examine the source of libavformat/file.c for a simple example of what a URLProtocol entails -- open, read, write, seek, close, and a few other functions.

Related

can i use some native Parquet file tools in Azure Logic App?

Hi there I need to change format of parquet file to csv using only Logic app native tools. Is that even possible?
I did research of similar issues, I found how to use Azure Functions to change format, but it's not native Logic App tool.
There's a custom connector that will transform Parquet to Json for you.
It will also allow you to perform filter and sorting operations on the data prior to it being returned.
Documentation can be found here ... https://www.statesolutions.com.au/parquet-to-json/

How can I use named pipes to stream a GCP Cloud Storage object to an executable that wants input files?

I have a third-party executable that takes a directory path as an argument and in turn looks there for a collection of .db files. I have said collection of files stored in a Google Cloud Storage bucket and would like to stream the content of those files into some local named pipes that can be used as input to the executable.
I'm writing an application to perform the above in Go and am using the "cloud.google.com/go/storage" package to work with cloud storage objects.
As a note, I need all pipes/files to be available for reading at the time I run the executable.
What is the best way to go about this? I'm looking to essentially used the named pipe as a proxy of sorts to make remote files look local to this executable. Possible?

How to write a stream to Google Cloud Storage?

I want to write a file in gcs with a stream object but I've only found the "create_file" function that creates a new file object by providing a path to a local file to upload and the path to store it with in the bucket.
Is there any function to create a file in gcs from a stream?
Fuse over GCS
You could try gcsfuse which layers a user-space fs over a bucket but it is only beta s/ware at present. There's a nice section on limitations which you should read first.
I use fog to access GCS but that is a thin layer which doesn't try to impose any additional semantics into the bucket/object model.
Warning, if your problem really requires a standard file-system underneath any possible solution then GCS is not a good fit.
The ability to provide an IO object instead of a File object has only recently been possible. It was added in PR 1335, and will be included in the next release.
Until then, quickest way is to write the stream to a tempfile and upload that. For more see Issue 305.

Sax parsing a large file from S3

I have a very large xml file on s3 (50gb). I would like to stream this file to a sax xml parser for further processing using ruby. How would I do that in an environment where I cannon download the whole file locally, but only stream it over tcp from s3?
I'm thinking about using https://github.com/ohler55/ox for the parsing it self, and https://github.com/aws/aws-sdk-ruby for accessing the file on S3. I'm just unsure how connect the pieces using a streaming approach?
The most easiest way is to use mc. mc implements are cat command which can used in a simpler way.
For example as shown below. Here cat streams your object and pipe the output of cat to your XML parser which reads from stdinput.
$ mc cat s3.amazonaws.com/<yourbucket>/<yourobject> | <your_xml_parser>
This way you can avoid downloading the file locally.
Additionally mc provides more tools to work with Amazon S3 compatible cloud storage and filesystems. It has features like resumable uploads, progress bar, parallel copy. mc is written in Golang and released under Apache license v2. mc is supported on OS X, Linux and Windows.

DBMS_Scheduler get/put file alternative

I have a side project I'm working on currently that requires me to copy over a .csv file from a remote FTP and save it locally. I figured I would use DBMS_SCHEDULER.GET_FILE but I do not have permission. When I asked my manager, he said that I wont be able to get privileges to do this and should look up other ways.
After researching for a couple of days I keep coming back to DBMS_SCHEDULER, am I out of luck or are my searching skills terrible.
Thanks
I'm not certain you'd want to use DBMS_SCHEDULER for this; from what I understand from the documentation (never used this myself) the FTP site would have to be completely open to all; there is a parameter destination_permissions, but it's only "Reserved for future use", i.e. there's no way of specifying any permissions at the moment.
If I'm right with this then I agree with your manager, though not necessarily for the same reasons (it seems like you'll never get permission to use DBMS_SCHEDULER which I hope is incorrect).
There are other methods of doing this:
UTL_TCP; this is simply a method of interacting over a TCP/IP protocol. Oracle Base has an article, which includes a FTP package based on UTL_TCP and instructions how to use it. This also requires the use of the UTL_FILE package, which can write OS files.
UTL_HTTP; I'm 99% certain it's possible to connect to an FTP using this; it's certainly possible to connect to a SFTP/any server. It'll require a little more work but it would be worth it in the longer run. It would also require the use of UTL_FILE.
A Java stored procedure to FTP directly; this is probably the best approach; create one using one of the many Java FTP libraries.
A Java stored procedure to call call OS commands; this is easiest method but the least extensible. Oracle released a white paper on calling OS commands from within PL/SQL back in 2008 but there's plenty of other stuff out there (including Oracle Base again)
Lastly, you could question whether this is actually what you want to do...
What scheduler do you use? Does it have event driven scheduling? If so there's no need to FTP from within Oracle; use UTL_FILE to write a file to the OS and then OS commands from there.
Was the other file originally in a database? If that's the case you don't need to extract it. You could use DBMS_FILE_TRANSFER to collect it straight from the database or even create a JDBC connection or (more simply) a database link to SELECT the data directly.

Resources