According to the ClickHouse documentation, I want to define a Executable User Defined Function.
how to connect above script to ClickHouse? In which directory should I copy the Python and XML file?
Related
Create a stored procedure that will read the .csv file from oracle server path using read file operation, query the data in some X table and write the output in .csv file.
here after read .csv file, compare .csv file data with table data and need to update few columns in .csv file.
Oracle works best with data in the database. UPDATE is one of the most frequently used commands.
But, modifying a file which resides in some directory seems to be somewhat out of scope. There are other programming languages you should use, I believe. However, if a hammer is the only tool you have, every problem looks like a nail.
I can think of two options.
One is to load file into the database. Use SQL*Loader to do that if file resides on your PC, or - if you have access to the database server and DBA granted you read/write privileges on a directory (an Oracle object which points to a filesystem directory) - use it as an external table. Once you load data, modify it and export it back (i.e. create a new CSV file) using spool.
Another option is to use UTL_FILE package. It also requires access to the database server's directory. Using the A(ppend) option, you can add rows to the original file, but I don't think that you can edit it so this option - at the end - finishes like the previous one - with creating a new file (but this time using UTL_FILE).
Conclusion? Don't use a database management system to modify files. Use another tool.
I'm copying data from an Oracle DB to ADLS using a copy activity of Azure Data Factory.
The result of this copy is a parquet file that contains the same data of the
table that I have copied but the name of this resultant parquet file is like this:
data_32ecaf24-00fd-42d4-9bcb-8bb6780ae152_7742c97c-4a89-4133-93ea-af2eb7b7083f.parquet
And I need that this name is stored like this:
TableName-Timestamp.parquet
How can I do that with Azure Data Factory?
Another question: Is there a way to add hierarchy when this file is being written? For example, I use the same
pipeline for writting several tables and I want to create a new folder for each table. I can do that if I create a new Dataset for each table
to write, but I want to know if is there a way to do that automatically (Using dynamic content).
Thanks in advance.
You could set a pipeline parameter to achieve it.
Here's the example I tried copy data from Azure SQL database to ADLS, it also should works for oracle to ADLS.
Set pipeline parameter: set the Azure SQL/Oracle table name which need to copy to ADLS:
Source dataset:
Add dynamic content to set table name:
Source:
Add dynamic content: set table name with pipeline parameter:
Sink dataset:
Add dynamic content to set Parquet file name:
Sink:
Add dynamic content to set Parquet file name with pipeline parameter:
Format: TableName-Timestamp.parquet:
#concat(pipeline().parameters.tablename,'-',utcnow())
Then execute the pipeline, you will get the Parquet file like TableName-Timestamp.parquet:
About your another question:
You could add dynamic content set folder name for each table, just follow this:
For example, if we copy the table "test", the result we will get:
container/test/test-2020-04-20T02:01:36.3679489Z.parquet
Hope this helps.
I want to access one/many hive variables, set with set var=XXX in my hive UDF evaluate() function/class.
As per this answer, I can pass these using ${hiveconf:var}, but can I access these without passing as arguments to the UDF.
I am open to any other means by which I can access a specific set of properties within the UDF that can be passed externally, if above is not possible.
I'd like to access a file which is stored in the binary table of a MSI installer from a custom action (VBScript, immediate execution).
My understanding is that the files from the binary table are extracted to some safe location and cleaned up after the installation. So the basic question would probably be: Can I determine from a custom action the path of this safe location, so that I can access the extracted files?
I found an alternative approach here. It reads the database from inside the CA and writes a temporary file itself, but does no cleanup. I have the impression that there must be a better solution.
I am working on Oracle Data Integrator 11g
I have to create an ODI package, where I need to process an incoming file. The file name is not a constant string, as it has a timestamp entry appended to it, something like this: FILTER_DATA_011413.TXT
Due to the MMDDYY, I can't hardcode the filename in my package. The way, we're handling it right now is, a shell script lists the files in the directory, and loads the filename into a table (using control file). This table is then queried to get the filename and the same is passed to the variable which stores the filename for processing.
I am looking for any other way, where I can avoiud having this temporary table to store the file name.
Can someone suggest me any alternative?