I'm trying to create a log file from the log servers and pushing it in to oracle. Is there a way I could implement the same using Flume (without HDFS setup) to push the log file in to ORACLE? Any help would be greatly appreciated.
You will need to create a custom sink in order to persist data into Oracle. This is relatively easy, you only have to extend the AbstrackSink class and implement the process() method (basically, take an event from the channel and use the Oracle API in order to persist the data).
Related
I have to pull some data from oracle and update the data in snowflake. And ofcourse the size of the data is 5gb.
Is there any procedure to connect to oracle database from snowflake? OR
Do I need to connect them using a programming language as python?
You'll need to unload the data from Oracle and load into Snowflake, as there are no "direct connect" options I've ever heard about.
I'd use SQL*Loader to unload, push the files to AWS S3 (or your cloud vendor's storage), and issue Snowflake COPY INTO TABLE commands, it should be fairly straightforward.
There is no equivalent to Oracle database links in Snowflake. You would need an external process to move the data from Oracle to S3. Then you can configure a Snowpipe task to load from S3 into Snowflake. See Loading Continuously Using Snowpipe for more information.
I would suggest to use python programming to extract and load data from oracle to snowflake. Since your oracle table is being updated daily write python program to generate merge statement dynamically to load your incremental data from oracle to snowflake.
Snowflake supports Java script based stored procedure so you can use stored procedure to generate merge statement dynamically by passing table name as parameter and you can call it via python.
Initial load from oracle to snowflake may take time as you have 5GB data from your source system.
I’m trying to learn about streaming services and reading kafka doc’s :
https://kafka.apache.org/quickstart
https://kafka.apache.org/24/documentation/streams/quickstart
To take a simple example I’m attempting to refactor a Spring web services GET request which accepts an ID parameter and returns a list of attributes associated with that ID. The DB backend is Oracle.
What is the approach for loading a single Oracle DB table which can be served by Kafka ? The above docs don't contain information for this. Do I need to replicate the Oracle DB to a NoSql DB such as MongoDB ? (Why we require Apache Kafka with NoSQL databases?)
Kafka is an event streaming platform. It is not a database. Instead of thinking about "loading a single Oracle DB table which can be served by Kafka", you need to think in terms of what events are you looking for that will trigger processing?
Change Data Capture (CDC) products like Oracle Golden Gate (there are other products too) will detect changes to rows and send messages into Kafka each time a row changes.
Alternatively you could configure a Kafka JDBC Source Connector to execute a query and pull data into Kafka.
i have a parallel job that writes in oracle table. I want to manually write warnings in Datastage's log if some event occur. For example if a certain value for a certain column is inserted i want to track this information in the log. Could this be achieved somehow?
To write custom messages into the logs for a particular jobs data stream, you can use a combination of a copy stage, transformer, and peak stage. The peak stage is the one that writes to the logs. I like to set the peak stage to run in sequential mode, so that your messages are kept together in single entries in the log, instead across nodes.
Also, you can peak the rejects of the oracle stage. maybe combine this with the above option (using a funnel stage and a standard column schema).
Lastly, if you'd actually like to query the logs themselves and write those logs out somewhere else or use them in a job (amoungst allother data kept about jobs in the repository). You can directly query the DSODB schema in the XMETA database. I.e. the DataStage repository (by default DB2).
You would need to have the DataStage Operations Console up and running for that (not sure what version of DataStage you're running). If DataStage is running on a single tier and using the default DB2 database. You can simply catalog the DSODB database so that it's available as a connection in the DB2 connector. Else you'd need to install a DB2 client on the DataStage engine tier and catalog the database there.
All the best!
Twitter: #InforgeAcademy
DataStage tips and Tricks: https://www.inforgeacademy.com/blog/
Do we have any utility to sync data between Oracle & Neo4J database. I want to use Neo4j in readonly mode & all writes will happen to oracle DB.
I think this depends on how often you want to have the data synced. Are you looking for a periodic sync/ETL process (say hourly or daily), or are looking for live updates into Neo4j?
I'm not aware of tools designed for this, but it's not terribly difficult to script yourself.
A periodic sync is obviously easiest. You can do that directly using the Java API and connecting via JDBC to Oracle. You could also just dump the data from Oracle as a CSV and import into Neo4j. This would be done similiarly to how data is imported from PostreSQL in this article: http://neo4j.com/developer/guide-importing-data-and-etl/
There is a SO response for exporting data from Oracle using sqlplus/spool:
How do I spool to a CSV formatted file using SQLPLUS?
If you're looking for live syncing, you'd probably do this either through monitoring the transaction log or by adding triggers onto your tables, depending on the complexity of your data.
Instead of placing triggers on tables everywhere in an Oracle database, is there a Java API that I can use to read transactions off the Oracle transaction log?
My purpose is to be able to detect transactions going into a proprietary(vendor) database and react accordingly. We can't modify the database so that we do not void our maintenance contract.
Please help!
There is LogMiner which is SQL based (and so you could access through JDBC).
http://download.oracle.com/docs/cd/B19306_01/server.102/b14215/logminer.htm#sthref1875
Or you can look at Oracle Streams which reads the logs and generates 'logical change messages' into a queue from the log contents.
http://download.oracle.com/docs/cd/B19306_01/server.102/b14229/strms_over.htm#i1006309
If you are running in *nix, there is a perl module that you could use to tail the file; then break down the lines for yourself.