Databricks (SQL Warehouse) How to read data from postgres (jdbc) - jdbc

Trying to connect database sql warehouse from postgres db but the problem I am facing is that it shows me that data source is not supported.
Please see screenshot below.
enter image description here

There is no source type postgresql in databricks. The only supported types are TEXT, AVRO, CSV, JSON, JDBC, PARQUET, ORC, DELTA, LIBSVM (https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-table-using.html)
Now, for your case, it is typical external data sourcing from a remote server or data location. For postgres, you can use the JDBC type. The implementation of that can be found in the relative documentation here https://docs.databricks.com/external-data/jdbc.html#language-sql

Related

I'm currently trying to migrate a large table from cassandra to oraclesql and can't find many solutions

I've been researching and looking for ideas but the only thing close to a solution i've found has been where someone used pyspark to convert an oracle table into hdfs and then from hdfs into cassandra but I was hoping there was another/a clear solution to this data migration.
Title suggests that it is Cassandra > Oracle. Message text says Oracle > HDFS > Cassandra (i.e. the opposite direction). What exactly are you trying to do?
Suppose it is the title that is correct. If there's no tool which would do the migration for you, from my - developer's - point of view, creating a database link in my Oracle schema which points to Cassandra might be a good option. Then I'd just write some SQL code to migrate data I need. Here's how: Access Cassandra Data as a Remote Oracle Database.
Shortly:
connect to Cassandra as an ODBC data source
set connection properties for compatibility with Oracle
configure the ODBC gateway, Oracle Net and Oracle database
write queries

Issue in copying HEX data from ORACLE db (source) to AZURE SQL (sink) through ADF

Scenario:
I am trying to copy data from source ORACLE database to sink AZURE SQL using ADF.
I have created Oracle 11gR2 database in my local system (Windows 10) and installed Self Hosted Run time. On adding "data set" in ADF, I can "preview" tables from my local Oracle database.
Now target is Azure SQL and copy activity is like-to-like. So I have created table in AZURE SQL, keeping all column attributes same, barring one RAW column in source.
Problem:
In source table, there is a column of type RAW(2000) and it contains zlib compressed data in HEX format.
For this, as per the mapping spec detailed in https://learn.microsoft.com/en-us/sql/relational-databases/replication/non-sql/data-type-mapping-for-oracle-publishers?view=sql-server-ver15,
I have changed the type for the same field in Azure SQL to varbinary(2000) (also tried with binary(2000)
Source column data in Oracle is as below:
COMPRESS_DATA
--------------------------------------------------------------------------------
076CE1315D719C6A86A13B8E863F4ACA982C3D72CA234B8F2F67C7996896AD39866639FDD699A3B1F8A3A272FB6BA3DAC8C08E3B19BBC3BEAD431BCB050665F5F2946BA3CFB58BBC42431C98FD2B2ABB7DE2DAE84F344EBB6F52EC1FBD677A682BB46EFB54F3A2DBEDFAD0FA6A4AFCD556581F5FEB1D68DA64E4E084F5CF18CCD2C49BDDE31D7DF80E460E3D9C080B9CF2EE6839A6B6F90EECBB6CF24004ACFDB92BC52FB6ECB1DEEA5F5096FDF2628E9F68EBB361BF4F3BE2A38A39CAE5194FA9E7100BF51CBAD30677B8DE2CEFE255216779975602DC7BD1661BF99FAAD6175EAC45CB625A7B5A3C51DCFD1375C94C9B6D5A97AF9F15BD583B574A5F2F8BC1FD0ADA91EC917E9C765E252B7AB92BEC5D1A657984D364453F51475D4331681DBD12F779947F613ACED82E3B2788F2C9CE1D99DD209EC876CEBCF537DC5A85EB84B7ADB75A3DE0E60376D754D9BF0ABD35041E32318E97986ABEEADA54A15AD7B62E51E2288920518AC37C1F27417FBE3F960873B713E6CB037E12B347B4C980EE0ACE454B7381AD6967E7678E4BA7AA364D5F726AA21AE2EAB635657E038992A52A9F4AF92E2504BBB5C9EC5449454EF002D972B3BEB29ACD12878FED78E55601594CBA93FAA6406FF51C4AA4BD7DAFBF408414F5386ECA36BD6AF7BC4E813577DD9A6814D6527A6E2895FB0DC1D8D3658A5BE21E76D3A11536CEB17BAD0FB3261FDE4326B5BC5FC67BB585B2EECF78A4B9069EB8B6AD1BC6E7BCA6FB338E4FF69CE3ABD1E43FBDA1636A7DA3D11A9A3F00F9EB9131ECC78C5A5C66E5D5650FA66DE0AAA34DB80DF3CBC7AE1FF891B7FA94865FC368F9354E90EEA3704E5604AAC681D290448E14121C607DBF4DBD02F7DBD6D49D5C4E29A173897386F1474812C208A6B073FC097F747C8868488ED00E4139C179BC1802ED9D38BF5C463EE49DD14ECF7140132B11938088D2233EC3F6764E7AB9D9924923EFF20C677C3423B08608BBC1F24F1238B5CF19ADD35164496384162328BCC8EB1C27AAA2DA1119257F0C1A52C6772AE013F86062BA72AE6CBAAA33378C3BB240B27980EFC82EEDC9AA382B34A980D6AE183B1F8A5CC6C63C761D64369EA7E70D0F0C0A9647FD601E46B10AE7BB8323155F9DCA050D13FE5598648AAA827834833199DC92E1573EBE55D58124AA8C2251D42A699DD48F0EABB4A2B2A83D1C8C8A0E122105155DDD49B9281FAA4BA1D5005A7132A4ADD420B564496519B27EA46942FF1D7B488D120EE525D8213921FB7F5EB4800F3F969516834643A1592EC320A74767E42C24FC14974C9C6CA78743F686641DD1229D0946DEFC9BF775172D6597321B6B459E5015EF5D8071A0534B1F5DE37AD0B2AC99EC906F76E1E0DE61643A667849A2C6B157CCCCE0E167D91803D9A2207B872B4BA72B9129BB056ECB2B19F161D4F492F9DD9159105AB1796A6144B6DBA638DC91C96ABFFED0D6EBE5D720EEC99EF6DCC1F45500100E03C0335C358B57826AE45294170E8A6048720A06474E9933A6439D1F8648941525512E1E9243C6CCEA366753FD027364C1F70C732CA9F9198E74ABA775750B0FF57871AEF29924940FEFFDE0168C15FC24170F0E0A9B630E6955E2D7F6833B2FFA169B8E209EF12A1B5F859FE186D9FE4FC21B61ADD11EEC7488AA4E5216C545D5D2C2B38C600DAF472EABD9D79C5494AF3D688D7A886FAB579A3A313BECE82267127ADE1F8B9CC0206662B8654D94F02B92DFC9BB275349E23DDD167553C4E93869E381192BEE197AAB3D5C476CF06AA64FFFCD362823FA98F4CFB03FA3AB6BE649ED8FD6EB5BB53FFACD01FF36FB4B128C38397E75C323AF7218B9ED3DF8C9258AA2500E6D5369385E12CB929DC824C87657B13F2BC49BE566E5D7764CE59E887C81CC5273D2A4847A36E2DC99D13EED88A32DF92D8E381EE8D1D114491FD2E88E21AFED1874ED135647EDEA9511589BD090381E42ACF684AC71C0375DB2D9A70A0D4E0272D0206E2833AAD2501411F1BB6FAC983DAB221EDB279F88C9C217AE4289B967A92F2AFA6A5E5B9AC119DAD544D334647A1B3F5A7A2BA48F4DCD9920DB31724EC4B6462729A7E4647AC537734E7AF17B6AF032C090E7DF42FEA38F87AED3EBBA48C152C293CCA0A164B4E8FA752A712998030801AEA669A20A7B7C36EBFB969156B3235EE41BAC00A132744FF65802B2A16F212BE11ED23E469E9866C99709CC1EFFB14CB110B914A918E48ADF96F374451C5A7CDD970855C8B4BBD2F2FE8A34C3AA70D60270EFF2A2461F55C9DEF7D3F66F9681BF4055A56ECF4C788F2C201400874BEEB249B356BD4AF6828E448649FE052C00A0715E3539F1BF2FFDBF7079C75364629CC5DD7AD19ABCF3882FBDF3882C9B44F761B83A59C1A87AC6B067AC8A59A1C182AD300D89E7596833D10D8E5341B5024AB6098F2298278E0C2F10C75148257930AFD2D24086CD6C66DBB941D9F7FE7C79F3902EC29F91379565AF9049D64950DDBBA3EADF1195DBFD53A7A7BA01548F4C75B050B8640A1946AA43A6CD768BC8807F5C6C577E762A2096E6ED035219841601516840E43A402EC7F22407A4B4154C06D06B81118F1A8EC12CDBE09486A83658861504351E35A44AA81A8A3AE48AA386A7470D5707C5C0350B50EC6A6EFA0129A60B17DE0A77C8CEC7DFA4CC03E60000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Now when I am creating the copy step in ADF and in Mapping tab, I opt for "Import Schema", it shows:
for Source, COMPRESS DATA field type is BYTE[ ].
Naturally when I run the pipeline, the above column value comes in Azure SQL, in a different format:
data preview in ADF
Data as in ADF preview and in Azure SQL (post copy)
B2x4nO1dW2/buLYG91sL9D/o5QApkHhE3RUcHGxZF9uJJTuS7MR5GWhspTHGkTOy05n8+7MoybrQkuOkzm6SXTVtY5FcXOvjx8VFSqR9rcPChf7FsazKqjyLMCfKqKN7HvJMx/hdc/Vub9xzOr933MFo+LvX7Q1t0/ERXC78dVD7vEM+YLHFom/T1eokiKe38+/z6Bv18V8+VIahsmvjqqvq3oU5MvsaFB2Q4mgQz8KY6QbRbEHKckgPomm4WIQz5o9HRn9YrZd3YYx4knEeBfEj48dBtLpfxmsmKYvABuWElU84nsG4hZUWL7Z4QVAlEU1XMcsKVA6OheSWjDGP+f1zYGR7/YPUhSbE7vZy+ScYzOjL6GYe3wXr+TJC2fV0LcXVw0hMGw4jfZL+kwBr3gXzRUlWUgizKIFvtpwyvej7cj4loCOi0w7FUdLiJJPJC6U2HHlGST7DAhfY9CdhF6gmw7+mj2zNHEH5y3D+7XYNoiYpKUCb4aVKkaKAxg6i4Ft4F0brXYjsB9jTOUAblUXmPwWzClZmWdzwMVwdM/rtfL1kPCSrWER2EMarP5n+PApJ/uUcobHT1RxgC06t5Fika7pbtdL3r53yHcSWDHce7v4AYiMkQr/hsKTyB7Gw2ngJGzDybuf3BOJUV/AEnDC06RbBMsp7YnptijHDIF4/MtpsFs4OomR60XR80pxMfYx4naPUZ5PuwYw0n9E1r8vcBivmjzCMmIAozcAnbbFYTpMeyAz+jsK4dQhLckAly3CqGnHvCdAz8dx5LqDg1NZxMF0/BIvchR8W1L7QH1a14t8TqLg9Gj4XVOId4Lf242GR1AecS/d3QUDDOB8b4CKfQkYPFtOHRdpRjGAdMvMV83A/g9/eBqpWX3FpVBs0n94G0TdA8yZe3n35/L/tRRD9+X9fPn/5/ET+9fLL50I9UsB4COEus/Hd0zgsipH0w/qS8aSnm8RCK28suG0spw+E5+UoogiTZllqwalEye1GU1uYa6kSK0lSg2o1ORoarSZnfaN5ikiZBDHKbcjcLMEr/00gXT9tyemXz/4QUGCi5WnjoPkS83LkDfbqbAt5Hpl/PWQuBgKFMAOjuEnCghVo+z0sdeY3AbvTk892w/6EFYT061vwAKvMyZ4StueFThmB/c1wJ8cMBEAsy4w8pt/2fmXYO8NB2dsz5O4WewWYa8XflmXmFjdWRR+7W35/I6Ttnem1vmKaq11oHYcL4hkITVPX4CxbTO4bDoyvozrnVcW4Hfj2K4q+HZ/gt8fnu31Co/713iDJfsqsw9X6mIF6BRYfM/7yESZR34I7MpfC7HGWDC2TzBzzDnDgJrpW+0M+MW5yaW66AIQHy9nDdJ1B54bTcA5kZ+LlAxn0l7NwuwaYF2cTXMO0tFHf3ydLY9NsZ61vG7c/otRnkZtrecrAZPsgmuZ4TbrXFmtW40JORcVcaRNj3wcQMMXhKoy/p5HPTTBf1DEa6hIUiVWa4p6aHLtgq+asR83stztVI9hGhQnj6D92sJ7eEuKnLnEZQZwRB7OQ6QcRAd3xTZcxR+5gaGoOcwS//QZ/v3757IFwCCRPmYHXHzCe6Y7BcTFHPcuExHG4WoWL38bLx+AbZNGgvKsxR/hy/PU3Fqz68rk0I4U+FS3XoC8MGDFQ88gyra+twyiXJv5maZBP83xGM2XmSFLYbR1N29YYWzNd75w5kiWJKMqqXz47y/LsOfHAN8uHaHYgBS2W31Il08IdePrAspgjlfWINiJ+Qpv0z9Efi6zP62lIb34Hb/X1EGzNu06H1YaXVNehXU32CQbX2fxm/ka6izMxKMXTAWC4IF1mecMkDvI+jb/TKVEygzpNsX0q63oJGfXlfRjdQktGW0WMcAHeN37cR3xd3kT+OFiE0XQe5CUyoIt8s0JoU4ZE0kE5YVxr3rvkhGXoHs2JrH8yNgw9VVA/Mb8xBNjaHIDqJ30COfR91Nkf27bB6tfvEdt2W6YUz/rbizlrnOvDrm8zRS87Zqx4DnO3KIw+wdDxCUaZbMT59IkMNp+gkNl2x1DIgTn2LeOFAeOH8d08ChZMOw7vwphM/aJP9tj7BK6/PBYQEawKIkxP67sgQlt8C6fzGNQ6Yex+LgdqHnwCR005b1JcxEnxsd7zdcYO4vl6fhcwm14cBZ8+HZQrysWZS3Plx5e7fgp7rrhzl2bPXstd1QWs56x4CaUVL9M3YMxekeYi8yx/fpc55vsgXj/E4df6EP4H2k4zTK3fIQaPrP1mscUsBaKAbPFoq/Ve9kCwvvXqcjasexk+ZczWXPZvUB76H5l+Jy2XT2R3zGNfZEyO8Kgz8s8phKXskWxm8ZsAzxWvKD1hNhTehDE4jZDM9E8ZSRp4eZYvnyEcmd/Pk/WWykL7MaN3e/6Acc2J6R0Xi+zHTPIA9ZTRPRf0+HfquFr6wD4w5B47HnqJng6Xk7p+Sdf2+kzdc2PGgDjoqBNGYRwsvmaT9EE8/5a48Pt4OSURdA3tBY4oparNam/laGy5rZz1LTcYi5S5ha2FAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
Expected outcome:
The content should be copied to Azure SQL as is from source (HEX).
Please help/advice on how this can be achieved as we are planning to move 10TB of Oracle data into Azure SQL but this is the base issue blocking it.
Please try create the table with nvarchar(max) data type in Azure SQL database.
I'm glad to hear that "Changing it to NVARCHAR(MAX) while creating the Azure SQL table solved the problem".
It's my pleasure to help you!

how to connect to oracle database from snowflake?

I have to pull some data from oracle and update the data in snowflake. And ofcourse the size of the data is 5gb.
Is there any procedure to connect to oracle database from snowflake? OR
Do I need to connect them using a programming language as python?
You'll need to unload the data from Oracle and load into Snowflake, as there are no "direct connect" options I've ever heard about.
I'd use SQL*Loader to unload, push the files to AWS S3 (or your cloud vendor's storage), and issue Snowflake COPY INTO TABLE commands, it should be fairly straightforward.
There is no equivalent to Oracle database links in Snowflake. You would need an external process to move the data from Oracle to S3. Then you can configure a Snowpipe task to load from S3 into Snowflake. See Loading Continuously Using Snowpipe for more information.
I would suggest to use python programming to extract and load data from oracle to snowflake. Since your oracle table is being updated daily write python program to generate merge statement dynamically to load your incremental data from oracle to snowflake.
Snowflake supports Java script based stored procedure so you can use stored procedure to generate merge statement dynamically by passing table name as parameter and you can call it via python.
Initial load from oracle to snowflake may take time as you have 5GB data from your source system.

Transfer data from an ORACLE View to greenplum DB table

I have an Oracle view containing very large amount of data in it and I want to migrate this data in a table in Greenplum database. Is there any way I can write any query in Postgresql to fetch that Oracle view's data?
If not possible by query in Postgresql, kindly suggest me some way to access Oracle view from Linux server, so that I can create data file from that Oracle view to my Linux server and load that file via gpfdist to a Greenplum table.
NOTE: an Oracle view is from third party, I only have an access to view that data (I have all the connection info) I can access that view via SQL Developer
NOTE: Exporting data from SQL Developer to my local machine is not feasible here as the data is very large
Thanks,
Sunny
The last time I used Greenplum (3 years ago) I don't think there were any untrusted languages like plperlu, so fetching directly from Oracle from within Greenplum might not be possible. If the data has a primary key, are you able to fetch in batches, compress it, then ship it to Greenplum?
Do you have a Greenplum support contract? If so, you could also try them if you haven't already: https://sso.emc.com/sso/login.htm
I recall that gpfdist can be configured to fetch from remote servers with a bit of fiddling, so if you are able to copy out the Oracle data to disk, you could fetch it using gpfdist without any intermediary steps.

What is Oracle SQL Loader?

What is Oracle SQL Loader and what is it used for?
SQL Loader is utility provided by Oracle which enables us to load data from flat files into database tables. It is well covered in the documentation (check the Utilities Guide). The key thing is that SQL Loader is an external OS program.
External tables were introduced in Oracle 9i, allowing us to define tables whose data is supplied from flat files. These provide most of the functionality of SQL Loader with a lot more convenience. For instance we can manipulate and re-format the data using SQL functions which is simpler than using SQL Loader's syntax. It also means that we can pull the data from inside the database rather than pushing it from the OS.
However, for loading huge volumes of data in ultra-quick time a well-tuned SQL Loader control file will beat external tables for performance. Also, if there is a complicated OS process associated with the data files - e.g. ftp, gunzip, pre-processing with sed or awk - it can be more convenient to call SQL Loader from inside the shell script rather than attempting to hook up with a database job. So SQL Loader is still useful in certain scenarios but it is not necessarily the automatic first choice.
It is one of Oracle's bulk data loading tools.
You use it to load data from flat files (such as CSV) into the database.
For details, please check their documentation (or this FAQ)
To transfer data from one Oracle database to another oracle database, we use oracle data pump. And in oracle versions previous to 10g we use oracle export/import. But if you want to transfer data from a non oracle database to an oracle database, you create a flat file of the data in the non oracle database and using SQL Loader load the data into oracle database.
Following is procedure to load the data from Third Party Database into Oracle using SQL Loader.
1.Convert the Data into Flat file using third party database command.
2.Create the Table Structure in Oracle Database using appropriate datatypes
3.Write a Control File, describing how to interpret the flat file and options to load the data.
4.Execute SQL Loader utility specifying the control file in the command line argument

Resources