I'm trying to export a R dataframe to Oracle database.
I read this post: how to export data frame (R) into Oracle table. In short,
dbWriteTable(jdbcConnection,"TABLE_NAME",data.frame.name., rownames=FALSE, overwrite = TRUE, append = FALSE)
However, I do not know what is 'jdbcConnection' and how to declare it.
BTW I'm able to connect Oracle with R Studio using RODBC package.
The accepted answer in that link cites the RJDBC package, which connects to the SQL database using the Java JDBC driver, in this case, the driver for Oracle. If you poke around the documentation, you will find some boilerplate code for how to do this:
drv <- JDBC("oracle.jdbc.driver.OracleDriver", "/path/to/ojdbc6.jar", " ")
conn <- dbConnect(drv, "jdbc:oracle:thin:#localhost:1521:orclt")
dbWriteTable(conn, "TABLE_NAME", data.frame.name, rownames=FALSE, overwrite = TRUE, append = FALSE)
Note that to make the above work, you will need locally the ojdbc6.jar JAR file for the Oracle JDBC driver. You may download this from the Oracle site directly if you don't already have it. The second parameter being used above in the call to dbConnect is the JDBC url for your Oracle instance. Refer to any number of posts on Stack Overflow to learn how to form the appropriate URL for your Oracle instance.
Here's another example based on this doc:
# Load RJDBC library
library(RJDBC)
# Create connection driver and open connection
jdbcDriver <- JDBC(driverClass="oracle.jdbc.OracleDriver", classPath="lib/ojdbc6.jar")
jdbcConnection <- dbConnect(jdbcDriver, "jdbc:oracle:thin:#//database.hostname.com:port/service_name_or_sid", "username", "password")
# Write to table
dbWriteTable(jdbcConnection,"TABLE_NAME",data.frame.name, rownames=FALSE, overwrite = TRUE, append = FALSE)
Related
I've written some Java code to use JDBC to copy the contents of a table from one DB to another (it requires that the table exist in both DBs, it does not check the target table for the existence of any of the data it's copying over).
It uses PreparedStatements, and copies in blocks of 10,000. I would like to add the ability to disable all indexes / foreign key constraints, and then re-enable them when the table has been completely copied over.
Is there a way to do this using pure JDBC, i.e. not just firing over some vendor specific code?
There is nothing in JDBC itself that provides such functionality. You will need to use database specific functionality to do this.
Short answer: no, there isn't.
Longer answer:
Given a java.sql.Connection theConnection:
DatabaseMetaData metaData = theConnection.getMetaData ();
String dbType = metaData.getDatabaseProductName ();
String command;
for "MySQL":
command = "ALTER TABLE <tableName> [ENABLE | DISABLE] KEYS;"
for "Microsoft SQL Server":
command = "ALTER INDEX ALL ON <tableName> [REBUILD | DISABLE];"
(feel encouraged to add other DB commands below.) Once you have command, execute it
Statement stmt = theConnection.createStatement ();
stmt.execute (command);
I am completely new to Python and pandas. I want to load a some tables and Sql Queries from Oracle and Teradata to pandas Dataframes and want to analyse them.
I know, we have to create some connection strings to Oracle and Teradata in Pandas. Can you please suggest me them and also add the sample code to read both table and SQL query in that?
Thanks Inadvance
I don't have Oracle server, so I take Teradata as an example
This is not the only way to to that, just one approach
Make sure you have installed Teradata ODBC Driver. Please refer to Teradata official website about the steps, I suppose you use Windows (since it is easy to use SQL Assistant to run query against Teradata, that is only on Windows). You can check it in ODBC Data Source Administrator
Install pyodbc by the command pip install pyodbc. Here is the official website
The connection string is db_conn_str = "DRIVER=Teradata;DBCNAME={url};UID={username};PWD={pwd}"
Get a connection object conn = pyodbc.connect(db_conn_str)
Read data from a SQL query to a DataFrame df = pd.read_sql(sql="select * from tb", con=conn)
The similar for Oracle, you need to have the driver and the format of ODBC connection string. I know there is a python module from Teradata which supports the connection too, but I just prefer use odbc as it is more generic purpose.
Here is an Oracle example:
import cx_Oracle # pip install cx_Oracle
from sqlalchemy import create_engine
engine = create_engine('oracle+cx_oracle://scott:tiger#host:1521/?service_name=hr')
df = pd.read_sql('select * from table_name', engine)
One way to query an Oracle DB is with a function like this one:
import pandas as pd
import cx_Oracle
def query(sql: str) -> pd.DataFrame:
try:
with cx_Oracle.connect(username, password, database, encoding='UTF-8') as connection:
dataframe = pd.read_sql(sql, con=connection)
return dataframe
except cx_Oracle.Error as error: print(error)
finally: print("Fetch end")
here, sql corresponds to the query you want to run. Since it´s a string it also supports line breaks in case you are reading the query from a .sql file
eg:
"SELECT * FROM TABLE\nWHERE <condition>\nGROUP BY <COL_NAME>"
or anything you need... it could also be an f-string in case you are using variables.
This function returns a pandas dataframe with the results from the sql string you need.
It also keeps the column names on the dataframe
I just started to use Spark-SQL to load data from a H2 database, here is what I did following the Spark-SQL document:
>>> sqlContext = SQLContext(sc)
>>> df = sqlContext.load(source="jdbc",driver="org.h2.Driver", url="jdbc:h2:~/test", dbtable="RAWVECTOR")
But it didn't work and gave errors, I think the problem is that the username and password are not specified in the function.
This is parameters from the document from Spark-SQL 1.3.1:
url
The JDBC URL to connect to.
dbtable
The JDBC table that should be read. Note that anything that
is valid in a FROM clause of a SQL query can be used. For example,
instead of a full table you could also use a subquery in
parentheses.
driver
The class name of the JDBC driver needed to connect to this
URL. This class with be loaded on the master and workers before
running an JDBC commands to allow the driver to register itself with
the JDBC subsystem.
partitionColumn, lowerBound, upperBound, numPartitions
These options must all be specified if any of them is specified. They describe how to partition the table when reading in parallel from multiple workers. partitionColumn must be a numeric column from the table in question.
But I didn't find any clue how to pass the database user name and password to the sqlContext.load function.
Any one has similar case or clues?
Thanks.
I figured it out. Just do
df = sqlContext.load(
source="jdbc",driver="org.h2.Driver",
url="jdbc:h2:tcp://localhost/~/test?user=sa&password=1234",
dbtable="RAWVECTOR"
)
And when you create the database, use same pattern:
conn = DriverManager.getConnection(
"jdbc:h2:tcp://localhost/~/"+dbName+"?user=sa&password=1234", null, null
);
And, here is a blog about how to use the API.
According to the CF9 cfquery documentation, I should be able to return the oracle ROWID in the cfquery result.
I've failed on all counts, it simply does not return any identity or generated keys
I am using the jdbc oracle thin client, can anyone point me in the right direction here?
If you were using one of the Oracle drivers that ships with ColdFusion, then you should be able to access GENERATEDKEY from the RESULT struct within the ColdFusion query object. Since you are using the JDBC Oracle thin client driver, where you setup a data source using "Add a new data source > Other", then enter the JDBC configuration, you don't have access to the RESULT struct described in the documentation.
I ran into the same issue when we used the MS JDBC driver with CF8. After converting to CF9 with the built-in SQL Driver, we were able to update our code to correctly reference the RESULT struct.
You will have to write your INSERT statements to also SELECT the value of ROWID, which you should be able to retrieve from the final query object.
I have problem when I try insert some data to Informix TEXT column
via JDBC. In ODBC I can simply run SQL like this:
INSERT INTO test_table (text_column) VALUES ('insert')
but this do not work in JDBC and I got error:
617: A blob data type must be supplied within this context.
I searched for such problem and found messages from 2003:
http://groups.google.com/group/comp.databases.informix/browse_thread/thread/4dab38472e521269?ie=UTF-8&oe=utf-8&q=Informix+jdbc+%22A+blob+data+type+must+be+supplied+within+this%22
I changed my code to use PreparedStatement. Now it works with JDBC,
but in ODBC when I try using PreparedStatement I got error:
Error: [Informix][Informix ODBC Driver][Informix]
Illegal attempt to convert Text/Byte blob type.
[SQLCode: -608], [SQLState: S1000]
Test table was created with:
CREATE TABLE _text_test (id serial PRIMARY KEY, txt TEXT)
Jython code to test both drivers:
# for Jython 2.5 invoke with --verify
# beacuse of bug: http://bugs.jython.org/issue1127
import traceback
import sys
from com.ziclix.python.sql import zxJDBC
def test_text(driver, db_url, usr, passwd):
arr = db_url.split(':', 2)
dbname = arr[1]
if dbname == 'odbc':
dbname = db_url
print "\n\n%s\n--------------" % (dbname)
try:
connection = zxJDBC.connect(db_url, usr, passwd, driver)
except:
ex = sys.exc_info()
s = 'Exception: %s: %s\n%s' % (ex[0], ex[1], db_url)
print s
return
Errors = []
try:
cursor = connection.cursor()
cursor.execute("DELETE FROM _text_test")
try:
cursor.execute("INSERT INTO _text_test (txt) VALUES (?)", ['prepared', ])
print "prepared insert ok"
except:
ex = sys.exc_info()
s = 'Exception in prepared insert: %s: %s\n%s\n' % (ex[0], ex[1], traceback.format_exc())
Errors.append(s)
try:
cursor.execute("INSERT INTO _text_test (txt) VALUES ('normal')")
print "insert ok"
except:
ex = sys.exc_info()
s = 'Exception in insert: %s: %s\n%s' % (ex[0], ex[1], traceback.format_exc())
Errors.append(s)
cursor.execute("SELECT id, txt FROM _text_test")
print "\nData:"
for row in cursor.fetchall():
print '[%s]\t[%s]' % (row[0], row[1])
if Errors:
print "\nErrors:"
print "\n".join(Errors)
finally:
cursor.close()
connection.commit()
connection.close()
#test_varchar(driver, db_url, usr, passwd)
test_text("sun.jdbc.odbc.JdbcOdbcDriver", 'jdbc:odbc:test_db', 'usr', 'passwd')
test_text("com.informix.jdbc.IfxDriver", 'jdbc:informix-sqli://169.0.1.225:9088/test_db:informixserver=ol_225;DB_LOCALE=pl_PL.CP1250;CLIENT_LOCALE=pl_PL.CP1250;charSet=CP1250', 'usr', 'passwd')
Is there any setting in JDBC or ODBC to have one version of code
for both drivers?
Version info:
Server: IBM Informix Dynamic Server Version 11.50.TC2DE
Client:
ODBC driver 3.50.TC3DE
IBM Informix JDBC Driver for IBM Informix Dynamic Server 3.50.JC3DE
First off, are you really sure you want to use an Informix TEXT type? The type is a nuisance to use, in part because of the problems you are facing. It pre-dates anything in any SQL standard with respect to large objects (TEXT still isn't in SQL-2003 - though approximately equivalent structures, CLOB and BLOB, are). And the functionality of BYTE and TEXT blobs has not been changed since - oh, let's say 1996, though I suspect there's a case for choosing an earlier date, like 1991.
In particular, how much data are you planning to store in the TEXT columns? Your example shows the string 'insert'; that is, I presume, much much smaller than you would really use. You should be aware that a BYTE or TEXT columns uses a 56-byte descriptor in the table plus a separate page (or set of pages) to store the actual data. So, for tiny strings like that, it is a waste of space and bandwidth (because the data for the BYTE or TEXT objects will be shipped between client and server separately from the rest of the row). If your size won't get above about 32 KB, then you should look at using LVARCHAR instead of TEXT. If you will be using data sizes above that, then BYTE or TEXT or BLOB or CLOB are sensible alternatives, but you should look at configuring either blob spaces (for BYTE or TEXT) or smart blob spaces (for BLOB or CLOB). You can, and are, using TEXT IN TABLE, rather than in a blob space; be aware that doing so impacts your logical logs whereas using a blob space does not impact them anything like as much.
One of the features I've been campaigning for a decade or so is the ability to pass string literals in SQL statements as TEXT literals (or BYTE literals). That is in part because of the experience of people like you. I haven't yet been successful in getting it prioritized ahead of other changes that need to be made. Of course, you need to be aware that the maximum size of an SQL statement is 64 KB text, so you could create too big an SQL statement if you aren't careful; placeholders (question marks) in the SQL normally prevent that being a problem - and increasing the size of an SQL statement is another feature request which I've been campaigning for, but a little less ardently.
OK, assuming that you have sound reasons for using TEXT...what next. I'm not clear what Java (the JDBC driver) is doing behind the scenes - apart from too much - but it is a fair bet that it is noticing that a TEXT 'locator' structure is needed and is shipping the parameter in the correct format. It appears that the ODBC driver is not indulging you with similar shenanigans.
In ESQL/C, where I normally work, then the code has to deal with BYTE and TEXT differently from everything else (and BLOB and CLOB have to be dealt with differently again). But you can create and populate a locator structure (loc_t or ifx_loc_t from locator.h - which may not be in the ODBC directory; it is in $INFORMIXDIR/incl/esql by default) and pass that to the ESQL/C code as the host variable for the relevant placeholder in the SQL statement. In principle, there is probably a parallel method available for ODBC. You may have to look at the Informix ODBC driver manual to find it, though.