How do i install snowflake.sqlalchemy in anaconda? - anaconda

I'm trying to connect to snowflake in python. At present i am an unsuccessful. I have read forums on using the engine way i.e.:
url = URL(
account = 'xxxx',
user = 'xxxx',
password = 'xxxx',
database = 'xxx',
schema = 'xxxx',
warehouse = 'xxx',
role='xxxxx',
authenticator='https://xxxxx.okta.com',
)
engine = create_engine(url)
connection = engine.connect()
query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''
df = pd.read_sql(query, connection)
but i get the error:
ModuleNotFoundError: No module named 'snowflake.sqlalchemy'
how do i install this module in anaconda? I cannot find how to get round this any other way i have read does not work

This worked for me:
conda install -c conda-forge snowflake-sqlalchemy

dont use import snowflake together with from snowflake.sqlalchemy
use only;
from snowflake.sqlalchemy import URL

Even in an Anaconda environment you can use pip. Did you try pip install snowflake-sqlalchemy?

Related

Oracle ODBC Issues with Centos7

Installation of Oracle ODBC originated from Centos7 was successful. However, the error message im004 unixodbc driver manager driver's sqallochhandle on sql_handle_henv failed was encountered and a solution was not found.
For a list of installation files, see
unixODBC-2.3.11.tar.gz
instantclient-basic-linux.x64-19.18.0.0.0dbru.zip
instantclient-odbc-linux.x64-19.18.0.0.0dbru.zip
First, this is the installation process.
tar -xvzf ./unixODBC-2.3.11.tar.gz -\> ./configure -\> make -\> make install
unzip ./instantclient-basic-linux.x64-19.18.0.0.0dbru.zip
unzip ./instantclient-odbc-linux.x64-19.18.0.0.0dbru.zip
cd instantclient_19_18/
./odbc_update_ini.sh /usr/local /usr/lib64 Oracle
The location and contents of each important installation file.
/usr/local/etc/odbc.ini
\[Oracle226\] Application Attributes = T Attributes = W BatchAutocommitMode = IfAllSuccessful CloseCursor = F DisableDPM = F DisableMTS = T Driver = Oracle EXECSchemaOpt = EXECSyntax = T Failover = T FailoverDelay = 10 FailoverRetryCount = 10 FetchBufferSize = 2000 ForceWCHAR = F Lobs = F Longs = T MetadataIdDefault = F QueryTimeout = T ResultSets = T ServerName = //IP:PORT/DB SQLGetData extensions = F Translation DLL = Translation Option = 0 UserID = DB_ID Password = DB_PW Port = PORT DatabaseCharacterSet = AL32UTF8 #DatabaseCharacterSet = euckr
/usr/local/etc/odbcinst.ini
\[Oracle\] Description = Oracle ODBC driver for Oracle 19 Driver = /usr/local/libsqora.so.19.1 Setup = FileUsage = CPTimeout = CPReuse =
/etc/profile
export ODBCSYSINI=/usr/local/etc export ODBCINI=/usr/local/etc/odbc.ini export LD_LIBRARY_PATH=/usr/lib64:/usr/local/etc
odbcinst -j unixODBC 2.3.11 DRIVERS............: /usr/local/etc/odbcinst.ini SYSTEM DATA SOURCES: /usr/local/etc/odbc.ini FILE DATA SOURCES..: /usr/local/etc/ODBCDataSources USER DATA SOURCES..: /usr/local/etc/odbc.ini SQLULEN Size.......: 8 SQLLEN Size........: 8 SQLSETPOSIROW Size.: 8
In the middle, commands such as source /etc/profile or ldconfig were executed. It's hard to use yum because it's a closed net.
Please help me.
The additional DB you are trying to connect to is an Oracle DB of a different IP.
I tried Googleing, but there was nothing I could do.

How to generate SAS token using python legacy SDK(2.1) without account_key or connection_string

I am using python3.6 and azure-storage-blob(version = 1.5.0) and trying to use user assigned managed identity to connect to my azure storage blob from an Azure VM.The problem I am facing is I want to generate the SAS token to form a downloadable url.
I am using blob_service = BlockBlobService(account name,token credential) to authenticate. But I am not able to find any methods which let me generate SAS token without supplying the account key.
Also not seeing any way of using the user delegation key as is available in the new azure-storage-blob (versions>=12.0.0). Is there any workaround or I will need to upgrade the azure storage library at the end.
I tried to reproduce in my environment to generate SAS token without account key or connection string got result successfully.
Code:
import datetime as dt
import json
import os
from azure.identity import DefaultAzureCredential
from azure.storage.blob import (
BlobClient,
BlobSasPermissions,
BlobServiceClient,
generate_blob_sas,
)
credential = DefaultAzureCredential(exclude_shared_token_cache_credential=True)
storage_acct_name = "Accountname"
container_name = "containername"
blob_name = "Filename"
url = f"https://<Accountname>.blob.core.windows.net"
blob_service_client = BlobServiceClient(url, credential=credential)
udk = blob_service_client.get_user_delegation_key(
key_start_time=dt.datetime.utcnow() - dt.timedelta(hours=1),
key_expiry_time=dt.datetime.utcnow() + dt.timedelta(hours=1))
sas = generate_blob_sas(
account_name=storage_acct_name,
container_name=container_name,
blob_name=blob_name,
user_delegation_key=udk,
permission=BlobSasPermissions(read=True),
start = dt.datetime.utcnow() - dt.timedelta(minutes=15),
expiry = dt.datetime.utcnow() + dt.timedelta(hours=2),
)
sas_url = (
f'https://{storage_acct_name}.blob.core.windows.net/'
f'{container_name}/{blob_name}?{sas}'
)
print(sas_url)
Output:
Make sure you need to add storage blob data contributor role as below:

Very slow connection to Snowflake from Databricks

I am trying to connect to Snowflake using R in databricks, my connection works and I can make queries and retrieve data successfully, however my problem is that it can take more than 25 minutes to simply connect, but once connected all my queries are quick thereafter.
I am using the sparklyr function 'spark_read_source', which looks like this:
query<- spark_read_source(
sc = sc,
name = "query_tbl",
memory = FALSE,
overwrite = TRUE,
source = "snowflake",
options = append(sf_options, client_Q)
)
where 'sf_options' are a list of connection parameters which look similar to this;
sf_options <- list(
sfUrl = "https://<my_account>.snowflakecomputing.com",
sfUser = "<my_user>",
sfPassword = "<my_pass>",
sfDatabase = "<my_database>",
sfSchema = "<my_schema>",
sfWarehouse = "<my_warehouse>",
sfRole = "<my_role>"
)
and my query is a string appended to the 'options' arguement e.g.
client_Q <- 'SELECT * FROM <my_database>.<my_schema>.<my_table>'
I can't understand why it is taking so long, if I run the same query from RStudio using a local spark instance and 'dbGetQuery', it is instant.
Is spark_read_source the problem? Is it an issue between Snowflake and Databricks? Or something else? Any help would be great. Thanks.

How to create EC2 instance through boto python code

requests = [conn.request_spot_instances(price=0.0034, image_id='ami-6989a659', count=1,type='one-time', instance_type='m1.micro')]
I used the following code. But it is not working.
Use the following code to create instance from python command line.
import boto.ec2
conn = boto.ec2.connect_to_region(
"us-west-2",
aws_access_key_id="<aws access key>",
aws_secret_access_key="<aws secret key>",
)
conn = boto.ec2.connect_to_region("us-west-2")
conn.run_instances(
"<ami-image-id>",
key_name="myKey",
instance_type="t2.micro",
security_groups=["your-security-group-here"],
)
To create an EC2 instance using Python on AWS, you need to have "aws_access_key_id_value" and "aws_secret_access_key_value".
You can store such variables in config.properties and write your code in create-ec2-instance.py file
Create a config.properties and save the following code in it.
aws_access_key_id_value='YOUR-ACCESS-KEY-OF-THE-AWS-ACCOUNT'
aws_secret_access_key_value='YOUR-SECRETE-KEY-OF-THE-AWS-ACCOUNT'
region_name_value='region'
ImageId_value = 'ami-id'
MinCount_value = 1
MaxCount_value = 1
InstanceType_value = 't2.micro'
KeyName_value = 'name-of-ssh-key'
Create create-ec2-instance.py and save the following code in it.
import boto3
def getVarFromFile(filename):
import imp
f = open(filename)
global data
data = imp.load_source('data', '', f)
f.close()
getVarFromFile('config.properties')
ec2 = boto3.resource(
'ec2',
aws_access_key_id=data.aws_access_key_id_value,
aws_secret_access_key=data.aws_secret_access_key_value,
region_name=data.region_name_value
)
instance = ec2.create_instances(
ImageId = data.ImageId_value,
MinCount = data.MinCount_value,
MaxCount = data.MaxCount_value,
InstanceType = data.InstanceType_value,
KeyName = data.KeyName_value)
print (instance[0].id)
Use the following command to execute the python code.
python create-ec2-instance.py

unable to use RODM to connect to Oracle database from R

I am trying to connect to an Oracle database from R.
I used RODM_open_dbms_connection(dsn, uid = "", pwd = ""), but it doesn't work. I am not sure what kind of the error it is.
Here is the error screen from R.
> library(RODM) Loading required package: RODBC DB<-
> RODM_open_dbms_connection(dsn="****",uid="****", pwd="****") Error in
> typesR2DBMS[[driver]] <<- value[c("double", "integer", "character", :
> cannot change value of locked binding for 'typesR2DBMS'
Have you tried ROracle? After you get the instant client installed on your machine, connecting and fetching records from R looks like this:
library(ROracle)
con <- dbConnect(dbDriver("Oracle"), username="username", password="password", dbname = "dbname")
res <- dbSendQuery(con, "select * from schema.table")
dt <- data.table(fetch(res, n=-1))
I explored the RODM_open_dbms_connection. I commented out the part setSqlTYpeInfo(). After that I didn't receive that error.
Install RODM package from source then only you can edit the package.

Resources