How can I solve Anaconda seaborn library issue? - seaborn

I am trying to work data visualization. I am using anaconda jupyter lab but always getting the error message.
For example,
sns.lineplot(x = "timepoint", y = "signal", data = df);
AttributeError: module 'seaborn' has no attribute 'lineplot'
It is same thing with sns.scatterplot and sns.catplot
I am using seaborn==0.10.1. Can you help me?

Related

TypeError: from_pretrained() got an unexpected keyword argument 'file_name'

I'm trying to quantize a seq2seq model (M2M100) using optimum library provided by Huggingface. As per this guide, I'm trying to quantize the encoder and decoder one by one but that requires me to overwrite the model name. Following the documentation in the guide, I used the code:
encoder_quantizer = ORTQuantizer.from_pretrained(model_dir, file_name="encoder_model.onnx")
This code is throwing the following error:
TypeError: from_pretrained() got an unexpected keyword argument 'file_name'
I tried examining ORTQuantizer.from_pretrained and got the following:
<function optimum.onnxruntime.quantization.ORTQuantizer.from_pretrained(model_name_or_path: Union[str, os.PathLike], feature: str, opset: Optional[int] = None) -> 'ORTQuantizer'>
Clearly, from_pretrained here doesn't have a file_name parameter as has been indicated in the guide. Can someone please help me debug this error? Thanks!
Posting the solution I found while exploring optimum GitHub repo. The problem is that installing optimum via pip is downloading v1.3 which did not have the fix for quantizing seq2seq models. Instead install the package directly from GitHub using the command below. It worked fine afterwards.
python -m pip install git+https://github.com/huggingface/optimum.git

Running databrew.describe_job_run() from inside a Lambda does not work

I have a Lambda that polls the status of a DataBrew job using a boto3 client. The code - as written here - works fine in my local environment. When I put it into a Lambda function, I get the error:
[ERROR] AttributeError: 'GlueDataBrew' object has no attribute 'describe_job_run'
This is the syntax found in the Boto3 documentation:
client.describe_job_run(
Name='string',
RunId='string')
This is my code:
import boto3
def get_brewjob_status(jobName, jobRunId):
brew = boto3.client('databrew')
try:
jobResponse = brew.describe_job_run(Name=jobName, RunId=jobRunId)
status = jobResponse['State']
except Exception as e:
status='FAILED'
print('Unable to get job status')
raise(e)
return {
'jobStatus':status
}
def lambda_handler(event, context):
jobName=event['jobName']
jobRunId=event['jobRunId']
response=get_brewjob_status(jobName, jobRunId)
return response
I am using the Lambda runtime version of boto3. The jobName and jobRunId variables are strings passed from a Step Function, but I've also tried to hard code them into the Lambda to check the error and I get the same result. I have tried running it on both the runtime Python3.7 and Python3.8 versions. I'm also confident (and have double checked) that the IAMs permissions allow the Lambda access to DataBrew. Thanks for any ideas!
Fixed my own problem. There must be some kind of conflict with the boto3 runtime and databrew - maybe not updated to include databrew yet? I created a .zip deployment package and it worked fine. Should have done that two days ago...

Data Explorer: ImportError No module named Kqlmagic

I'm following this tutorial:
https://learn.microsoft.com/en-us/azure/data-explorer/kqlmagic
I have a Databricks cluster so I decided to use the notebook that is available on there.
When I get to step 2 and run:
reload_ext Kqlmagic
I get the error message:
ImportError: No module named Kqlmagic
Kqlmagic doesn't work with Databricks notebook. It might be supported in a future version.
Please try running instead of Steps 1,2:
%load_ext Kqlmagic

PYspark SparkContext Error "error occurred while calling None.org.apache.spark.api.java.JavaSparkContext."

I know this question has been posted before, but I tried implementing the solutions, but none worked for me. I installed Spark for Jupyter Notebook
using this tutorial:
https://medium.com/#GalarnykMichael/install-spark-on-mac-pyspark-
453f395f240b#.be80dcqat
Installed Latest Version of Apache Spark on the MAC
When I try to run the following code in Jupyter
wordcounts = sc.textFile('words.txt')
I get the following error:
name 'sc' is not defined
When I try adding the Code:
from pyspark import SparkContext, SparkConf
sc =SparkContext()
getting the following error:
An error occurred while calling
None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.util.StringUtils
at
org.apache.hadoop.security.SecurityUtil.
getAuthenticationMethod(SecurityUtil.java:611)
Added the path in bash:
export SPARK_PATH=~/spark-2.2.1-bin-hadoop2.7
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
#For python 3, You have to add the line below or you will get an error
# export PYSPARK_PYTHON=python3
alias snotebook='$SPARK_PATH/bin/pyspark --master local[2]'
Please help me resolve this.

h2o.ai H2OResponseError: Server error water.exceptions.H2ONotFoundArgumentException: Error: File does not exist

Using h2o on python in jupyter notebook and getting error message:
...
/home/mapr/anaconda2/lib/python2.7/site-packages/h2o/backend/connection.pyc in _process_response(response, save_to)
723 # Client errors (400 = "Bad Request", 404 = "Not Found", 412 = "Precondition Failed")
724 if status_code in {400, 404, 412} and isinstance(data, (H2OErrorV3, H2OModelBuilderErrorV3)):
--> 725 raise H2OResponseError(data)
726
727 # Server errors (notably 500 = "Server Error")
H2OResponseError: Server error water.exceptions.H2ONotFoundArgumentException:
Error: File <path to data file I'm trying to import> does not exist.
when trying to import data with
train = h2o.import_file(path = os.path.realpath("relative path to data file"))
Yet the file does in fact exist on the specified path. Why would this be happening?
Details
Following h2o deeplearning example for accessing h2o service from python code in a jupyter notebook. Everything works fine up until the part where need to import .csv data, eg.
spiral = h2o.import_file(path = os.path.realpath("../data/spiral.csv"))
At which point the error above is raised. The source code comments that
# In this case, the cluster is running on our laptops. Data files are imported by their relative locations to this notebook.
Yet, when running
os.path.exists(os.path.realpath("./data/<my data csv file>"))
in the notebook, the response is true. So it seems like the relative path is recognized by the python os package*, but there is some problem with the h2o.import_file() method.
What could be going on here? Thanks.
Note: that I'm am using port forwarding from the machine actually running the h2o and jupyter-notebook services with something like:
remote machine:
$jupyter-notebook --no-browser --port=8889
local machine:
$ssh -N -L localhost:8888:localhost:8889 myuser#mnode01
* The directory structure is:
bin
data
|
|_____ mydata.csv
include
lib
remote-h2o.ipynb
UPDATE
Think have found the problem. The h2o python docs specify that
The path to the data must be a valid path for each node in the H2O cluster. If some node in the H2O cluster cannot see the file, then an exception will be thrown by the H2O cluster.
This raises the question, that does this mean that all of the cluster nodes need to have the same virtualenv (with same absolute path) that I am running the jupyter notebook and holding the data/mydata.csv in?
I had the same problem and solved it by changing the port in the h2o.init(port='XXXXX')

Resources