Error when importing sklearn in pipeline component - google-cloud-vertex-ai

When I run this simple pipeline (in GCP's Vertex AI Workbench) I get an error:
ModuleNotFoundError: No module named 'sklearn'
Here is my code:
from kfp.v2 import compiler
from kfp.v2.dsl import pipeline, component
from google.cloud import aiplatform
#component(
packages_to_install=["sklearn"],
base_image="python:3.9",
)
def test_sklearn():
import sklearn
#pipeline(
pipeline_root=PIPELINE_ROOT,
name="sklearn-pipeline",
)
def pipeline():
test_sklearn()
compiler.Compiler().compile(pipeline_func=pipeline, package_path="sklearn_pipeline.json")
job = aiplatform.PipelineJob(
display_name=PIPELINE_DISPLAY_NAME,
template_path="sklearn_pipeline.json",
pipeline_root=PIPELINE_ROOT,
location=REGION
)
job.run(service_account=SERVICE_ACCOUNT)
What do I do wrong? :)

It seems that the package name sklearn does not work after a version upgrade.You need to change the value of packages_to_install from "sklearn" to "scikit-learn" in the #component block.

Related

python-telegram-bot for (v20.x) TypeError: bad operand type for unary ~: 'type'

I am trying to build a telegram bot, I have just reproduce the code:
from telegram.ext import MessageHandler
from telegram.ext import filters
from telegram.ext import Application
from telegram import Update
from telegram.ext import ContextTypes
from decouple import config
def testFunc (update: Update, context: ContextTypes.DEFAULT_TYPE):
print('Hi')
def main():
BOT_TOKEN = config('TELE_BOT_API_KEY_2')
application = Application.builder().token(BOT_TOKEN).build()
application.add_handler(MessageHandler(filters.Text & ~filters.Command, testFunc))
application.run_polling()
if __name__ == '__main__':
main()
The error this code shows is:
Bot\AsyncAdvanceBot\test3.py", line 16, in main
application.add_handler(MessageHandler(filters.Text & ~filters.Command, testFunc))
TypeError: bad operand type for unary ~: 'type'
I am using python-telegram-bot api v20.x
I know this might be a naive problem that I might be missing.
Thanks!
I tried changing the code to different format but it doesn't work.
I got it! It was as I said naive error. I was not thinking straight😅
I have seen the document even earlier and was only focusing on making filters.Command to filters.COMMAND, but forgot to change filters.Text to filters.TEXT.
just replaced
filters.Text & ~filters.Command
with
filters.TEXT & ~filters.COMMAND

Is it possible to make the internal dependencies inside a python module user selectable?

After testing the logger library locally, I uploaded it to pypi.
Afterwards, when I proceeded with pip install, there was an error saying that the module inside the library could not be found.
So, as a temporary measure, I added a syntax to register all .py in init.py in the library package folder, and I want to improve this. This is because you have to install all dependencies for features that users may not be using
What improvements can I take in this situation?
If possible, I would like to know how to lazy use only the modules used by the user instead of registering all .py in init.py .
Or is there something structurally I'm overlooking?
Here is the project structure I used
project_name
- pacakge_name
- __init__.py. <- all loggers were registered
- file_logger.py
- console_logger.py
- ...
- fluent_logger.py <- used external library
- scribe_logger.py <- used external library
init.py
"""
Description for Package
"""
from .composite_logger import CompositeLogger
from .console_logger import ConsoleLogger
from .file_logger import FileLogger
from .fluent_logger import FluentLogger
from .jandi_logger import JandiLogger
from .line_logger import LineLogger
from .logger_impl import LoggerImpl
from .logger_interface import LoggerInterface
from .logger import Logger
from .memory_logger import MemoryLogger
from .null_logger import NullLogger
from .scribe_logger import ScribeLogger
from .telegram_logger import TelegramLogger
from .retry import Retry
__all__ = [
'CompositeLogger',
'ConsoleLogger',
'FileLogger',
'FluentLogger',
'JandiLogger',
'LineLogger',
'LoggerImpl',
'LoggerInterface',
'Logger',
'MemoryLogger',
'NullLogger',
'ScribeLogger',
'TelegramLogger',
'Retry',
]
setup.py
import setuptools
from distutils.core import setup
with open("README.md", "r", encoding="utf-8") as f:
long_descriprion = f.read()
setuptools.setup(
name = 'project_name',
version = '0.0.1',
description = 'python logger libary',
long_description = long_descriprion,
long_description_content_type = "text/markdown",
author = 'abc',
author_email = 'abc#gmail.com',
url = "https://github.com/##/##",
packages = ["pacakge_name"],
install_requires=[ <- contains too many external libraries
'requests>=2.0.0',
'thrift>=0.16.0',
'facebook-scribe>=2.0.post1',
'fluent-logger>=0.10.0'
],
keywords = ['logger'],
python_requires = '>=3.7',
classifiers = [
'Programming Language :: Python :: 3.7',
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent"
],
)

How to get rid of positional argument error while using map function in tensorflow dataset

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf
from tensorflow import keras
import tensorflow_datasets as tfds
( ds_train,ds_test),ds_info=tfds.load("mnist",split=['train',"test"],
shuffle_files=True,with_info=True)
def normalize(image,label):
image=tf.cast(image,tf.float32)
return image/255.0,label
AUTOTUNE=tf.data.experimental.AUTOTUNE
new_ds=new_ds=ds_train.map(normalize,num_parallel_calls=AUTOTUNE)
When I execute 'ds_train.map' it shows me Below error:
TypeError: in user code:
TypeError: tf__normalize() missing 1 required positional argument:
'label'

Streamlit Unhashable TypeError when i use st.cache

when i use the st.cache decorator to cash hugging-face transformer model i get
Unhashable TypeError
this is the code
from transformers import pipeline
import streamlit as st
from io import StringIO
#st.cache(hash_funcs={StringIO: StringIO.getvalue})
def model() :
return pipeline("sentiment-analysis", model='akhooli/xlm-r-large-arabic-sent')
after searching in issues section in streamlit repo
i found that hashing argument is not required , just need to pass this argument
allow_output_mutation = True
This worked for me:
from transformers import pipeline
import tokenizers
import streamlit as st
import copy
#st.cache(hash_funcs={tokenizers.Tokenizer: lambda _: None, tokenizers.AddedToken: lambda _: None})
def get_model() :
return pipeline("sentiment-analysis", model='akhooli/xlm-r-large-arabic-sent')
input = st.text_input('Text')
bt = st.button("Get Sentiment Analysis")
if bt and input:
model = copy.deepcopy(get_model())
st.write(model(input))
Note 1:
calling the pipeline with input model(input) changes the model and we shouldn't change a cached value so we need to copy the model and run it on the copy.
Note 2:
First run will load the model using the get_model function next run will use the chace.
Note 3:
You can read more about Advanced caching in stremlit in thier documentation.
Output examples:

how to remove the unused genreated require when use protobuf anotations

package usegogo.api.v1;
import "google/protobuf/empty.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/field_mask.proto";
import "gogoproto/gogo.proto";
option (gogoproto.marshaler_all) = false;
I use gogoproto to generate go codes.
But when I generate nodejs code, there is a var gogoproto_gogo_pb = require('../../../gogoproto/gogo_pb.js');
this is generated because i use import "gogoproto/gogo.proto";
is there any way to let protoc ignore the the import "gogoproto/gogo.proto"; sine i don't use this when i genreate nodejs code.
Protoc will actually generate gogo_pb.js if you point it to gogo.proto like you do with your other proto files.

Resources