I am using the Spring boot for an Java application and I want to put a python module my_module.py in the the app. I am trying to import the module like
interpretor.exec("import my_impodule")
But I am getting the error ImportError: No Module named my_module and when I check the current working directory using
interpretor.exec("import os\nprint os.getcwd()")
which gave me the path /my_project/ and my module location is /my_project/my_module.py which is correct. It should pick up the module if current working directory is this.
Can someone please help me where to put the python module so that I can picked up by Jython.
You need to set the Python module path. So that it can pick your module like this:
Properties pyProperties = new Properties();
pyProperties.put("python.path", System.getProperty("user.dir") + MODULE_PATH);
PythonInterpreter.initialize(System.getProperties(), pyProperties(), new String[0]);
PythonInterpreter pyInterpreter = new PythonInterpreter();
Related
Attempting a Pyflink stream setup from ADLS and currently trying to read a json file using the StreamExecutionEnvironment.from_source() method.
Here's how the code looks like:
from flink.plan.Environment import get_environment
from pyflink.datastream.functions import SourceFunction
from pyflink.datastream import StreamExecutionEnvironment, RuntimeExecutionMode
from pyflink.datastream.connectors.file_system import (FileSource, StreamFormat, FileSink,
OutputFileConfig, RollingPolicy, BucketAssigner)
from pyflink.common import WatermarkStrategy, Encoder, Types
from azure.storage.filedatalake import FileSystemClient
file_system = FileSystemClient.from_connection_string(connection_str, file_system_name="my_fs")
# setting the stream environment object
env = StreamExecutionEnvironment.get_execution_environment()
env.set_runtime_mode(RuntimeExecutionMode.STREAMING)
env.add_jars("file:///opt/flink/plugins/azure/flink-azure-fs-hadoop-1.16.0.jar")
env.add_classpaths("file:///opt/flink/plugins/azure/flink-azure-fs-hadoop-1.16.0.jar")
file_client = file_system.get_file_client(my_file)
input_path = 'abfss://' + file_client.url[8:]
print('URL is ===== >>>>',file_client.url)
# Source
ds = env.from_source(
source=FileSource.for_record_stream_format(StreamFormat.text_line_format(),
input_path)
.process_static_file_set().build(),
watermark_strategy=WatermarkStrategy.for_monotonous_timestamps(),
source_name="file_source"
)
ds.sink_to(
sink=FileSink.for_row_format(
base_path=output_path,
encoder=Encoder.simple_string_encoder())
.with_bucket_assigner(BucketAssigner.base_path_bucket_assigner())
.build())
ds.print()
env.execute()
I am getting below error:
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'abfs'. The scheme is directly supported by Flink through the following plugin(s): flink-fs-azure-hadoop. Please ensure that each plugin resides within its own subfolder within the plugins directory. See https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/ for more information. If you want to use a Hadoop file system for that scheme, please add the scheme to the configuration fs.allowed-fallback-filesystems. For a full list of supported file systems, please see https://nightlies.apache.org/flink/flink-docs-stable/ops/filesystems/.
Jar file has already been added in to the plugins folder as given in the documentation:
https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/
Also, storage account key is also added into the config.yaml file.
Alternatively tried adding the source as DataStream using :
ds = env.read_text_file(input_path)
Following the Graphene-Django basic tutorial verbatim results in this helpful situation.
Maybe there is a dir problem or something? As soon as we add the installed app, it is not found?
Tried everything here
Try to open "cookbook/ingredients/" folder and change name in IngredientsConfig to cookbook.ingredients:
class IngredientsConfig(AppConfig):
default_auto_field = 'django.db.models.BigAutoField'
name = 'cookbook.ingredients'
I am able to run my pipelines using the kedro run command without issue. For some reason though I can't access my context and catalog from Jupyter Notebook anymore. When I run kedro jupyter notebook and start a new (or existing) notebook using my project name when selecting "New", I get the errors following errors:
context
NameError: name 'context' is not defined
catalog.list()
NameError: name 'catalog' is not defined
EDIT:
After running the magic command %kedro_reload I can see that my ProjectContext init_spark_session is looking for files in project_name/notebooks instead of project_name/src. I tried changing the working directory in my Jupyter Notebook session with %cd ../src and os.ch_dir('../src') but kedro still looks in the notebooks folder:
%kedro_reload
java.io.FileNotFoundException: File file:/Users/user_name/Documents/app_name/kedro/notebooks/dist/project_name-0.1-py3.8.egg does not exist
_spark_session.sparkContext.addPyFile() is looking in the wrong place. When I comment out this line from my ProjectContext this error goes away but I receive another one about not being able to find my Oracle driver when trying to load a dataset from the catalog:
df = catalog.load('dataset')
java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver
EDIT 2:
For reference:
kedro/src/project_name/context.py
def init_spark_session(self) -> None:
"""Initialises a SparkSession using the config defined in project's conf folder."""
# Load the spark configuration in spark.yaml using the config loader
parameters = self.config_loader.get("spark*", "spark*/**")
spark_conf = SparkConf().setAll(parameters.items())
# Initialise the spark session
spark_session_conf = (
SparkSession.builder.appName(self.package_name)
.enableHiveSupport()
.config(conf=spark_conf)
)
_spark_session = spark_session_conf.getOrCreate()
_spark_session.sparkContext.setLogLevel("WARN")
_spark_session.sparkContext.addPyFile(f'src/dist/project_name-{__version__}-py3.8.egg')
kedro/conf/base/spark.yml:
# You can define spark specific configuration here.
spark.driver.maxResultSize: 8g
spark.hadoop.fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
spark.sql.execution.arrow.pyspark.enabled: true
# https://kedro.readthedocs.io/en/stable/11_tools_integration/01_pyspark.html#tips-for-maximising-concurrency-using-threadrunner
spark.scheduler.mode: FAIR
# JDBC driver
spark.jars: drivers/ojdbc8-21.1.0.0.jar
I think a combination of this might help you:
Generally, let's try to avoid manually interfering with the current working directory, so let's remove os.chdir in your notebook. Construct an absolute path where possible.
In your init_spark_session, when addPyFile, use absolute path instead. self.project_path points to the root directory of your Kedro project, so you can use it to construct the path to your PyFile accordingly, e.g. _spark_session.sparkContext.addPyFile(f'{self.project_path}/src/dist/project_name-{__version__}-py3.8.egg')
Not sure why you would need to add the PyFile though, but maybe you have a specific reason.
I tried Following :
1.D:\TestWorkSpace\protoc.exe where .exe file is residing.
Method1
1.PROTO_HOME = "D:\TestWorkSpace\"
2.Path = "%PROTO_HOME%\protoc.exe"
Method2
1.PROTO_HOME = "D:\TestWorkSpace\"
2.Path = "%PROTO_HOME%\protoc"
but still unable to add protoc.exe in class path.
I am using a project in which maven project is to be imported , before development , and pre-requisite is to add protoc.exe in class path.
Kindly provide your valuable suggestion. Thank you.
You mean you want to add protoc to the "PATH", not to the classpath?!
use:
PROTO_HOME = "D:\TestWorkSpace\"
2.Path = "%PROTO_HOME%"
And you are able to call protoc.exe without path
I'd like to export my Plone session configuration to my portal product.
The session configuration is set via the ZMI -> acl-users -> session -> properties
I have tried creating a snapshot of the site but can't locate the session configuration within the snapshot xml...
Indeed, there is no GenericSetup configuration support included in plone.session; there is currently nothing that'll export it for you, nor anything to then import the settings.
You'd have to write a setup step for it instead, and configure the session plugin manually through that.
Add an import step to your configure.zcml configuration file:
<?xml version="1.0"?>
<configure
xmlns="http://namespaces.zope.org/zope"
xmlns:genericsetup="http://namespaces.zope.org/genericsetup"
<genericsetup:importStep
name="yourpackage.a_unique_id_for_your_step"
title="Configures the plone.session plugin"
description="Perhaps an optional description"
handler="your.package.setuphandlers.setupPloneSession"
/>
</configure>
and add an empty 'sentinel' text file to the same profile directory named youpackage.setup-plonesession.txt
then add a setuphandlers.py module to your package (what handler points to in the above example):
def setupPloneSession(context):
if context.readDataFile('youpackage.setup-plonesession.txt') is None:
return
portal = context.getSite()
plugin = portal.acl_users.session
# Configure the plugin manually
plugin.path = '/'
plugin.cookie_name = '__ac'
plugin.cookie_domain = ''
# Set up a shared auth_tkt secret
plugin._shared_secret = 'YourSharedSecretKey'
plugin.mod_auth_tkt = True
Note that we first test if the sentinel file is present; if you reuse your package setup elsewhere the setup step could be run multiple times if you don't do this.
You'll need to refer to the plugin source to get an idea of what you can configure, I'm afraid.