How to solve not found error on spark on windows? - windows

<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.2.0
/_/
Using Scala version 2.12.15 (Java HotSpot(TM) 64-Bit Server VM, Java 17.0.2)
Type in expressions to have them evaluated.
Type :help for more information.
I'm having an py4javaerror when I try to run a code in jupyter notebook. The error lines mostly shows something to do with spark. I thought maybe the error is the spark, I'm getting this error when I run spark. how to solve it ? Most commands to solve it are in linux machine. Can someone tell me how to solve it in a windows machine? If this isnt the issue what else would it be ?
the py4javaerror is
Py4JJavaError: An error occurred while calling o118.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 10) (AQKT255-5420.AQ.Local executor driver): java.net.SocketException: Connection reset

Related

How do I resolve "AttributeError: 'NoneType' object has no attribute 'origin'" when attempting to run pyspark on macOS

I have installed pyspark on macOS using brew but I'm getting the error when I type pyspark in zsh:
Traceback (most recent call last):
File "/opt/homebrew/bin/find_spark_home.py", line 86, in <module>
print(_find_spark_home())
File "/opt/homebrew/bin/find_spark_home.py", line 52, in _find_spark_home
module_home = os.path.dirname(find_spec("pyspark").origin)
AttributeError: 'NoneType' object has no attribute 'origin'
I've tried setting the path inside the pyspark script but then got
/opt//homebrew/Cellar/apache-spark/3.3.1/bin/load-spark-env.sh: line 2: /opt/homebrew/Cellar/apache-spark/3.3.1/libexec/bin/load-spark-env.sh: Permission denied
/opt//homebrew/Cellar/apache-spark/3.3.1/bin/load-spark-env.sh: line 2: exec: /opt/homebrew/Cellar/apache-spark/3.3.1/libexec/bin/load-spark-env.sh: cannot execute: Undefined error: 0
How do I resolve this error?
I first had to locate and copy the apache spark directory to usr/local:
sudo cp -r /opt/homebrew/Cellar/apache-spark /usr/local/Cellar/
I found the spark directory with sudo find /opt/ -name find_spark_home.py
then I set the environment variables:
SPARK_HOME=/usr/local/Cellar/apache-spark/3.3.1/libexec
export PATH=/usr/local/Cellar/apache-spark/3.3.1/bin:$PATH
after that typing pyspark gives:
Python 3.9.6 (default, Oct 18 2022, 12:41:40)
[Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
23/01/23 13:31:11 WARN Utils: Your hostname, Reggies-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.0.20 instead (on interface en0)
23/01/23 13:31:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/01/23 13:31:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 3.3.1
/_/
Using Python version 3.9.6 (default, Oct 18 2022 12:41:40)
Spark context Web UI available at http://192.168.0.20:4040
Spark context available as 'sc' (master = local[*], app id = local-1674498672860).
SparkSession available as 'spark'.

How do I display the commit SHA1 as my application version in Quarkus?

I have a Quarkus application and I'd like to display the application version as the Git commit SHA1, how do I do that?
The buildnumber-maven-plugin provides information about the current Git commit for the application as a maven property.
Make sure your pom.xml has it included and the src/main/resources/application.properties file is filtered by Maven, like the following:
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
</resource>
</resources>
</build>
Add the following to your src/main/resources/application.properties:
quarkus.application.version=${buildNumber}
The buildNumber property will be resolved and replaced during the maven build. The quarkus.application.version is a build-time property resolved during the Quarkus build (performed by the quarkus-maven-plugin).
If all steps were performed correctly, you should see an output as the following when running the application:
__ ____ __ _____ ___ __ ____ ______
--/ __ \/ / / / _ | / _ \/ //_/ / / / __/
-/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/
2021-07-22 11:35:44,283 INFO [org.fly.cor.int.lic.VersionPrinter] (main) Flyway Community Edition 7.11.2 by Redgate
2021-07-22 11:35:44,284 INFO [org.fly.cor.int.dat.bas.BaseDatabaseType] (main) Database: jdbc:postgresql://localhost:5432/postgres (PostgreSQL 12.7)
2021-07-22 11:35:44,321 INFO [org.fly.cor.int.com.DbMigrate] (main) Current version of schema "public": 3
2021-07-22 11:35:44,322 INFO [org.fly.cor.int.com.DbMigrate] (main) Schema "public" is up to date. No migration necessary.
2021-07-22 11:35:45,080 INFO [io.quarkus] (main) quarkus-registry afd82c886d3d6fa60d1f29df642bf6565135ccef on JVM (powered by Quarkus 2.1.0.Final) started in 2.327s. Listening on: http://0.0.0.0:8080
HINT: you can resize the commit SHA by adding the maven.buildNumber.shortRevisionLength Maven property to your pom.xml:
<maven.buildNumber.shortRevisionLength>7</maven.buildNumber.shortRevisionLength>

Spark - How to colourise terminal output from spark-submit

When I run spark-submit it works successfully, but the output is not colourised.
(/Users/me/bai/conda-envs/spark-mllib-kmeans) me#my-mbp spark-mllib-kmeans % spark-submit spark-helloWorld.py
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/Cellar/apache-spark/3.0.1/libexec/jars/spark-unsafe_2.12-3.0.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/12/22 12:18:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/12/22 12:18:34 INFO SparkContext: Running Spark version 3.0.1
20/12/22 12:18:34 INFO ResourceUtils: ==============================================================
20/12/22 12:18:34 INFO ResourceUtils: Resources for spark.driver:
20/12/22 12:18:34 INFO ResourceUtils: ==============================================================
20/12/22 12:18:34 INFO SparkContext: Submitted application: Simple App
...
I am using Spark version 3.0.1:
(base) me#my-mbp spark-mllib-kmeans % spark-shell --version
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/Cellar/apache-spark/3.0.1/libexec/jars/spark-unsafe_2.12-3.0.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.0.1
/_/
Using Scala version 2.12.10, OpenJDK 64-Bit Server VM, 14.0.1
Branch HEAD
Compiled by user ubuntu on 2020-08-28T08:58:35Z
Revision 2b147c4cd50da32fe2b4167f97c8142102a0510d
Url https://gitbox.apache.org/repos/asf/spark.git
Type --help for more information.
I am using the default Mac terminal program on latest Mac OS:
% uname -a
Darwin my-mbp.lan 20.2.0 Darwin Kernel Version 20.2.0: Wed Dec 2 20:39:59 PST 2020; root:xnu-7195.60.75~1/RELEASE_X86_64 x86_64
I would like to see the different log statement levels (WARN/INFO/ERROR) in different colours. Perhaps other use of colours to differentiate the output from spark framework and the output from my application.
Given there is so much framework level output and there is noise of WARNINGS due to framework issues, I was hoping better use of colour could help me to scan my output quicker.
Is there a simple solution for this?
I see this behaviour in both native Mac Terminal and MS VSC integrated terminal.
I saw the output line:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
I see I can create then edit the log4j config file here, so perhaps I just need the config for the log4j config file to colourise all output.
% cd $SPARK_HOME/conf
% cp log4j.properties.template log4j.properties
I got the same warning but it works properly without errors.

How can I install postgresqljdbc to work in Karaf OSGi?

I want to install org.postgresql/postgresql/9.4-1201-jdbc41 in Karaf but I get errors. How can I resolve these errors? Strangely on Windows my Karaf doesn't have errors with this Postgres jdbc but on Ubuntu it has these errors. Any clues appreciated.
Install Kar feature social_importer.kar/1.0-SNAPSHOT
java.lang.Exception: Could not start bundle
mvn:org.postgresql/postgresql/9.4-1201-jdbc41 in feature(s)
T: Unresolved constraint in bundle org.postgresql.jdbc41
[127]: Unable to resolve 127.0: missing requirement [127.0]
osgi.wiring.package; (osgi.wiring.package=javax.transaction.xa)
Caused by: org.osgi.framework.BundleException:
Unresolved constraint in bundle org.postgresql.jdbc41 [127]: Unable
to resolve 127.0: missing requirement [127.0] osgi.wiring.package;
(osgi.wiring.package=javax.transaction.xa)
This might be related Apache Felix not able to access Postgres JDBC
karaf#root()> install -s wrap:mvn:postgresql/postgresql/9.4-1201-jdbc41
Bundle IDs:
Error executing command: Error installing bundles:
Unable to install bundle wrap:mvn:postgresql/postgresql/9.4-1201-jdbc41
karaf#root()> install -s mvn:postgresql/postgresql/9.4-1201-jdbc41
Bundle IDs:
Error executing command: Error installing bundles:
Unable to install bundle mvn:postgresql/postgresql/9.4-1201-jdbc41
karaf#root()>
I looked in the Karaf logs with log level of INFO.
Caused by: java.lang.NoClassDefFoundError: org/osgi/service/jdbc/DataSourceFactory
at org.postgresql.osgi.PGBundleActivator.start(PGBundleActivator.java:32)
at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:645)
at org.apache.felix.framework.Felix.activateBundle(Felix.java:2154)
... 11 more
Caused by: java.lang.ClassNotFoundException: org.osgi.service.jdbc.DataSourceFactory not found by org.postgresql.jdbc41 [5328]
at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1556)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:77)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1993)[org.apache.felix.framework-4.4.1.jar:]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)[:1.8.0_25]
Provisioning Postgresql JDBC Driver to Karaf 4.0.1
__ __ ____
/ //_/____ __________ _/ __/
/ ,< / __ `/ ___/ __ `/ /_
/ /| |/ /_/ / / / /_/ / __/
/_/ |_|\__,_/_/ \__,_/_/
Apache Karaf (4.0.1)
Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit '<ctrl-d>' or type 'system:shutdown' or 'logout' to shutdown Karaf.
karaf#root()> feature:repo-add mvn:org.ops4j.pax.jdbc/pax-jdbc-features/0.7.0/xml/features
karaf#root()>feature:install pax-jdbc-spec
karaf#root()>feature:install transaction
karaf#root()>bundle:install -s mvn:org.postgresql/postgresql/9.4-1200-jdbc41
karaf#root()> service:list org.osgi.service.jdbc.DataSourceFactory
[org.osgi.service.jdbc.DataSourceFactory]
-----------------------------------------
osgi.jdbc.driver.class = org.postgresql.Driver
osgi.jdbc.driver.name = PostgreSQL JDBC Driver
osgi.jdbc.driver.version = PostgreSQL 9.4 JDBC4.1 (build 1200)
service.bundleid = 52
service.id = 113
service.scope = singleton
Provided by :
PostgreSQL JDBC Driver JDBC41 (52)
Definining a Postgres Pool Datasource to Karaf 4.0.1
Theory at: https://ops4j1.jira.com/wiki/display/PAXJDBC/Create+DataSource+from+config
karaf#root()>feature:install pax-jdbc-config
karaf#root()>feature:install pax-jdbc-pool-dbcp2
Create file under KARAF_HOME/etc/org.ops4j.datasource-companymanager.cfg
where companymanager is the datasource name.
osgi.jdbc.driver.name=PostgreSQL JDBC Driver-pool-xa
serverName=localhost
databaseName=companymanager
portNumber=5432
user=postgres
password=admin
dataSourceName=companymanager
Voilá you are done, your datasource is exposed to OSGI registry ready to be used at your will:
karaf#root()> service:list javax.sql.DataSource
[javax.sql.DataSource]
----------------------
databaseName = companymanager
dataSourceName = companymanager
felix.fileinstall.filename = file:/C:/apache-karaf-4.0.1/etc/org.ops4j.datasource-companymanager.cfg
osgi.jdbc.driver.name = PostgreSQL JDBC Driver-pool-xa
osgi.jndi.service.name = companymanager
password = admin
portNumber = 5432
serverName = localhost
service.bundleid = 64
service.factoryPid = org.ops4j.datasource
service.id = 119
service.pid = org.ops4j.datasource.3cad9abf-49be-4868-8940-1623481b1363
service.scope = singleton
user = postgres
Provided by :
OPS4J Pax JDBC Config (64)
The next step will be perhaps set up JPA if you are interested, you can keep reading and get the full example code from:
https://github.com/antoniomaria/karaf4-eclipselink-jpa
Just tested with Karaf 4.0.0.M2:
OSGi compendium exports org.osgi.service.jdbc
__ __ ____
/ //_/____ __________ _/ __/
/ ,< / __ `/ ___/ __ `/ /_
/ /| |/ /_/ / / / /_/ / __/
/_/ |_|\__,_/_/ \__,_/_/
Apache Karaf (4.0.0.M2)
Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit '<ctrl-d>' or type 'system:shutdown' or 'logout' to shutdown Karaf.
karaf#root()> feature:install transaction
karaf#root()> install -s mvn:org.osgi/org.osgi.compendium/5.0.0
Bundle ID: 51
karaf#root()> install -s wrap:mvn:org.postgresql/postgresql/9.4-1201- jdbc41
Bundle ID: 52
karaf#root()> list
START LEVEL 100 , List Threshold: 50
ID | State | Lvl | Version | Name
----------------------------------------------------------------------
51 | Active | 80 | 5.0.0.201305092017 | osgi.cmpn
52 | Active | 80 | 9.4.0.build-1201 | PostgreSQL JDBC Driver JDBC41
karaf#root()>
The Apache Karaf DataSources (JDBC) is an optional enterprise feature. In order to install the postgresql, please use the following statements.
karaf#root()> feature:install jdbc
karaf#root()> install -s mvn:org.postgresql/postgresql/9.4-1203-jdbc42
the above solution was tested on Karaf 4.0.1

Unable to install hawtio feature in apache-karaf

I Use the following 2 commands to install the hawtio feature in apache karaf
features:addurl mvn:io.hawt/hawtio-karaf/1.4.17/xml/features
features:install hawtio
When I run "features:install hawtio" I get the following error
"Error executing command: Could not start bundle mvn:io.hawt/hawtio-osgi-jmx/1.4.17 in feature(s) hawtio-core-1.4.17: Activator start error in bundle io.hawt.hawtio-osgi-jmx [286]"
Could you help me fix this.
Works fine for me
davsclaus:/opt/apache-karaf-2.3.7/$ bin/karaf
__ __ ____
/ //_/____ __________ _/ __/
/ ,< / __ `/ ___/ __ `/ /_
/ /| |/ /_/ / / / /_/ / __/
/_/ |_|\__,_/_/ \__,_/_/
Apache Karaf (2.3.7)
Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit '<ctrl-d>' or type 'osgi:shutdown' or 'logout' to shutdown Karaf.
karaf#root> features:addurl mvn:io.hawt/hawtio-karaf/1.4.17/xml/features
karaf#root> features:install hawtio
karaf#root> web:list
ID State Web-State Level Web-ContextPath Name
[ 86] [Active ] [Deployed ] [ 80] [/hawtio ] hawtio :: hawtio-web (1.4.17)
[ 88] [Active ] [Deployed ] [ 80] [/hawtio-karaf-terminal ] hawtio :: Karaf terminal plugin (1.4.17)
karaf#root>
And btw there is shortcut to add hawtio, by
features:chooseurl hawtio 1.4.17
So you need to check the logs what is failing for you, you can use log:display or check the data/logs directory of Apache Karaf.
I was using the wrong Java version, I was using 1.6.x the compatible version for hawtio 1.4.17 is Java 1.7.x

Resources