I have java_home set in linux and in bin/karaf script file one entry is:
JAVA_EXT_DIRS="${JAVA_HOME}/jre/lib/ext:${JAVA_HOME}/lib/ext:${JAVA_HOME}/jre/lib:${KARAF_HOME}/lib/ext"
When my karaf is up and running and a flow is tested below error is throw:
Caused by: java.lang.ClassNotFoundException: javax.xml.transform.TransformerFactoryConfigurationError not found..
But this should be provided by rt.jar and rt.jar is present in "${JAVA_HOME}/jre/lib" hence I added same section in JAVA_EXT_DIRS entry.
But same error persists.
I should get java libraries from karaf.
Help me understanding the cause.
Your bundle needs to import the package javax.xml.transform in its Import-Package statement.
In general you need to import all packages that you actually use, with the sole exception of packages beginning with java., which includes for example java.lang, java.util etc but not javax.*.
Related
When packaging our app with mvn package everything works fine. Then when we start our app with java -jar target\quarkus-app\quarkus-run.jar the app silently crashes. While debugging we found that it crashes while parsing an xml InputStream. It happens while initialising some classes.
This is the stacktrace that we had to dig out ourselves:
Exception occurred in target VM: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found
javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found
at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source)
at org.optaplanner.core.impl.io.jaxb.GenericJaxbIO.parseXml(GenericJaxbIO.java:209)
at org.optaplanner.core.impl.io.jaxb.SolverConfigIO.read(SolverConfigIO.java:15)
at org.optaplanner.core.config.solver.SolverConfig.createFromXmlReader(SolverConfig.java:199)
at org.optaplanner.core.config.solver.SolverConfig.createFromXmlInputStream(SolverConfig.java:173)
at org.optaplanner.core.config.solver.SolverConfig.createFromXmlInputStream(SolverConfig.java:160)
When packaging the app in an uberjar this problem does not occur. Same when using dev.
We use graalvm-ce-java17-22.2.0, together with the 2.11.2.Final version of quarkus and the 8.29.0.Final version of optaplanner.
We tried to verify that there aren't any xml exclusion in the dependencies. Also we checked if quarkus and the quarkus maven-compiler-plugin are of the same version. Also we looked into the compiled jarfiles, if the xml we want to read is present. If it wouldn't be present, the code would crash even earlier. The class javax.xml.parsers.DocumentBuilderFactory is not listed in the quarkus-app-dependencies.txt
Adding the quarkus-optaplanner extension helped to identify the logger issue. So the problem with the silent crash is resolved. Adding quarkus-jaxp to the dependencies gets rid of the FactoryConfigurationError and everything works as expected.
I'm trying to run a simple spark to s3 app from a server but I keep getting the below error because the server has hadoop 2.7.3 installed and it looks like it doesn't include the GlobalStorageStatistics class. I have hadoop 2.8.x defined in my pom.xml file but trying to test it by running it locally.
How can I make it ignore searching for that or what workaround options are there to include that class if I have to go with hadoop 2.7.3?
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageStatistics
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource.hasMetadata(DataSource.scala:301)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425)
at com.ibm.cos.jdbc2DF$.main(jdbc2DF.scala:153)
at com.ibm.cos.jdbc2DF.main(jdbc2DF.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StorageStatistics
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 28 more
You can't mix bits of Hadoop and expect things to work. It's not just the close coupling between internal classes in hadoop-common and hadoop-aws, its things like the specific version of the amazon-aws SDK the hadoop-aws module was built it.
If you get ClassNotFoundException or MethodNotFoundException stack traces when trying to work with s3a:// URLs, JAR version mismatch is the likely cause.
Using the RFC2117 MUST/SHOULD/MAY terminology, here are the rules to avoid this situation:
The s3a connector is in hadoop-aws JAR; it depends on hadoop-common and the aws-sdk-shaded JARs.
all these JARs MUST be on the classpath.
All versions of the hadoop-* JARs on your classpath MUST be exactly the same version, e.g 3.3.1 everywhere, or 3.2.2. Otherwise: stack trace. Always
And they MUST be exclusively of that version; there MUST NOT be multiple versions of hadoop-common, hadoop-aws etc on the classpath. Otherwise: stack trace. Always. Usually ClassNotFoundException indicating a mismatch in hadoop-common and hadoop-aws.
The exact missing class varies across Hadoop releases: it's the first class depended on by org.apache.fs.s3a.S3AFileSystem which the classloader can't find -the exact class depends on the mismatch of JARs
The AWS SDK jar SHOULD be the huge aws-java-sdk-bundle JAR, unless you know exactly which bits of the AWS SDK stack you need *and are confident all transitive dependencies (jackson, httpclient, ...) are in your Spark distribution and compatible. Otherwise: missing classes or odd runtime issues.
There MUST NOT be any other AWS SDK jars on your classpath. Otherwise: duplicate classes and general classpath problems.
The AWS SDK version SHOULD be the one shipped. Otherwise: maybe stack trace, maybe not. Either way -you are in self-support mode or have opted to join a QE team for version testing.
The specific version of the AWS SDK you need can be determined from Maven Repository
Changing the AWS SDK versions MAY work. You get to test, and if there are compatibility problems: you get to fix. See Qualifying an AWS SDK Update for the least you should be doing.
You SHOULD use the most recent versions of Hadoop you can/Spark is tested with. Non-critical bug fixes do not get backported to old Hadoop releases, and the S3A and ABFS connectors are rapidly evolving. New releases will be better, stronger, faster. Generally
If none of this works. a bug report filed on the ASF JIRA server will get closed as WORKSFORME. Config issues aren't treated as code bugs
Finally: the ASF documentation: The S3A Connector.
Note: that link is to the latest release. If you are using an older release it will lack features. Upgrade before complaining that the s3a connector doesn't do what the documentation says it does.
I found stevel's answer above to be extremely helpful. His information inspired my write-up here. I will copy the relevant parts below. My answer is tailored to a Python/Windows context, but I suspect most points are still relevant in a JVM/Linux context.
Dependencies
This answer is intended for Python developers, so it assumes we will install Apache Spark indirectly via pip. When pip installs PySpark, it collects most dependencies automatically, as seen in .venv/Lib/site-packages/pyspark/jars. However, to enable the S3A Connector, we must track down the following dependencies manually:
JAR file: hadoop-aws
JAR file: aws-java-sdk-bundle
Executable: winutils.exe (and hadoop.dll) <-- Only needed in Windows
Constraints
Assuming we're installing Spark via pip, we can't pick the Hadoop version directly. We can only pick the PySpark version, e.g. pip install pyspark==3.1.3, which will indirectly determine the Hadoop version. For example, PySpark 3.1.3 maps to Hadoop 3.2.0.
All Hadoop JARs must have the exact same version, e.g. 3.2.0. Verify this with cd pyspark/jars && ls -l | grep hadoop. Notice that pip install pyspark automatically included some Hadoop JARs. Thus, if these Hadoop JARs are 3.2.0, then we should download hadoop-aws:3.2.0 to match.
winutils.exe must have the exact same version as Hadoop, e.g. 3.2.0. Beware, winutils releases are scarce. Thus, we must carefully pick our PySpark/Hadoop version such that a matching winutils version exists. Some PySpark/Hadoop versions do not have a corresponding winutils release, thus they cannot be used on Windows.
aws-java-sdk-bundle must be compatible with our hadoop-aws choice above. For example, hadoop-aws:3.2.0 depends on aws-java-sdk-bundle:1.11.375, which can be verified here.
Instructions
With the above constraints in mind, here is a reliable algorithm for installing PySpark with S3A support on Windows:
Find latest available version of winutils.exe here. At time of writing, it is 3.2.0. Place it at C:/hadoop/bin. Set environment variable HADOOP_HOME to C:/hadoop and (important!) add %HADOOP_HOME%/bin to PATH.
Find latest available version of PySpark that uses Hadoop version equal to above, e.g. 3.2.0. This can be determined by browsing PySpark's pom.xml file across each release tag. At time of writing, it is 3.1.3.
Find the version of aws-java-sdk-bundle that hadoop-aws requires. For example, if we're using hadoop-aws:3.2.0, then we can use this page. At time of writing, it is 1.11.375.
Create a venv and install the PySpark version from step 2.
python -m venv .venv
source .venv/Scripts/activate
pip install pyspark==3.1.3
Download the AWS JARs into PySpark's JAR directory:
cd .venv/Lib/site-packages/pyspark/jars
ls -l | grep hadoop
curl -O https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.2.0/hadoop-aws-3.2.0.jar
curl -O https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.375/aws-java-sdk-bundle-1.11.375.jar
Download winutils:
cd C:/hadoop/bin
curl -O https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.2.0/bin/winutils.exe
curl -O https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.2.0/bin/hadoop.dll
Testing
To verify your setup, try running the following script.
import pyspark
spark = (pyspark.sql.SparkSession.builder
.appName('my_app')
.master('local[*]')
.config('spark.hadoop.fs.s3a.access.key', 'secret')
.config('spark.hadoop.fs.s3a.secret.key', 'secret')
.getOrCreate())
# Test reading from S3.
df = spark.read.csv('s3a://my-bucket/path/to/input/file.csv')
print(df.head(3))
# Test writing to S3.
df.write.csv('s3a://my-bucket/path/to/output')
You'll need to substitute your AWS keys and S3 paths, accordingly.
If you recently updated your OS environment variables, e.g. HADOOP_HOME and PATH, you might need to close and re-open VSCode to reflect that.
I am building an Karaf OSGI application which uses javax.smartcardio library. when I deploy it, it is giving me the following error:
Unable to resolve 249.0: missing requirement [249.0] osgi.wiring.package; (osgi.wiring.package=javax.smartcardio)
at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeatures(FeaturesServiceImpl.java:488)[26:org.apache.karaf.features.core:2.3.2]
at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeature(FeaturesServiceImpl.java:402)[26:org.apache.karaf.features.core:2.3.2]
at Proxy532dee57_5493_41af_a3a1_bf689277fb5b.installFeature(Unknown Source)[:]
at org.apache.karaf.deployer.kar.KarArtifactInstaller.installFeatures(KarArtifactInstaller.java:189)[24:org.apache.karaf.deployer.kar:2.3.2]
at org.apache.karaf.deployer.kar.KarArtifactInstaller.install(KarArtifactInstaller.java:134)[24:org.apache.karaf.deployer.kar:2.3.2]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.install(DirectoryWatcher.java:929)[6:org.apache.felix.fileinstall:3.2.6]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.install(DirectoryWatcher.java:857)[6:org.apache.felix.fileinstall:3.2.6]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:483)[6:org.apache.felix.fileinstall:3.2.6]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:291)[6:org.apache.felix.fileinstall:3.2.6]
Caused by: org.osgi.framework.BundleException: Unresolved constraint in bundle com.isirona.drivers.neuroptics-npi200.core [249]: Unable to resolve 249.0: missing requirement [249.0] osgi.wiring.package; (osgi.wiring.package=javax.smartcardio)
at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3826)[org.apache.felix.framework-4.0.3.jar:]
at org.apache.felix.framework.Felix.startBundle(Felix.java:1868)[org.apache.felix.framework-4.0.3.jar:]
at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:944)[org.apache.felix.framework-4.0.3.jar:]
at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:931)[org.apache.felix.framework-4.0.3.jar:]
at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeatures(FeaturesServiceImpl.java:485)[26:org.apache.karaf.features.core:2.3.2]
I want to package javax.smartcardio library as a bundle. There are links on how to build a bundle from .jar but I cannot find javax.smartcardio library in my JDK.
Is it inside the JDK? Do I have to do anything special to get access to it. Thank you.
If it's part of the JDK make sure you export it as a system package.
For this edit the etc/jre.properties file.
You'll need to add the corresponding package to the section of the JVM you use. And usually it's also best to export it with a correct version if known.
I am having some difficulties running a simple pig script to import data into HBase using HBaseStorage
The error I have encountered is given by:
Caused by: <file demo.pig, line 14, column 0> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'org.apache.pig.backend.hadoop.hbase.HBaseStorage' with arguments '[rdf:predicate rdf:object]'
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.Scan.setCacheBlocks(Z)V
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.initScan(HBaseStorage.java:427)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.<init>(HBaseStorage.java:368)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.<init>(HBaseStorage.java:239) 13_21.51.28.tar.gz
... 29 more
According to other questions and threads, the main response/answer to this issue would be to register the appropriate jars required for the HBaseStorage references. What I am stumped by is how am I supposed to identify the required JAR given the appropriate Pig function.
I even tried to open the various jar files under the hbase and pig folders to ensure the appropriate classes are registered in the pig script.
For example, since java.lang.NoSuchMethodError was caused by org.apache.hadoop.hbase.client.Scan.setCacheBlocks(Z)V
I imported specifically the jar that contains org.apache.hadoop.hbase.client.Scan, to no avail.
Pig's documentation does not provide any obvious links and help that I can refer to.
I am using Hadoop 2.7.0, HBase 1.0.1.1., Pig 0.15.0.
If you need any other clarification, feel free to ask me again. Would really appreciate it if someone could help me out with this issue.
Also, is it better to install Hadoop and the relevant softwares from scratch, or is it better to directly get one of the Hadoop bundles available?
There is something wrong with the released jar: hbase-client-1.0.1.1.jar
you can test it with this code, the error will show up:
Scan scan = new Scan();
scan.setCacheBlocks(true);
I've tried other set functions, like setCaching, it throws the same error. While I checked the source code, those functions exist. Maybe just compile hbase-client-1.0.1.1.jar manually, I'm still looking for better solution...
============
Update for above, found the root cause is hbase-client-1.0.1.1.jar incompatibility with older versions.
https://issues.apache.org/jira/browse/HBASE-10841
https://issues.apache.org/jira/browse/HBASE-10460
There is a change of return value for set functions, jars compiled with old version won't work with current.
For your question, you can modify the pig script $PIG_HOME/bin/pig, set debug=true, then it will just print running info.
Did you register required jars.
Most important jars habse,zookeeper and guava
I solved the similar kind of issue by registering zookeeper jar in my pigscript
Following instructions at http://wiki.hl7.org/index.php?title=FHIR_Build_Process my FHIR build is failing. I modified the publish.bat to ensure it uses the correct JDK. Running it on Windows 7 64-bit machine with JDK 1.6 (also tried JDK 1.7) and both failing with same error.
Looks like some Saxon JAR hell somewhere. Any ideas?
...validate v2-tables 441sec 755MB
...validate v3-codesystems 443sec 889MB
Reference Platform Validation. 447sec 1067MB
...test adversereaction-example 447sec 1067MB
Exception in thread "main" java.lang.NoSuchMethodError: net.sf.saxon.Configuration.newConfiguration()Lnet/sf/saxon/Configuration
;
at net.sf.saxon.xpath.XPathFactoryImpl.<init>(XPathFactoryImpl.java:33)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:355)
at java.lang.Class.newInstance(Class.java:308)
at javax.xml.xpath.XPathFactoryFinder.loadFromService(XPathFactoryFinder.java:401)
at javax.xml.xpath.XPathFactoryFinder._newFactory(XPathFactoryFinder.java:222)
at javax.xml.xpath.XPathFactoryFinder.newFactory(XPathFactoryFinder.java:143)
at javax.xml.xpath.XPathFactory.newInstance(XPathFactory.java:185)
at javax.xml.xpath.XPathFactory.newInstance(XPathFactory.java:99)
at org.hl7.fhir.tools.publisher.Publisher.testSearchParameters(Publisher.java:2796)
at org.hl7.fhir.tools.publisher.Publisher.testSearchParameters(Publisher.java:2785)
at org.hl7.fhir.tools.publisher.Publisher.validateRoundTrip(Publisher.java:2759)
at org.hl7.fhir.tools.publisher.Publisher.validateXml(Publisher.java:2656)
at org.hl7.fhir.tools.publisher.Publisher.execute(Publisher.java:378)
at org.hl7.fhir.tools.publisher.Publisher.main(Publisher.java:281)
A workaround... do a fresh build of the publisher tool jar from source.
Following instructions in the build/buildhowto.txt I was able to build the tool jar inside Eclipse, run the Publisher successfully from inside Eclipse and then export it as a fresh tool jar overwriting the one I pulled from SVN. The freshly build one then ran to completion from the command line.
Could be there's just a problem with the version of tools jar out there in SVN at the moment.
For the record I am working with Version 0.12-1953.
You have two classes net.sf.saxon.Configuration in your classpath. One containing the method newConfiguration() and one not.
The method is probably called from Saxon-HE 9.x, and the class net.sf.saxon.Configuration is found in saxon 8.x, while the class should have been found inside Saxon-HE 9.x, where it also is, and does have this method.
So, check your dependencies to see if saxon 8.x is called, and try replacing that with Saxon-HE 9.x, then your problem is solved