I have a simple script that takes coordinates and turns them into points.
df.withColumn("geometry", expr("ST_Point(CAST(home_top_1_longitude AS Decimal(24,20)), CAST(home_top_1_latitude AS Decimal(24,20)))"))
When I run it on a cluster, it works and I get a df with POINTS.
+--------------------+
| geometry|
+--------------------+
|POINT (37.77244 5...|
|POINT (37.707264 ...|
+--------------------+
When I try to test it localy, I get this error.
An exception or error caused a run to abort: org.apache.spark.sql.catalyst.analysis.FunctionRegistry.createOrReplaceTempFunction(Ljava/lang/String;Lscala/Function1;)V
java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.analysis.FunctionRegistry.createOrReplaceTempFunction(Ljava/lang/String;Lscala/Function1;)V
This is what my test looks like.
class TestApplication extends FunSuite with DataFrameSuiteBase{
private implicit var config: Configuration = _
private val pathSources = "src/test/resources/"
val tb: String = pathSources + "tb.csv"
override def beforeAll(): Unit = {
super.beforeAll()
config = new Configuration(TestParams.args)
GeoSparkSQLRegistrator.registerAll(spark)
loadTestTables(spark, config)
}
def loadTestTables(implicit spark: SparkSession, conf: Configuration): Unit = {
createTable(tb, "tb")
}
def createTable(path: String, tableName: String)(implicit spark: SparkSession): Unit = {
spark.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("delimiter", ";")
.option("nullValue", "")
.load(path)
.createOrReplaceTempView(tableName)
}
test("tb") {
import spark.implicits._
implicit val sparkSession: SparkSession = spark
spark.table("tb").withColumn("geometry", expr("ST_Point(CAST(long AS Decimal(24,20)), CAST(lat AS Decimal(24,20)))"))
}
}
I thought there was a confict in my dependencies, but it wouldn't have worked on cluster if that was the case. These are the dependencies I use.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_2.11</artifactId>
<version>2.2.0_0.11.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.datasyslab</groupId>
<artifactId>geospark-sql_2.3</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.datasyslab</groupId>
<artifactId>geospark</artifactId>
<version>1.3.1</version>
</dependency>
Related
I have a scenario where I have Spring Boot integrated with OTEL and shipping to Honeycomb.io. I am trying to add an environment tag to each trace. I have created a class:
#Component
public class EnvironmentSpanProcessor implements SpanProcessor {
#Value("${ENVIRONMENT")
private String environment;
Queue<SpanData> spans = new LinkedBlockingQueue<>(50);
#Override
public void onStart(Context context, ReadWriteSpan readWriteSpan) {
readWriteSpan.setAttribute("env", environment);
}
#Override
public boolean isStartRequired() {
return false;
}
#Override
public void onEnd(ReadableSpan readableSpan) {
this.spans.add(readableSpan.toSpanData());
}
#Override
public boolean isEndRequired() {
return true;
}
}
I have set break points in this class, and they never hit on startup, even though the bean can be seen in actuator. I have put breakpoints on:
SdkTracerProvider otelTracerProvider(SpanLimits spanLimits, ObjectProvider<List<SpanProcessor>> spanProcessors,
SpanExporterCustomizer spanExporterCustomizer, ObjectProvider<List<SpanExporter>> spanExporters,
Sampler sampler, Resource resource, SpanProcessorProvider spanProcessorProvider) {
SdkTracerProviderBuilder sdkTracerProviderBuilder = SdkTracerProvider.builder().setResource(resource)
.setSampler(sampler).setSpanLimits(spanLimits);
List<SpanProcessor> processors = spanProcessors.getIfAvailable(ArrayList::new);
processors.addAll(spanExporters.getIfAvailable(ArrayList::new).stream()
.map(e -> spanProcessorProvider.toSpanProcessor(spanExporterCustomizer.customize(e)))
.collect(Collectors.toList()));
processors.forEach(sdkTracerProviderBuilder::addSpanProcessor);
return sdkTracerProviderBuilder.build();
}
in OtelAutoConfiguration and am not seeing them firing either on startup.
My pom.xml relevant section is:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-brave</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-otel-autoconfigure</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-otlp</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-extension-trace-propagators</artifactId>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-netty-shaded</artifactId>
<version>1.47.0</version>
</dependency>
And my configuration from application.yaml
sleuth:
enabled: true
web:
additional-skip-pattern: /readiness|/liveness
client.skip-pattern: /readiness
sampler:
probability: 1.0
rate: 100
propagation:
type: OT_TRACER
otel:
config:
trace-id-ratio-based: 1.0
log.exporter.enabled: true
exporter:
otlp:
endpoint: https://api.honeycomb.io
headers:
x-honeycomb-team: ${TELEMETRY_API_KEY}
x-honeycomb-dataset: app-telemetry
sleuth-span-filter:
enabled: true
resource:
enabled: true
I am getting traces, so it appears the system itself is working, however I cannot get my env tag added.
Preemptive thank you to #marcingrzejszczak for the help so far on my gist: https://gist.github.com/fpmoles/b880ccfdef2d2138169ed398e87ec396
I'm unsure why your span processor is not being picked up by Spring and being added to your list of processors being registered with the tracer provider.
An alternative way to set process consistent values, like environment, would be to set it as a resource attribute. This is more desireable because it's set once and delivered once per batch of spans sent to the configured backend (eg Honeycomb). Using a span processor adds the same attribute to every span.
This can be done in a few different ways:
If using AutoConfigure, you can set via system property or environment variable
Set directly on the resource during your otelTracerProvider method:
resource.setAttribute("environment", "${environment}");
FYI Honeycomb has OTel Java SDK & Agent distros to help simplify sending data that reduces required configuration and sets sensible defaults.
I'm trying to upload dynamic objects into the s3 bucket in my web application.
But struggling with no such method error during initializing the AWS3Client.
Initially, the input is a multipart image saving it into a local machine and then using it for uploading it into the s3 bucket.
During uploading the No such method exception occurs as shown in the first image.
Exception snapshot during client initialization
Also, specified the amazon dependency used into pom in the second picture.
pom-Amazon_dependency
following is the code used for initiating s3client into the application.
Map<String, String> s3Credentials = ((FSRepositoryServiceImpl) fsRepositoryService).getS3Credentials();
AmazonS3 s3 = AmazonS3ClientBuilder.standard().withRegion(Regions.AP_SOUTH_1).withCredentials(
new AWSStaticCredentialsProvider(new BasicAWSCredentials(s3Credentials.get("accessKeyId"),
s3Credentials.get("secretAccessKey"))))
.build();
Here are some more details about the issue
Pom dependency
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-core-asl</artifactId>
<version>1.9.13</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.9.13</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.12.150</version>
</dependency>
Code to upload object into s3 bucket:
//Save temporary document in file format for uploading to s3
Map<String, String> s3ObjectDetails = fsRepositoryService.saveTempDocumentforS3Reference(
basePatientDocumentsRepoPath, docContents, ext, originalImgFileName);
//Credentials for aws account
Map<String, String> s3Credentials = ((FSRepositoryServiceImpl) fsRepositoryService).getS3Credentials();
AmazonS3 s3 = AmazonS3ClientBuilder.standard().withRegion(Regions.AP_SOUTH_1).withCredentials(
new AWSStaticCredentialsProvider(new BasicAWSCredentials(s3Credentials.get(<accessKeyId>),
s3Credentials.get(<secretAccessKey>))))
.build();
//bucket details on the aws cloud
Map<String, String> bucketDetails = ((FSRepositoryServiceImpl) fsRepositoryService).getBucketDetails(
/* used this for dynamic key name for bucket */s3ObjectDetails.get(<document Name>));
String uploadingPath = bucketDetails.get(<filePath>) + "/" + patientId + "/";
uploadingPath = uploadingPath.replace("//", "/");
String fileToUpload = s3ObjectDetails.get(<Pathofthedocument>);
PutObjectRequest objectRequest = new PutObjectRequest(bucketDetails.get(bucket),
uploadingPath + bucketDetails.get(<key>), <fileToUpload>);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("plain/text");
metadata.addUserMetadata(<key Name>, bucketDetails.get(key));
objectRequest.setMetadata(metadata);
PutObjectResult uploadImagetoS3 = s3.putObject(objectRequest);
Error statement:
02:54,190 INFO :
Saving Doc 5f5ab725-0403-4554-b62c-674130c9a8f2-1643628704942 under: /data/ihealwell/patients/ImagePrescription/20808/
Jan 31, 2022 5:07:27 PM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet [api-dispatcher] in context with path [/IHW] threw exception [Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable([Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/fasterxml/jackson/databind/ObjectMapper;] with root cause
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable([Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/fasterxml/jackson/databind/ObjectMapper;
at com.amazonaws.partitions.PartitionsLoader.(PartitionsLoader.java:54)
at com.amazonaws.regions.RegionMetadataFactory.create(RegionMetadataFactory.java:30)
at com.amazonaws.regions.RegionUtils.initialize(RegionUtils.java:64)
at com.amazonaws.regions.RegionUtils.getRegionMetadata(RegionUtils.java:52)
at com.amazonaws.regions.RegionUtils.getRegion(RegionUtils.java:106)
at com.amazonaws.client.builder.AwsClientBuilder.getRegionObject(AwsClientBuilder.java:256)
at com.amazonaws.client.builder.AwsClientBuilder.withRegion(AwsClientBuilder.java:245)
at com.amazonaws.client.builder.AwsClientBuilder.withRegion(AwsClientBuilder.java:232)
at com.indohealth.ihealwell.web.rest.controller.PatientResourceController.saveOnS3Resource(PatientResourceController.java:1932)
at com.indohealth.ihealwell.web.rest.controller.PatientResourceController.saveDocumentOnGlobalResource(PatientResourceController.java:1891)
at com.indohealth.ihealwell.web.rest.controller.DocumentResourceController.updateDocumentbyPatient(DocumentResourceController.java:288)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
I've got strange problem and I hope you will to help me to solve it.
I try to pass list of objects, where each object contains LocalDate parameter (JodaTime library) from test service to my controller.
This is method from my service. It returns list of objects. Look at the dates printed out in the loop.
#RequestMapping("/getListaRecept")
#ResponseBody
public ListaRecept sendAnswer(){
ListaRecept listaReceptFiltered = prescriptionCreator.createListaRecept();
for(Recepta r : listaReceptFiltered.getListaRecept()){
System.out.println(r.toString());
}
return listaReceptFiltered;
}
Dates are correct
Recepta{id=3, nazwa='nurofen', status=NOT_REALIZED, date=2017-07-27}
Recepta{id=1, nazwa='ibuprom', status=ANNULED, date=2014-12-25}
Recepta{id=2, nazwa='apap', status=REALIZED, date=2016-08-18}
And now I'm invoking this method from my SpringBoot app using restTemplate. And then received list is printed out
private final RestTemplate restTemplate;
public SgrService2(RestTemplateBuilder restTemplateBuilder) {
this.restTemplate = restTemplateBuilder.build();
this.restTemplate.getMessageConverters()
.add(0, new StringHttpMessageConverter(Charset.forName("UTF-16")));
}
public ListaRecept getList() {
for(Recepta r : this.restTemplate.getForObject("http://localhost:8090/getListaRecept",
ListaRecept.class).getListaRecept()){
System.out.println(r.toString());
}
return this.restTemplate.getForObject("http://localhost:8090/getListaRecept",
ListaRecept.class);
}
As you can see all dates were replaced with current date :/
Recepta{id=3, nazwa='nurofen', status=NOT_REALIZED, date=2017-09-30}
Recepta{id=1, nazwa='ibuprom', status=ANNULED, date=2017-09-30}
Recepta{id=2, nazwa='apap', status=REALIZED, date=2017-09-30}
I have no idea what is going on...
Here you have pom dependencies
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.9.9</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jsr310</artifactId>
<version>2.9.0</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.9.1</version>
</dependency>
Thank you in advance for your help
It seems to me that you are using the wrong jackson module, instead of jsr310 (which I guess is for Java 8 date types), try using the artifact jackson-datatype-joda and register the module JodaModule.
Getting the following exception when I try to execute the query to create a relationship between nodes
java.lang.NoSuchMethodError:
org.neo4j.ogm.session.Session.execute(Ljava/lang/String;)V at
org.springframework.data.neo4j.template.Neo4jTemplate.execute(Neo4jTemplate.java:183)
I am using the following dependencies in pom.xml
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-ogm</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-neo4j</artifactId>
<version>4.0.0.M1</version>
</dependency>
My code is something like in service POJO
private void createServiceRelationships(
XXX yyyy, Neo4JExtension template)
throws Exception {
Boolean zzz = yyyy.getKKKKK();
String queryBegin = "MATCH (n:XXX {pk:'"
+ XXX.getPk() + "' })";
for (String LLLL : XXXX.getLLLL()) {
String cipherQuery = ",(m:" + Name + "ROOT" + "{pk:'"
+ Name + "ROOT"
+ "' }) CREATE (m)-[:ROOT]->(n)";
template.execute(queryBegin + cipherQuery);
}
}
Kindly help
There's no need to include the neo4j-ogm dependency. In fact, with SDN 4.0.0.M1 you shouldn't.
Please upgrade to SDN 4.0.0.RC1 - it contains many fixes and features since M1.
Elasticsearch/Spark serialization does not appear to play well with nested types.
For example:
public class Foo implements Serializable {
private List<Bar> bars = new ArrayList<Bar>();
// getters and setters
public static class Bar implements Serializable {
}
}
List<Foo> foos = new ArrayList<Foo>();
foos.add( new Foo());
// Note: Foo object does not contain nested Bar instances
SparkConf sc = new SparkConf(); //
sc.setMaster("local");
sc.setAppName("spark.app.name");
sc.set("spark.serializer", KryoSerializer.class.getName());
JavaSparkContext jsc = new JavaSparkContext(sc);
JavaRDD javaRDD = jsc.parallelize(ImmutableList.copyOf(foos));
JavaEsSpark.saveToEs(javaRDD, INDEX_NAME+"/"+TYPE_NAME);
The above code above works, and documents of type Foo will be indexed within Elasticsearch.
The issue arises when the bars list in a Foo object is not empty, for instance:
Foo = new Foo();
Bar = new Foo.Bar();
foo.getBars().add(bar);
Then, when indexing to Elasticsearch, the following exception is thrown:
org.elasticsearch.hadoop.serialization.EsHadoopSerializationException:
Cannot handle type [Bar] within type [class Foo], instance [Bar ...]]
within instance [Foo#1cf628a]
using writer [org.elasticsearch.spark.serialization.ScalaValueWriter#4e635d]
at org.elasticsearch.hadoop.serialization.builder.ContentBuilder.value(ContentBuilder.java:63)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.doWriteObject(TemplatedBulk.java:71)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.write(TemplatedBulk.java:58)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:148)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:47)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:68)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:68)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
These are the relevant Maven dependencies
<dependency>
<groupId>com.sksamuel.elastic4s</groupId>
<artifactId>elastic4s_2.11</artifactId>
<version>1.5.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop-cascading</artifactId>
<version>2.1.0.Beta4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.1.3</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark_2.10</artifactId>
<version>2.1.0.Beta4</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-xml</artifactId>
<version>2.11.0-M4</version>
</dependency>
What is the correct way to index when using nested types with ElasticSearch and Spark?
Thanks
A solution could be to build a json from the object you're trying to save, using for example Json4s.
In this case your "JavaEsSpark" RDD would be a RDD of strings.
Then you simply have to call
JavaEsSpark.saveJsonToEs...
instead of
JavaEsSpark.saveToEs...
This workaround helped me save countless hours trying to figure out a way to Serialize nested maps.
Looking at the ScalaValueWriter & JdkValueWriter code we can see that only certain types are directly supported. Most likely the inner class is not a JavaBean or other supported type.
One day ScalaValueWriter & JdkValueWriter will possibly support user defined types (like Bar in our example), other than just Java types like String, int, etc.
In the meantime, there is the following workaround. Instead of having Foo expose a List of Bar objects, internally transform the List to a Map<String, Object> and expose that.
Something like this:
private List<Map<String, Object>> bars= new ArrayList<Map<String, Object>>();
public List<Map<String, Object>> getBars() {
return bars;
}
public void setBars(List<Bar> bars) {
for (Bar bar: bars){
this.bars.add(bar.getAsMap());
}
}
i suggest working with com.google.gson.Gson;
String foosJson = new Gson().toJson(foos );
then ,
Map map = new HashMap<> ();
...
...
JavaRDD<Map<String,?>> javaRDD= sc.parallelize(ImmutableList.of(map));
JavaEsSpark.saveToEs ( javaRDD, INDEX_NAME+"/"+TYPE_NAME );