ParDo did not have a ParDoPayload - go

I've written a Beam pipeline in Go that runs successfully on my local machine, but when I add --runner=dataflow to run it on Google Cloud Dataflow I get a vague error when it's setting up that a ParDo is missing a ParDoPayload. The stacktrace is entirely Java, so I'm not sure how to translate this back into my Go code to figure out what I'm missing.
I've gone through and used beam.RegisterFunction() for all of my functions that emit and also used beam.RegisterType() for the top-level struct I'm passing around.
Any ideas how this error connects to the code I've written / how I can debug?
java.lang.RuntimeException: ParDo did not have a ParDoPayload
at org.apache.beam.runners.dataflow.worker.graph.RegisterNodeFunction.apply(RegisterNodeFunction.java:327)
at org.apache.beam.runners.dataflow.worker.graph.RegisterNodeFunction.apply(RegisterNodeFunction.java:97)
at java.util.function.Function.lambda$andThen$1(Function.java:88)
at org.apache.beam.runners.dataflow.worker.graph.CreateRegisterFnOperationFunction.apply(CreateRegisterFnOperationFunction.java:207)
at org.apache.beam.runners.dataflow.worker.graph.CreateRegisterFnOperationFunction.apply(CreateRegisterFnOperationFunction.java:74)
at java.util.function.Function.lambda$andThen$1(Function.java:88)
at java.util.function.Function.lambda$andThen$1(Function.java:88)
at org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:346)
at org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:305)
at org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195)
at org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123)
Caused by: org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.InvalidProtocolBufferException: Protocol message had invalid UTF-8.
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.InvalidProtocolBufferException.invalidUtf8(InvalidProtocolBufferException.java:141)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Utf8$DecodeUtil.handleTwoBytes(Utf8.java:1909)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Utf8$DecodeUtil.access$700(Utf8.java:1883)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Utf8$UnsafeProcessor.decodeUtf8(Utf8.java:1411)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Utf8.decodeUtf8(Utf8.java:340)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.CodedInputStream$ArrayDecoder.readStringRequireUtf8(CodedInputStream.java:804)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec.<init>(RunnerApi.java:55936)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec.<init>(RunnerApi.java:55897)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec$1.parsePartialFrom(RunnerApi.java:56565)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec$1.parsePartialFrom(RunnerApi.java:56559)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:883)
at org.apache.beam.model.pipeline.v1.RunnerApi$ParDoPayload.<init>(RunnerApi.java:10363)
at org.apache.beam.model.pipeline.v1.RunnerApi$ParDoPayload.<init>(RunnerApi.java:10320)
at org.apache.beam.model.pipeline.v1.RunnerApi$ParDoPayload$1.parsePartialFrom(RunnerApi.java:12633)
at org.apache.beam.model.pipeline.v1.RunnerApi$ParDoPayload$1.parsePartialFrom(RunnerApi.java:12627)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:100)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:120)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:125)
at org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:48)
at org.apache.beam.model.pipeline.v1.RunnerApi$ParDoPayload.parseFrom(RunnerApi.java:11130)
at org.apache.beam.runners.dataflow.worker.graph.RegisterNodeFunction.apply(RegisterNodeFunction.java:325)
... 10 more

Related

How to read tomcat error log

Here is a screenshot of my tomcat spring boot error log. How do I read this?
Specifically,
1.) where do I read there is says "63 more"?
2.) Is the top-most error the most generic, or most specific? I'm thinking most generic because it says "caused by" and then another error.
Thanks.
First, get destination of log (console, file, database, ...) by find information in:
application.properties (for example: spring.log.**** = ...)
or log4j.properties/log4j.yml/log4j.yaml (if using log4j)
or log4j2.properties/log4j2.yml/log4j2.yaml (if using log4j2)
or logback.properties/logback.yml (if using logback)
and then you can get "63 more" there
It's the most specific
most specific error <== root cause is here
^ cause by: generic error
^ cause by: generic error
...
^ cause by: most generic error
The first line contains the exact reason for the error, such as this:
java.lang.NullPointerException
abc.def.ghi.handleRequest(MyController.java:30)
...
The more you go down, the more information you get about the root cause. Usually you can find the cause at the bottom of the log message containing the stack trace. In your case, the cause for the exception is "No converter found capable of converting from type ...".

org.apache.hive.jdbc.HiveDriver: HiveBaseResultSet has not implemented absolute()?

I just started using the driver org.apache.hive.jdbc.HiveDriver (Version
1.2.1 for spark2) with a Spark Thrift Server (STS) (reference here)
java.sql.ResultSet defines the method absolute() (JavaDoc here)
but HiveBaseResultSet seems to have chosen not to implement the method (source code here)
So now my application (built on top of SmartGWT) was doing a simple operation and I got the following error message:
=== 2017-05-13 18:06:16,980 [3-47] WARN RequestContext - dsRequest.execute() failed:
java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveBaseResultSet.absolute(HiveBaseResultSet.java:70)
at org.apache.commons.dbcp.DelegatingResultSet.absolute(DelegatingResultSet.java:373)
at com.isomorphic.sql.SQLDataSource.executeWindowedSelect(SQLDataSource.java:2970)
at com.isomorphic.sql.SQLDataSource.SQLExecute(SQLDataSource.java:2024)
What is the reason that the driver chose not to implement absolute()?
Are there any workaround for the limitation?
Thanks for the hint from Mark Rotteveel. Now I understand better and let me post an answer to my own question.
Implementation of absolute() is optional
As specified by the Interface of ResultSet#absolute() (link), the implementation for absolute() is optional -- especially when the result set type is TYPE_FORWARD_ONLY.
Workaround
In my case, the result set comes from a Spark Thrift Server (STS) so I guess it is indeed forward-only. So the question became how to instruct my application to NOT making a call to absolute(), which is basically for cursor movement.
SmartGWT-specific answer
For SmartGWT, this is controlled by a property called sqlPaging, which we can specified for an OperationBinding. The right value to use seems to be dropAtServer (more reference here). So I set my SmartGWT DataSource XML file to something like this
<operationBindings>
<operationBinding operationType="fetch" progressiveLoading="false"
sqlPaging="dropAtServer"
>
After that I saw another error, which is now related to HiveConnection#commit():
java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveConnection.commit(HiveConnection.java:742)
at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
at com.isomorphic.sql.SQLTransaction.commitTransaction(SQLTransaction.java:307)
at com.isomorphic.sql.SQLDataSource.commit(SQLDataSource.java:4673)
After more digging, I realized that the right property for SmartGWT to control the commit behavior is autoJoinTransactions and I should set it to false (more reference here). After these two changes, I could get may application to talk to STS via jdbc.HiveDriver
For anyone out there who are also trying this, here is my full settings for the driver in SmartGWT's server.properties (more reference here)
sql.defaultDatabase: perf2 # this name is picked by me, but it can be anyname
sql.perf2.driver.networkProtocol: tcp
sql.perf2.driver: org.apache.hive.jdbc.HiveDriver # important
sql.perf2.database.type: generic # important
sql.perf2.autoJoinTransactions: false # important
sql.perf2.interface.type: driverManager # important
sql.perf2.driver.url: jdbc:hive2://host:port # important -- pick your host:port
sql.perf2.driver.user: someuser # important -- pick your username
sql.perf2.interface.credentialsInURL: true
sql.perf2.driver.databaseName: someDb
sql.perf2.driver.context:

Drupal, entity_metadata_wrappers and debugging

For handling entities in Drupal I'm using Entity Metadata Wrappers (the "Drupal way").
It's really easy to start coding and see all the advantages it has... except when you get a fatal error and you are not clear where it comes from.
This is what the database log shows:
EntityMetadataWrapperException: Unknown data property
field_whatever. at EntityStructureWrapper->getPropertyInfo()
(line 335 of
/var/www/html/sites/all/modules/entity/includes/entity.wrapper.inc).
Sadly, many times that "field_whatever" is "nid", "uid" or some very common property, so it's name is spread all over my code, which makes me difficult to get to the origin of the error.
I'm currently doing this:
Write a tiny piece of code and then run to see if something fails.
Using getPropertyInfo when handling entities with "not so common" fields.
Loosing hair.
What is worst is that sometimes the error does not appear when you are coding, but a week later. So it could be anywhere...
Is there any way of handling entity metadata wrapper errors better? Can I get better information in the database log and not just a line? A backtrace maybe?
Thanks.
Well, having the devel module active (just to see the nice krumo message) we can do something like this inside our module:
<?php
set_exception_handler('exception_with_trace');
function exception_with_trace($e)
{
dpm($e->getTrace());
}
That will return the backtrace error of the exception thrown by the entity metadata handler on the next page load (some page in your site where everything is running fine).
Also you can set the exception handler exclusively and more elegant just for some pages or some users with some role... or when some parameter in the url is met, or when in some state of your Drupal site is met (ex. when a bool persistent variable 'exception_with_trace' is true). Even, under certain conditions and control, you can use it in production too.
If the site does not work "at all" you can include it in your settings.php file, but instead of printing the trace, you must write the trace to a file and watch the trace in a different context (not Drupal but some php file).
If exceptions are too long and are causing memory problems then getting the trace as string is also possible. See http://php.net/manual/es/exception.gettraceasstring.php
Hope that helps.

How to put data to Hbase without using java

Are there any way to read data from a file and put them into Hbase table without using any java? I tried to store data from pig script by using
sample = LOAD '/mapr/user/username/sample.txt' AS (all:chararray);
STORE deneme INTO 'hbase://sampledata' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('mysampletable:intdata');
but this gave this error message:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filterWritableByteArrayComparable
ERROR org.apache.pig.tools.grunt.Grunt java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArrayComparable
Pig seems like a good idea to import data into HBase. Check what Armon suggested about setting the $PIG_CLASSPATH.
Another possibility to bulk loading data into HBase is to use featured tools like ImportTsv (Tab Separated Values) and CompleteBulkLoad.
http://hbase.apache.org/book/ops_mgt.html#importtsv
Well, there's the Stargate REST interface, which is usable from any language. It's not perfect, but it's worth a look.
You just need to make sure that $PIG_CLASSPATH also points at hbase.jar

Dojo SyntaxError: missing ) in parenthetical

I have had this same error before and posted a question to that effect but unfortunately there was no answer. The previous error occurred under different circumstances ( ie I was triggered when I used the dojo toolkit sdk(Size 19M).
This time I am retrieving data from a couple of tables which have a one to many relationship. I am using Dojo, Doctrine and Zend Framework for my project. I have posted quite an extensive error message at 1 the Link to pastie.org with code and error details and the php and javascript that I identified as being the code involved.
When you look towards the end of the pastie in the FIREBUG ERROR MESSAGE section you will see a piece of JSON like
[{"id":"000001",
"name":"Adam",
"area_id":null,
"registration_date":"2011-03-08",
"loan_cycle":"0","credit_score":"100",
"created_by":null,
"borrowers":[{"id":"00000001",
"first_name":"Test",
"surname":"User",
"dob":"2006-12-09",
"personalid_no":"100000",
"gender":"Male",
"marital_status":"Marrie",
"home_number":"09866678",
"mobile_number":"09877655",
"accomodation_type":"owner",
"current_loan_cycle":"1",
"status":"Active",
"date_created":null,
"date_registered":"2009-12-11",
"created_by":null,
"Groups_id":"000001"}]
}]
Its clear that the data gets pulled from the tables. However the code fails and I get the error message SyntaxError: missing ) in parenthetical. I have battled with this for a long time now. Am at a point where I either have to abandon the application or restart the whole project again. Thanks for your help.

Resources