Tarantool - creating primary index - tarantool

I am following the tarantool documentation but I am getting an error while creating the index.
I would like to understand why this is failing, as I am following exactly the tutorial.
$ tarantool
tarantool: version 1.6.7-591-g7d4dbbb
type 'help' for interactive help
tarantool> box.cfg{listen = 3301}
2017-12-06 20:57:18.684 [15168] main/101/interactive C> version 1.6.7-591-g7d4dbbb
2017-12-06 20:57:18.684 [15168] main/101/interactive C> log level 5
2017-12-06 20:57:18.684 [15168] main/101/interactive I> mapping 1073741824 bytes for tuple arena...
2017-12-06 20:57:18.705 [15168] main/101/interactive I> initializing an empty data directory
2017-12-06 20:57:18.710 [15168] snapshot/101/main I> creating `./00000000000000000000.snap.inprogress'
2017-12-06 20:57:18.710 [15168] snapshot/101/main I> saving snapshot `./00000000000000000000.snap.inprogress'
2017-12-06 20:57:18.710 [15168] snapshot/101/main I> done
2017-12-06 20:57:18.713 [15168] iproto I> binary: started
2017-12-06 20:57:18.713 [15168] iproto I> binary: bound to 0.0.0.0:3301
2017-12-06 20:57:18.713 [15168] main/101/interactive I> ready to accept requests
---
...
tarantool> s = box.schema.space.create('tester')
2017-12-06 20:57:32.803 [15168] wal/101/main I> creating `./00000000000000000000.xlog.inprogress'
---
...
tarantool> s:create_index('primary', {
> type = 'hash',
> parts = {1, 'unsigned'}
> })
---
- error: 'Can''t create or modify index ''primary'' in space ''tester'': unknown field
type'
...

You are using Tarantool 1.6 so I believe you should have
parts = {1, 'NUM'}
in the tutorial example. The code you show ('unsigned') is for 1.7, so one option is to upgrade your version of Tarantool. Also in the docs, you can change between Tarantool 1.6, 1.7 and 1.8 in the upper-right-hand corner.

Related

TransientDataAccessResourceException - R2DBC pgdb connection remains in idle in transaction

I have a spring-boot application where using webflux and r2dbc-postgres. I have discovered a strange issue when trying to do some db operations in a flatMap().
Code example:
#Transactional
public Mono<Void> insertDummyFooBars() {
return Flux.fromIterable(IntStream.rangeClosed(1, 260).boxed().collect(Collectors.toList()))
.log()
.flatMap(i -> this.repository.save(FooBar.builder().foo("test-" + i).build()))
.log()
.concatMap(i -> this.repository.findAll())
.then();
}
It seems like flatMap can process max 256 elements in batches. (Queues.SMALL_BUFFER_SIZE default value is 256). So when I tried to run the code above (with 260 elements) I've got an exception - TransientDataAccessResourceException and the following message:
Cannot exchange messages because the request queue limit is exceeded; nested exception is io.r2dbc.postgresql.client.ReactorNettyClient$RequestQueueException
There is no Releasing R2DBC Connection after this exception. The pgdb connection/session remains in idle in transaction state and the app is not able to run properly when pool max size is reached and all of the connections are in idle in transaction state. I think the connection should be released even if an exception happened or not.
If I use concatMap instead of flatMap it works as expected - no exception, connection released! It's also ok with flatMap when the elements are less than or equal to 256.
Is it possible to force pgdb connection closure? What should I do If I have lot of db operations in flatMap like this? Should I replace all of them with concatMap? Is there a global solution for this?
Versions:
Postgres: 12.6, Spring-boot: 2.7.6
Demo project
LOG:
2022-12-08 16:32:13.092 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.Iterable.1 : | onNext(256)
2022-12-08 16:32:13.092 DEBUG 17932 --- [actor-tcp-nio-1] o.s.r2dbc.core.DefaultDatabaseClient : Executing SQL statement [INSERT INTO foo_bar (foo) VALUES ($1)]
2022-12-08 16:32:13.114 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.FlatMap.2 : onNext(FooBar(id=258, foo=test-1))
2022-12-08 16:32:13.143 DEBUG 17932 --- [actor-tcp-nio-1] o.s.r2dbc.core.DefaultDatabaseClient : Executing SQL statement [SELECT foo_bar.* FROM foo_bar]
2022-12-08 16:32:13.143 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.Iterable.1 : | request(1)
2022-12-08 16:32:13.143 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.Iterable.1 : | onNext(257)
2022-12-08 16:32:13.144 DEBUG 17932 --- [actor-tcp-nio-1] o.s.r2dbc.core.DefaultDatabaseClient : Executing SQL statement [INSERT INTO foo_bar (foo) VALUES ($1)]
2022-12-08 16:32:13.149 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.Iterable.1 : | onComplete()
2022-12-08 16:32:13.149 INFO 17932 --- [actor-tcp-nio-1] reactor.Flux.Iterable.1 : | cancel()
2022-12-08 16:32:13.160 ERROR 17932 --- [actor-tcp-nio-1] reactor.Flux.FlatMap.2 : onError(org.springframework.dao.TransientDataAccessResourceException: executeMany; SQL [INSERT INTO foo_bar (foo) VALUES ($1)]; Cannot exchange messages because the request queue limit is exceeded; nested exception is io.r2dbc.postgresql.client.ReactorNettyClient$RequestQueueException: [08006] Cannot exchange messages because the request queue limit is exceeded)
2022-12-08 16:32:13.167 ERROR 17932 --- [actor-tcp-nio-1] reactor.Flux.FlatMap.2 :
org.springframework.dao.TransientDataAccessResourceException: executeMany; SQL [INSERT INTO foo_bar (foo) VALUES ($1)]; Cannot exchange messages because the request queue limit is exceeded; nested exception is io.r2dbc.postgresql.client.ReactorNettyClient$RequestQueueException: [08006] Cannot exchange messages because the request queue limit is exceeded
at org.springframework.r2dbc.connection.ConnectionFactoryUtils.convertR2dbcException(ConnectionFactoryUtils.java:215) ~[spring-r2dbc-5.3.24.jar:5.3.24]
at org.springframework.r2dbc.core.DefaultDatabaseClient.lambda$inConnectionMany$8(DefaultDatabaseClient.java:147) ~[spring-r2dbc-5.3.24.jar:5.3.24]
at reactor.core.publisher.Flux.lambda$onErrorMap$29(Flux.java:7105) ~[reactor-core-3.4.25.jar:3.4.25]
at reactor.core.publisher.Flux.lambda$onErrorResume$30(Flux.java:7158) ~[reactor-core-3.4.25.jar:3.4.25]
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94) ~[reactor-core-3.4.25.jar:3.4.25]
I have tried to change the Queues.SMALL_BUFFER_SIZE, and also tried to add a concurrency value to the flatmap. It works when I reduced the value to 255 but I think it is not a good solution.

Kafka - ERROR Stopping after connector error java.lang.IllegalArgumentException: Number of groups must be positive

Working on setting up Kafka running from our RDS Postgres 9.6 to Redhift. Using the guidelines at https://blog.insightdatascience.com/from-postgresql-to-redshift-with-kafka-connect-111c44954a6a and we have the all of the infrastructure set up, and am working on fully setting up Confluent. I'm getting the error of ava.lang.IllegalArgumentException: Number of groups must be positive. when trying to set stuff up. Here's my config file:
name=source-postgres
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=16
connection.url= ((correct url and information here))
mode=timestamp+incrementing
timestamp.column.name=updated_at
incrementing.column.name=id
topic.prefix=postgres_
Full error:
/usr/local/confluent$ /usr/local/confluent/bin/connect-standalone
/usr/local/confluent/etc/schema-registry/connect-avro-standalone.properties
/usr/local/confluent/etc/kafka-connect-jdbc/source-postgres.properties
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found
binding in
[jar:file:/usr/local/confluent/share/java/kafka-serde-tools/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/local/confluent/share/java/kafka-connect-elasticsearch/slf4j-simple-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/local/confluent/share/java/kafka-connect-hdfs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/local/confluent/share/java/kafka/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation. SLF4J: Actual binding is of type
[org.slf4j.impl.Log4jLoggerFactory] [2018-01-29 16:49:49,820] INFO
StandaloneConfig values:
access.control.allow.methods =
access.control.allow.origin =
bootstrap.servers = [localhost:9092]
internal.key.converter = class org.apache.kafka.connect.json.JsonConverter
internal.value.converter = class org.apache.kafka.connect.json.JsonConverter
key.converter = class io.confluent.connect.avro.AvroConverter
offset.flush.interval.ms = 60000
offset.flush.timeout.ms = 5000
offset.storage.file.filename = /tmp/connect.offsets
rest.advertised.host.name = null
rest.advertised.port = null
rest.host.name = null
rest.port = 8083
task.shutdown.graceful.timeout.ms = 5000
value.converter = class io.confluent.connect.avro.AvroConverter
(org.apache.kafka.connect.runtime.standalone.StandaloneConfig:180)
[2018-01-29 16:49:49,942] INFO Logging initialized #549ms
(org.eclipse.jetty.util.log:186) [2018-01-29 16:49:50,301] INFO Kafka
Connect starting (org.apache.kafka.connect.runtime.Connect:52)
[2018-01-29 16:49:50,302] INFO Herder starting
(org.apache.kafka.connect.runtime.standalone.StandaloneHerder:70)
[2018-01-29 16:49:50,302] INFO Worker starting
(org.apache.kafka.connect.runtime.Worker:113) [2018-01-29
16:49:50,302] INFO Starting FileOffsetBackingStore with file
/tmp/connect.offsets
(org.apache.kafka.connect.storage.FileOffsetBackingStore:60)
[2018-01-29 16:49:50,304] INFO Worker started
(org.apache.kafka.connect.runtime.Worker:118) [2018-01-29
16:49:50,305] INFO Herder started
(org.apache.kafka.connect.runtime.standalone.StandaloneHerder:72)
[2018-01-29 16:49:50,305] INFO Starting REST server
(org.apache.kafka.connect.runtime.rest.RestServer:98) [2018-01-29
16:49:50,434] INFO jetty-9.2.15.v20160210
(org.eclipse.jetty.server.Server:327) Jan 29, 2018 4:49:51 PM
org.glassfish.jersey.internal.Errors logErrors WARNING: The following
warnings have been detected: WARNING: The (sub)resource method
listConnectors in
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource
contains empty path annotation. WARNING: The (sub)resource method
createConnector in
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource
contains empty path annotation. WARNING: The (sub)resource method
listConnectorPlugins in
org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource
contains empty path annotation. WARNING: The (sub)resource method
serverInfo in
org.apache.kafka.connect.runtime.rest.resources.RootResource contains
empty path annotation. [2018-01-29 16:49:51,385] INFO Started
o.e.j.s.ServletContextHandler#5aabbb29{/,null,AVAILABLE}
(org.eclipse.jetty.server.handler.ContextHandler:744) [2018-01-29
16:49:51,409] INFO Started
ServerConnector#54dab9ac{HTTP/1.1}{0.0.0.0:8083}
(org.eclipse.jetty.server.ServerConnector:266) [2018-01-29
16:49:51,409] INFO Started #2019ms
(org.eclipse.jetty.server.Server:379) [2018-01-29 16:49:51,410] INFO
REST server listening at http://127.0.0.1:8083/, advertising URL
http://127.0.0.1:8083/
(org.apache.kafka.connect.runtime.rest.RestServer:150) [2018-01-29
16:49:51,410] INFO Kafka Connect started
(org.apache.kafka.connect.runtime.Connect:58) [2018-01-29
16:49:51,412] INFO ConnectorConfig values:
connector.class = io.confluent.connect.jdbc.JdbcSourceConnector
key.converter = null
name = source-postgres
tasks.max = 16
value.converter = null (org.apache.kafka.connect.runtime.ConnectorConfig:180) [2018-01-29
16:49:51,413] INFO Creating connector source-postgres of type
io.confluent.connect.jdbc.JdbcSourceConnector
(org.apache.kafka.connect.runtime.Worker:159) [2018-01-29
16:49:51,416] INFO Instantiated connector source-postgres with version
3.1.2 of type class io.confluent.connect.jdbc.JdbcSourceConnector (org.apache.kafka.connect.runtime.Worker:162) [2018-01-29
16:49:51,419] INFO JdbcSourceConnectorConfig values:
batch.max.rows = 100
connection.url =
incrementing.column.name = id
mode = timestamp+incrementing
poll.interval.ms = 5000
query =
schema.pattern = null
table.blacklist = []
table.poll.interval.ms = 60000
table.types = [TABLE]
table.whitelist = []
timestamp.column.name = updated_at
timestamp.delay.interval.ms = 0
topic.prefix = postgres_
validate.non.null = true (io.confluent.connect.jdbc.source.JdbcSourceConnectorConfig:180)
[2018-01-29 16:49:52,129] INFO Finished creating connector
source-postgres (org.apache.kafka.connect.runtime.Worker:173)
[2018-01-29 16:49:52,130] INFO SourceConnectorConfig values:
connector.class = io.confluent.connect.jdbc.JdbcSourceConnector
key.converter = null
name = source-postgres
tasks.max = 16
value.converter = null (org.apache.kafka.connect.runtime.SourceConnectorConfig:180)
[2018-01-29 16:49:52,209] ERROR Stopping after connector error
(org.apache.kafka.connect.cli.ConnectStandalone:102)
java.lang.IllegalArgumentException: Number of groups must be positive.
at org.apache.kafka.connect.util.ConnectorUtils.groupPartitions(ConnectorUtils.java:45)
at io.confluent.connect.jdbc.JdbcSourceConnector.taskConfigs(JdbcSourceConnector.java:123)
at org.apache.kafka.connect.runtime.Worker.connectorTaskConfigs(Worker.java:193)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.recomputeTaskConfigs(StandaloneHerder.java:251)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.updateConnectorTasks(StandaloneHerder.java:281)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:163)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:96)
[2018-01-29 16:49:52,210] INFO Kafka Connect stopping
(org.apache.kafka.connect.runtime.Connect:68) [2018-01-29
16:49:52,210] INFO Stopping REST server
(org.apache.kafka.connect.runtime.rest.RestServer:154) [2018-01-29
16:49:52,213] INFO Stopped
ServerConnector#54dab9ac{HTTP/1.1}{0.0.0.0:8083}
(org.eclipse.jetty.server.ServerConnector:306) [2018-01-29
16:49:52,218] INFO Stopped
o.e.j.s.ServletContextHandler#5aabbb29{/,null,UNAVAILABLE}
(org.eclipse.jetty.server.handler.ContextHandler:865) [2018-01-29
16:49:52,224] INFO REST server stopped
(org.apache.kafka.connect.runtime.rest.RestServer:165) [2018-01-29
16:49:52,224] INFO Herder stopping
(org.apache.kafka.connect.runtime.standalone.StandaloneHerder:76)
[2018-01-29 16:49:52,224] INFO Stopping connector source-postgres
(org.apache.kafka.connect.runtime.Worker:218) [2018-01-29
16:49:52,225] INFO Stopping table monitoring thread
(io.confluent.connect.jdbc.JdbcSourceConnector:137) [2018-01-29
16:49:52,225] INFO Stopped connector source-postgres
(org.apache.kafka.connect.runtime.Worker:229) [2018-01-29
16:49:52,225] INFO Worker stopping
(org.apache.kafka.connect.runtime.Worker:122) [2018-01-29
16:49:52,225] INFO Stopped FileOffsetBackingStore
(org.apache.kafka.connect.storage.FileOffsetBackingStore:68)
[2018-01-29 16:49:52,225] INFO Worker stopped
(org.apache.kafka.connect.runtime.Worker:142) [2018-01-29
16:49:57,334] INFO Reflections took 6952 ms to scan 263 urls,
producing 12036 keys and 80097 values
(org.reflections.Reflections:229) [2018-01-29 16:49:57,346] INFO
Herder stopped
(org.apache.kafka.connect.runtime.standalone.StandaloneHerder:86)
[2018-01-29 16:49:57,346] INFO Kafka Connect stopped
(org.apache.kafka.connect.runtime.Connect:73)
We were using DMS between our RDS Postgres (9.6) to Redshift. It has been failing, and simply miserable, as well as almost at this point almost unweidly expensive, so we are moving into this as a possible solution. I am kind of at a wall here, and would really like to get some help on this.
I'm working on a very similar issue to this, and what I found is that if the connector doesn't have configuration to tell it what to pull, it will simply error. Trying adding the following to your connector configuration:
table.whitelist=
Then specifying a list of tables to grab.
I had this error with a JDBC Source Connector job. The issue was that the table.whitelist setting was case sensitive, even though the underlying DB wasn't (RDBMS was MS Sql Server).
So my table was tableName, and I had "table.whitelist": "tablename",. This failed, and I got the above error. Changing it to "table.whitelist": "tableName", fixed the error.
This despite the fact that SELECT * FROM tablename and SELECT * FROM tableName both work in MS Sql Manager.

Unable to perform sum operation in pig

I am trying to perform sum operation on my data in pig but it is not accepting explicit type casting i have tried replacing (int) with double while performing sum.
Code
drivers = LOAD '/sachin/drivers.csv' USING PigStorage(',');
time = LOAD '/sachin/timesheet.csv' USING PigStorage(',');
drivdata = FILTER drivers BY $0>1;
timedata = filter time by $0>0;
drivgrp = group timedata by $0;
drivinfo = foreach drivgrp generate group as id , SUM(timedata.$2) as totalhr , SUM(timedata.$3) as totmillogged;
drivfinal = foreach drivdata generate $0 as id , $1 as name;
result = join drivfinal by id , drivinfo by id;
finalres = foreach result generate $0 as id, $1 as name, $3 as hrslogged, $4 as mileslogged;
summile = foreach finalres generate (int)SUM(mileslogged);
DUMP summile;
Error Message
grunt> exec /home/sachin/sec.pig
2017-12-13 21:57:58,812 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 1 time(s).
2017-12-13 21:57:58,854 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:58,996 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:59,036 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:59,080 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:59,121 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:59,192 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_INT 2 time(s).
2017-12-13 21:57:59,246 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: <line 10, column 41> Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
Details at logfile: /home/sachin/pig_1513175202309.log
grunt>
I am actually trying to perform operation for each driver in the top 5 list and finding the miles logged and the percentage of mileslogged by the driver over the total miles logged and store the result in hdfs.
Link for Dataset:https://raw.githubusercontent.com/hortonworks/data-tutorials/master/tutorials/hdp/how-to-process-data-with-apache-pig/assets/driver_data.zip
Can anyone help me to solve this problem or help me to understand what is going wrong here ?
You have to cast mileslogged and then call the SUM function
finalres = foreach result generate $0 as id, $1 as name, $3 as hrslogged, (int)$4 as mileslogged;
summile = foreach finalres generate SUM(mileslogged);
Also I noticed that you are not specifying the datatype in the load statement.The default datatype is bytearray and I suspect you will get the correct result if you don't explicitly cast the fields in the subsequent steps.
From
http://pig.apache.org/docs/r0.17.0/func.html#sum
SUM is defined as
Computes the sum of the numeric values in a single-column bag. SUM requires a preceding GROUP ALL statement for global sums and a GROUP BY statement for group sums.
You code is passing a double whereas SUM requires a BAG containing doubles. No need to typecast but you need to group before calling the SUM function.
allres = group finalres ALL;
summile = foreach allres generate SUM(finalres.mileslogged);
DUMP summile;

Multiple Hive joins failing with Execution Error, return code 2

I'm trying to execute a query in which a table is left outer joined on two other tables. The query is given below:
SELECT T.Rdate, c.Specialty_Cruises, b.Specialty_Cruises from arunf.PASSENGER_HISTORY_FACT T
LEFT OUTER JOIN arunf.RPT_WEB_COURTESY_HOLD_TEMP C on (unix_timestamp(T.RDATE,'yyyy-MM-dd')=unix_timestamp(c.rdate,'yyyy-MM-dd') AND T.book_num = c.Courtesy_Hold_Booking_Num)
LEFT OUTER JOIN arunf.RPT_WEB_BOOKING_NUM_TEMP b ON (unix_timestamp(T.RDATE,'yyyy-MM-dd')=unix_timestamp(b.rdate,'yyyy-MM-dd') AND T.book_num = B.Online_Booking_Number);
This query fails with the notification:
: exec.Task (SessionState.java:printError(922)) - /tmp/arunf/hive.log
: mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(308)) - Execution failed with exit status: 2
: ql.Driver (SessionState.java:printError(922)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
The error logs contain the following:
2015-12-01 10:25:16,077 INFO [main]: mr.ExecDriver (SessionState.java:printInfo(913)) - Execution log at: /tmp/arunf/arunf_20151201102525_914a2eab-652b-440c-9fdc-a473b4caa026.log
2015-12-01 10:25:16,278 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(118)) - <PERFLOG method=deserializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-12-01 10:25:16,278 INFO [main]: exec.Utilities (Utilities.java:deserializePlan(953)) - Deserializing MapredLocalWork via kryo
2015-12-01 10:25:16,421 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(158)) - </PERFLOG method=deserializePlan start=1448983516278 end=1448983516421 duration=143 from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-12-01 10:25:16,429 INFO [main]: mr.MapredLocalTask (SessionState.java:printInfo(913)) - 2015-12-01 10:25:16 Starting to launch local task to process map join; maximum memory = 1029701632
2015-12-01 10:25:16,498 INFO [main]: mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(441)) - fetchoperator for c created
2015-12-01 10:25:16,500 INFO [main]: mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(441)) - fetchoperator for b created
2015-12-01 10:25:16,500 INFO [main]: exec.TableScanOperator (Operator.java:initialize(346)) - Initializing Self TS[2]
2015-12-01 10:25:16,500 INFO [main]: exec.TableScanOperator (Operator.java:initializeChildren(419)) - Operator 2 TS initialized
2015-12-01 10:25:16,500 INFO [main]: exec.TableScanOperator (Operator.java:initializeChildren(423)) - Initializing children of 2 TS
2015-12-01 10:25:16,500 INFO [main]: exec.HashTableSinkOperator (Operator.java:initialize(458)) - Initializing child 1 HASHTABLESINK
2015-12-01 10:25:16,500 INFO [main]: exec.TableScanOperator (Operator.java:initialize(394)) - Initialization Done 2 TS
2015-12-01 10:25:16,500 INFO [main]: mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(461)) - fetchoperator for b initialized
2015-12-01 10:25:16,500 INFO [main]: exec.TableScanOperator (Operator.java:initialize(346)) - Initializing Self TS[0]
2015-12-01 10:25:16,501 INFO [main]: exec.TableScanOperator (Operator.java:initializeChildren(419)) - Operator 0 TS initialized
2015-12-01 10:25:16,501 INFO [main]: exec.TableScanOperator (Operator.java:initializeChildren(423)) - Initializing children of 0 TS
2015-12-01 10:25:16,502 INFO [main]: exec.HashTableSinkOperator (Operator.java:initialize(458)) - Initializing child 1 HASHTABLESINK
2015-12-01 10:25:16,503 INFO [main]: exec.HashTableSinkOperator (Operator.java:initialize(346)) - Initializing Self HASHTABLESINK[1]
2015-12-01 10:25:16,503 INFO [main]: mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:<init>(61)) - JVM Max Heap Size: 1029701632
2015-12-01 10:25:16,533 ERROR [main]: mr.MapredLocalTask (MapredLocalTask.java:executeInProcess(357)) - Hive Runtime Error: Map local work failed
java.lang.RuntimeException: cannot find field courtesy_hold_booking_num from [0:rdate, 1:online_booking_number, 2:pages, 3:mobile_device_type, 4:specialty_cruises]
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:410)
at org.apache.hadoop.hive.serde2.BaseStructObjectInspector.getStructFieldRef(BaseStructObjectInspector.java:133)
at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:138)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:460)
at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:366)
at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:346)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:743)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Please note that, when the main table is left outer joined with the tables separately they succeed.
Example, the below queries succeed:
SELECT T.Rdate from arunf.PASSENGER_HISTORY_FACT T
LEFT OUTER JOIN arunf.RPT_WEB_COURTESY_HOLD_TEMP C on (unix_timestamp(T.RDATE,'yyyy-MM-dd')=unix_timestamp(c.rdate,'yyyy-MM-dd') AND T.book_num = c.Courtesy_Hold_Booking_Num);
SELECT T.Rdate from arunf.PASSENGER_HISTORY_FACT T
LEFT OUTER JOIN arunf.RPT_WEB_BOOKING_NUM_TEMP b ON (unix_timestamp(T.RDATE,'yyyy-MM-dd')=unix_timestamp(b.rdate,'yyyy-MM-dd') AND T.book_num = B.Online_Booking_Number);
I'm also able to do a left outer join of this main table with two other tables in the same combined manner. I'm facing this issue only when I try to left join the main table with these two secondary tables.
Kindly provide your insights on this issue.
Hive bugs come and go. It may depend on Hive version (?) and the table format (text? AVRO? Sequence? ORC? Parquet?).
Now, if each query appears to work, why don't you try a workaround based on the divide-and-conquer approach (or: if Hive is not smart enough to design an execution plan, then let's design it ourselves) e.g.
SELECT TC.RDate, TC.Specialty_Cruises, B.Specialty_Cruises
FROM
(SELECT T.Rdate, C.Specialty_Cruises
FROM arunf.PASSENGER_HISTORY_FACT T
LEFT JOIN arunf.RPT_WEB_COURTESY_HOLD_TEMP C
ON unix_timestamp(T.RDate,'yyyy-MM-dd')=unix_timestamp(C.RDate,'yyyy-MM-dd')
AND T.book_num = C.Courtesy_Hold_Booking_Num
) TC
LEFT JOIN arunf.RPT_WEB_BOOKING_NUM_TEMP B
ON unix_timestamp(TC.RDate,'yyyy-MM-dd')=unix_timestamp(B.RDate,'yyyy-MM-dd')
AND TC.book_num = B.Online_Booking_Number
;

How to enlarge data column in SonarQube?

I'm trying to check my source code with cppcheck and SonarQube.
When I run sonar-runner, I met error below
SonarQube Runner 2.4
Java 1.6.0_33 Sun Microsystems Inc. (64-bit)
Linux 3.11.0-26-generic amd64
INFO: Error stacktraces are turned on.
INFO: Runner configuration file: /var/lib/jenkins/sonarqube/sonar-runner-2.4/conf/sonar-runner.properties
INFO: Project configuration file: /var/lib/jenkins/jobs/MIP35.KT.Centrex.Branch/workspace/hudson_mvmw/sonar-project.properties
INFO: Default locale: "ko_KR", source code encoding: "UTF-8"
INFO: Work directory: /data/jenkins/jobs/MIP35.KT.Centrex.Branch/workspace/hudson_mvmw/./.sonar
INFO: SonarQube Server 4.5
16:23:56.070 INFO - Load global referentials...
16:23:56.152 INFO - Load global referentials done: 84 ms
16:23:56.158 INFO - User cache: /var/lib/jenkins/.sonar/cache
16:23:56.164 INFO - Install plugins
16:23:56.273 INFO - Install JDBC driver
16:23:56.278 INFO - Create JDBC datasource for jdbc:mysql://localhost:3306/sonar?useUnicode=true&characterEncoding=utf8
16:23:57.156 INFO - Initializing Hibernate
16:23:57.990 INFO - Load project referentials...
16:23:58.522 INFO - Load project referentials done: 532 ms
16:23:58.522 INFO - Load project settings
16:23:58.788 INFO - Loading technical debt model...
16:23:58.809 INFO - Loading technical debt model done: 21 ms
16:23:58.811 INFO - Apply project exclusions
16:23:58.962 INFO - ------------- Scan mvmw for KT centrex at branch
16:23:58.968 INFO - Load module settings
16:23:59.939 INFO - Language is forced to c++
16:23:59.940 INFO - Loading rules...
16:24:00.558 INFO - Loading rules done: 618 ms
16:24:00.576 INFO - Configure Maven plugins
16:24:00.660 INFO - No quality gate is configured.
16:24:00.759 INFO - Base dir: /data/jenkins/jobs/MIP35.KT.Centrex.Branch/workspace/hudson_mvmw/.
16:24:00.759 INFO - Working dir: /data/jenkins/jobs/MIP35.KT.Centrex.Branch/workspace/hudson_mvmw/./.sonar
16:24:00.760 INFO - Source paths: moimstone
16:24:00.760 INFO - Source encoding: UTF-8, default locale: ko_KR
16:24:00.760 INFO - Index files
16:24:20.825 INFO - 13185 files indexed
16:26:35.895 WARN - SQL Error: 1406, SQLState: 22001
16:26:35.895 ERROR - Data truncation: Data too long for column 'data' at row 1
INFO: ------------------------------------------------------------------------
INFO: EXECUTION FAILURE
INFO: ------------------------------------------------------------------------
Total time: 2:40.236s
Final Memory: 27M/1765M
INFO: ------------------------------------------------------------------------
ERROR: Error during Sonar runner execution
org.sonar.runner.impl.RunnerException: Unable to execute Sonar
at org.sonar.runner.impl.BatchLauncher$1.delegateExecution(BatchLauncher.java:91)
at org.sonar.runner.impl.BatchLauncher$1.run(BatchLauncher.java:75)
at java.security.AccessController.doPrivileged(Native Method)
at org.sonar.runner.impl.BatchLauncher.doExecute(BatchLauncher.java:69)
at org.sonar.runner.impl.BatchLauncher.execute(BatchLauncher.java:50)
at org.sonar.runner.api.EmbeddedRunner.doExecute(EmbeddedRunner.java:102)
at org.sonar.runner.api.Runner.execute(Runner.java:100)
at org.sonar.runner.Main.executeTask(Main.java:70)
at org.sonar.runner.Main.execute(Main.java:59)
at org.sonar.runner.Main.main(Main.java:53)
Caused by: org.sonar.api.utils.SonarException: Unable to read and import the source file : '/data/jenkins/jobs/MIP35.KT.Centrex.Branch/workspace/hudson_mvmw/moimstone/mgrs/mUIMgr/gui/resource/wideBasicStyle/320Wx240H/imageMerged.c' with the charset : 'UTF-8'.
at org.sonar.batch.scan.filesystem.ComponentIndexer.importSources(ComponentIndexer.java:96)
at org.sonar.batch.scan.filesystem.ComponentIndexer.execute(ComponentIndexer.java:79)
at org.sonar.batch.scan.filesystem.DefaultModuleFileSystem.index(DefaultModuleFileSystem.java:245)
at org.sonar.batch.phases.PhaseExecutor.execute(PhaseExecutor.java:111)
at org.sonar.batch.scan.ModuleScanContainer.doAfterStart(ModuleScanContainer.java:194)
at org.sonar.api.platform.ComponentContainer.startComponents(ComponentContainer.java:92)
at org.sonar.api.platform.ComponentContainer.execute(ComponentContainer.java:77)
at org.sonar.batch.scan.ProjectScanContainer.scan(ProjectScanContainer.java:233)
at org.sonar.batch.scan.ProjectScanContainer.scanRecursively(ProjectScanContainer.java:228)
at org.sonar.batch.scan.ProjectScanContainer.doAfterStart(ProjectScanContainer.java:221)
at org.sonar.api.platform.ComponentContainer.startComponents(ComponentContainer.java:92)
at org.sonar.api.platform.ComponentContainer.execute(ComponentContainer.java:77)
at org.sonar.batch.scan.ScanTask.scan(ScanTask.java:64)
at org.sonar.batch.scan.ScanTask.execute(ScanTask.java:51)
at org.sonar.batch.bootstrap.TaskContainer.doAfterStart(TaskContainer.java:125)
at org.sonar.api.platform.ComponentContainer.startComponents(ComponentContainer.java:92)
at org.sonar.api.platform.ComponentContainer.execute(ComponentContainer.java:77)
at org.sonar.batch.bootstrap.BootstrapContainer.executeTask(BootstrapContainer.java:173)
at org.sonar.batch.bootstrapper.Batch.executeTask(Batch.java:95)
at org.sonar.batch.bootstrapper.Batch.execute(Batch.java:67)
at org.sonar.runner.batch.IsolatedLauncher.execute(IsolatedLauncher.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.sonar.runner.impl.BatchLauncher$1.delegateExecution(BatchLauncher.java:87)
... 9 more
Caused by: javax.persistence.PersistenceException: Unable to persist : SnapshotSource[snapshot_id=53035,data=#if defined(__cplusplus)
#pragma hdrstop
#endif
#include "Prj_pcx2_resource.h"
#if defined(__cplusplus)
#pragma package(smart_init)
#endif
const rgb24_type Prj_Bg_call_ColorTable[59] PCX2_SEGMENT =
{
{0xFF,0xFF,0xFF}, {0xFE,0xFE,0xFE}, {0xE7,0xE7,0xE7}, {0xC7,0xC7,0xC7}, {0x9B,0x9B,0x9B}, {0xFD,0xFD,0xFD}, {0xCF,0xCF,0xCF}, {0xA8,0xA8,0xA8}, {0xBC,0xBC,0xBC}, {0xD6,0xD6,0xD6},
{0xDC,0xDC,0xDC}, {0xCE,0xCE,0xCE}, {0xB5,0xB5,0xB5}, {0xD0,0xD0,0xD0}, {0xE1,0xE1,0xE1}, {0xA7,0xA7,0xA7}, {0xFA,0xFA,0xFA}, {0xBE,0xBE,0xBE}, {0xBB,0xBB,0xBB}, {0xF3,0xF3,0xF3},
{0x9A,0x9A,0x9A}, {0xEC,0xEC,0xEC}, {0xE9,0xE9,0xE9}, {0x99,0x99,0x99}, {0x98,0x98,0x98}, {0x97,0x97,0x97}, {0x96,0x96,0x96}, {0x95,0x95,0x95}, {0x94,0x94,0x94}, {0x93,0x93,0x93},
{0x92,0x92,0x92}, {0x91,0x91,0x91}, {0x90,0x90,0x90}, {0x8F,0x8F,0x8F}, {0x8E,0x8E,0x8E}, {0x8D,0x8D,0x8D}, {0x8C,0x8C,0x8C}, {0x8B,0x8B,0x8B}, {0x8A,0x8A,0x8A}, {0x89,0x89,0x89},
{0x88,0x88,0x88}, {0x87,0x87,0x87...]
at org.sonar.jpa.session.JpaDatabaseSession.internalSave(JpaDatabaseSession.java:136)
at org.sonar.jpa.session.JpaDatabaseSession.save(JpaDatabaseSession.java:103)
at org.sonar.batch.index.SourcePersister.saveSource(SourcePersister.java:47)
at org.sonar.batch.index.DefaultPersistenceManager.setSource(DefaultPersistenceManager.java:68)
at org.sonar.batch.index.DefaultIndex.setSource(DefaultIndex.java:467)
at org.sonar.batch.scan.filesystem.ComponentIndexer.importSources(ComponentIndexer.java:93)
... 34 more
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not insert: [org.sonar.api.database.model.SnapshotSource]
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:614)
at org.hibernate.ejb.AbstractEntityManagerImpl.persist(AbstractEntityManagerImpl.java:226)
at org.sonar.jpa.session.JpaDatabaseSession.internalSave(JpaDatabaseSession.java:130)
... 39 more
Caused by: org.hibernate.exception.DataException: could not insert: [org.sonar.api.database.model.SnapshotSource]
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:100)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:64)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2176)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2656)
at org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:71)
at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279)
at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:321)
at org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:204)
at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:130)
at org.hibernate.ejb.event.EJB3PersistEventListener.saveWithGeneratedId(EJB3PersistEventListener.java:49)
at org.hibernate.event.def.DefaultPersistEventListener.entityIsTransient(DefaultPersistEventListener.java:154)
at org.hibernate.event.def.DefaultPersistEventListener.onPersist(DefaultPersistEventListener.java:110)
at org.hibernate.event.def.DefaultPersistEventListener.onPersist(DefaultPersistEventListener.java:61)
at org.hibernate.impl.SessionImpl.firePersist(SessionImpl.java:646)
at org.hibernate.impl.SessionImpl.persist(SessionImpl.java:620)
at org.hibernate.impl.SessionImpl.persist(SessionImpl.java:624)
at org.hibernate.ejb.AbstractEntityManagerImpl.persist(AbstractEntityManagerImpl.java:220)
... 40 more
Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'data' at row 1
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4235)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2617)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2825)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2156)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2459)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2376)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2360)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
at org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAndExtract(IdentityGenerator.java:94)
at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:57)
... 55 more
ERROR:
ERROR: Re-run SonarQube Runner using the -X switch to enable full debug logging.
I have a huge source file which is image data file. It's over 100 Megabytes.
How can I enlarge data column? Is there setting for it?
There's no point in analyzing such a file, SonarQube won't give you useful information about it. And this is true for any other file like this one.
The solution is to exclude those image data files using the standard exclusion mechanism provided by SonarQube.
For instance, I would do something like:
sonar.exclusions=**/*imageMerged*

Resources