Can I run pig0.12 with hadoop-2.2.0 distribution? [closed] - hadoop

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I installed hadoop-2.2.0 and can run MR jobs. Configured pig0.12 and trying to use the interactie grunt shell. But when I try to create data into records from input using
records = LOAD '/temp/temp.txt' AS (year:chararray, temperature:int, quality:int);
I get the following. Did not see this when I used pig0.12 earlier with hadoop-1.2.1 distribution.
2013-11-27 11:11:37,225 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jobtracker.maxtasks.per.job is deprecated. Instead, use mapreduce.jobtracker.maxtasks.perjob
2013-11-27 11:11:37,225 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.system.dir is deprecated. Instead, use mapreduce.jobtracker.system.dir
2013-11-27 11:11:37,226 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.max.attempts is deprecated. Instead, use mapreduce.reduce.maxattempts
2013-11-27 11:11:37,226 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.map.tasks.maximum is deprecated. Instead, use mapreduce.tasktracker.map.tasks.maximum
2013-11-27 11:11:37,226 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.local.dir.minspacekill is deprecated. Instead, use mapreduce.tasktracker.local.dir.minspacekill
2013-11-27 11:11:37,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jobtracker.job.history.block.size is deprecated. Instead, use mapreduce.jobtracker.jobhistory.block.size
2013-11-27 11:11:37,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.backup.address is deprecated. Instead, use dfs.namenode.backup.address
2013-11-27 11:11:37,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir
2013-11-27 11:11:37,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.timeout is deprecated. Instead, use mapreduce.task.timeout
2013-11-27 11:11:37,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.tracker.task-controller is deprecated. Instead, use mapreduce.tasktracker.taskcontroller
2013-11-27 11:11:37,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.local.dir.minspacestart is deprecated. Instead, use mapreduce.tasktracker.local.dir.minspacestart
2013-11-27 11:11:37,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - jobclient.progress.monitor.poll.interval is deprecated. Instead, use mapreduce.client.progressmonitor.pollinterval
2013-11-27 11:11:37,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.shuffle.merge.percent is deprecated. Instead, use mapreduce.reduce.shuffle.merge.percent
2013-11-27 11:11:37,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.block.size is deprecated. Instead, use dfs.blocksize
2013-11-27 11:11:37,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.df.interval is deprecated. Instead, use fs.df.interval
2013-11-27 11:11:37,229 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.skip.reduce.max.skip.groups is deprecated. Instead, use mapreduce.reduce.skip.maxgroups
2013-11-27 11:11:37,229 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
2013-11-27 11:11:37,229 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.profile is deprecated. Instead, use mapreduce.task.profile
2013-11-27 11:11:37,229 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max
2013-11-27 11:11:37,229 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.child.tmp is deprecated. Instead, use mapreduce.task.tmp.dir
2013-11-27 11:11:37,230 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min
2013-11-27 11:11:37,230 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.safemode.threshold.pct is deprecated. Instead, use dfs.namenode.safemode.threshold-pct
2013-11-27 11:11:37,230 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.https.client.keystore.resource is deprecated. Instead, use dfs.client.https.keystore.resource
2013-11-27 11:11:37,230 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.reduce.tasks.maximum is deprecated. Instead, use mapreduce.tasktracker.reduce.tasks.maximum
2013-11-27 11:11:37,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.userlog.limit.kb is deprecated. Instead, use mapreduce.task.userlog.limit.kb
2013-11-27 11:11:37,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
2013-11-27 11:11:37,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.parallel.copies is deprecated. Instead, use mapreduce.reduce.shuffle.parallelcopies
2013-11-27 11:11:37,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2013-11-27 11:11:37,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.skip.attempts.to.start.skipping is deprecated. Instead, use mapreduce.task.skip.start.attempts
2013-11-27 11:11:37,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
2013-11-27 11:11:37,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - jobclient.completion.poll.interval is deprecated. Instead, use mapreduce.client.completion.pollinterval
2013-11-27 11:11:37,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - jobclient.output.filter is deprecated. Instead, use mapreduce.client.output.filter
2013-11-27 11:11:37,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
2013-11-27 11:11:37,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects
2013-11-27 11:11:37,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2013-11-27 11:11:37,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2013-11-27 11:11:37,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.expiry.interval is deprecated. Instead, use mapreduce.jobtracker.expire.trackers.interval
2013-11-27 11:11:37,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.datanode.max.xcievers is deprecated. Instead, use dfs.datanode.max.transfer.threads
2013-11-27 11:11:37,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.submit.replication is deprecated. Instead, use mapreduce.client.submit.file.replication
2013-11-27 11:11:37,234 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jobtracker.taskScheduler is deprecated. Instead, use mapreduce.jobtracker.taskscheduler
2013-11-27 11:11:37,238 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.temp.dir is deprecated. Instead, use mapreduce.cluster.temp.dir
2013-11-27 11:11:37,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.taskmemorymanager.monitoring-interval is deprecated. Instead, use mapreduce.tasktracker.taskmemorymanager.monitoringinterval
2013-11-27 11:11:37,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2013-11-27 11:11:37,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compression.codec is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.codec
2013-11-27 11:11:37,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.userlog.retain.hours is deprecated. Instead, use mapreduce.job.userlog.retain.hours
2013-11-27 11:11:37,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.healthChecker.interval is deprecated. Instead, use mapreduce.tasktracker.healthchecker.interval
2013-11-27 11:11:37,240 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.retiredjobs.cache.size is deprecated. Instead, use mapreduce.jobtracker.retiredjobs.cache.size
2013-11-27 11:11:37,240 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.max.tracker.failures is deprecated. Instead, use mapreduce.job.maxtaskfailures.per.tracker
2013-11-27 11:11:37,240 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.replication.considerLoad is deprecated. Instead, use dfs.namenode.replication.considerLoad
2013-11-27 11:11:37,240 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.acls.enabled is deprecated. Instead, use mapreduce.cluster.acls.enabled
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.slowstart.completed.maps is deprecated. Instead, use mapreduce.job.reduce.slowstart.completedmaps
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.handler.count is deprecated. Instead, use mapreduce.jobtracker.handler.count
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - job.end.retry.attempts is deprecated. Instead, use mapreduce.job.end-notification.retry.attempts
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.http.address is deprecated. Instead, use dfs.namenode.http-address
2013-11-27 11:11:37,242 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.persist.jobstatus.dir is deprecated. Instead, use mapreduce.jobtracker.persist.jobstatus.dir
2013-11-27 11:11:37,246 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.name.dir.restore is deprecated. Instead, use dfs.namenode.name.dir.restore
2013-11-27 11:11:37,246 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2013-11-27 11:11:37,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.healthChecker.script.timeout is deprecated. Instead, use mapreduce.tasktracker.healthchecker.script.timeout
2013-11-27 11:11:37,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.shuffle.connect.timeout is deprecated. Instead, use mapreduce.reduce.shuffle.connect.timeout
2013-11-27 11:11:37,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.backup.http.address is deprecated. Instead, use dfs.namenode.backup.http-address
2013-11-27 11:11:37,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.secondary.http.address is deprecated. Instead, use dfs.namenode.secondary.http-address
2013-11-27 11:11:37,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.umaskmode is deprecated. Instead, use fs.permissions.umask-mode
2013-11-27 11:11:37,248 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
2013-11-27 11:11:37,248 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.replication.interval is deprecated. Instead, use dfs.namenode.replication.interval
2013-11-27 11:11:37,248 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir
2013-11-27 11:11:37,248 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.indexcache.mb is deprecated. Instead, use mapreduce.tasktracker.indexcache.mb
2013-11-27 11:11:37,248 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - keep.failed.task.files is deprecated. Instead, use mapreduce.task.files.preserve.failedtasks
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.heartbeats.in.second is deprecated. Instead, use mapreduce.jobtracker.heartbeats.in.second
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.permissions is deprecated. Instead, use dfs.permissions.enabled
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.speculative.execution.slowTaskThreshold is deprecated. Instead, use mapreduce.job.speculative.slowtaskthreshold
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.dns.interface is deprecated. Instead, use mapreduce.tasktracker.dns.interface
2013-11-27 11:11:37,249 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.speculative.execution.slowNodeThreshold is deprecated. Instead, use mapreduce.job.speculative.slownodethreshold
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.https.need.client.auth is deprecated. Instead, use dfs.client.https.need-auth
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.persist.jobstatus.hours is deprecated. Instead, use mapreduce.jobtracker.persist.jobstatus.hours
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reuse.jvm.num.tasks is deprecated. Instead, use mapreduce.job.jvm.numtasks
2013-11-27 11:11:37,250 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - topology.node.switch.mapping.impl is deprecated. Instead, use net.topology.node.switch.mapping.impl
2013-11-27 11:11:37,254 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.cache.levels is deprecated. Instead, use mapreduce.jobtracker.taskcache.levels
2013-11-27 11:11:37,254 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.instrumentation is deprecated. Instead, use mapreduce.tasktracker.instrumentation
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.access.time.precision is deprecated. Instead, use dfs.namenode.accesstime.precision
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.queue.name is deprecated. Instead, use mapreduce.job.queuename
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.child.log.level is deprecated. Instead, use mapreduce.reduce.log.level
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.balance.bandwidthPerSec is deprecated. Instead, use dfs.datanode.balance.bandwidthPerSec
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.persist.jobstatus.active is deprecated. Instead, use mapreduce.jobtracker.persist.jobstatus.active
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.output.compression.codec is deprecated. Instead, use mapreduce.map.output.compress.codec
2013-11-27 11:11:37,255 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.tracker.http.address is deprecated. Instead, use mapreduce.tasktracker.http.address
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapreduce.jobtracker.split.metainfo.maxsize is deprecated. Instead, use mapreduce.job.split.metainfo.maxsize
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.profile.reduces is deprecated. Instead, use mapreduce.task.profile.reduces
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.inmem.merge.threshold is deprecated. Instead, use mapreduce.reduce.merge.inmem.threshold
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.checkpoint.period is deprecated. Instead, use dfs.namenode.checkpoint.period
2013-11-27 11:11:37,256 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - job.end.retry.interval is deprecated. Instead, use mapreduce.job.end-notification.retry.interval
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compression.type is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.type
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.sort.spill.percent is deprecated. Instead, use mapreduce.map.sort.spill.percent
2013-11-27 11:11:37,257 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.permissions.supergroup is deprecated. Instead, use dfs.permissions.superusergroup
2013-11-27 11:11:37,258 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2013-11-27 11:11:37,258 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - tasktracker.http.threads is deprecated. Instead, use mapreduce.tasktracker.http.threads
2013-11-27 11:11:37,258 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.input.buffer.percent is deprecated. Instead, use mapreduce.reduce.input.buffer.percent
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.tasks.sleeptime-before-sigkill is deprecated. Instead, use mapreduce.tasktracker.tasks.sleeptimebeforesigkill
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.tasktracker.dns.nameserver is deprecated. Instead, use mapreduce.tasktracker.dns.nameserver
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.shuffle.read.timeout is deprecated. Instead, use mapreduce.reduce.shuffle.read.timeout
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.max.tracker.blacklists is deprecated. Instead, use mapreduce.jobtracker.tasktracker.maxblacklists
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - topology.script.number.args is deprecated. Instead, use net.topology.script.number.args
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.shuffle.input.buffer.percent is deprecated. Instead, use mapreduce.reduce.shuffle.input.buffer.percent
2013-11-27 11:11:37,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.merge.recordsBeforeProgress is deprecated. Instead, use mapreduce.task.merge.progress.records
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - dfs.write.packet.size is deprecated. Instead, use dfs.client-write-packet-size
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jobtracker.restart.recover is deprecated. Instead, use mapreduce.jobtracker.restart.recover
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.jobhistory.lru.cache.size is deprecated. Instead, use mapreduce.jobtracker.jobhistory.lru.cache.size
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.child.log.level is deprecated. Instead, use mapreduce.map.log.level
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.tracker.report.address is deprecated. Instead, use mapreduce.tasktracker.report.address
2013-11-27 11:11:37,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.speculative.execution.speculativeCap is deprecated. Instead, use mapreduce.job.speculative.speculativecap
2013-11-27 11:11:37,263 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.skip.map.max.skip.records is deprecated. Instead, use mapreduce.map.skip.maxrecords
2013-11-27 11:11:37,263 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.profile.maps is deprecated. Instead, use mapreduce.task.profile.maps
2013-11-27 11:11:37,263 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jobtracker.instrumentation is deprecated. Instead, use mapreduce.jobtracker.instrumentation
But when I do DUMP records.
Backend error message during job submission
-------------------------------------------
Unexpected System Error Occured: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:225)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:186)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:456)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
at java.lang.Thread.run(Thread.java:744)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias records
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias records
at org.apache.pig.PigServer.openIterator(PigServer.java:880)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:541)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at org.apache.pig.PigServer.openIterator(PigServer.java:872)
... 12 more
================================================================================
Content of /temp/temp.txt file is:
1950 0 1
1950 22 1
1950 -11 1
1950 111 1
What could be wrong ? When I issue pig command I don't see any error, there is one warning saying couldn't load native library from platform.
Thanks!

Related

Error While executing pig script which is saved in HDFS path

I have placed .pig file and txt file in HDFS path
Trying to execute .pig from Grunt
getting below error
students.txt
001,Rajiv,Hyderabad
002,siddarth,Kolkata
003,Rajesh,Delhi
Script.pig
student = LOAD 'hdfs://localhost:8020/pig_data/students.txt' USING PigStorage(',')
as (id:int,name:chararray,city:chararray);
Dump student;
grunt> exec /user/cloudera/pig_data/script.pig
2019-07-24 12:13:35,303 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2019-07-24 12:13:35,304 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-07-24 12:13:35,376 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2019-07-24 12:13:35,378 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-07-24 12:13:35,384 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. File not found: /user/cloudera/pig_data/script.pig
Details at logfile: /home/cloudera/pig_1563995548146.log
grunt>

Failed to read data from "hdfs://192.168.1.195:9000/vivek/flume_data/flume.1520589885576"

test = LOAD 'hdfs://192.168.1.195:9000/vivek/flume_data/flume.1520589885576' USING TextLoader AS (line:chararray);
log = FOREACH test GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'^(\\S+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] "(.+?)" (\\S+) (\\S+) "([^"]*)" "([^"]*)"')) AS (address_ip: chararray, logname: chararray, user: chararray, timestamp: chararray, req_line: chararray, status: int, bytes: int, referer: chararray, userAgent: chararray);
STORE log INTO 'hbase://Access_Logs' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:address_ip, cf:logname, cf:user, cf:timestamp, cf:req_line, cf:status, cf:bytes, cf:referer, cf:userAgent');
2018-03-10 10:52:03,636 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:03,840 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:03,840 [main] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for Access_Logs
2018-03-10 10:52:03,843 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2018-03-10 10:52:03,859 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:03,860 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2018-03-10 10:52:03,860 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2018-03-10 10:52:03,890 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2018-03-10 10:52:03,890 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:03,891 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2018-03-10 10:52:03,891 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2018-03-10 10:52:03,897 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:03,898 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /192.168.1.195:8050
2018-03-10 10:52:03,899 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2018-03-10 10:52:03,899 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2018-03-10 10:52:03,900 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2018-03-10 10:52:03,981 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.15.0-core-h2.jar to DistributedCache through /tmp/temp1710369540/tmp1282565307/pig-0.15.0-core-h2.jar
2018-03-10 10:52:04,013 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/htrace-core-2.04.jar to DistributedCache through /tmp/temp1710369540/tmp520067094/htrace-core-2.04.jar
2018-03-10 10:52:04,067 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp1710369540/tmp946538428/guava-11.0.2.jar
2018-03-10 10:52:04,123 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/hbase-common-0.98.8-hadoop2.jar to DistributedCache through /tmp/temp1710369540/tmp468949353/hbase-common-0.98.8-hadoop2.jar
2018-03-10 10:52:04,144 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/hbase-hadoop-compat-0.98.8-hadoop2.jar to DistributedCache through /tmp/temp1710369540/tmp113887319/hbase-hadoop-compat-0.98.8-hadoop2.jar
2018-03-10 10:52:04,200 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/hbase-server-0.98.8-hadoop2.jar to DistributedCache through /tmp/temp1710369540/tmp682998180/hbase-server-0.98.8-hadoop2.jar
2018-03-10 10:52:04,256 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/hbase-client-0.98.8-hadoop2.jar to DistributedCache through /tmp/temp1710369540/tmp-1958170360/hbase-client-0.98.8-hadoop2.jar
2018-03-10 10:52:04,317 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/hbase-protocol-0.98.8-hadoop2.jar to DistributedCache through /tmp/temp1710369540/tmp-892814021/hbase-protocol-0.98.8-hadoop2.jar
2018-03-10 10:52:04,363 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.5.jar to DistributedCache through /tmp/temp1710369540/tmp-830858682/zookeeper-3.4.5.jar
2018-03-10 10:52:04,396 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar to DistributedCache through /tmp/temp1710369540/tmp-420530468/protobuf-java-2.5.0.jar
2018-03-10 10:52:04,432 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hbase/lib/high-scale-lib-1.1.1.jar to DistributedCache through /tmp/temp1710369540/tmp-1046507224/high-scale-lib-1.1.1.jar
2018-03-10 10:52:04,474 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar to DistributedCache through /tmp/temp1710369540/tmp309001480/netty-3.6.2.Final.jar
2018-03-10 10:52:04,489 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1710369540/tmp964502237/automaton-1.11-8.jar
2018-03-10 10:52:04,507 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1710369540/tmp-1680308848/antlr-runtime-3.4.jar
2018-03-10 10:52:04,528 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/joda-time-2.5.jar to DistributedCache through /tmp/temp1710369540/tmp813805284/joda-time-2.5.jar
2018-03-10 10:52:04,539 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2018-03-10 10:52:04,544 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2018-03-10 10:52:04,544 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2018-03-10 10:52:04,544 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2018-03-10 10:52:04,555 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2018-03-10 10:52:04,557 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /192.168.1.195:8050
2018-03-10 10:52:04,596 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2018-03-10 10:52:04,597 [JobControl] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for Access_Logs
2018-03-10 10:52:04,625 [JobControl] WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2018-03-10 10:52:04,742 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2018-03-10 10:52:04,742 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2018-03-10 10:52:04,748 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2018-03-10 10:52:04,786 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2018-03-10 10:52:04,828 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1519576410629_0017
2018-03-10 10:52:04,832 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2018-03-10 10:52:04,908 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1519576410629_0017
2018-03-10 10:52:04,910 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://server2.linux.com:8088/proxy/application_1519576410629_0017/
2018-03-10 10:52:05,056 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1519576410629_0017
2018-03-10 10:52:05,056 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases log,test
2018-03-10 10:52:05,056 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: test[16,7],log[-1,-1] C: R:
2018-03-10 10:52:05,064 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2018-03-10 10:52:05,064 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1519576410629_0017]
2018-03-10 10:53:02,248 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2018-03-10 10:53:02,249 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1519576410629_0017]
2018-03-10 10:53:05,259 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2018-03-10 10:53:05,259 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1519576410629_0017 has failed! Stop running all dependent jobs
2018-03-10 10:53:05,259 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2018-03-10 10:53:05,260 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /192.168.1.195:8050
2018-03-10 10:53:05,274 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2018-03-10 10:53:05,789 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
2018-03-10 10:53:05,789 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2018-03-10 10:53:05,789 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.4.1 0.15.0 hadoop 2018-03-10 10:52:03 2018-03-10 10:53:05 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1519576410629_0017 log,test MAP_ONLY Message: Job failed! hbase://Access_Logs,
Input(s):
Failed to read data from "hdfs://192.168.1.195:9000/vivek/flume_data/flume.1520589885576"
Output(s):
Failed to produce result in "hbase://Access_Logs"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1519576410629_0017
2018-03-10 10:53:05,789 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
But runs fine when doing
dump log
The PIG script is not able to load data possibly for columns status: int, bytes: int.
The Error says
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
It means, the REGEX parser is bringing String data, when PIG expect it to be Integer.
To debug, try changing datatype in the PIG command and try just printing the output. Once all set, you may try to save into hbase.

Pig jobs not running on hadoop 2.6

When I execute the below dump command, it stop in the 0% complete and not continuing for next step. How to track the issue .
grunt> a = load '/user/hduser1/file1' using PigStorage(',') as (usernames:chararray, password:chararray, price:int);
2015-12-25 10:34:26,471 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> dump a;
2015-12-25 10:34:33,862 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2015-12-25 10:34:33,904 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-12-25 10:34:33,910 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-12-25 10:34:33,951 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2015-12-25 10:34:34,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-12-25 10:34:34,088 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-12-25 10:34:34,088 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-12-25 10:34:34,158 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-12-25 10:34:34,197 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2015-12-25 10:34:34,425 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-12-25 10:34:34,432 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-12-25 10:34:34,432 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-12-25 10:34:34,432 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2015-12-25 10:34:34,435 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-12-25 10:34:35,464 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.14.0-core-h2.jar to DistributedCache through /tmp/temp-1328911758/tmp304593952/pig-0.14.0-core-h2.jar
2015-12-25 10:34:35,519 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1328911758/tmp1133010181/automaton-1.11-8.jar
2015-12-25 10:34:35,575 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1328911758/tmp-1601875197/antlr-runtime-3.4.jar
2015-12-25 10:34:35,641 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp-1328911758/tmp1137978977/guava-11.0.2.jar
2015-12-25 10:34:36,218 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp-1328911758/tmp325144073/joda-time-2.1.jar
2015-12-25 10:34:36,259 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-12-25 10:34:36,267 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-12-25 10:34:36,267 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-12-25 10:34:36,267 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-12-25 10:34:36,304 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-12-25 10:34:36,305 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-12-25 10:34:36,311 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2015-12-25 10:34:36,344 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-12-25 10:34:36,706 [JobControl] WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2015-12-25 10:34:36,782 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-12-25 10:34:36,782 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-12-25 10:34:36,820 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2015-12-25 10:34:36,972 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2015-12-25 10:34:37,483 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1451017152672_0001
2015-12-25 10:34:37,664 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2015-12-25 10:34:38,078 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1451017152672_0001
2015-12-25 10:34:38,221 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://vijee-Lenovo-IdeaPad-S510p:8088/proxy/application_1451017152672_0001/
2015-12-25 10:34:38,221 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1451017152672_0001
2015-12-25 10:34:38,222 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2015-12-25 10:34:38,222 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],a[-1,-1] C: R:
2015-12-25 10:34:38,243 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-12-25 10:34:38,243 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1451017152672_0001]
Find the Image for the application.
logs

Flag -useHCatalog not working

I installed CDH5.4 in single node following the instructions here, also, I put the hive-metastore in localmode using these instructions and everything works perfectly, except when I tried to connect pig with the metastore:
➜ ~ pig -useHCatalog
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2015-05-01 15:45:08,657 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.4.0 (rUnversioned directory) compiled Apr 21 2015, 12:19:15
2015-05-01 15:45:08,658 [main] INFO org.apache.pig.Main - Logging error messages to: /home/itam/pig_1430495108571.log
2015-05-01 15:45:09,035 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:09,035 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:09,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:8020
2015-05-01 15:45:09,940 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:09,941 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
2015-05-01 15:45:09,941 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:09,999 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,001 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,088 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,089 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,125 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,126 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,160 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,162 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,194 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,195 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,228 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,261 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,262 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 15:45:10,295 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 15:45:10,296 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
and when I tried to access the table:
grunt> a = load 'ufos' using org.apache.hcatalog.pig.HCatLoader();
2015-05-01 15:46:11,656 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /home/itam/pig_1430495108571.log
grunt>
Hadoop version
➜ ~ hadoop version
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r c788a14a5de9ecd968d1e2666e8765c5f018c271
Compiled by jenkins on 2015-04-21T19:16Z
Compiled with protoc 2.5.0
From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar
UPDATE: I just tried with Impala, and It neither sees anything:
➜ ~ impala-shell
/usr/lib/python2.7/dist-packages/pkg_resources.py:1049: UserWarning: /home/itam/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extracti
on_path or the PYTHON_EGG_CACHE environment variable).
warnings.warn(msg, UserWarning)
Starting Impala Shell without Kerberos authentication
Connected to 6b512e41337d:21000
Server version: impalad version 2.2.0-cdh5 RELEASE (build 2ffd73a4255cefd521362ffe1cfb37463f67f75c)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.
Copyright (c) 2012 Cloudera, Inc. All rights reserved.
(Shell build version: Impala Shell v2.2.0-cdh5 (2ffd73a) built on Tue Apr 21 12:09:21 PDT 2015)
[6b512e41337d:21000] > invalidate metadata;
Query: invalidate metadata
[6b512e41337d:21000] > show tables;
Query: show tables
Fetched 0 row(s) in 0.00s
but from beeline:
~ beeline -u jdbc:hive2://
scan complete in 2ms
Connecting to jdbc:hive2://
Connected to: Apache Hive (version 1.1.0-cdh5.4.0)
Driver: Hive JDBC (version 1.1.0-cdh5.4.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.4.0 by Apache Hive
0: jdbc:hive2://> show tables;
OK
+-----------+--+
| tab_name |
+-----------+--+
| ufos |
+-----------+--+
1 row selected (0.701 seconds)
It worked... What is happening?
UPDATE: I am running hcatalog too
➜ ~ sudo service hive-webhcat-server status
* WEBHCat server is running
➜ ~ hcat -e "desc ufos"
OK
timestamp string from deserializer
city string from deserializer
state string from deserializer
shape string from deserializer
duration string from deserializer
summary string from deserializer
posted string from deserializer
Time taken: 1.314 seconds
UPDATE: The problem with impala was due that I didn't copy hive-site.xml to /etc/impala/conf, once this is done, impala-shell worked properly.
The loader you are using is deprecated. Instead of using org.apache.hcatalog.pig.HCatLoader, you need to use org.apache.hive.hcatalog.pig.HCatLoader.
From org.apache.hcatalog.pig.HCatLoader:
Deprecated.
Use/modify HCatLoader instead
I was facing the issue in HDP 2.3 and Pig 0.15 .
Package name for HCatLoader() class is different in Hortonworks distribution.
The following worked for me
USING org.apache.hive.hcatalog.pig.HCatLoader()
instead of
USING org.apache.hcatalog.pig.HCatLoader();
like you started to see the issues you have is with hive-site.xmlfile - you need to place it in the classpath
As mention here:
A workflow action interacting with HCatalog requires the following
jars in the classpath: hcatalog-core.jar, webhcat-java-client.jar,
hive-common.jar, hive-exec.jar, hive-metastore.jar, hive-serde.jar and
libfb303.jar. hive-site.xml which has the configuration to talk to the
HCatalog server also needs to be in the classpath. The correct version
of HCatalog and hive jars should be placed in classpath based on the
version of HCatalog installed on the cluster.
The jars can be added to the classpath of the action using one of the
below ways.
You can place the jars and hive-site.xml in the system shared library.
The shared library for a pig, hive or java action can be overridden to
include hcatalog shared libraries along with the action's shared
library. Refer to Shared Libraries for more information. The
oozie-sharelib-[version].tar.gz in the oozie distribution bundles the
required HCatalog jars in a hcatalog sharelib. If using a different
version of HCatalog than the one bundled in the sharelib, copy the
required HCatalog jars from such version into the sharelib.
You can
place the jars and hive-site.xml in the workflow application lib/
path.
You can specify the location of the jar files in archive tag and
the hive-site.xml in file tag in the corresponding pig, hive or java
action.
If you are going to use Oozie coordinator, upload them to HDFS coordinator path

hbase warning about deprecated native.lib

hadoop conf file: opt/hadoop/etc/hadoop/core-site.xml
when set
<name>hadoop.native.lib</name>
and then start hbase shell, there will be four lines of warning:
2015-02-10 11:07:46,956 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,005 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,046 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,081 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,169 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
but when set
<name>io.native.lib.available</name>
and then start hbase shell, there will be one line of warning:
2015-02-10 11:07:46,956 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
How can I set to make it doesn't show any of this warning?
I'm on hadoop 2.5.2 and hbase 0.98.8 #ubuntu x64.

Resources