How to extract exceptions from java log file? - bash

I have a huge log file with a lot of exceptions and I need to extract full stack traces and few lines before and after it. It will be perfect if tool for this is bash script.
example:
$16.02.2012 16:04:34 *INFO * [main] InitialContextInitializer: Reference bound: rmirepository (InitialContextInitializer.java, line 203)
16.02.2012 16:04:34 *ERROR* [main] StandaloneContainerInitializedListener: Error of StandaloneContainer initialization (StandaloneContainerInitializedListener.java, line 109)
java.lang.RuntimeException: Cannot instantiate component key=org.exoplatform.services.jcr.ext.script.groovy.GroovyScript2RestLoader type=org.exoplatform.services.jcr.ext.script.groovy.GroovyScript2RestLoader found at file:/home/roman/reports/backup/1.14.7-5636/rdbms/single/exo-tomcat_1.14.7-5636/exo-configuration.xml
at org.exoplatform.container.jmx.MX4JComponentAdapter.getComponentInstance(MX4JComponentAdapter.java:134)
at org.exoplatform.container.management.ManageableComponentAdapter.getComponentInstance(ManageableComponentAdapter.java:68)
at org.exoplatform.container.ConcurrentPicoContainer.getInstance(ConcurrentPicoContainer.java:468)
at org.exoplatform.container.ConcurrentPicoContainer.getComponentInstancesOfType(ConcurrentPicoContainer.java:366)
at org.exoplatform.container.CachingContainer.getComponentInstancesOfType(CachingContainer.java:111)
at org.exoplatform.container.LifecycleVisitor.visitContainer(LifecycleVisitor.java:151)
at org.exoplatform.container.ConcurrentPicoContainer.accept(ConcurrentPicoContainer.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.picocontainer.defaults.AbstractPicoVisitor.traverse(AbstractPicoVisitor.java:32)
at org.exoplatform.container.LifecycleVisitor.traverse(LifecycleVisitor.java:90)
at org.exoplatform.container.LifecycleVisitor.start(LifecycleVisitor.java:170)
at org.exoplatform.container.ConcurrentPicoContainer.start(ConcurrentPicoContainer.java:554)
at org.exoplatform.container.ExoContainer.start(ExoContainer.java:266)
at org.exoplatform.container.StandaloneContainer$3.run(StandaloneContainer.java:178)
at org.exoplatform.container.StandaloneContainer$3.run(StandaloneContainer.java:175)
at org.exoplatform.commons.utils.SecurityHelper.doPrivilegedAction(SecurityHelper.java:291)
at org.exoplatform.container.StandaloneContainer.getInstance(StandaloneContainer.java:174)
at org.exoplatform.container.StandaloneContainer.getInstance(StandaloneContainer.java:129)
at org.exoplatform.ws.frameworks.servlet.StandaloneContainerInitializedListener.contextInitialized(StandaloneContainerInitializedListener.java:104)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4205)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4704)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601)
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:675)
at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:601)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1315)
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:324)
at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1061)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:840)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463)
at org.apache.catalina.core.StandardService.start(StandardService.java:525)
at org.apache.catalina.core.StandardServer.start(StandardServer.java:754)
at org.apache.catalina.startup.Catalina.start(Catalina.java:595)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
Caused by: java.lang.RuntimeException: Cannot instantiate component key=org.exoplatform.services.jcr.RepositoryService type=org.exoplatform.services.jcr.impl.RepositoryServiceImpl found at file:/home/roman/reports/backup/1.14.7-5636/rdbms/single/exo-tomcat_1.14.7-5636/exo-configuration.xml
at org.exoplatform.container.jmx.MX4JComponentAdapter.getComponentInstance(MX4JComponentAdapter.java:134)
at org.exoplatform.container.management.ManageableComponentAdapter.getComponentInstance(ManageableComponentAdapter.java:68)
at org.exoplatform.container.ConcurrentPicoContainer.getInstance(ConcurrentPicoContainer.java:468)
at org.exoplatform.container.ConcurrentPicoContainer.getComponentInstanceOfType(ConcurrentPicoContainer.java:422)
at org.exoplatform.container.CachingContainer.getComponentInstanceOfType(CachingContainer.java:139)
at org.exoplatform.container.ExoContainer.createComponent(ExoContainer.java:407)
at org.exoplatform.container.jmx.MX4JComponentAdapter.getComponentInstance(MX4JComponentAdapter.java:96)
... 45 more

Use awk:
BEGIN {
previous = "";
}
/^\tat / {
if( previous != "" ) {
print previous;
previous = "";
}
print;
next;
}
{ previous = $0; }
should do the trick. In a nutshell, look for the pattern \tat (tab, at, blank) which almost always is used in a stack trace.
If you have many exceptions, then you can use maps (associative arrays in AWK's lingo) to save part of the exception message and then do statistics (like which exception happens the most).

Most of line in Java exception block are start with TAB delimiter.
So to get all exceptions from a huge log file you can use grep TAB delimiter.
Use following command to grep all exceptions from a file:
**$ grep -P -A 15 -B 15 "^\t" inputLogFile**

You can use below command
grep -B 10 -A 100 "Caused By" app.log -->this will give all the java exception logged in application log file. You can change string if you want to find specific exceptions.
https://www.techiedelight.com/java-program-search-exceptions-huge-log-file-on-server/

You can try something like this,
=======
grep -B20 -A20 ^'$16.02.2012 16:04:34' <path/to/log/file>
=======
Where
-B20 -- will display 20 lines before the line where the match was found.
-A20 -- will display 20 lines after the line where the match was found.

Related

spring.data.ldap with multiple spring.ldap.urls in application.properties

We use an LdapRepository<MyUser> within a Spring Boot project. The repo is auto-generated with a few additional query methods. We now have a second Active Directory domain server, and would like to add that to our Spring configuration. Before the .properties file looked like this:
spring.ldap.base=cn=Users,dc=company,dc=local
spring.ldap.password=really_save_password
spring.ldap.username=ldapuser#company.local
spring.ldap.urls=ldap://172.16.36.82:389
So I changed the last line to:
spring.ldap.urls=ldap://172.16.36.80:389 ldap://172.16.36.82:389
Because from other Stackoverflow questions I understood that multiple urls should be supplied with separating spaces.
But using userRepo.findAll() (or any query method) then leads to an exception:
Exception in thread "main" java.lang.IllegalStateException: Failed to execute CommandLineRunner
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:803)
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:784)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:338)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1258)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1246)
at com.company.product.web.WebApp.main(WebApp.java:22)
Caused by: org.springframework.ldap.NameNotFoundException: [LDAP: error code 32 - 0000208D: NameErr: DSID-0310021B, problem 2001 (NO_OBJECT), data 0, best match of:
''
]; nested exception is javax.naming.NameNotFoundException: [LDAP: error code 32 - 0000208D: NameErr: DSID-0310021B, problem 2001 (NO_OBJECT), data 0, best match of:
''
]; remaining name '/'
at org.springframework.ldap.support.LdapUtils.convertLdapException(LdapUtils.java:183)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:376)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:309)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:642)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:578)
at org.springframework.ldap.core.LdapTemplate.find(LdapTemplate.java:1840)
at org.springframework.ldap.core.LdapTemplate.findAll(LdapTemplate.java:1806)
at org.springframework.ldap.core.LdapTemplate.findAll(LdapTemplate.java:1814)
at org.springframework.data.ldap.repository.support.SimpleLdapRepository.findAll(SimpleLdapRepository.java:183)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.springframework.data.repository.core.support.RepositoryComposition$RepositoryFragments.invoke(RepositoryComposition.java:377)
at org.springframework.data.repository.core.support.RepositoryComposition.invoke(RepositoryComposition.java:200)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$ImplementationMethodExecutionInterceptor.invoke(RepositoryFactorySupport.java:629)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:593)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:578)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:61)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
at com.sun.proxy.$Proxy173.findAll(Unknown Source)
at com.company.product.web.Foo.run(Foo.java:22)
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:800)
... 5 more
Caused by: javax.naming.NameNotFoundException: [LDAP: error code 32 - 0000208D: NameErr: DSID-0310021B, problem 2001 (NO_OBJECT), data 0, best match of:
''
]; remaining name '/'
at java.naming/com.sun.jndi.ldap.LdapCtx.mapErrorCode(LdapCtx.java:3284)
at java.naming/com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:3205)
at java.naming/com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2996)
at java.naming/com.sun.jndi.ldap.LdapCtx.searchAux(LdapCtx.java:1875)
at java.naming/com.sun.jndi.ldap.LdapCtx.c_search(LdapCtx.java:1798)
at java.naming/com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(ComponentDirContext.java:392)
at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:358)
at java.naming/javax.naming.directory.InitialDirContext.search(InitialDirContext.java:305)
at org.springframework.ldap.core.LdapTemplate$3.executeSearch(LdapTemplate.java:303)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:363)
... 33 more
It does work when I use just the second domain server, so either server works when used by its own in the .properties file.
What am I doing wrong?
If you want to use multiple ldap urls, you must declare them as a string array according to official spring-ldap document:
It is possible to configure multiple alternate LDAP servers using the urls property. In this case, supply all server urls in a String array to the urls property.
In your case try:
spring.ldap.urls=ldap://172.16.36.80:389,ldap://172.16.36.82:389

How to use alias with space in hive 0.13

I am on HDP 2.1 hortonwork, hive 0.13.
I want to use an alias with space in the query but it is giving error, is it possible to have a space in alias name? (it works fine without space alias).
Tried with single quotes, no quotes still fails.
*hive> select ot_vdot "alias test" from test4 where ot_vdot<>'null';
NoViableAltException(301#[146:1: selectExpression : ( expression | tableAllColumns );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectExpression(HiveParser_SelectClauseParser.java:4142)
at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:3038)
at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1307)
at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1070)
at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:40193)
at org.apache.hadoop.hive.ql.parse.HiveParser.singleSelectStatement(HiveParser.java:38048)
at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:37754)
at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:37691)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:36898)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:36774)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1338)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:408)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:976)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1041)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:912)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: ParseException line 1:15 cannot recognize input near 'ot_vdot' '"alias test"' 'from' in select expression
hive>
*
Backticks are hive's way to give things names with weird characters or reserved words. For example:
select `my field with spaces`, `select`, `where` from `_my_table`

"Could not get input splits" Error, with Hive-Cassandra-CqlStorageHandler

Im trying to read data from cassandra using Hive with CqlStorageHandler.
The versions:
Hive 0.11.0
Hadoop 1.2.1
Cassandra 1.2.6
Im able to create EXTERNAL table with the following HIVE Query
CREATE EXTERNAL TABLE input(number string,name string,address string) STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES ("cassandra.columns.mapping" = ":key, name, address", "cassandra.ks.name" ="cassandradb", "cassandra.host" = "localhost" ,"cassandra.port" = "9160") TBLPROPERTIES ("cassandra.input.split.size" = "64000","cassandra.range.size" = "1000","cassandra.slice.predicate.size" = "1000");
(The table "input" is already existing and containing some data in cassandra created with CQL3)
However, When I try to read data with the following query
select * from input where number="1";
Im facing the folowing issue:
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: Could not get input splits
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:189)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:213)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:169)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:297)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:185)
... 31 more
Caused by: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:444)
at java.lang.Long.valueOf(Long.java:540)
at org.apache.cassandra.dht.Murmur3Partitioner$1.fromString(Murmur3Partitioner.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:239)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:207)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Job Submission failed with exception 'java.io.IOException(Could not get input splits)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Am I missing anything? Kindly advise.

Hadoop error while processing file with brackets

I have a lot of different files *.doc, *.pdf and so on. I wanted to process them with mapReduce.
I put them in HDFS and then started java MapReduce program using Hue.
If files are well formated and doesn't have brackets "(){}[]" in their name all goes fine.
But if there is a file OPN_last_[age.PDF
I get this errors:
Failing Oozie Launcher, Main class [distr.fors.ru.Index], main() threw exception, Illegal file pattern: Unclosed character class near index 17
OPN_last_[age.PDF
^
java.io.IOException: Illegal file pattern: Unclosed character class near index 17
OPN_last_[age.PDF
^
at org.apache.hadoop.fs.GlobFilter.init(GlobFilter.java:70)
at org.apache.hadoop.fs.GlobFilter.<init>(GlobFilter.java:49)
at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1670)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1627)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:211)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at distr.fors.ru.Index.run(Index.java:78)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at distr.fors.ru.Index.main(Index.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.util.regex.PatternSyntaxException: Unclosed character class near index 17
OPN_last_[age.PDF
^
at org.apache.hadoop.fs.GlobPattern.error(GlobPattern.java:167)
at org.apache.hadoop.fs.GlobPattern.set(GlobPattern.java:151)
at org.apache.hadoop.fs.GlobPattern.<init>(GlobPattern.java:42)
at org.apache.hadoop.fs.GlobFilter.init(GlobFilter.java:66)
... 32 more
If there is a file like this: {2011-01-27} (3769330).pdf
I get such error:
Input Pattern hdfs://fd-bigdata.distr.fors.ru:8020/{2011-01-27} (3769330).pdf matches 0 files
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231)
t org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at distr.fors.ru.Index.run(Index.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at distr.fors.ru.Index.main(Index.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
I realy need to process such files. What can I make to solve such problems?
P.S. I am using the latest CDH 4.4.0.
To deal with special characters in Java you should escape them with double backslash '\':
'[' => '\\['
'}' => '\\}'
This works for me in Java, in Pig and in Oozie. Hope it will also solve your problem.

BDB JE exception

com.sleepycat.je.DatabaseException: (JE 3.2.76) fetchTarget of 0x45/0x27aa63 parent IN=3316846 lastFullVersion=0x45/0x47b147 parent.getDirty()=false state=0 com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 97
at com.sleepycat.je.tree.IN.fetchTarget(IN.java:989)
at com.sleepycat.je.tree.Tree.getNextBinInternal(Tree.java:1441)
at com.sleepycat.je.tree.Tree.getNextBin(Tree.java:1306)
at com.sleepycat.je.dbi.CursorImpl.getNextWithKeyChangeStatus(CursorImpl.java:1498)
at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1368)
at com.sleepycat.je.Cursor.retrieveNextAllowPhantoms(Cursor.java:1587)
at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:1397)
at com.sleepycat.je.Cursor.getNext(Cursor.java:456)
at browse.bdb.dao.JENodeDao.load(JENodeDao.java:76)
.................my code..................
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 97
at com.sleepycat.je.log.LogEntryHeader.<init>(LogEntryHeader.java:69)
at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:631)
at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:597)
at com.sleepycat.je.tree.IN.fetchTarget(IN.java:958)
... 16 more
what is wrong ?
it's likely some log file is corrupted, try to recover environment when opening.
This exception appear because I've tried to read record which was not flushed to the file system yet.
For JE it's: environment.sync();
For JNI it's: environment.syncCache(null);

Resources