Split ~200mb log4j log file by day - bash

I have a log file formatted as follows and I want to split it into multiple files by day (ie. log-2017-10-2, log-2017-10-3 etc). I've seen people do it with awk but I'm not sure how to handle stack traces because java.io.Exception is a new line. Is there any convenient way to achieve this?
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
Final file contents will be:
log-2017-10-2:
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
log-2017-10-3:
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
log-2017-10-4:
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
log-2017-10-5:
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

awk to the rescue!
$ awk --posix 'BEGIN{f="log-header"}
$1~/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/{f="log-"$1} {print > f}' log
if there are too many dates (corresponding to too many open files) you may need to close files at one point. For few hundred it should work as is.
The initial log file (log-header) is set in case your log doesn't start with the checked regex.

awk solution:
awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} /{
if (fn && !a[$1]++) close(fn);
fn="log-"$1
}{ print > fn }' logfile
/^[0-9]{4}-[0-9]{2}-[0-9]{2} / - on encountering line starting with date string
if(fn && !a[$1]++) close(fn) - close the previous opened file descriptor for the previous "date"
fn="log-"$1 - constructing filename
Viewing results:
$ head log-*
==> log-2017-10-02 <==
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
==> log-2017-10-03 <==
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
==> log-2017-10-04 <==
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
==> log-2017-10-05 <==
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

Related

Why is Spring cloud data flow tutorial not showing expected result

I'm trying to complete the first spring cloud dataflow tutorial and I'm not getting the result in the tutorial.
https://dataflow.spring.io/docs/stream-developer-guides/streams/
The tutorial has me use curl to a http source and see the result in the log sink with a tail of a file of stdout.
I do not see the result. I see the startup in the log.
I tail the log
docker exec -it skipper tail -f /path/from/stdout/textbox/in/dashboard
I enter
curl http://localhost:20100 -H "Content-type: text/plain" -d "Happy streaming"
all I see is
2020-10-05 16:30:03.315 INFO 110 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka version : 2.0.1
2020-10-05 16:30:03.316 INFO 110 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka commitId : fa14705e51bd2ce5
2020-10-05 16:30:03.322 INFO 110 --- [ main] o.s.s.c.ThreadPoolTaskScheduler : Initializing ExecutorService
2020-10-05 16:30:03.338 INFO 110 --- [ main] s.i.k.i.KafkaMessageDrivenChannelAdapter : started org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter#106faf11
2020-10-05 16:30:03.364 INFO 110 --- [container-0-C-1] org.apache.kafka.clients.Metadata : Cluster ID: 2J0QTxzQQmm2bLxFKgRwmA
2020-10-05 16:30:03.574 INFO 110 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 20041 (http) with context path ''
2020-10-05 16:30:03.584 INFO 110 --- [ main] o.s.c.s.a.l.s.k.LogSinkKafkaApplication : Started LogSinkKafkaApplication in 38.086 seconds (JVM running for 40.251)
2020-10-05 16:30:05.852 INFO 110 --- [container-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-3, groupId=http-ingest] Discovered group coordinator kafka-broker:9092 (id: 2147482646 rack: null)
2020-10-05 16:30:05.857 INFO 110 --- [container-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-3, groupId=http-ingest] Revoking previously assigned partitions []
2020-10-05 16:30:05.858 INFO 110 --- [container-0-C-1] o.s.c.s.b.k.KafkaMessageChannelBinder$1 : partitions revoked: []
2020-10-05 16:30:05.858 INFO 110 --- [container-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-3, groupId=http-ingest] (Re-)joining group
2020-10-05 16:30:08.943 INFO 110 --- [container-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-3, groupId=http-ingest] Successfully joined group with generation 1
2020-10-05 16:30:08.945 INFO 110 --- [container-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-3, groupId=http-ingest] Setting newly assigned partitions [http-ingest.http-0]
2020-10-05 16:30:08.964 INFO 110 --- [container-0-C-1] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=consumer-3, groupId=http-ingest] Resetting offset for partition http-ingest.http-0 to offset 0.
2020-10-05 16:30:08.981 INFO 110 --- [container-0-C-1] o.s.c.s.b.k.KafkaMessageChannelBinder$1 : partitions assigned: [http-ingest.http-0]
No Happy streaming
Any suggestions?
Thank you for trying out the developer guides!
From what I can tell, it appears the http | log stream definition in SCDF is submitted without an explicit port. When that is the case, a port gets randomly assigned by Spring Boot when the http-source and log-sink applications start.
If you navigate to your http-source application logs, you will see the application port listed, and that is the port you'd use on the CURL command.
There's this following note about this in the guide for your reference.
If you use the local Data Flow Server, add the following deployment property to set the port to avoid a port collision.
Alternatively, you can deploy the stream with an explicit port in the definition. For example: http --server.port=9004 | log. With that, your CURL would then be:
curl http://localhost:9004 -H "Content-type: text/plain" -d "Happy streaming"

Kerberos problem: GSSException: No valid credentials provided

My application is sending data to Kafka, Kerberos is used for authentication. Everything works fine for around 20 days, then I get the following exception:
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Initiating connection to node mkav2.dc.ex.com:9092 (id: 101 rack: null)
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to SEND_HANDSHAKE_REQUEST
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Creating SaslClient: client=lpa/appX.dc.ex.com#DC.EX.COM;service=kafka;serviceHostname=mkav2.dc.ex.com;mechs=[GSSAPI]
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.network.Selector : Created socket with SO_RCVBUF = 32768, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node 101
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to RECEIVE_HANDSHAKE_RESPONSE
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Completed connection to node 101. Fetching API versions.
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to INITIAL
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.network.Selector : Connection with mkav2.dc.ex.com/172.10.15.44 disconnected
javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) occurred when evaluating SASL token received from the Kafka Broker. Kafka Client will go to AUTH_FAILED state.
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:298)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendSaslToken(SaslClientAuthenticator.java:215)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:183)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:76)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:376)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:433)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:224)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.security.sasl.SaslException: GSS initiate failed
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$2.run(SaslClientAuthenticator.java:280)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$2.run(SaslClientAuthenticator.java:278)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:278)
... 9 common frames omitted
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 14 common frames omitted
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Node 101 disconnected.
2020-01-07 22:22:08.484 WARN 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Connection to node 101 terminated during authentication. This may indicate that authentication failed due to invalid credentials.
After restarting the application everything works fine for another 20 days or so and then I get the same exception again. These are the ticket properties in krb5.conf file:
ticket_lifetime = 86400
renew_lifetime = 604800
Any ideas on why this could be happening?

Liquibase yaml inserts

We are using a stack on SpringBoot, Hibernate and Liquibase. I have a sql file with 24000 inserts. When I converted it into yaml (for versioning purposes), I got a huge yaml file which I split into 16 yamls. Insertion using the master file using the liquibase command line option is pretty quick. But with Spring and Hibernate, it gets stuck. Upto 5 files is fine. Anything more than this dosn't work.
I tried with 4 files each from the 16 files and it works too. So it is not an issue with any malformed yaml files. I also tried with the following properties in my application.yml.
spring:
datasource:
hikari:
maximum-pool-size: 100
properties:
hibernate:
jdbc:
batch_size: 200
Basically, it gets stuck at the changelog lock. This is what the log shows:
2019-10-17 18:10:26.375 INFO 1 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2019-10-17 18:10:26.387 WARN 1 --- [ main] com.zaxxer.hikari.util.DriverDataSource : Registered driver with driverClassName=com.mysql.jdbc.Driver was not found, trying direct instantiation.
2019-10-17 18:10:26.927 INFO 1 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2019-10-17 18:10:27.760 INFO 1 --- [ main] liquibase.executor.jvm.JdbcExecutor : SELECT COUNT(*) FROM pbr.DATABASECHANGELOGLOCK
2019-10-17 18:10:27.794 INFO 1 --- [ main] liquibase.executor.jvm.JdbcExecutor : SELECT COUNT(*) FROM pbr.DATABASECHANGELOGLOCK
2019-10-17 18:10:27.804 INFO 1 --- [ main] liquibase.executor.jvm.JdbcExecutor : SELECT `LOCKED` FROM pbr.DATABASECHANGELOGLOCK WHERE ID=1
2019-10-17 18:10:27.812 INFO 1 --- [ main] l.lockservice.StandardLockService : Waiting for changelog lock....
2019-10-17 18:10:37.816 INFO 1 --- [ main] liquibase.executor.jvm.JdbcExecutor : SELECT `LOCKED` FROM pbr.DATABASECHANGELOGLOCK WHERE ID=1
2019-10-17 18:10:37.821 INFO 1 --- [ main] l.lockservice.StandardLockService : Waiting for changelog lock..
It did not work. Please help me.
The inserts are like this:
databaseChangeLog:
- changeSet:
id: 15706644546-4
author: pbr-admin
changes:
- insert:
columns:
- column:
name: model_id
value: xxxxx
- column:
name: category_id
value: ALL_TRANSACTIONS
- column:
name: afpr_indexed
valueBoolean: false
- column:
name: score
valueNumeric: xxx
The changelog is liquibase specific.

Apache Atlas quickstart - kafka error

Env: no kerberos, no ranger, no hdfs. EC2 with ssl.
Getting this error after running $ATLAS_HOME/bin/quick_start.py https://$componentPrivateDNSRecord:21443 with correct user/pass
Creating sample types:
Created type [DB]
Created type [Table]
Created type [StorageDesc]
Created type [Column]
Created type [LoadProcess]
Created type [View]
Created type [JdbcAccess]
Created type [ETL]
Created type [Metric]
Created type [PII]
Created type [Fact]
Created type [Dimension]
Created type [Log Data]
Creating sample entities:
Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:105)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:634)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:334)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:311)
at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:199)
at org.apache.atlas.AtlasClientV2.createEntity(AtlasClientV2.java:277)
at org.apache.atlas.examples.QuickStartV2.createInstance(QuickStartV2.java:339)
at org.apache.atlas.examples.QuickStartV2.createDatabase(QuickStartV2.java:362)
at org.apache.atlas.examples.QuickStartV2.createEntities(QuickStartV2.java:268)
at org.apache.atlas.examples.QuickStartV2.runQuickstart(QuickStartV2.java:150)
at org.apache.atlas.examples.QuickStartV2.main(QuickStartV2.java:132)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 14 more
No sample data added to Apache Atlas Server.
Relevant code:
https://github.com/apache/incubator-atlas/blob/master/webapp/src/main/java/org/apache/atlas/examples/QuickStartV2.java
#This works
quickStartV2.createTypes();
#This errors
quickStartV2.createEntities();
First I thought atlas->kafka connectivity was issue but then I see:
[ec2-user#ip-10-160-187-181 logs]$ cat atlas_kafka_setup.log
2018-07-25 00:06:14,923 INFO - [main:] ~ Looking for atlas-application.properties in classpath (ApplicationProperties:78)
2018-07-25 00:06:14,926 INFO - [main:] ~ Loading atlas-application.properties from file:/home/ec2-user/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/conf/atlas-application.properties (ApplicationProperties:91)
2018-07-25 00:06:16,512 WARN - [main:] ~ Attempting to create topic ATLAS_HOOK (AtlasTopicCreator:72)
2018-07-25 00:06:17,004 WARN - [main:] ~ Created topic ATLAS_HOOK with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 00:06:17,004 WARN - [main:] ~ Attempting to create topic ATLAS_ENTITIES (AtlasTopicCreator:72)
2018-07-25 00:06:17,024 WARN - [main:] ~ Created topic ATLAS_ENTITIES with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,166 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,269 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Dimension returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,270 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,271 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,291 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,450 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Calling API [ POST : api/atlas/v2/entity ] <== AtlasEntityWithExtInfo{entity=AtlasEntity{AtlasStruct{typeName='DB', attributes=[owner:John ETL, createTime:1532483385453, name:Sales, description:sales database, locationuri:hdfs://host:8000/apps/warehouse/sales]}guid='-6466195619848', status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0, relationshipAttributes=[], classifications=[], },AtlasEntityExtInfo{referredEntities={}}} (AtlasBaseClient:319)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,474 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:33,256 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/types/typedefs (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,445 Audit: myuser/10.160.189.35-10.160.189.35 performed request GET https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,678 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/entity (10.160.187.181) at time 2018-07-25T01:49Z
The 2 topics are returned by this:
$KAFKA_HOME/bin/kafka-topics.sh --list --zookeeper localhost:2181
atlas' application.log does have this, not sure why:
2018-07-25 02:18:14,991 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:183)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:973)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:63)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:55)
at org.apache.atlas.notification.NotificationHookConsumer$HookConsumer.doWork(NotificationHookConsumer.java:305)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
This fixed it!
sed -i 's/atlas.kafka.bootstrap.servers=localhost:9027/atlas.kafka.bootstrap.servers=localhost:9092/' $ATLAS_HOME/conf/atlas-application.properties```

Extract logs between time frame with non specific time stamp using bash

I want to fetch log between two time stamps but i do not have specific time stamps with me. I can use the command sed for fetching if I have specific time stamp in log using the following command
sed -rne '/$StartTime/,/$EndTime/'p <filename>
My query is that since the specific StartTime and EndTime which I'm fetching from my DB might not be present in the log file, I will have to fetch the log between times near to the StartTime and EndTime that I provide using >= and <= signs. I tried the following command but it does not work.
awk '$0>=st && $0<=et' st=$StartTime et=$EndTime <filename>
Sample input and output
Input
Time retrieved from DB
StartTime - 2017-11-02 10:20:00
EndTime - 2017-11-02 11:20:00
The time present in log
T1 - 2017-11-02 10:17:44
T2 - 2017-11-02 11:19:32
Output: Entire Log text between T1 & T2
Sample Log
2017-03-03 10:43:18,736 [main] WARN - ORACLE_HOSTNAME=xxxxxxxxxx[OVERRIDES:
xxxxxxxxxxxxxxxx]
2017-03-03 10:43:18,736 [main] WARN - NLS_DATE_FORMAT=DD-MON-YYYY
HH24:MI:SS [OVERRIDES: DD-MON-YYYY HH24:MI:SS]
2017-03-03 10:43:18,736 [main] WARN - xxxxUsername=MDMPIUSER [OVERRIDES: MDMPIUSER]
2017-03-03 10:43:18,736 [main] WARN - BUNDLE_GEMFILE=uri:classloader://installer/Gemfile [OVERRIDES: uri:classloader://installer/Gemfile]
2017-03-03 10:43:18,736 [main] WARN - TIMEOUT=900 [OVERRIDES: 900]
2017-03-03 10:43:18,736 [main] WARN - SHLVL=4 [OVERRIDES: 4]
2017-03-03 10:43:18,736 [main] WARN - HISTSIZE=1000 [OVERRIDES: 1000]
2017-03-03 10:43:18,736 [main] WARN - JAVA_HOME=/usr/java/jdk1.8.0_60/jre [OVERRIDES: /usr/java/jdk1.8.0_60/jre]
2017-03-03 10:43:20,156 [main] WARN - APP_PROPS=/home/xxx/conf/appProperties [OVERRIDES: /home/xxx/conf/appProperties]
You can try
awk -v start="$StartTime" -v end="$EndTime" '
function fonct(date)
{
gsub(/-|,| |:/,"",date)
return date
}
BEGIN{
start=fonct(start)
end=fonct(end)
}
{
a=fonct($1$2)
if (a>=start && a<=end)print $0
}' infile

Resources