How to configure OTEL collector with Spring app using micrometer library and java instrumentation agent to get metrics - spring

Issue
Otel collector instrumentation agent does not forward prometheus metrics from Spring app correctly
Local reproduction steps
Simple java springboot app with gradle
dependencies {
implementation 'org.springframework.boot:spring-boot-starter-web'
implementation 'org.springframework.boot:spring-boot-starter-actuator'
}
OTEL Collector running as a docker container with the following files
docker-compose.yml
otel-collector:
image: otel/opentelemetry-collector
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./config.yaml:/etc/otel-collector-config.yaml
ports:
- "1888:1888" # pprof extension
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP http receiver
- "55679:55679" # zpages extension
config.yml
receivers:
prometheus:
config:
scrape_configs:
- job_name: "Book"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9080"]
processors:
batch:
exporters:
prometheus:
endpoint: "localhost:9090"
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [prometheus]
Prometheus running as docker container
docker-compose.yml
prometheus:
image: prom/prometheus:v2.17.1
container_name: prometheus
volumes:
- ./prometheus:/etc/prometheus
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=200h'
- '--web.enable-lifecycle'
restart: unless-stopped
expose:
- 9090
prometheus.yaml
global:
scrape_interval: 15s
evaluation_interval: 15s
Scenario
I start the OTEL Collector and Prometheus using docker
I run the following command below to auto instrument my spring java app
java -javaagent:./opentelemetry-javaagent.jar \
-Dotel.javaagent.extensions=./opentelemetry-micrometer-1.5-1.13.0-alpha.jar \
-Dotel.instrumentation.spring-boot-actuator-autoconfigure.enabled=false \
-Dotel.javaagent.debug=true \
-Dotel.metrics.exporter=prometheus \
-Dotel.traces.exporter=none \
-Dotel.exporter.prometheus.metrics.endpoint=http://127.0.0.1:4317 \
-Dotel.exporter.prometheus.port=9080 \
-Dotel.resource.attributes="service.name=helloapp" \
-Dotel.instrumentation.micrometer.base-time-unit=s \
-Dotel.instrumentation.micrometer.prometheus-mode.enabled=true \
-jar build/libs/hello-0.0.1-SNAPSHOT.jar
Issue:
Prometheus metrics exported by Instrumentation agent are different compared to Metrics received by collector and subsequently forwarded to Prometheus docker app
Expected prometheus metrics
# TYPE process_runtime_jvm_system_cpu_utilization gauge
# HELP process_runtime_jvm_system_cpu_utilization Recent cpu utilization for the whole system
process_runtime_jvm_system_cpu_utilization 0.0 1660037708660
# TYPE process_runtime_jvm_memory_usage gauge
# HELP process_runtime_jvm_memory_usage Measure of memory used process_runtime_jvm_memory_usage{pool="Metaspace",type="non_heap"}
4.289408E7 1660037708660 process_runtime_jvm_memory_usage{pool="G1 Eden Space",type="heap"} 6291456.0 1660037708660
process_runtime_jvm_memory_usage{pool="G1 Old Gen",type="heap"}
2.0279808E7 1660037708660 process_runtime_jvm_memory_usage{pool="Compressed Class
Space",type="non_heap"} 5806728.0 1660037708660
process_runtime_jvm_memory_usage{pool="CodeHeap 'profiled
nmethods'",type="non_heap"} 1.2102912E7 1660037708660
process_runtime_jvm_memory_usage{pool="CodeHeap
'non-nmethods'",type="non_heap"} 1271552.0 1660037708660
process_runtime_jvm_memory_usage{pool="G1 Survivor Space",type="heap"}
5671744.0 1660037708660 process_runtime_jvm_memory_usage{pool="CodeHeap 'non-profiled
nmethods'",type="non_heap"} 3989888.0 1660037708660
# TYPE process_runtime_jvm_threads_count gauge
# HELP process_runtime_jvm_threads_count Number of executing threads process_runtime_jvm_threads_count 12.0 1660037708660
# TYPE process_runtime_jvm_memory_limit gauge
# HELP process_runtime_jvm_memory_limit Measure of max obtainable memory process_runtime_jvm_memory_limit{pool="G1 Old Gen",type="heap"}
4.294967296E9 1660037708660 process_runtime_jvm_memory_limit{pool="Compressed Class
Space",type="non_heap"} 1.073741824E9 1660037708660
process_runtime_jvm_memory_limit{pool="CodeHeap 'profiled
nmethods'",type="non_heap"} 1.22908672E8 1660037708660
process_runtime_jvm_memory_limit{pool="CodeHeap
'non-nmethods'",type="non_heap"} 5840896.0 1660037708660
process_runtime_jvm_memory_limit{pool="CodeHeap 'non-profiled
nmethods'",type="non_heap"} 1.22908672E8 1660037708660
# TYPE process_runtime_jvm_memory_init gauge
# HELP process_runtime_jvm_memory_init Measure of initial memory requested
process_runtime_jvm_memory_init{pool="Metaspace",type="non_heap"} 0.0
1660037708660 process_runtime_jvm_memory_init{pool="G1 Eden
Space",type="heap"} 2.5165824E7 1660037708660
process_runtime_jvm_memory_init{pool="G1 Old Gen",type="heap"}
2.43269632E8 1660037708660 process_runtime_jvm_memory_init{pool="Compressed Class
Space",type="non_heap"} 0.0 1660037708660
process_runtime_jvm_memory_init{pool="CodeHeap 'profiled
nmethods'",type="non_heap"} 2555904.0 1660037708660
process_runtime_jvm_memory_init{pool="CodeHeap
'non-nmethods'",type="non_heap"} 2555904.0 1660037708660
process_runtime_jvm_memory_init{pool="G1 Survivor Space",type="heap"}
0.0 1660037708660 process_runtime_jvm_memory_init{pool="CodeHeap 'non-profiled nmethods'",type="non_heap"} 2555904.0 1660037708660
# TYPE process_runtime_jvm_classes_loaded_total counter
# HELP process_runtime_jvm_classes_loaded_total Number of classes loaded since JVM start process_runtime_jvm_classes_loaded_total 9284.0
1660037708660
# TYPE process_runtime_jvm_memory_committed gauge
# HELP process_runtime_jvm_memory_committed Measure of memory committed
process_runtime_jvm_memory_committed{pool="Metaspace",type="non_heap"}
4.3384832E7 1660037708660 process_runtime_jvm_memory_committed{pool="G1 Eden Space",type="heap"}
5.24288E7 1660037708660 process_runtime_jvm_memory_committed{pool="G1 Old Gen",type="heap"} 3.7748736E7 1660037708660
process_runtime_jvm_memory_committed{pool="Compressed Class
Space",type="non_heap"} 6029312.0 1660037708660
process_runtime_jvm_memory_committed{pool="CodeHeap 'profiled
nmethods'",type="non_heap"} 1.212416E7 1660037708660
process_runtime_jvm_memory_committed{pool="CodeHeap
'non-nmethods'",type="non_heap"} 2555904.0 1660037708660
process_runtime_jvm_memory_committed{pool="G1 Survivor
Space",type="heap"} 6291456.0 1660037708660
process_runtime_jvm_memory_committed{pool="CodeHeap 'non-profiled
nmethods'",type="non_heap"} 3997696.0 1660037708660
# TYPE process_runtime_jvm_classes_current_loaded gauge
# HELP process_runtime_jvm_classes_current_loaded Number of classes currently loaded process_runtime_jvm_classes_current_loaded 9310.0
1660037708660
# TYPE process_runtime_jvm_cpu_utilization gauge
# HELP process_runtime_jvm_cpu_utilization Recent cpu utilization for the process process_runtime_jvm_cpu_utilization 0.0 1660037708660
# TYPE process_runtime_jvm_classes_unloaded_total counter
# HELP process_runtime_jvm_classes_unloaded_total Number of classes unloaded since JVM start process_runtime_jvm_classes_unloaded_total
1.0 1660037708660
# TYPE process_runtime_jvm_system_cpu_load_1m gauge
# HELP process_runtime_jvm_system_cpu_load_1m Average CPU load of the whole system for the last minute
process_runtime_jvm_system_cpu_load_1m 2.15087890625 1660037708660
# TYPE system_cpu_usage gauge
# HELP system_cpu_usage The "recent cpu usage" of the system the application is running in system_cpu_usage 0.4666666666666667
1660037708660
# TYPE jvm_threads_states_threads gauge
# HELP jvm_threads_states_threads The current number of threads having NEW state jvm_threads_states_threads{state="runnable"} 6.0
1660037708660 jvm_threads_states_threads{state="timed-waiting"} 4.0
1660037708660 jvm_threads_states_threads{state="terminated"} 0.0
1660037708660 jvm_threads_states_threads{state="new"} 0.0
1660037708660 jvm_threads_states_threads{state="blocked"} 0.0
1660037708660 jvm_threads_states_threads{state="waiting"} 2.0
1660037708660
# TYPE jvm_gc_max_data_size_bytes gauge
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool jvm_gc_max_data_size_bytes 4.294967296E9 1660037708660
# TYPE system_cpu_count gauge
# HELP system_cpu_count The number of processors available to the Java virtual machine system_cpu_count 8.0 1660037708660
# TYPE jvm_threads_live_threads gauge
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads jvm_threads_live_threads
12.0 1660037708660
# TYPE jvm_threads_daemon_threads gauge
# HELP jvm_threads_daemon_threads The current number of live daemon threads jvm_threads_daemon_threads 8.0 1660037708660
# TYPE jvm_memory_usage_after_gc_percent gauge
# HELP jvm_memory_usage_after_gc_percent The percentage of long-lived heap pool used after the last GC event, in the range [0..1]
jvm_memory_usage_after_gc_percent{area="heap",pool="long-lived"}
0.0047217607498168945 1660037708660
# TYPE jvm_gc_memory_allocated_bytes_total counter
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the (young) heap memory pool after one GC to before the
next jvm_gc_memory_allocated_bytes_total 5.0331648E7 1660037708660
# TYPE jvm_gc_overhead_percent gauge
# HELP jvm_gc_overhead_percent An approximation of the percent of CPU time used by GC activities over the last lookback period or since
monitoring began, whichever is shorter, in the range [0..1]
jvm_gc_overhead_percent 0.0034007331860589907 1660037708660
# TYPE disk_free_bytes gauge
# HELP disk_free_bytes Usable space for path disk_free_bytes{path="/Users/anuragk/workspace/ts/demo/otel/java/hello/."} 1.89101027328E11 1660037708660
# TYPE jvm_memory_committed_bytes gauge
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space"}
6029312.0 1660037708660 jvm_memory_committed_bytes{area="nonheap",id="Metaspace"} 4.358144E7
1660037708660 jvm_memory_committed_bytes{area="nonheap",id="CodeHeap
'profiled nmethods'"} 1.212416E7 1660037708660
jvm_memory_committed_bytes{area="heap",id="G1 Survivor Space"}
6291456.0 1660037708660 jvm_memory_committed_bytes{area="heap",id="G1 Eden Space"} 5.24288E7 1660037708660
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap
'non-nmethods'"} 2555904.0 1660037708660
jvm_memory_committed_bytes{area="heap",id="G1 Old Gen"} 3.7748736E7
1660037708660 jvm_memory_committed_bytes{area="nonheap",id="CodeHeap
'non-profiled nmethods'"} 4063232.0 1660037708660
# TYPE jvm_gc_live_data_size_bytes gauge
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation jvm_gc_live_data_size_bytes 0.0 1660037708660
# TYPE system_load_average_1m gauge
# HELP system_load_average_1m The sum of the number of runnable entities queued to available processors and the number of runnable
entities running on the available processors averaged over a period of
time system_load_average_1m 2.15087890625 1660037708660
# TYPE process_uptime_seconds gauge
# HELP process_uptime_seconds The uptime of the Java virtual machine process_uptime_seconds 11.542 1660037708660
# TYPE jvm_buffer_memory_used_bytes gauge
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_memory_used_bytes{id="mapped - 'non-volatile memory'"} 0.0
1660037708660 jvm_buffer_memory_used_bytes{id="mapped"} 0.0
1660037708660 jvm_buffer_memory_used_bytes{id="direct"} 16384.0
1660037708660
# TYPE jvm_buffer_count_buffers gauge
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool jvm_buffer_count_buffers{id="mapped - 'non-volatile
memory'"} 0.0 1660037708660 jvm_buffer_count_buffers{id="mapped"} 0.0
1660037708660 jvm_buffer_count_buffers{id="direct"} 2.0 1660037708660
# TYPE process_cpu_usage gauge
# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process process_cpu_usage 0.3913746438746439 1660037708660
# TYPE jvm_classes_unloaded_classes_total counter
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
jvm_classes_unloaded_classes_total 1.0 1660037708660
# TYPE process_files_max_files gauge
# HELP process_files_max_files The maximum file descriptor count process_files_max_files 10240.0 1660037708660
# TYPE process_files_open_files gauge
# HELP process_files_open_files The open file descriptor count process_files_open_files 19.0 1660037708660
# TYPE jvm_threads_peak_threads gauge
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_peak_threads 12.0 1660037708660
# TYPE process_start_time_seconds gauge
# HELP process_start_time_seconds Start time of the process since unix epoch. process_start_time_seconds 1.66003769719E9 1660037708660
# TYPE jvm_memory_max_bytes gauge
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space"}
1.073741824E9 1660037708660 jvm_memory_max_bytes{area="nonheap",id="Metaspace"} -1.0 1660037708660
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'"}
1.22908672E8 1660037708660 jvm_memory_max_bytes{area="heap",id="G1 Survivor Space"} -1.0 1660037708660
jvm_memory_max_bytes{area="heap",id="G1 Eden Space"} -1.0
1660037708660 jvm_memory_max_bytes{area="nonheap",id="CodeHeap
'non-nmethods'"} 5840896.0 1660037708660
jvm_memory_max_bytes{area="heap",id="G1 Old Gen"} 4.294967296E9
1660037708660 jvm_memory_max_bytes{area="nonheap",id="CodeHeap
'non-profiled nmethods'"} 1.22908672E8 1660037708660
# TYPE jvm_gc_pause_seconds histogram
# HELP jvm_gc_pause_seconds Time spent in GC pause jvm_gc_pause_seconds_count{action="end of minor GC",cause="G1
Evacuation Pause"} 1.0 1660037708660
jvm_gc_pause_seconds_sum{action="end of minor GC",cause="G1 Evacuation
Pause"} 0.003 1660037708660 jvm_gc_pause_seconds_bucket{action="end of
minor GC",cause="G1 Evacuation Pause",le="5.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="10.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="25.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="50.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="75.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="100.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="250.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="500.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="750.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="1000.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="2500.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="5000.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="7500.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="10000.0"} 1.0 1660037708660
jvm_gc_pause_seconds_bucket{action="end of minor GC",cause="G1
Evacuation Pause",le="+Inf"} 1.0 1660037708660
# TYPE jvm_classes_loaded_classes gauge
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
jvm_classes_loaded_classes 9409.0 1660037708660
# TYPE jvm_memory_used_bytes gauge
# HELP jvm_memory_used_bytes The amount of used memory jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space"}
5853584.0 1660037708660 jvm_memory_used_bytes{area="nonheap",id="Metaspace"} 4.315976E7
1660037708660 jvm_memory_used_bytes{area="nonheap",id="CodeHeap
'profiled nmethods'"} 1.2106752E7 1660037708660
jvm_memory_used_bytes{area="heap",id="G1 Survivor Space"} 5671744.0
1660037708660 jvm_memory_used_bytes{area="heap",id="G1 Eden Space"}
8388608.0 1660037708660 jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'"}
1277696.0 1660037708660 jvm_memory_used_bytes{area="heap",id="G1 Old Gen"} 2.0279808E7 1660037708660
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled
nmethods'"} 4001920.0 1660037708660
# TYPE disk_total_bytes gauge
# HELP disk_total_bytes Total space for path disk_total_bytes{path="/Users/anuragk/workspace/ts/demo/otel/java/hello/."}
4.94384795648E11 1660037708660
# TYPE jvm_buffer_total_capacity_bytes gauge
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
jvm_buffer_total_capacity_bytes{id="mapped - 'non-volatile memory'"}
0.0 1660037708660 jvm_buffer_total_capacity_bytes{id="mapped"} 0.0 1660037708660 jvm_buffer_total_capacity_bytes{id="direct"} 16384.0
1660037708660 ```
Received metrics
# HELP otelcol_exporter_enqueue_failed_log_records Number of log records failed to be added to the sending queue.
# TYPE otelcol_exporter_enqueue_failed_log_records counter otelcol_exporter_enqueue_failed_log_records{exporter="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
0
# HELP otelcol_exporter_enqueue_failed_metric_points Number of metric points failed to be added to the sending queue.
# TYPE otelcol_exporter_enqueue_failed_metric_points counter otelcol_exporter_enqueue_failed_metric_points{exporter="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
0
# HELP otelcol_exporter_enqueue_failed_spans Number of spans failed to be added to the sending queue.
# TYPE otelcol_exporter_enqueue_failed_spans counter otelcol_exporter_enqueue_failed_spans{exporter="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
0
# HELP otelcol_exporter_sent_metric_points Number of metric points successfully sent to destination.
# TYPE otelcol_exporter_sent_metric_points counter otelcol_exporter_sent_metric_points{exporter="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
5576
# HELP otelcol_process_cpu_seconds Total CPU user and system time in seconds
# TYPE otelcol_process_cpu_seconds counter otelcol_process_cpu_seconds{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
2.17
# HELP otelcol_process_memory_rss Total physical memory (resident set size)
# TYPE otelcol_process_memory_rss gauge otelcol_process_memory_rss{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
6.5077248e+07
# HELP otelcol_process_runtime_heap_alloc_bytes Bytes of allocated heap objects (see 'go doc runtime.MemStats.HeapAlloc')
# TYPE otelcol_process_runtime_heap_alloc_bytes gauge otelcol_process_runtime_heap_alloc_bytes{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
9.784448e+06
# HELP otelcol_process_runtime_total_alloc_bytes Cumulative bytes allocated for heap objects (see 'go doc runtime.MemStats.TotalAlloc')
# TYPE otelcol_process_runtime_total_alloc_bytes counter otelcol_process_runtime_total_alloc_bytes{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
3.9931936e+07
# HELP otelcol_process_runtime_total_sys_memory_bytes Total bytes of memory obtained from the OS (see 'go doc runtime.MemStats.Sys')
# TYPE otelcol_process_runtime_total_sys_memory_bytes gauge otelcol_process_runtime_total_sys_memory_bytes{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
2.944308e+07
# HELP otelcol_process_uptime Uptime of the process
# TYPE otelcol_process_uptime counter otelcol_process_uptime{service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
388.883801875
# HELP otelcol_processor_batch_batch_send_size Number of units in the batch
# TYPE otelcol_processor_batch_batch_send_size histogram otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="10"}
26
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="25"}
28
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="50"}
32
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="75"}
32
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="100"}
32
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="250"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="500"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="750"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="1000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="2000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="3000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="4000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="5000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="6000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="7000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="8000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="9000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="10000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="20000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="30000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="50000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="100000"}
75
otelcol_processor_batch_batch_send_size_bucket{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",le="+Inf"}
75
otelcol_processor_batch_batch_send_size_sum{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
5576.000000000002 otelcol_processor_batch_batch_send_size_count{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
75
# HELP otelcol_processor_batch_timeout_trigger_send Number of times the batch was sent due to a timeout trigger
# TYPE otelcol_processor_batch_timeout_trigger_send counter otelcol_processor_batch_timeout_trigger_send{processor="batch",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0"}
75
# HELP otelcol_receiver_accepted_metric_points Number of metric points successfully pushed into the pipeline.
# TYPE otelcol_receiver_accepted_metric_points counter otelcol_receiver_accepted_metric_points{receiver="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",transport="http"}
5576
# HELP otelcol_receiver_refused_metric_points Number of metric points that could not be pushed into the pipeline.
# TYPE otelcol_receiver_refused_metric_points counter otelcol_receiver_refused_metric_points{receiver="prometheus",service_instance_id="57f37a9a-5825-48b3-a8a2-240ee4c52656",service_version="0.56.0",transport="http"}
0 ```

I also added io.opentelemetry.instrumentation:opentelemetry-runtime-metrics as a runtime library in my application. Then I get some metrics like:
process_runtime_jvm_memory_committed{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha",pool="CodeHeap 'non-nmethods'",type="non_heap"} 2555904.0 1675781958779
...
process_runtime_jvm_memory_init{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha",pool="CodeHeap 'non-nmethods'",type="non_heap"} 2555904.0 1675781958779
...
process_runtime_jvm_memory_limit{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha",pool="CodeHeap 'non-nmethods'",type="non_heap"} 5836800.0 1675781958779
...
process_runtime_jvm_system_cpu_load_1m{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha"} 0.63 1675781958779
process_runtime_jvm_system_cpu_utilization{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha"} 0.009925090776389015 1675781958779
process_runtime_jvm_threads_count{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha",daemon="false"} 1.0 1675781958779
process_runtime_jvm_threads_count{otel_scope_name="io.opentelemetry.runtime-metrics",otel_scope_version="1.22.1-alpha",daemon="true"} 21.0 1675781958779
Still missing metrics like jvm_memory_used_bytes though

Related

Prometheus - connect: no route to host. Even though the service is up

When I access Prometheus targets, one of these targets is getting "state down", even thoug I can access it on my browser (). I am problably missing something. I´d like to know what it is, because I have searched for it for some hours and I can´t find the reason. I even disabled the Windows firewall for a while, but it doesn´t work.
I am using docker and Spring. The Rest services I created are working fine. (myIP is the localhost in the code and in the image).
My docker is configurated as shown bellow.
docker-compose.yml
Use root/example as user/password credentials
version: '3.8'
services:
mysql:
image: mysql:8.0.22
container_name: mysql
ports:
- 3306:3306
environment:
MYSQL_USER: atendente
MYSQL_PASSWORD: atendente
MYSQL_ROOT_PASSWORD: root
MYSQL_DATABASE: plano_saude_db
jaeger:
image: jaegertracing/all-in-one:1.21
container_name: jaeger
ports:
- 5775:5775/udp
- 6831:6831/udp
- 5778:5778
- 16686:16686
- 14268:14268
- 14250:14250
- 9411:9411
prometheus:
image: prom/prometheus:v2.24.1
container_name: prometheus
ports:
- 9090:9090
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
'--config.file=/etc/prometheus/prometheus.yml'
prometheus.yml:
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'planosaude-sys'
scrape_interval: 5s
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['<myIp>:8080']
Accessing http://myIp:8080/actuator/prometheus I get the following:
# HELP jdbc_connections_idle Number of established but idle connections.
# TYPE jdbc_connections_idle gauge jdbc_connections_idle{name="dataSource",} 10.0
# HELP tomcat_sessions_active_max_sessions
# TYPE tomcat_sessions_active_max_sessions gauge tomcat_sessions_active_max_sessions 0.0
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
# TYPE jvm_memory_max_bytes gauge jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'profiled
nmethods'",} 1.2288E8 jvm_memory_max_bytes{area="heap",id="G1 Survivor
Space",} -1.0 jvm_memory_max_bytes{area="heap",id="G1 Old Gen",}
1.579155456E9 jvm_memory_max_bytes{area="nonheap",id="Metaspace",} -1.0 jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 5898240.0 jvm_memory_max_bytes{area="heap",id="G1
Eden Space",} -1.0 jvm_memory_max_bytes{area="nonheap",id="Compressed
Class Space",} 1.073741824E9
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled
nmethods'",} 1.2288E8
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
# TYPE jvm_threads_peak_threads gauge jvm_threads_peak_threads 21.0
# HELP jvm_threads_states_threads The current number of threads having NEW state
# TYPE jvm_threads_states_threads gauge jvm_threads_states_threads{state="runnable",} 9.0
jvm_threads_states_threads{state="blocked",} 0.0
jvm_threads_states_threads{state="waiting",} 5.0
jvm_threads_states_threads{state="timed-waiting",} 6.0
jvm_threads_states_threads{state="new",} 0.0
jvm_threads_states_threads{state="terminated",} 0.0
# HELP jdbc_connections_max Maximum number of active connections that can be allocated at the same time.
# TYPE jdbc_connections_max gauge jdbc_connections_max{name="dataSource",} 10.0
# HELP process_uptime_seconds The uptime of the Java virtual machine
# TYPE process_uptime_seconds gauge process_uptime_seconds 2484.248
# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter logback_events_total{level="warn",} 1.0
logback_events_total{level="debug",} 0.0
logback_events_total{level="error",} 0.0
logback_events_total{level="trace",} 0.0
logback_events_total{level="info",} 15.0
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
# TYPE jvm_classes_loaded_classes gauge jvm_classes_loaded_classes 12647.0
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge jvm_threads_live_threads 20.0
# HELP hikaricp_connections_usage_seconds Connection usage time
# TYPE hikaricp_connections_usage_seconds summary hikaricp_connections_usage_seconds_count{pool="HikariPool-1",} 0.0
hikaricp_connections_usage_seconds_sum{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_usage_seconds_max Connection usage time
# TYPE hikaricp_connections_usage_seconds_max gauge hikaricp_connections_usage_seconds_max{pool="HikariPool-1",} 0.0
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool
# TYPE jvm_gc_max_data_size_bytes gauge jvm_gc_max_data_size_bytes 1.579155456E9
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge jvm_buffer_total_capacity_bytes{id="mapped - 'non-volatile memory'",}
0.0 jvm_buffer_total_capacity_bytes{id="mapped",} 0.0 jvm_buffer_total_capacity_bytes{id="direct",} 32768.0
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'profiled
nmethods'",} 1.1337728E7 jvm_memory_committed_bytes{area="heap",id="G1
Survivor Space",} 2097152.0
jvm_memory_committed_bytes{area="heap",id="G1 Old Gen",} 3.7748736E7
jvm_memory_committed_bytes{area="nonheap",id="Metaspace",} 7.0582272E7
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap
'non-nmethods'",} 2555904.0
jvm_memory_committed_bytes{area="heap",id="G1 Eden Space",} 2.62144E7
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class
Space",} 9109504.0
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-profiled
nmethods'",} 4325376.0
# HELP hikaricp_connections_pending Pending threads
# TYPE hikaricp_connections_pending gauge hikaricp_connections_pending{pool="HikariPool-1",} 0.0
# HELP tomcat_sessions_alive_max_seconds
# TYPE tomcat_sessions_alive_max_seconds gauge tomcat_sessions_alive_max_seconds 0.0
# HELP tomcat_sessions_active_current_sessions
# TYPE tomcat_sessions_active_current_sessions gauge tomcat_sessions_active_current_sessions 0.0
# HELP jdbc_connections_min Minimum number of idle connections in the pool.
# TYPE jdbc_connections_min gauge jdbc_connections_min{name="dataSource",} 10.0
# HELP jdbc_connections_active Current number of active connections that have been allocated from the data source.
# TYPE jdbc_connections_active gauge jdbc_connections_active{name="dataSource",} 0.0
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# TYPE jvm_threads_daemon_threads gauge jvm_threads_daemon_threads 16.0
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'profiled
nmethods'",} 1.1297792E7 jvm_memory_used_bytes{area="heap",id="G1
Survivor Space",} 1651760.0 jvm_memory_used_bytes{area="heap",id="G1
Old Gen",} 2.7873792E7
jvm_memory_used_bytes{area="nonheap",id="Metaspace",} 7.0095672E7
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",}
1357312.0 jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 1.2582912E7 jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space",} 8883800.0 jvm_memory_used_bytes{area="nonheap",id="CodeHeap
'non-profiled nmethods'",} 4293632.0
# HELP tomcat_sessions_rejected_sessions_total
# TYPE tomcat_sessions_rejected_sessions_total counter tomcat_sessions_rejected_sessions_total 0.0
# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process
# TYPE process_cpu_usage gauge process_cpu_usage 0.0013668563296602018
# HELP hikaricp_connections_active Active connections
# TYPE hikaricp_connections_active gauge hikaricp_connections_active{pool="HikariPool-1",} 0.0
# HELP tomcat_sessions_expired_sessions_total
# TYPE tomcat_sessions_expired_sessions_total counter tomcat_sessions_expired_sessions_total 0.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge jvm_buffer_count_buffers{id="mapped - 'non-volatile memory'",} 0.0
jvm_buffer_count_buffers{id="mapped",} 0.0
jvm_buffer_count_buffers{id="direct",} 4.0
# HELP hikaricp_connections_creation_seconds_max Connection creation time
# TYPE hikaricp_connections_creation_seconds_max gauge hikaricp_connections_creation_seconds_max{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_creation_seconds Connection creation time
# TYPE hikaricp_connections_creation_seconds summary hikaricp_connections_creation_seconds_count{pool="HikariPool-1",} 10.0
hikaricp_connections_creation_seconds_sum{pool="HikariPool-1",} 0.943
# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter jvm_gc_memory_promoted_bytes_total 1.356288E7
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge system_cpu_usage 0.07962681468188304
# HELP http_server_requests_seconds
# TYPE http_server_requests_seconds summary http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",}
1.0 http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",}
2.7254424
# HELP http_server_requests_seconds_max
# TYPE http_server_requests_seconds_max gauge http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",}
0.0
# HELP hikaricp_connections_timeout_total Connection timeout total count
# TYPE hikaricp_connections_timeout_total counter hikaricp_connections_timeout_total{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_max Max connections
# TYPE hikaricp_connections_max gauge hikaricp_connections_max{pool="HikariPool-1",} 10.0
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# TYPE jvm_gc_live_data_size_bytes gauge jvm_gc_live_data_size_bytes 0.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter jvm_classes_unloaded_classes_total 0.0
# HELP hikaricp_connections_idle Idle connections
# TYPE hikaricp_connections_idle gauge hikaricp_connections_idle{pool="HikariPool-1",} 10.0
# HELP hikaricp_connections Total connections
# TYPE hikaricp_connections gauge hikaricp_connections{pool="HikariPool-1",} 10.0
# HELP jvm_gc_pause_seconds Time spent in GC pause
# TYPE jvm_gc_pause_seconds summary jvm_gc_pause_seconds_count{action="end of minor GC",cause="Metadata GC
Threshold",} 1.0 jvm_gc_pause_seconds_sum{action="end of minor
GC",cause="Metadata GC Threshold",} 0.004
jvm_gc_pause_seconds_count{action="end of minor GC",cause="G1
Evacuation Pause",} 9.0 jvm_gc_pause_seconds_sum{action="end of minor
GC",cause="G1 Evacuation Pause",} 0.038
# HELP jvm_gc_pause_seconds_max Time spent in GC pause
# TYPE jvm_gc_pause_seconds_max gauge jvm_gc_pause_seconds_max{action="end of minor GC",cause="Metadata GC
Threshold",} 0.0 jvm_gc_pause_seconds_max{action="end of minor
GC",cause="G1 Evacuation Pause",} 0.0
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the (young) heap memory pool after one GC to before the
next
# TYPE jvm_gc_memory_allocated_bytes_total counter jvm_gc_memory_allocated_bytes_total 2.42221056E8
# HELP hikaricp_connections_acquire_seconds Connection acquire time
# TYPE hikaricp_connections_acquire_seconds summary hikaricp_connections_acquire_seconds_count{pool="HikariPool-1",} 0.0
hikaricp_connections_acquire_seconds_sum{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_acquire_seconds_max Connection acquire time
# TYPE hikaricp_connections_acquire_seconds_max gauge hikaricp_connections_acquire_seconds_max{pool="HikariPool-1",} 0.0
# HELP process_start_time_seconds Start time of the process since unix epoch.
# TYPE process_start_time_seconds gauge process_start_time_seconds 1.631847251003E9
# HELP tomcat_sessions_created_sessions_total
# TYPE tomcat_sessions_created_sessions_total counter tomcat_sessions_created_sessions_total 0.0
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge jvm_buffer_memory_used_bytes{id="mapped - 'non-volatile memory'",} 0.0
jvm_buffer_memory_used_bytes{id="mapped",} 0.0
jvm_buffer_memory_used_bytes{id="direct",} 32768.0
# HELP system_cpu_count The number of processors available to the Java virtual machine
# TYPE system_cpu_count gauge system_cpu_count 8.0
# HELP hikaricp_connections_min Min connections
# TYPE hikaricp_connections_min gauge hikaricp_connections_min{pool="HikariPool-1",} 10.0
Pom.xml:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.5.4</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.planosaude</groupId>
<artifactId>planosaude-sys</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>planosaude-sys</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>11</java.version>
</properties>
<dependencies>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
If I use docker.for.win.localhost, I get the error:
'dial tcp: lookup docker.for.win.localhost on 127.0.0.11:53: no such
host".
I am trying to solve this new error now.
Solved
It works if I use host.docker.internal instead.
In field <myip> you should use mysql.
As stated in Networking in compose
Each container for a service joins the default network and is both reachable by other containers on that network, and discoverable by them at a hostname identical to the container name.
If you use localhost that refers to the container itself, but inside the prometheus container, only prometheus is running, not the mysql instance.
Job scraping prometheus is working, because prometheus itself is running in localhost.

Elasticsearch indexing doc with file size 42 MB created over 2GB heap member

I' trying to index a 42 MB doc, and I saw the over 2GB memory were used in the heap.
>> ls -l data.json
-rw-r--r--# 1 user Network\Domain Users 42798016 Oct 29 11:03 data.json
>> curl -XPOST http://localhost:9200/my-index/my-type -d data.json
curl: (52) Empty reply from server
Below is the error fro Elasticsearch
2018-10-29T12:02:42,583][WARN ][o.e.m.j.JvmGcMonitorService] [Cfr4bRu] [gc][1665] overhead, spent [3.3s] collecting in the last [3.6s]
[2018-10-29T12:02:46,386][WARN ][o.e.m.j.JvmGcMonitorService] [Cfr4bRu] [gc][1666] overhead, spent [3.2s] collecting in the last [3.8s]
[2018-10-29T12:02:50,038][WARN ][o.e.m.j.JvmGcMonitorService] [Cfr4bRu] [gc][1667] overhead, spent [3.2s] collecting in the last [3.6s]
[2018-10-29T12:02:53,110][WARN ][o.e.m.j.JvmGcMonitorService] [Cfr4bRu] [gc][1668] overhead, spent [3s] collecting in the last [3s]
[2018-10-29T12:02:56,683][WARN ][o.e.m.j.JvmGcMonitorService] [Cfr4bRu] [gc][1669] overhead, spent [3.5s] collecting in the last [3.5s]
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid93445.hprof ...
Heap dump file created [2900298449 bytes in 37.716 secs]
How should I fix this?

SonarQube upgrade to 6.7.1 LTS: Unrecoverable Indexation Failures

I have successfully upgraded SonarQube to ver. 6.5, including the database upgrade, and I am currently trying to upgrade SonarQube to ver. 6.7.1 LTS. The New SonarQube version is being installed on a Linux 64 bit system and is connected to a 2014 Microsoft SQL database. Every time I try to launch the 6.7.1 version of SonarQube it fails with the error "Background initialization failed". If I run the new SonarQube using an empty Microsoft SQL database, then it will start up fine with no issues. The "Background initialization failed" issue only occurs when I connect the new SonarQube to the upgraded database. I have tried adding memory to the heap for ElasticSearch and reducing the number of issues being processed. Any help to resolve this issue would be greatly appreciated.
Web log:
web[][o.s.p.ProcessEntryPoint] Starting web
web[][o.a.t.u.n.NioSelectorPool] Using a shared selector for servlet write/read
web[][o.e.p.PluginsService] no modules loaded
web[][o.e.p.PluginsService] loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin]
web[][o.e.p.PluginsService] loaded plugin [org.elasticsearch.join.ParentJoinPlugin]
web[][o.e.p.PluginsService] loaded plugin [org.elasticsearch.percolator.PercolatorPlugin]
web[][o.e.p.PluginsService] loaded plugin [org.elasticsearch.transport.Netty4Plugin]
web[][i.n.c.MultithreadEventLoopGroup] -Dio.netty.eventLoopThreads: 64
web[][i.n.u.i.PlatformDependent0] -Dio.netty.noUnsafe: false
web[][i.n.u.i.PlatformDependent0] Java version: 8
web[][i.n.u.i.PlatformDependent0] sun.misc.Unsafe.theUnsafe: available
web[][i.n.u.i.PlatformDependent0] sun.misc.Unsafe.copyMemory: available
web[][i.n.u.i.PlatformDependent0] java.nio.Buffer.address: available
web[][i.n.u.i.PlatformDependent0] direct buffer constructor: available
web[][i.n.u.i.PlatformDependent0] java.nio.Bits.unaligned: available, true
web[][i.n.u.i.PlatformDependent0] jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable prior to Java9
web[][i.n.u.i.PlatformDependent0] java.nio.DirectByteBuffer.<init>(long, int): available
web[][i.n.u.i.PlatformDependent] sun.misc.Unsafe: available
web[][i.n.u.i.PlatformDependent] -Dio.netty.tmpdir: /../../sonarqube-6.7.1/temp (java.io.tmpdir)
web[][i.n.u.i.PlatformDependent] -Dio.netty.bitMode: 64 (sun.arch.data.model)
web[][i.n.u.i.PlatformDependent] -Dio.netty.noPreferDirect: false
web[][i.n.u.i.PlatformDependent] -Dio.netty.maxDirectMemory: 4772593664 bytes
web[][i.n.u.i.PlatformDependent] -Dio.netty.uninitializedArrayAllocationThreshold: -1
web[][i.n.u.i.CleanerJava6] java.nio.ByteBuffer.cleaner(): available
web[][i.n.c.n.NioEventLoop] -Dio.netty.noKeySetOptimization: false
web[][i.n.c.n.NioEventLoop] -Dio.netty.selectorAutoRebuildThreshold: 512
web[][i.n.u.i.PlatformDependent] org.jctools-core.MpscChunkedArrayQueue: available
web[][i.n.c.DefaultChannelId] -Dio.netty.processId: ***** (auto-detected)
web[][i.netty.util.NetUtil] -Djava.net.preferIPv4Stack: true
web[][i.netty.util.NetUtil] -Djava.net.preferIPv6Addresses: false
web[][i.netty.util.NetUtil] Loopback interface: lo (lo, 127.0.0.1)
web[][i.netty.util.NetUtil] /proc/sys/net/core/somaxconn: 128
web[][i.n.c.DefaultChannelId] -Dio.netty.machineId: ***** (auto-detected)
web[][i.n.u.ResourceLeakDetector] -Dio.netty.leakDetection.level: simple
web[][i.n.u.ResourceLeakDetector] -Dio.netty.leakDetection.maxRecords: 4
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.numHeapArenas: 47
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.numDirectArenas: 47
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.pageSize: 8192
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.maxOrder: 11
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.chunkSize: 16777216
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.tinyCacheSize: 512
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.smallCacheSize: 256
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.normalCacheSize: 64
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.maxCachedBufferCapacity: 32768
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.cacheTrimInterval: 8192
web[][i.n.b.PooledByteBufAllocator] -Dio.netty.allocator.useCacheForAllThreads: true
web[][i.n.b.ByteBufUtil] -Dio.netty.allocator.type: pooled
web[][i.n.b.ByteBufUtil] -Dio.netty.threadLocalDirectBufferSize: 65536
web[][i.n.b.ByteBufUtil] -Dio.netty.maxThreadLocalCharBufferSize: 16384
web[][i.n.b.AbstractByteBuf] -Dio.netty.buffer.bytebuf.checkAccessible: true
web[][i.n.u.ResourceLeakDetectorFactory] Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector#6c6be5c2
web[][i.n.util.Recycler] -Dio.netty.recycler.maxCapacityPerThread: 32768
web[][i.n.util.Recycler] -Dio.netty.recycler.maxSharedCapacityFactor: 2
web[][i.n.util.Recycler] -Dio.netty.recycler.linkCapacity: 16
web[][i.n.util.Recycler] -Dio.netty.recycler.ratio: 8
web[][o.s.s.e.EsClientProvider] Connected to local Elasticsearch: [127.0.0.1:*****]
web[][o.s.s.p.LogServerVersion] SonarQube Server / 6.7.1.35068 / 426519346f51f7b980a76f9050f983110550509d
web[][o.sonar.db.Database] Create JDBC data source for jdbc:sqlserver:*****
web[][o.s.s.p.ServerFileSystemImpl] SonarQube home: /../../sonarqube-6.7.1
web[][o.s.s.u.SystemPasscodeImpl] System authentication by passcode is disabled
web[][o.s.c.i.DefaultI18n] Loaded 2094 properties from l10n bundles
web[][o.s.s.p.d.m.c.MssqlCharsetHandler] Verify that database collation is case-sensitive and accent-sensitive
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.ws.WebServiceFilter#7e977d45 [pattern=UrlPattern{inclusions=[/api/system/migrate_db/*, ...], exclusions=[/api/properties*, ...]}]
web[][o.s.s.a.TomcatAccessLog] Tomcat is started
web[][o.s.s.a.EmbeddedTomcat] HTTP connector enabled on port ****
web[][o.s.s.p.UpdateCenterClient] Update center:https://update.sonarsource.org/update-center.properties (no proxy)
web[][o.s.a.r.Languages] No language available
web[][o.s.s.e.RecoveryIndexer] Elasticsearch recovery - sonar.search.recovery.minAgeInMs=300000
web[][o.s.s.e.RecoveryIndexer] Elasticsearch recovery - sonar.search.recovery.loopLimit=10000
web[][o.s.s.s.LogServerId] Server ID: *****
web[][o.s.s.e.RecoveryIndexer] Elasticsearch recovery - sonar.search.recovery.delayInMs=300000
web[][o.s.s.e.RecoveryIndexer] Elasticsearch recovery - sonar.search.recovery.initialDelayInMs=26327
web[][o.s.s.t.TelemetryDaemon] Sharing of SonarQube statistics is enabled.
web[][o.s.s.n.NotificationDaemon] Notification service started (delay 60 sec.)
web[][o.s.s.s.GeneratePluginIndex] Generate scanner plugin index
web[][o.s.s.s.GeneratePluginIndex] Generate scanner plugin index (done) | time=1ms
web[][o.s.s.s.RegisterPlugins] Register plugins
web[][o.s.s.s.RegisterPlugins] Register plugins (done) | time=167ms
web[][o.s.s.s.RegisterMetrics] Register metrics
web[][o.s.s.s.RegisterMetrics] Register metrics (done) | time=2734ms
web[][o.s.s.r.RegisterRules] Register rules
web[][o.s.s.r.RegisterRules] Register rules (done) | time=685ms
web[][o.s.s.q.BuiltInQProfileRepositoryImpl] Load quality profiles
web[][o.s.s.q.BuiltInQProfileRepositoryImpl] Load quality profiles (done) | time=2ms
web[][o.s.s.s.RegisterPermissionTemplates] Register permission templates
web[][o.s.s.s.RegisterPermissionTemplates] Register permission templates (done) | time=153ms
web[][o.s.s.s.RenameDeprecatedPropertyKeys] Rename deprecated property keys
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.ws.WebServiceFilter#3a6e54b [pattern=UrlPattern{inclusions=[/api/measures/component/*, ...], exclusions=[/api/properties*, ...]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.ws.DeprecatedPropertiesWsFilter#3b2c45f3 [pattern=UrlPattern{inclusions=[/api/properties/*], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.ws.WebServiceReroutingFilter#42ffe60e [pattern=UrlPattern{inclusions=[/api/components/bulk_update_key, ...], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.authentication.InitFilter#3bc1cd0f [pattern=UrlPattern{inclusions=[/sessions/init/*], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.authentication.OAuth2CallbackFilter#533fe992 [pattern=UrlPattern{inclusions=[/oauth2/callback/*], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.authentication.ws.LoginAction#54370dcd [pattern=UrlPattern{inclusions=[/api/authentication/login], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.authentication.ws.LogoutAction#7bc801b4 [pattern=UrlPattern{inclusions=[/api/authentication/logout], exclusions=[]}]
web[][o.s.s.p.w.MasterServletFilter] Initializing servlet filter org.sonar.server.authentication.ws.ValidateAction#2e0576fc [pattern=UrlPattern{inclusions=[/api/authentication/validate], exclusions=[]}]
web[][o.s.s.e.IndexerStartupTask] Indexing of type [issues/issue] ...
web[][o.s.s.es.BulkIndexer] 1387134 requests processed (23118 items/sec)
web[][o.s.s.es.BulkIndexer] 2715226 requests processed (22134 items/sec)
web[][o.s.s.es.BulkIndexer] 3944404 requests processed (20486 items/sec)
web[][o.s.s.es.BulkIndexer] 5319447 requests processed (22917 items/sec)
web[][o.s.s.es.BulkIndexer] 6871423 requests processed (25866 items/sec)
web[][o.s.s.es.BulkIndexer] 7814247 requests processed (15713 items/sec)
web[][o.s.s.es.BulkIndexer] 7814247 requests processed (0 items/sec)
web[][o.s.s.es.BulkIndexer] 7814247 requests processed (0 items/sec)
web[][o.s.s.p.Platform] Background initialization failed. Stopping SonarQube
java.lang.IllegalStateException: Unrecoverable indexation failures
at org.sonar.server.es.IndexingListener$1.onFinish(IndexingListener.java:39)
at org.sonar.server.es.BulkIndexer.stop(BulkIndexer.java:117)
at org.sonar.server.issue.index.IssueIndexer.doIndex(IssueIndexer.java:247)
at org.sonar.server.issue.index.IssueIndexer.indexOnStartup(IssueIndexer.java:95)
at org.sonar.server.es.IndexerStartupTask.indexUninitializedTypes(IndexerStartupTask.java:68)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at org.sonar.server.es.IndexerStartupTask.execute(IndexerStartupTask.java:55)
at java.util.Optional.ifPresent(Optional.java:159)
at org.sonar.server.platform.platformlevel.PlatformLevelStartup$1.doPrivileged(PlatformLevelStartup.java:84)
at org.sonar.server.user.DoPrivileged.execute(DoPrivileged.java:45)
at org.sonar.server.platform.platformlevel.PlatformLevelStartup.start(PlatformLevelStartup.java:80)
at org.sonar.server.platform.Platform.executeStartupTasks(Platform.java:196)
at org.sonar.server.platform.Platform.access$400(Platform.java:46)
at org.sonar.server.platform.Platform$1.lambda$doRun$1(Platform.java:121)
at org.sonar.server.platform.Platform$AutoStarterRunnable.runIfNotAborted(Platform.java:371)
at org.sonar.server.platform.Platform$1.doRun(Platform.java:121)
at org.sonar.server.platform.Platform$AutoStarterRunnable.run(Platform.java:355)
at java.lang.Thread.run(Thread.java:748)
web[][o.s.s.p.Platform] Background initialization of SonarQube done
web[][o.s.p.StopWatcher] Stopping process
===========================================================================
Edit: I have referenced the link provided prior to my initial post. The post referenced "free space" which I assumed to mean disk space, here is my disk space values where SonarQube 6.7.1 is installed:
1K-blocks Used Available Use%
251531268 16204576 235326692 7% /prod/appl
Also here is a portion of my elasticsearch log where the error in the web.log occurs. SonarQube 6.7.1 uses Elasticsearch-5.
Elasticsearch log:
es[][o.e.i.IndexingMemoryController] write indexing buffer to disk for shard [[issues][0]] to free up its [29.8mb] indexing buffer
es[][o.e.i.s.IndexShard] add [29.8mb] writing bytes for shard [[issues][0]]
es[][o.e.i.e.Engine] use refresh to write indexing buffer (heap size=[23.5mb]), to also clear version map (heap size=[6.3mb])
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][49] took [462.3micros]
es[][o.e.i.s.IndexShard] remove [29.8mb] writing bytes for shard [[issues][0]]
es[][o.e.i.IndexingMemoryController] now write some indexing buffers: total indexing heap bytes used [104.3mb] vs indices.memory.index_buffer_size [98.9mb], currently writing bytes [0b], [5] shards with non-zero indexing buffer
es[][o.e.i.IndexingMemoryController] write indexing buffer to disk for shard [[issues][1]] to free up its [54.8mb] indexing buffer
es[][o.e.i.s.IndexShard] add [54.8mb] writing bytes for shard [[issues][1]]
es[][o.e.i.e.Engine] use IndexWriter.flush to write indexing buffer (heap size=[51.1mb]) since version map is small (heap size=[3.6mb])
es[][o.e.i.s.IndexShard] remove [54.8mb] writing bytes for shard [[issues][1]]
es[][o.e.i.IndexingMemoryController] now write some indexing buffers: total indexing heap bytes used [104.2mb] vs indices.memory.index_buffer_size [98.9mb], currently writing bytes [0b], [5] shards with non-zero indexing buffer
es[][o.e.i.IndexingMemoryController] write indexing buffer to disk for shard [[issues][1]] to free up its [50.7mb] indexing buffer
es[][o.e.i.s.IndexShard] add [50.7mb] writing bytes for shard [[issues][1]]
es[][o.e.i.e.Engine] use IndexWriter.flush to write indexing buffer (heap size=[43.9mb]) since version map is small (heap size=[6.7mb])
es[][o.e.i.s.IndexShard] remove [50.7mb] writing bytes for shard [[issues][1]]
es[][o.e.i.IndexingMemoryController] now write some indexing buffers: total indexing heap bytes used [100.1mb] vs indices.memory.index_buffer_size [98.9mb], currently writing bytes [0b], [5] shards with non-zero indexing buffer
es[][o.e.i.IndexingMemoryController] write indexing buffer to disk for shard [[issues][1]] to free up its [31.5mb] indexing buffer
es[][o.e.i.s.IndexShard] add [31.5mb] writing bytes for shard [[issues][1]]
es[][o.e.i.e.Engine] use refresh to write indexing buffer (heap size=[23.3mb]), to also clear version map (heap size=[8.2mb])
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][46] took [988.8micros]
es[][o.e.i.s.IndexShard] remove [31.5mb] writing bytes for shard [[issues][1]]
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][46] took [880.6micros]
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][57] took [510.7micros]
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][49] took [829.3micros]
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][47] took [412.9micros]
es[][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] global-ordinals [_parent#authorization][43] took [277.4micros]
es[][o.e.i.e.InternalEngine$EngineMergeScheduler] merge segment [_kh] done: took [30.9s], [343.7 MB], [3,159,200 docs], [0s stopped], [1.5s throttled], [169.4 MB written], [Infinity MB/sec throttle]
es[][o.e.i.e.InternalEngine$EngineMergeScheduler] merge segment [_oc] done: took [28.9s], [290.9 MB], [2,593,116 docs], [0s stopped], [0s throttled], [232.1 MB written], [Infinity MB/sec throttle]
es[][o.e.i.e.InternalEngine$EngineMergeScheduler] merge segment [_pz] done: took [30.6s], [341.3 MB], [2,573,716 docs], [0s stopped], [0s throttled], [266.1 MB written], [Infinity MB/sec throttle]
es[][o.e.i.e.InternalEngine$EngineMergeScheduler] merge segment [_th] done: took [35.2s], [346.3 MB], [3,102,397 docs], [0s stopped], [0s throttled], [262.0 MB written], [Infinity MB/sec throttle]
es[][o.e.c.s.ClusterService] processing [update-settings]: execute
es[][o.e.i.IndicesQueryCache] using [node] query cache with size [98.9mb] max filter count [10000]
es[][o.e.i.IndicesService] creating Index [[issues/WmTjz_-ITtyPeqpDlqPeFg]], shards [5]/[0] - reason [metadata verification]
es[][o.e.i.s.IndexStore] using index.store.throttle.type [NONE], with index.store.throttle.max_bytes_per_sec [null]
es[][o.e.i.m.MapperService] using dynamic[false]
es[][o.e.i.c.b.BitsetFilterCache] clearing all bitsets because [close]
es[][o.e.i.c.q.IndexQueryCache] full cache clear, reason [close]
es[][o.e.i.c.b.BitsetFilterCache] clearing all bitsets because [close]
es[][o.e.c.s.ClusterService] cluster state updated, version [17], source [update-settings]
es[][o.e.c.s.ClusterService] publishing cluster state version [17]
es[][o.e.c.s.ClusterService] applying cluster state version 17
es[][o.e.c.s.ClusterService] set local cluster state to version 17
es[][o.e.c.s.ClusterService] processing [update-settings]: took [19ms] done applying updated cluster_state (version: 17, uuid: dkhQacKBQGS5YsyMqp1kmQ)
es[][o.e.n.Node] stopping ...

Spark + Parquet + S3n : Seems to read parquet file many times

I have the parquet files in Hive-like partitioned way on S3n bucket. The metadata files are not created, the parquet footers are in the file itself.
When I tried a sample spark job in local mode (v-1.6.0) trying to read a file of size 5.2 MB:
val filePath = "s3n://bucket/trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet"
val path: Path = new Path(filePath)
val conf = new SparkConf().setMaster("local[2]").set("spark.app.name", "parquet-reader-s3n").set("spark.eventLog.enabled", "true")
val sc = new SparkContext(conf)
val sqlc = new org.apache.spark.sql.SQLContext(sc)
val df = sqlc.read.parquet(filePath).select("referenceCode")
Thread.sleep(1000*10) // Intentionally given
println(df.schema)
val output = df.collect
The log generated is:
..
[22:21:56.505][main][INFO][BlockManagerMaster:58] Registered BlockManager
[22:21:56.909][main][INFO][EventLoggingListener:58] Logging events to file:/tmp/spark-events/local-1463676716372
[22:21:57.307][main][INFO][ParquetRelation:58] Listing s3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet on driver
[22:21:59.927][main][INFO][SparkContext:58] Starting job: parquet at InspectInputSplits.scala:30
[22:21:59.942][dag-scheduler-event-loop][INFO][DAGScheduler:58] Got job 0 (parquet at InspectInputSplits.scala:30) with 2 output partitions
[22:21:59.942][dag-scheduler-event-loop][INFO][DAGScheduler:58] Final stage: ResultStage 0 (parquet at InspectInputSplits.scala:30)
[22:21:59.943][dag-scheduler-event-loop][INFO][DAGScheduler:58] Parents of final stage: List()
[22:21:59.944][dag-scheduler-event-loop][INFO][DAGScheduler:58] Missing parents: List()
[22:21:59.954][dag-scheduler-event-loop][INFO][DAGScheduler:58] Submitting ResultStage 0 (MapPartitionsRDD[1] at parquet at InspectInputSplits.scala:30), which has no missing parents
[22:22:00.218][dag-scheduler-event-loop][INFO][MemoryStore:58] Block broadcast_0 stored as values in memory (estimated size 64.5 KB, free 64.5 KB)
[22:22:00.226][dag-scheduler-event-loop][INFO][MemoryStore:58] Block broadcast_0_piece0 stored as bytes in memory (estimated size 21.7 KB, free 86.2 KB)
[22:22:00.229][dispatcher-event-loop-0][INFO][BlockManagerInfo:58] Added broadcast_0_piece0 in memory on localhost:54419 (size: 21.7 KB, free: 1088.2 MB)
[22:22:00.231][dag-scheduler-event-loop][INFO][SparkContext:58] Created broadcast 0 from broadcast at DAGScheduler.scala:1006
[22:22:00.234][dag-scheduler-event-loop][INFO][DAGScheduler:58] Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at parquet at InspectInputSplits.scala:30)
[22:22:00.235][dag-scheduler-event-loop][INFO][TaskSchedulerImpl:58] Adding task set 0.0 with 2 tasks
[22:22:00.278][dispatcher-event-loop-1][INFO][TaskSetManager:58] Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2076 bytes)
[22:22:00.281][dispatcher-event-loop-1][INFO][TaskSetManager:58] Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2395 bytes)
[22:22:00.290][Executor task launch worker-0][INFO][Executor:58] Running task 0.0 in stage 0.0 (TID 0)
[22:22:00.291][Executor task launch worker-1][INFO][Executor:58] Running task 1.0 in stage 0.0 (TID 1)
[22:22:00.425][Executor task launch worker-1][INFO][ParquetFileReader:151] Initiating action with parallelism: 5
[22:22:00.447][Executor task launch worker-0][INFO][ParquetFileReader:151] Initiating action with parallelism: 5
[22:22:00.463][Executor task launch worker-0][INFO][Executor:58] Finished task 0.0 in stage 0.0 (TID 0). 936 bytes result sent to driver
[22:22:00.471][task-result-getter-0][INFO][TaskSetManager:58] Finished task 0.0 in stage 0.0 (TID 0) in 213 ms on localhost (1/2)
[22:22:00.586][pool-20-thread-1][INFO][NativeS3FileSystem:619] Opening 's3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet' for reading
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[22:22:25.890][Executor task launch worker-1][INFO][Executor:58] Finished task 1.0 in stage 0.0 (TID 1). 4067 bytes result sent to driver
[22:22:25.898][task-result-getter-1][INFO][TaskSetManager:58] Finished task 1.0 in stage 0.0 (TID 1) in 25617 ms on localhost (2/2)
[22:22:25.898][dag-scheduler-event-loop][INFO][DAGScheduler:58] ResultStage 0 (parquet at InspectInputSplits.scala:30) finished in 25.656 s
[22:22:25.899][task-result-getter-1][INFO][TaskSchedulerImpl:58] Removed TaskSet 0.0, whose tasks have all completed, from pool
[22:22:25.905][main][INFO][DAGScheduler:58] Job 0 finished: parquet at InspectInputSplits.scala:30, took 25.977801 s
StructType(StructField(referenceCode,StringType,true))
[22:22:36.271][main][INFO][DataSourceStrategy:58] Selected 1 partitions out of 1, pruned 0.0% partitions.
[22:22:36.325][main][INFO][MemoryStore:58] Block broadcast_1 stored as values in memory (estimated size 89.3 KB, free 175.5 KB)
[22:22:36.389][main][INFO][MemoryStore:58] Block broadcast_1_piece0 stored as bytes in memory (estimated size 20.2 KB, free 195.7 KB)
[22:22:36.389][dispatcher-event-loop-0][INFO][BlockManagerInfo:58] Added broadcast_1_piece0 in memory on localhost:54419 (size: 20.2 KB, free: 1088.2 MB)
[22:22:36.391][main][INFO][SparkContext:58] Created broadcast 1 from collect at InspectInputSplits.scala:34
[22:22:36.520][main][INFO][deprecation:1174] mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
[22:22:36.522][main][INFO][ParquetRelation:58] Reading Parquet file(s) from s3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet
[22:22:36.554][main][INFO][SparkContext:58] Starting job: collect at InspectInputSplits.scala:34
[22:22:36.556][dag-scheduler-event-loop][INFO][DAGScheduler:58] Got job 1 (collect at InspectInputSplits.scala:34) with 1 output partitions
[22:22:36.556][dag-scheduler-event-loop][INFO][DAGScheduler:58] Final stage: ResultStage 1 (collect at InspectInputSplits.scala:34)
[22:22:36.556][dag-scheduler-event-loop][INFO][DAGScheduler:58] Parents of final stage: List()
[22:22:36.557][dag-scheduler-event-loop][INFO][DAGScheduler:58] Missing parents: List()
[22:22:36.557][dag-scheduler-event-loop][INFO][DAGScheduler:58] Submitting ResultStage 1 (MapPartitionsRDD[4] at collect at InspectInputSplits.scala:34), which has no missing parents
[22:22:36.571][dag-scheduler-event-loop][INFO][MemoryStore:58] Block broadcast_2 stored as values in memory (estimated size 7.6 KB, free 203.3 KB)
[22:22:36.575][dag-scheduler-event-loop][INFO][MemoryStore:58] Block broadcast_2_piece0 stored as bytes in memory (estimated size 4.0 KB, free 207.3 KB)
[22:22:36.576][dispatcher-event-loop-1][INFO][BlockManagerInfo:58] Added broadcast_2_piece0 in memory on localhost:54419 (size: 4.0 KB, free: 1088.2 MB)
[22:22:36.577][dag-scheduler-event-loop][INFO][SparkContext:58] Created broadcast 2 from broadcast at DAGScheduler.scala:1006
[22:22:36.577][dag-scheduler-event-loop][INFO][DAGScheduler:58] Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[4] at collect at InspectInputSplits.scala:34)
[22:22:36.577][dag-scheduler-event-loop][INFO][TaskSchedulerImpl:58] Adding task set 1.0 with 1 tasks
[22:22:36.585][dispatcher-event-loop-3][INFO][TaskSetManager:58] Starting task 0.0 in stage 1.0 (TID 2, localhost, partition 0,PROCESS_LOCAL, 2481 bytes)
[22:22:36.586][Executor task launch worker-1][INFO][Executor:58] Running task 0.0 in stage 1.0 (TID 2)
[22:22:36.605][Executor task launch worker-1][INFO][ParquetRelation$$anonfun$buildInternalScan$1$$anon$1:58] Input split: ParquetInputSplit{part: s3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet start: 0 end: 5364897 length: 5364897 hosts: []}
[22:22:38.253][Executor task launch worker-1][INFO][NativeS3FileSystem:619] Opening 's3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet' for reading
[22:23:04.249][Executor task launch worker-1][INFO][NativeS3FileSystem:619] Opening 's3n://bucket//trackingPackage/dpYear=2016/dpMonth=5/dpDay=10/part-r-00004-1c86d6b0-4f6f-4770-a930-c42d77e3c729-1462833064172.gz.parquet' for reading
[22:23:28.337][Executor task launch worker-1][INFO][CodecPool:181] Got brand-new decompressor [.gz]
[22:23:28.400][dispatcher-event-loop-1][INFO][BlockManagerInfo:58] Removed broadcast_0_piece0 on localhost:54419 in memory (size: 21.7 KB, free: 1088.2 MB)
[22:23:28.408][Spark Context Cleaner][INFO][ContextCleaner:58] Cleaned accumulator 1
[22:23:49.993][Executor task launch worker-1][INFO][Executor:58] Finished task 0.0 in stage 1.0 (TID 2). 9376344 bytes result sent to driver
[22:23:50.191][task-result-getter-2][INFO][TaskSetManager:58] Finished task 0.0 in stage 1.0 (TID 2) in 73612 ms on localhost (1/1)
[22:23:50.191][task-result-getter-2][INFO][TaskSchedulerImpl:58] Removed TaskSet 1.0, whose tasks have all completed, from pool
[22:23:50.191][dag-scheduler-event-loop][INFO][DAGScheduler:58] ResultStage 1 (collect at InspectInputSplits.scala:34) finished in 73.612 s
[22:23:50.195][main][INFO][DAGScheduler:58] Job 1 finished: collect at InspectInputSplits.scala:34, took 73.640193 s
The SparkUI snapshot is:
Questions:
In logs, I can see that the parquet file is seen to be read in total of 3 times. One time by [pool-21-thread-1] thread (on driver) and another two times by [Executor task launch worker-1] thread, which I assume to be worker thread. On debug, I can see that before first read, two s3n requests were made specifically for the footer (it had the http header of content-range), first to get the size of the footer and then to get the footer itself. My question is: When we had the footer information, why [pool-21-thread-1] thread still had to read the entire file? And why the executor thread made 2 requests to read the s3 file?
In the spark UI, It shows that only 670 KB is being taken as input. Since I was not assured this to be true, I looked into network activity and it seems 20+ MB has been received. Snapshot attached shows nearly 5+ MB received data in first read and later on 15+ MB for the 2 reads after Thread.sleep(1000*10). I could not reach the debug point for last 2 reads by [pool-21-thread-1] thread due to IDE issues, so not sure whether the particular column ("referenceCode") is being read or the entire file. I understand that there are overhead network packets at the tcp/udp layers, but 20+ MB seems quite a lot for just one column.
After debugging into the application, it turned out that S3N still uses jets3t library but the S3A has a new implementation based on AWS SDK (
Hadoop-10400 )
The hadoop's implementation of NativeS3FileSystem does not support seek (partial content reads) on S3 files. It downloads the whole file first.
EDIT: The scenario was not seen in EMR. On EMR amazon provides a highly optimized S3 connector - emrfs for all schemes which overrides the connector provided by hadoop.

OpenJDK on FreeBSD: "Given reserved space must have been reserved already"

I'm trying to get Artifactory up and running on a FreeBSD machine. I installed /usr/ports/devel/artifactory, seemingly without problem, and then ran "/usr/local/etc/rc.d/artifactory start". It said Artifactory was starting, and didn't give any obvious signs of error, but when the script ended, Artifactory was not running. I found that every time I do this, the following is appended to /usr/local/artifactory/logs/boot.log:
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (g1PageBasedVirtualSpace.cpp:54), pid=87801, tid=100176
# guarantee(rs.is_reserved()) failed: Given reserved space must have been reserved already.
Googling that "Given reserved space must have been reserved already" reveals no information that is particularly useful to me. It seems to be a message from within the OpenJDK.
The log file also mentions that another file was created with more detailed error information. That file has a stack trace and various other info:
--------------- T H R E A D ---------------
Current thread (0x29cb0800): JavaThread "Unknown thread" [_thread_in_vm, id=100176, stack(0xbf9be000,0xbf9fe000)]
Stack: [0xbf9be000,0xbf9fe000], sp=0xbf9fd528, free space=253k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x8234ed] JVM_handle_bsd_signal+0x166bbd
V [libjvm.so+0x36ef3f] SUNWprivate_1.1+0x36ef3f
V [libjvm.so+0x40df4c] AsyncGetCallTrace+0x2c0bc
V [libjvm.so+0x40def4] AsyncGetCallTrace+0x2c064
V [libjvm.so+0x40ff0b] AsyncGetCallTrace+0x2e07b
V [libjvm.so+0x41017e] AsyncGetCallTrace+0x2e2ee
V [libjvm.so+0x3f5874] AsyncGetCallTrace+0x139e4
V [libjvm.so+0x7ea485] JVM_handle_bsd_signal+0x12db55
V [libjvm.so+0x7ea105] JVM_handle_bsd_signal+0x12d7d5
V [libjvm.so+0x471291] AsyncGetCallTrace+0x8f401
V [libjvm.so+0x7cccd3] JVM_handle_bsd_signal+0x1103a3
V [libjvm.so+0x4d0eeb] JNI_CreateJavaVM+0x6b
C [java+0x3c35] JavaMain+0x1d5
C [libthr.so.3+0x76dc] operator->+0x81c
C 0x00000000
--------------- P R O C E S S ---------------
Java Threads: ( => current thread )
Other Threads:
=>0x29cb0800 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=100176, stack(0xbf9be000,0xbf9fe000)]
VM state:not at safepoint (not fully initialized)
VM Mutex/Monitor currently owned by a thread: ([mutex/lock_event])
[0x29c48640] Heap_lock - owner thread: 0x29cb0800
GC Heap History (0 events):
No events
Deoptimization events (0 events):
No events
Internal exceptions (0 events):
No events
Events (0 events):
No events
Dynamic libraries:
0x08048000 /usr/local/openjdk8/bin/java
0x2807d000 /lib/libz.so.6
0x28091000 /lib/libthr.so.3
0x280b3000 /lib/libc.so.7
0x28c00000 /usr/local/openjdk8/jre/lib/i386/server/libjvm.so
0x28237000 /lib/libm.so.5
0x2825d000 /usr/lib/libc++.so.1
0x2830c000 /lib/libcxxrt.so.1
0x28325000 /lib/libgcc_s.so.1
0x28331000 /usr/local/openjdk8/jre/lib/i386/libverify.so
0x2833d000 /usr/local/openjdk8/jre/lib/i386/libjava.so
0x2836a000 /usr/local/openjdk8/jre/lib/i386/libzip.so
0x28054000 /libexec/ld-elf.so.1
VM Arguments:
jvm_args: -Djava.util.logging.config.file=/usr/local/artifactory/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xms512m -Xmx2g -Xss256k -XX:+UseG1GC -Djruby.compile.invokedynamic=false -Dfile.encoding=UTF8 -Dartdist=zip -Dartifactory.home=/usr/local/artifactory -Dfile.encoding=UTF8 -Djruby.compile.invokedynamic=false -Djava.endorsed.dirs=/usr/local/artifactory/tomcat/endorsed -Dcatalina.base=/usr/local/artifactory/tomcat -Dcatalina.home=/usr/local/artifactory/tomcat -Djava.io.tmpdir=/usr/local/artifactory/tomcat/temp
java_command: org.apache.catalina.startup.Bootstrap start
java_class_path (initial): /usr/local/artifactory/tomcat/bin/bootstrap.jar:/usr/local/artifactory/tomcat/bin/tomcat-juli.jar
Launcher Type: SUN_STANDARD
Environment Variables:
JAVA_HOME=/usr/local/openjdk8
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/nonexistent/bin
SHELL=/bin/csh
HOSTTYPE=FreeBSD
OSTYPE=FreeBSD
MACHTYPE=i386
Signal Handlers:
SIGSEGV: [libjvm.so+0x824280], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGBUS: [libjvm.so+0x824280], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGFPE: [libjvm.so+0x6b92f0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGPIPE: [libjvm.so+0x6b92f0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGXFSZ: [libjvm.so+0x6b92f0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGILL: [libjvm.so+0x6b92f0], sa_mask[0]=11111111111111111111111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGUSR1: SIG_DFL, sa_mask[0]=11111111011111110111111111111111, sa_flags=none
SIGUSR2: [libjvm.so+0x6b9fe0], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO
SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
--------------- S Y S T E M ---------------
OS:BSD
uname:FreeBSD 10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 22:51:51 UTC 2014 root#releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC i386
rlimit: STACK 65536k, CORE infinity, NPROC 5547, NOFILE 94860, AS infinity
load average:0.26 0.19 0.80
CPU:total 8 (4 cores per cpu, 1 threads per core) family 6 model 26 stepping 5, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, tsc, tscinvbit, tscinv
Memory: 4k page, physical 3372996k(2931288k free), swap 13807988388243963904k(13807988392538586972k free)
vm_info: OpenJDK Server VM (25.60-b23) for bsd-x86 JRE (1.8.0_60-b24), built on Nov 14 2015 17:53:51 by "bob" with gcc 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)
time: Sat Nov 14 18:21:52 2015
elapsed time: 0 seconds (0d 0h 0m 0s)
All packages are up to date and compiled from source. All Java-related stuff is newly installed (along with Artifactory) and with unchanged, default configuration.
Any ideas? Thanks.
This seems to be an issue with memory allocation.
The default heap size defined in the Artifactory startup script is 2g, which is more than the maximal heap size which can be allocated by the JVM on a 32bit FreeBSD machine.
The solution in this case would be deacreasing the maximum heap size to ~1.5g.

Resources