Determine whether matrix is sparse? - algorithm

I have a matrix. I want to know it whether sparse or not. Is there any function in matlab to evaluate that property? I tried to used issparse function, but it always returns 0(not sparse). For example, my matrix (27 by 27) is
A=
[ 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
1 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 1 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0
250 243 247 245 244 244 244 122 61 144 72 36 18 9 4 2 1 1 0 0 0 0 0 0 0 0 0
151 197 236 118 181 212 106 53 26 13 136 68 34 17 8 4 2 0 1 0 0 0 0 0 0 0 0
24 12 6 3 143 201 234 117 180 90 45 152 76 38 19 9 4 0 0 1 0 0 0 0 0 0 0
18 9 138 69 172 86 165 220 224 112 56 28 128 64 32 16 8 0 0 0 1 0 0 0 0 0 0
27 131 207 103 189 94 47 153 194 239 119 59 29 128 64 32 16 0 0 0 0 1 0 0 0 0 0
44 22 133 204 232 116 58 147 199 237 248 124 62 31 129 64 32 0 0 0 0 0 1 0 0 0 0
238 119 181 90 45 152 76 38 19 135 205 232 116 58 29 128 64 0 0 0 0 0 0 1 0 0 0
48 24 12 6 3 143 201 100 50 25 130 207 233 116 58 29 128 0 0 0 0 0 0 0 1 0 0
168 84 42 21 132 66 33 158 79 39 19 135 205 232 116 58 29 0 0 0 0 0 0 0 0 1 0
235 117 58 29 128 64 32 16 8 4 2 1 142 201 234 117 58 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
0 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0
0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0
0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0]
This is figure for above matrix

This seemingly easy question is quite difficult to answer. There is actually no known standard that determines whether a matrix is sparse or full.
However, the most common measure I know is to measure a matrix's sparsity. This is simply the fraction of the total number of zeroes over the total number of elements. If this exceeds some sensible threshold, then you could say that the matrix is sparse.
If you're given the matrix A, perhaps something like this:
sparsity = (numel(A) - nnz(A)) / numel(A);
numel determines the total number of elements in the matrix A and nnz determines the total number of non-zero elements. Therefore, numel(A) - nnz(A) should give you the total number of zero elements.
So, going with the threshold idea, this is what I was talking about:
is_sparse = sparsity > tol;
tol would be a fraction from [0,1], so something like 0.75 could work. This would mean that if 75% of your matrix consisted of zeroes, this could be a sparse matrix. It's all heuristic though. Choose a threshold that you think makes the most sense.

Related

Clickhouse is using only one core after upgrading to version 22.3.2.1

I am using clickhouse version 22.3.2.1. I want my clickhouse to utilise multiple cores.
This is my profile configuration
<?xml version="1.0"?>
<yandex>
<profiles>
<default>
<max_insert_threads>12</max_insert_threads>
<max_threads>12</max_threads>
<min_insert_block_size_bytes>536870912</min_insert_block_size_bytes>
<min_insert_block_size_rows>1000000</min_insert_block_size_rows>
</default>
</profiles>
I had the same configuration with version v21.12 and it was working fine but after upgrading clickhouse to latest version. It is not using multiple cores.
this is my settings file
min_compress_block_size 65536
max_compress_block_size 1048576
max_block_size 65505
max_insert_block_size 1048545
min_insert_block_size_rows 1000000
min_insert_block_size_bytes 536870912
min_insert_block_size_rows_for_materialized_views 0
min_insert_block_size_bytes_for_materialized_views 0
max_joined_block_size_rows 65505
max_insert_threads 12
max_final_threads 16
max_threads 12
max_read_buffer_size 1048576
max_distributed_connections 1024
max_query_size 262144
interactive_delay 100000
connect_timeout 10
connect_timeout_with_failover_ms 50
connect_timeout_with_failover_secure_ms 100
receive_timeout 300
send_timeout 300
drain_timeout 3
tcp_keep_alive_timeout 290
hedged_connection_timeout_ms 100
receive_data_timeout_ms 2000
use_hedged_requests 1
allow_changing_replica_until_first_data_packet 0
queue_max_wait_ms 0
connection_pool_max_wait_ms 0
replace_running_query_max_wait_ms 5000
kafka_max_wait_ms 5000
rabbitmq_max_wait_ms 5000
poll_interval 10
idle_connection_timeout 3600
distributed_connections_pool_size 1024
connections_with_failover_max_tries 3
s3_min_upload_part_size 16777216
s3_upload_part_size_multiply_factor 2
s3_upload_part_size_multiply_parts_count_threshold 1000
s3_max_single_part_upload_size 33554432
s3_max_single_read_retries 4
s3_max_redirects 10
s3_max_connections 1024
s3_truncate_on_insert 0
s3_create_new_file_on_insert 0
hdfs_replication 0
hdfs_truncate_on_insert 0
hdfs_create_new_file_on_insert 0
hsts_max_age 0
extremes 0
use_uncompressed_cache 0
replace_running_query 0
background_buffer_flush_schedule_pool_size 16
background_pool_size 16
background_merges_mutations_concurrency_ratio 2
background_move_pool_size 8
background_fetches_pool_size 8
background_common_pool_size 8
background_schedule_pool_size 128
background_message_broker_schedule_pool_size 16
background_distributed_schedule_pool_size 16
max_replicated_fetches_network_bandwidth_for_server 0
max_replicated_sends_network_bandwidth_for_server 0
stream_like_engine_allow_direct_select 0
distributed_directory_monitor_sleep_time_ms 100
distributed_directory_monitor_max_sleep_time_ms 30000
distributed_directory_monitor_batch_inserts 0
distributed_directory_monitor_split_batch_on_failure 0
optimize_move_to_prewhere 1
optimize_move_to_prewhere_if_final 0
replication_alter_partitions_sync 1
replication_wait_for_inactive_replica_timeout 120
load_balancing random
load_balancing_first_offset 0
totals_mode after_having_exclusive
totals_auto_threshold 0.5
allow_suspicious_low_cardinality_types 0
compile_expressions 1
min_count_to_compile_expression 3
compile_aggregate_expressions 1
min_count_to_compile_aggregate_expression 3
group_by_two_level_threshold 100000
group_by_two_level_threshold_bytes 50000000
distributed_aggregation_memory_efficient 1
aggregation_memory_efficient_merge_threads 0
enable_positional_arguments 0
max_parallel_replicas 1
parallel_replicas_count 0
parallel_replica_offset 0
allow_experimental_parallel_reading_from_replicas 0
skip_unavailable_shards 0
parallel_distributed_insert_select 0
distributed_group_by_no_merge 0
distributed_push_down_limit 1
optimize_distributed_group_by_sharding_key 1
optimize_skip_unused_shards_limit 1000
optimize_skip_unused_shards 0
optimize_skip_unused_shards_rewrite_in 1
allow_nondeterministic_optimize_skip_unused_shards 0
force_optimize_skip_unused_shards 0
optimize_skip_unused_shards_nesting 0
force_optimize_skip_unused_shards_nesting 0
input_format_parallel_parsing 1
min_chunk_bytes_for_parallel_parsing 10485760
output_format_parallel_formatting 1
merge_tree_min_rows_for_concurrent_read 163840
merge_tree_min_bytes_for_concurrent_read 251658240
merge_tree_min_rows_for_seek 0
merge_tree_min_bytes_for_seek 0
merge_tree_coarse_index_granularity 8
merge_tree_max_rows_to_use_cache 1048576
merge_tree_max_bytes_to_use_cache 2013265920
do_not_merge_across_partitions_select_final 0
mysql_max_rows_to_insert 65536
optimize_min_equality_disjunction_chain_length 3
min_bytes_to_use_direct_io 0
min_bytes_to_use_mmap_io 0
checksum_on_read 1
force_index_by_date 0
force_primary_key 0
use_skip_indexes 1
use_skip_indexes_if_final 0
force_data_skipping_indices
max_streams_to_max_threads_ratio 1
max_streams_multiplier_for_merge_tables 5
network_compression_method LZ4
network_zstd_compression_level 1
priority 0
os_thread_priority 0
log_queries 1
log_formatted_queries 0
log_queries_min_type QUERY_START
log_queries_min_query_duration_ms 0
log_queries_cut_to_length 100000
log_queries_probability 1
distributed_product_mode deny
max_concurrent_queries_for_all_users 0
max_concurrent_queries_for_user 0
insert_deduplicate 1
insert_quorum 0
insert_quorum_timeout 600000
insert_quorum_parallel 1
select_sequential_consistency 0
table_function_remote_max_addresses 1000
read_backoff_min_latency_ms 1000
read_backoff_max_throughput 1048576
read_backoff_min_interval_between_events_ms 1000
read_backoff_min_events 2
read_backoff_min_concurrency 1
memory_tracker_fault_probability 0
enable_http_compression 0
http_zlib_compression_level 3
http_native_compression_disable_checksumming_on_decompress 0
count_distinct_implementation uniqExact
add_http_cors_header 0
max_http_get_redirects 0
use_client_time_zone 0
send_progress_in_http_headers 0
http_headers_progress_interval_ms 100
fsync_metadata 1
join_use_nulls 0
join_default_strictness ALL
any_join_distinct_right_table_keys 0
preferred_block_size_bytes 1000000
max_replica_delay_for_distributed_queries 300
fallback_to_stale_replicas_for_distributed_queries 1
preferred_max_column_in_block_size_bytes 0
insert_distributed_sync 0
insert_distributed_timeout 0
distributed_ddl_task_timeout 180
stream_flush_interval_ms 7500
stream_poll_timeout_ms 500
sleep_in_send_tables_status_ms 0
sleep_in_send_data_ms 0
unknown_packet_in_send_data 0
sleep_in_receive_cancel_ms 0
insert_allow_materialized_columns 0
http_connection_timeout 1
http_send_timeout 180
http_receive_timeout 180
http_max_uri_size 1048576
http_max_fields 1000000
http_max_field_name_size 1048576
http_max_field_value_size 1048576
http_skip_not_found_url_for_globs 1
optimize_throw_if_noop 0
use_index_for_in_with_subqueries 1
joined_subquery_requires_alias 1
empty_result_for_aggregation_by_empty_set 0
empty_result_for_aggregation_by_constant_keys_on_empty_set 1
allow_distributed_ddl 1
allow_suspicious_codecs 0
allow_experimental_codecs 0
query_profiler_real_time_period_ns 1000000000
query_profiler_cpu_time_period_ns 1000000000
metrics_perf_events_enabled 0
metrics_perf_events_list
opentelemetry_start_trace_probability 0
prefer_column_name_to_alias 0
prefer_global_in_and_join 0
max_rows_to_read 0
max_bytes_to_read 0
read_overflow_mode throw
max_rows_to_read_leaf 0
max_bytes_to_read_leaf 0
read_overflow_mode_leaf throw
max_rows_to_group_by 0
group_by_overflow_mode throw
max_bytes_before_external_group_by 0
max_rows_to_sort 0
max_bytes_to_sort 0
sort_overflow_mode throw
max_bytes_before_external_sort 0
max_bytes_before_remerge_sort 1000000000
remerge_sort_lowered_memory_bytes_ratio 2
max_result_rows 0
max_result_bytes 0
result_overflow_mode throw
max_execution_time 0
timeout_overflow_mode throw
min_execution_speed 0
max_execution_speed 0
min_execution_speed_bytes 0
max_execution_speed_bytes 0
timeout_before_checking_execution_speed 10
max_columns_to_read 0
max_temporary_columns 0
max_temporary_non_const_columns 0
max_subquery_depth 100
max_pipeline_depth 1000
max_ast_depth 1000
max_ast_elements 50000
max_expanded_ast_elements 500000
readonly 0
max_rows_in_set 0
max_bytes_in_set 0
set_overflow_mode throw
max_rows_in_join 0
max_bytes_in_join 0
join_overflow_mode throw
join_any_take_last_row 0
join_algorithm hash
default_max_bytes_in_join 1000000000
partial_merge_join_left_table_buffer_bytes 0
partial_merge_join_rows_in_right_blocks 65536
join_on_disk_max_files_to_merge 64
temporary_files_codec LZ4
max_rows_to_transfer 0
max_bytes_to_transfer 0
transfer_overflow_mode throw
max_rows_in_distinct 0
max_bytes_in_distinct 0
distinct_overflow_mode throw
max_memory_usage 28000000000
max_guaranteed_memory_usage 0
max_memory_usage_for_user 0
max_guaranteed_memory_usage_for_user 0
max_untracked_memory 4194304
memory_profiler_step 4194304
memory_profiler_sample_probability 0
memory_usage_overcommit_max_wait_microseconds 0
max_network_bandwidth 0
max_network_bytes 0
max_network_bandwidth_for_user 0
max_network_bandwidth_for_all_users 0
max_backup_threads 0
log_profile_events 1
log_query_settings 1
log_query_threads 1
log_query_views 1
log_comment
send_logs_level fatal
enable_optimize_predicate_expression 1
enable_optimize_predicate_expression_to_final_subquery 1
allow_push_predicate_when_subquery_contains_with 1
low_cardinality_max_dictionary_size 8192
low_cardinality_use_single_dictionary_for_part 0
decimal_check_overflow 1
prefer_localhost_replica 1
max_fetch_partition_retries_count 5
http_max_multipart_form_data_size 1073741824
calculate_text_stack_trace 1
allow_ddl 1
parallel_view_processing 0
enable_unaligned_array_join 0
optimize_read_in_order 1
optimize_aggregation_in_order 0
aggregation_in_order_max_block_bytes 50000000
read_in_order_two_level_merge_threshold 100
low_cardinality_allow_in_native_format 1
cancel_http_readonly_queries_on_client_close 0
external_table_functions_use_nulls 1
external_table_strict_query 0
allow_hyperscan 1
max_hyperscan_regexp_length 0
max_hyperscan_regexp_total_length 0
allow_simdjson 1
allow_introspection_functions 0
max_partitions_per_insert_block 100
max_partitions_to_read -1
check_query_single_value_result 1
allow_drop_detached 0
postgresql_connection_pool_size 16
postgresql_connection_pool_wait_timeout 5000
glob_expansion_max_elements 1000
odbc_bridge_connection_pool_size 16
distributed_replica_error_half_life 60
distributed_replica_error_cap 1000
distributed_replica_max_ignored_errors 0
allow_experimental_live_view 0
live_view_heartbeat_interval 15
max_live_view_insert_blocks_before_refresh 64
allow_experimental_window_view 0
window_view_clean_interval 5
window_view_heartbeat_interval 15
min_free_disk_space_for_temporary_data 0
default_database_engine Atomic
default_table_engine None
show_table_uuid_in_table_create_query_if_not_nil 0
database_atomic_wait_for_drop_and_detach_synchronously 0
enable_scalar_subquery_optimization 1
optimize_trivial_count_query 1
optimize_respect_aliases 1
mutations_sync 0
optimize_move_functions_out_of_any 0
optimize_normalize_count_variants 1
optimize_injective_functions_inside_uniq 1
convert_query_to_cnf 0
optimize_arithmetic_operations_in_aggregate_functions 1
optimize_duplicate_order_by_and_distinct 1
optimize_redundant_functions_in_order_by 1
optimize_if_chain_to_multiif 0
optimize_if_transform_strings_to_enum 0
optimize_monotonous_functions_in_order_by 1
optimize_functions_to_subcolumns 0
optimize_using_constraints 0
optimize_substitute_columns 0
optimize_append_index 0
normalize_function_names 1
allow_experimental_alter_materialized_view_structure 0
enable_early_constant_folding 1
deduplicate_blocks_in_dependent_materialized_views 0
use_compact_format_in_distributed_parts_names 1
validate_polygons 1
max_parser_depth 1000
temporary_live_view_timeout 5
periodic_live_view_refresh 60
transform_null_in 0
allow_nondeterministic_mutations 0
lock_acquire_timeout 120
materialize_ttl_after_modify 1
function_implementation
allow_experimental_geo_types 0
data_type_default_nullable 0
cast_keep_nullable 0
cast_ipv4_ipv6_default_on_conversion_error 0
alter_partition_verbose_result 0
allow_experimental_database_materialized_mysql 0
allow_experimental_database_materialized_postgresql 0
system_events_show_zero_values 0
mysql_datatypes_support_level
optimize_trivial_insert_select 1
allow_non_metadata_alters 1
enable_global_with_statement 1
aggregate_functions_null_for_empty 0
optimize_syntax_fuse_functions 0
optimize_fuse_sum_count_avg 0
flatten_nested 1
asterisk_include_materialized_columns 0
asterisk_include_alias_columns 0
optimize_skip_merged_partitions 0
optimize_on_insert 1
force_optimize_projection 0
async_socket_for_remote 1
insert_null_as_default 1
describe_extend_object_types 0
describe_include_subcolumns 0
optimize_rewrite_sum_if_to_count_if 1
insert_shard_id 0
allow_experimental_query_deduplication 0
engine_file_empty_if_not_exists 0
engine_file_truncate_on_insert 0
engine_file_allow_create_multiple_files 0
allow_experimental_database_replicated 0
database_replicated_initial_query_timeout_sec 300
max_distributed_depth 5
database_replicated_always_detach_permanently 0
database_replicated_allow_only_replicated_engine 0
distributed_ddl_output_mode throw
distributed_ddl_entry_format_version 1
external_storage_max_read_rows 0
external_storage_max_read_bytes 0
external_storage_connect_timeout_sec 10
external_storage_rw_timeout_sec 300
union_default_mode
optimize_aggregators_of_group_by_keys 1
optimize_group_by_function_keys 1
legacy_column_name_of_tuple_literal 0
query_plan_enable_optimizations 1
query_plan_max_optimizations_to_apply 10000
query_plan_filter_push_down 1
regexp_max_matches_per_row 1000
limit 0
offset 0
function_range_max_elements_in_block 500000000
short_circuit_function_evaluation enable
local_filesystem_read_method pread
remote_filesystem_read_method threadpool
local_filesystem_read_prefetch 0
remote_filesystem_read_prefetch 1
read_priority 0
merge_tree_min_rows_for_concurrent_read_for_remote_filesystem 163840
merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem 251658240
remote_read_min_bytes_for_seek 4194304
async_insert_threads 16
async_insert 0
wait_for_async_insert 1
wait_for_async_insert_timeout 120
async_insert_max_data_size 100000
async_insert_busy_timeout_ms 200
async_insert_stale_timeout_ms 0
remote_fs_read_max_backoff_ms 10000
remote_fs_read_backoff_max_tries 5
remote_fs_enable_cache 1
remote_fs_cache_max_wait_sec 5
http_max_tries 10
http_retry_initial_backoff_ms 100
http_retry_max_backoff_ms 10000
force_remove_data_recursively_on_drop 0
check_table_dependencies 1
use_local_cache_for_remote_storage 1
allow_unrestricted_reads_from_keeper 0
allow_experimental_funnel_functions 0
allow_experimental_nlp_functions 0
allow_experimental_object_type 0
insert_deduplication_token
max_memory_usage_for_all_queries 0
multiple_joins_rewriter_version 0
enable_debug_queries 0
allow_experimental_database_atomic 1
allow_experimental_bigint_types 1
allow_experimental_window_functions 1
handle_kafka_error_mode default
database_replicated_ddl_output 1
replication_alter_columns_timeout 60
odbc_max_field_size 0
allow_experimental_map_type 1
merge_tree_clear_old_temporary_directories_interval_seconds 60
merge_tree_clear_old_parts_interval_seconds 1
partial_merge_join_optimizations 0
max_alter_threads \'auto(12)\'
allow_experimental_projection_optimization 1
format_csv_delimiter ,
format_csv_allow_single_quotes 1
format_csv_allow_double_quotes 1
output_format_csv_crlf_end_of_line 0
input_format_csv_enum_as_number 0
input_format_csv_arrays_as_nested_csv 0
input_format_skip_unknown_fields 0
input_format_with_names_use_header 1
input_format_with_types_use_header 1
input_format_import_nested_json 0
input_format_defaults_for_omitted_fields 1
input_format_csv_empty_as_default 1
input_format_tsv_empty_as_default 0
input_format_tsv_enum_as_number 0
input_format_null_as_default 1
input_format_use_lowercase_column_name 0
input_format_arrow_import_nested 0
input_format_orc_import_nested 0
input_format_orc_row_batch_size 100000
input_format_parquet_import_nested 0
input_format_allow_seeks 1
input_format_orc_allow_missing_columns 0
input_format_parquet_allow_missing_columns 0
input_format_arrow_allow_missing_columns 0
input_format_hive_text_fields_delimiter
input_format_hive_text_collection_items_delimiter
input_format_hive_text_map_keys_delimiter
input_format_msgpack_number_of_columns 0
output_format_msgpack_uuid_representation ext
input_format_max_rows_to_read_for_schema_inference 100
date_time_input_format basic
date_time_output_format simple
bool_true_representation true
bool_false_representation false
input_format_values_interpret_expressions 1
input_format_values_deduce_templates_of_expressions 1
input_format_values_accurate_types_of_literals 1
input_format_avro_allow_missing_fields 0
format_avro_schema_registry_url
output_format_json_quote_64bit_integers 1
output_format_json_quote_denormals 0
output_format_json_escape_forward_slashes 1
output_format_json_named_tuples_as_objects 0
output_format_json_array_of_rows 0
output_format_pretty_max_rows 10000
output_format_pretty_max_column_pad_width 250
output_format_pretty_max_value_width 10000
output_format_pretty_color 1
output_format_pretty_grid_charset UTF-8
output_format_parquet_row_group_size 1000000
output_format_avro_codec
output_format_avro_sync_interval 16384
output_format_avro_string_column_pattern
output_format_avro_rows_in_file 1
output_format_tsv_crlf_end_of_line 0
format_csv_null_representation \\N
format_tsv_null_representation \\N
output_format_decimal_trailing_zeros 0
input_format_allow_errors_num 0
input_format_allow_errors_ratio 0
format_schema
format_template_resultset
format_template_row
format_template_rows_between_delimiter \n
format_custom_escaping_rule Escaped
format_custom_field_delimiter \t
format_custom_row_before_delimiter
format_custom_row_after_delimiter \n
format_custom_row_between_delimiter
format_custom_result_before_delimiter
format_custom_result_after_delimiter
format_regexp
format_regexp_escaping_rule Raw
format_regexp_skip_unmatched 0
output_format_enable_streaming 0
output_format_write_statistics 1
output_format_pretty_row_numbers 0
insert_distributed_one_random_shard 0
cross_to_inner_join_rewrite 1
output_format_arrow_low_cardinality_as_dictionary 0
format_capn_proto_enum_comparising_mode by_values
Any help would be appreciated thanks
Look like you run clickhouse inside docker,
issue related with cgroups limits calculation
And fixed on next 22.3.x
Look details
https://github.com/ClickHouse/ClickHouse/pull/35815
After your comments
look like you need to increase max_insert_threads for INSERT ... SELECT ...
https://clickhouse.com/docs/en/operations/settings/settings/#settings-max-insert-threads
and check EXPLAIN for SELECT
https://clickhouse.com/docs/en/sql-reference/statements/explain/

Generating and plotting an event window relative to a shock (Repost)

Repost from: https://www.statalist.org/forums/forum/general-stata-discussion/general/1648042-generating-and-plotting-of-an-event-window-relative-to-a-shock
Dear all,
I am (still) struggling with the generation of event_window variable (relative to the time of the event). The esplot package (#Dylan Balla-Elliott) defines event_windowas follows.
event_indicator = <current_time> == <time of event>
event_time = <current_time> - <time of event>
Here is a data example, with a time variable, a continuous variable, and a set of event indicator dummies (which are basically random shocks).
* Example generated by -dataex-. For more info, type help dataex
clear
input str7 modate float epeu_lvl byte(cop_shock unpri_reg_shock eu_reg_shock) float tid
"2004/1" 75.34063 0 0 0 1
"2004/2" 76.99823 0 0 0 2
"2004/3" 125.02164 0 0 0 3
"2004/4" 109.83804 0 0 0 4
"2004/5" 114.84982 0 0 0 5
"2004/6" 99.84531 0 0 0 6
"2004/7" 115.9254 0 0 0 7
"2004/8" 77.3424 0 0 0 8
"2004/9" 89.59677 0 0 0 9
"2004/10" 120.00146 0 0 0 10
"2004/11" 127.93832 0 0 0 11
"2004/12" 83.33497 1 0 1 12
"2005/1" 58.94662 0 0 0 13
"2005/2" 74.97708 0 0 0 14
"2005/3" 81.45479 0 0 0 15
"2005/4" 89.07868 0 0 0 16
"2005/5" 99.44091 0 0 0 17
"2005/6" 99.41497 0 0 0 18
"2005/7" 85.08384 0 0 0 19
"2005/8" 82.83349 0 0 0 20
"2005/9" 160.47383 0 0 0 21
"2005/10" 71.51886 0 0 0 22
"2005/11" 95.44765 0 0 0 23
"2005/12" 61.47662 1 0 1 24
"2006/1" 83.96114 0 0 0 25
"2006/2" 60.63415 0 0 0 26
"2006/3" 79.82993 0 0 0 27
"2006/4" 89.04356 0 0 0 28
"2006/5" 82.44514 0 0 0 29
"2006/6" 89.85152 0 0 0 30
"2006/7" 82.00437 0 0 0 31
"2006/8" 58.86663 0 0 0 32
"2006/9" 76.82971 0 0 0 33
"2006/10" 71.2218 0 0 0 34
"2006/11" 73.84509 1 0 0 35
"2006/12" 74.91799 0 0 0 36
"2007/1" 62.33881 0 0 0 37
"2007/2" 58.51786 0 0 0 38
"2007/3" 71.11645 0 0 0 39
"2007/4" 65.16531 0 0 0 40
"2007/5" 54.99327 0 0 0 41
"2007/6" 60.84606 0 0 0 42
"2007/7" 47.69234 0 0 0 43
"2007/8" 94.66286 0 0 0 44
"2007/9" 166.7332 0 0 0 45
"2007/10" 96.88046 0 0 0 46
"2007/11" 97.73734 0 0 0 47
"2007/12" 98.01473 1 0 1 48
"2008/1" 160.25905 0 0 1 49
"2008/2" 128.78455 0 0 0 50
"2008/3" 139.87073 0 0 0 51
"2008/4" 96.74758 0 0 0 52
"2008/5" 76.82344 0 0 0 53
"2008/6" 106.42784 0 0 0 54
"2008/7" 87.93302 0 0 0 55
"2008/8" 92.29639 0 0 0 56
"2008/9" 156.0435 0 0 0 57
"2008/10" 216.5918 0 0 0 58
"2008/11" 156.77446 1 0 0 59
"2008/12" 136.78456 0 0 0 60
"2009/1" 159.99384 0 0 0 61
"2009/2" 139.69698 0 0 0 62
"2009/3" 133.46071 0 0 0 63
"2009/4" 119.9992 0 0 1 64
"2009/5" 122.9601 0 0 0 65
"2009/6" 113.23891 0 0 0 66
"2009/7" 95.94823 0 0 0 67
"2009/8" 91.37744 0 0 0 68
"2009/9" 104.3236 0 0 0 69
"2009/10" 105.04014 0 0 0 70
"2009/11" 133.00749 1 0 1 71
"2009/12" 115.2626 0 0 1 72
"2010/1" 142.00356 0 0 0 73
"2010/2" 136.73906 0 0 0 74
"2010/3" 137.8383 0 0 0 75
"2010/4" 152.78447 0 0 0 76
"2010/5" 203.30525 0 0 0 77
"2010/6" 171.40266 0 0 1 78
"2010/7" 186.55524 0 0 0 79
"2010/8" 172.81606 0 0 0 80
"2010/9" 161.69014 0 0 0 81
"2010/10" 186.1411 0 1 0 82
"2010/11" 172.68817 1 0 0 83
"2010/12" 183.076 0 0 0 84
"2011/1" 143.03174 0 0 0 85
"2011/2" 122.44579 0 0 0 86
"2011/3" 154.4015 0 0 0 87
"2011/4" 145.5086 0 0 0 88
"2011/5" 134.21507 0 0 1 89
"2011/6" 168.2959 0 0 0 90
"2011/7" 183.40234 0 0 0 91
"2011/8" 230.29893 0 0 0 92
"2011/9" 280.05814 0 0 0 93
"2011/10" 241.75185 0 0 0 94
"2011/11" 304.60022 1 0 0 95
"2011/12" 228.8716 0 0 0 96
"2012/1" 216.73445 0 0 0 97
"2012/2" 193.44435 0 0 0 98
"2012/3" 177.4927 0 0 0 99
"2012/4" 216.99586 0 0 0 100
end
At glance I thought to create a loop that generates event_window. But some questions arise about how to handle the variable with two sequential shocks (i.e in 2009/11 and 2009/12 for eu_reg_shock). Or where two or more shocks are included in the time window. If the window is too large, it will be problematic, I assume.
My main goal is to analyze if these shocks affect the continuous variable before and after. Ideally, I need to normalize the continuous variable (with mean of one) before the shock. Here is the study and the plot that I wish to replicate from Scott R. Baker Nicholas Bloom Stephen J. Terry (2022).
I thought about the following plot. But I have no idea about the normalization part.
graph bar (mean) epeu_lvl, over(event_time)
References:
Scott R. Baker Nicholas Bloom Stephen J. Terry (2022). https://www.nber.org/papers/w27167
Dylan Balla-Elliott. https://dballaelliott.github.io/esplot/index.html

AWK Formatting Using First Row as a Header and Iterating by column

I'm struggling trying to format a collectd ploted file si I can later import it to an influx db instance.
This is how the file looks like:
#Date Time [CPU]User% [CPU]Nice% [CPU]Sys% [CPU]Wait% [CPU]Irq% [CPU]Soft% [CPU]Steal% [CPU]Idle% [CPU]Totl% [CPU]Intrpt/sec [CPU]Ctx/sec [CPU]Proc/sec [CPU]ProcQue [CPU]ProcRun [CPU]L-Avg1 [CPU]L-Avg5 [CPU]L-Avg15 [CPU]RunTot [CPU]BlkTot [MEM]Tot [MEM]Used [MEM]Free [MEM]Shared [MEM]Buf [MEM]Cached [MEM]Slab [MEM]Map [MEM]Anon [MEM]Commit [MEM]Locked [MEM]SwapTot [MEM]SwapUsed [MEM]SwapFree [MEM]SwapIn [MEM]SwapOut [MEM]Dirty [MEM]Clean [MEM]Laundry [MEM]Inactive [MEM]PageIn [MEM]PageOut [MEM]PageFaults [MEM]PageMajFaults [MEM]HugeTotal [MEM]HugeFree [MEM]HugeRsvd [MEM]SUnreclaim [SOCK]Used [SOCK]Tcp [SOCK]Orph [SOCK]Tw [SOCK]Alloc [SOCK]Mem [SOCK]Udp [SOCK]Raw [SOCK]Frag [SOCK]FragMem [NET]RxPktTot [NET]TxPktTot [NET]RxKBTot [NET]TxKBTot [NET]RxCmpTot [NET]RxMltTot [NET]TxCmpTot [NET]RxErrsTot [NET]TxErrsTot [DSK]ReadTot [DSK]WriteTot [DSK]OpsTot [DSK]ReadKBTot [DSK]WriteKBTot [DSK]KbTot [DSK]ReadMrgTot [DSK]WriteMrgTot [DSK]MrgTot [INODE]NumDentry [INODE]openFiles [INODE]MaxFile% [INODE]used [NFS]ReadsS [NFS]WritesS [NFS]MetaS [NFS]CommitS [NFS]Udp [NFS]Tcp [NFS]TcpConn [NFS]BadAuth [NFS]BadClient [NFS]ReadsC [NFS]WritesC [NFS]MetaC [NFS]CommitC [NFS]Retrans [NFS]AuthRef [TCP]IpErr [TCP]TcpErr [TCP]UdpErr [TCP]IcmpErr [TCP]Loss [TCP]FTrans [BUD]1Page [BUD]2Pages [BUD]4Pages [BUD]8Pages [BUD]16Pages [BUD]32Pages [BUD]64Pages [BUD]128Pages [BUD]256Pages [BUD]512Pages [BUD]1024Pages
20190228 00:01:00 12 0 3 0 0 1 0 84 16 26957 20219 14 2991 3 0.05 0.18 0.13 1 0 198339428 197144012 1195416 0 817844 34053472 1960600 76668 158641184 201414800 0 17825788 0 17825788 0 0 224 0 0 19111168 3 110 4088 0 0 0 0 94716 2885 44 0 5 1982 1808 0 0 0 0 9739 9767 30385 17320 0 0 0 0 0 0 12 13 3 110 113 0 16 16 635592 7488 0 476716 0 0 0 0 0 0 0 0 0 0 0 8 0 0 22 0 1 0 0 0 0 48963 10707 10980 1226 496 282 142 43 19 6 132
20190228 00:02:00 11 0 3 0 0 1 0 85 15 26062 18226 5 2988 3 0.02 0.14 0.12 2 0 198339428 197138128 1201300 0 817856 34054692 1960244 75468 158636064 201398036 0 17825788 0 17825788 0 0 220 0 0 19111524 0 81 960 0 0 0 0 94420 2867 42 0 7 1973 1842 0 0 0 0 9391 9405 28934 16605 0 0 0 0 0 0 9 9 0 81 81 0 11 11 635446 7232 0 476576 0 0 0 0 0 0 0 0 0 0 0 3 0 0 8 0 1 0 0 0 0 49798 10849 10995 1241 499 282 142 43 19 6 132
20190228 00:03:00 11 0 3 0 0 1 0 85 15 25750 17963 4 2980 0 0.00 0.11 0.10 2 0 198339428 197137468 1201960 0 817856 34056400 1960312 75468 158633880 201397832 0 17825788 0 17825788 0 0 320 0 0 19111712 0 75 668 0 0 0 0 94488 2869 42 0 5 1975 1916 0 0 0 0 9230 9242 28411 16243 0 0 0 0 0 0 9 9 0 75 75 0 10 10 635434 7232 0 476564 0 0 0 0 0 0 0 0 0 0 0 2 0 0 6 0 1 0 0 0 0 50029 10817 10998 1243 501 282 142 43 19 6 132
20190228 00:04:00 11 0 3 0 0 1 0 84 16 25755 17871 10 2981 5 0.08 0.11 0.10 3 0 198339428 197140864 1198564 0 817856 34058072 1960320 75468 158634508 201398088 0 17825788 0 17825788 0 0 232 0 0 19111980 0 79 2740 0 0 0 0 94488 2867 4 0 2 1973 1899 0 0 0 0 9191 9197 28247 16183 0 0 0 0 0 0 9 9 0 79 79 0 10 10 635433 7264 0 476563 0 0 0 0 0 0 0 0 0 0 0 5 0 0 12 0 1 0 0 0 0 49243 10842 10985 1245 501 282 142 43 19 6 132
20190228 00:05:00 12 0 4 0 0 1 0 83 17 26243 18319 76 2985 3 0.06 0.10 0.09 2 0 198339428 197148040 1191388 0 817856 34059808 1961420 75492 158637636 201405208 0 17825788 0 17825788 0 0 252 0 0 19112012 0 85 18686 0 0 0 0 95556 2884 43 0 6 1984 1945 0 0 0 0 9176 9173 28153 16029 0 0 0 0 0 0 10 10 0 85 85 0 12 12 635473 7328 0 476603 0 0 0 0 0 0 0 0 0 0 0 3 0 0 7 0 1 0 0 0 0 47625 10801 10979 1253 505 282 142 43 19 6 132
What I'm trying to do, is to get it in a format that looks like this:
cpu_value,host=mxspacr1,instance=5,type=cpu,type_instance=softirq value=180599 1551128614916131663
cpu_value,host=mxspacr1,instance=2,type=cpu,type_instance=interrupt value=752 1551128614916112943
cpu_value,host=mxspacr1,instance=4,type=cpu,type_instance=softirq value=205697 1551128614916128446
cpu_value,host=mxspacr1,instance=7,type=cpu,type_instance=nice value=19250943 1551128614916111618
cpu_value,host=mxspacr1,instance=2,type=cpu,type_instance=softirq value=160513 1551128614916127690
cpu_value,host=mxspacr1,instance=1,type=cpu,type_instance=softirq value=178677 1551128614916127265
cpu_value,host=mxspacr1,instance=0,type=cpu,type_instance=softirq value=212274 1551128614916126586
cpu_value,host=mxspacr1,instance=6,type=cpu,type_instance=interrupt value=673 1551128614916116661
cpu_value,host=mxspacr1,instance=4,type=cpu,type_instance=interrupt value=701 1551128614916115893
cpu_value,host=mxspacr1,instance=3,type=cpu,type_instance=interrupt value=723 1551128614916115492
cpu_value,host=mxspacr1,instance=1,type=cpu,type_instance=interrupt value=756 1551128614916112550
cpu_value,host=mxspacr1,instance=6,type=cpu,type_instance=nice value=21661921 1551128614916111032
cpu_value,host=mxspacr1,instance=3,type=cpu,type_instance=nice value=18494760 1551128614916098304
cpu_value,host=mxspacr1,instance=0,type=cpu,type_instance=interrupt value=552 1551
What I have managed to do so far is just to convert the date string into EPOCH format.
I was thinking somehow to use the first value "[CPU]" as the measurement, and the "User%" as the type, the host I can take it from the system where the script will run.
I would really appreciate your help, because I really basic knowledge of text editing.
Thanks.
EDIT: this is what would expect to get with the information of the second line using as a header the first row:
cpu_value,host=mxspacr1,type=cpu,type_instance=user% value=0 1551128614916131663
EDIT: This is what I have so far, and I'm stuck here.
awk -v HOSTNAME="$HOSTNAME" 'BEGIN { FS="[][]"; getline; NR==1; f1=$2; f2=$3 } { RS=" "; printf f1"_measurement,host="HOSTNAME",type="f2"value="$3" ", system("date +%s -d \""$1" "$2"\"") }' mxmcaim01-20190228.tab
And this is what I get, but this is only for 1 column, now I don't know how to process the remaining columns such as Nice, Sys, Wait and so on.
CPU_measurement,host=mxmcamon05,type=User% value= 1552014000
CPU_measurement,host=mxmcamon05,type=User% value= 1551960000
CPU_measurement,host=mxmcamon05,type=User% value= 1551343500
CPU_measurement,host=mxmcamon05,type=User% value= 1551997620
CPU_measurement,host=mxmcamon05,type=User% value= 1551985200
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551949200
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551945600
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
Please help.
EDIT. First of all, Thanks for your help.
Taking Advantage from you knowledge in text editing, I was expecting to use this for 3 separate files, but unfortunately and I don't know why the format is different, like this:
#Date Time SlabName ObjInUse ObjInUseB ObjAll ObjAllB SlabInUse SlabInUseB SlabAll SlabAllB SlabChg SlabPct
20190228 00:01:00 nfsd_drc 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_delegations 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_stateids 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_files 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_stateowners 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfs_direct_cache 0 0 0 0 0 0 0 0 0 0
So I don't how to handle the arrays in a way that I can use nfsd_drc as the type and then Iterate through ObjInUse ObjInUseB ObjAll ObjAllB SlabInUse SlabInUseB SlabAll SlabAllB SlabChg SlabPct and use them like the type_instance and finally the value in this case for ObjInUse will be 0, ObjInUseB = 0, ObjAll = 0, an so one, making something like this:
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjectInUse value=0 1551128614916131663
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjInuseB value=0 1551128614916131663
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjAll value=0 1551128614916112943
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjAllB value=0 1551128614916128446
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabInUse value=0 1551128614916111618
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabInUseB value=0 1551128614916127690
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabAll value=0 1551128614916127265
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabAllB value=0 1551128614916126586
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabChg value=0 1551128614916116661
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabPct value=0 1551128614916115893
slab_value is a hard-coded value.
Thanks.
It is not clear where do instance and type_instance=interrupt come from in your final desired format. Otherwise awk code below should work.
Note: it doesn't strip % from tag values and prints timestamp at end of line in seconds (append extra zeros if you want nanoseconds).
gawk -v HOSTNAME="$HOSTNAME" 'NR==1 {split($0,h,/[ \t\[\]]+/,s); for(i=0;i<length(h);i++){ h[i]=tolower(h[i]); };}; NR>1 { for(j=2;j<NF;j++) {k=2*j; printf("%s_value,host=%s,type=%s,type_instance=%s value=%s %s\n", h[k], HOSTNAME, h[k], h[k+1],$(j+1), mktime(substr($1,1,4)" "substr($1,5,2)" "substr($1,7,2)" "substr($2,1,2)" "substr($2,4,2)" "substr($2,7,2)));}}' mxmcaim01-20190228.tab

What image format are MNIST images?

I've unpacked the first image from the MNIST training set and I can access the (28,28) matrix.
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 3 18 18 18 126 136
175 26 166 255 247 127 0 0 0 0]
[ 0 0 0 0 0 0 0 0 30 36 94 154 170 253 253 253 253 253
225 172 253 242 195 64 0 0 0 0]
[ 0 0 0 0 0 0 0 49 238 253 253 253 253 253 253 253 253 251
93 82 82 56 39 0 0 0 0 0]
[ 0 0 0 0 0 0 0 18 219 253 253 253 253 253 198 182 247 241
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 80 156 107 253 253 205 11 0 43 154
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 14 1 154 253 90 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 139 253 190 2 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 11 190 253 70 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 35 241 225 160 108 1
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 81 240 253 253 119
25 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45 186 253 253
150 27 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 93 252
253 187 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 249
253 249 64 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 46 130 183 253
253 207 2 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 39 148 229 253 253 253
250 182 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 24 114 221 253 253 253 253 201
78 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 23 66 213 253 253 253 253 198 81 2
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 18 171 219 253 253 253 253 195 80 9 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 55 172 226 253 253 253 253 244 133 11 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 136 253 253 253 212 135 132 16 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]]
I want to do some image processing on it like converting to grayscale and then binarizing it (for machine learning), however I'm confused as to what kind of image format I'm dealing with. If this was a (28, 28, 3) it's obvious that it's an RGB image with 3 channels. However, this is a (28, 28) image with each pixel taking on a value in the discrete range [0, 255], which is rather odd. Is this image already in gray scale and do I just have to normalize the pixel values? What exactly does normalization entail? Do I multiply the flattened vector by the scalar 1/(sum of all energy values) ?
Thanks!
The images are 28x28 pixel grey-scale images with 8-bit quantization (hence the range [0-255]). The images were apparently binary black/white images but anti-aliasing during resizing caused them to have additional grey-scale values. See here for additional details.
Normally, you would normalize by dividing all values by 255 (not the sum of all pixel values).

How to compare the values in two different rows with awk?

Given this file:
Variable_name Value
Aborted_clients 0
Aborted_connects 4
Binlog_cache_disk_use 0
Binlog_cache_use 0
Binlog_stmt_cache_disk_use 0
Binlog_stmt_cache_use 0
Bytes_received 141
Bytes_sent 177
Com_admin_commands 0
Com_assign_to_keycache 0
Com_alter_db 0
Com_alter_db_upgrade 0
Com_alter_event 0
Com_alter_function 0
Com_alter_procedure 0
Com_alter_server 0
Com_alter_table 0
Com_alter_tablespace 0
Com_analyze 0
Com_begin 0
Com_binlog 0
Com_call_procedure 0
Com_change_db 0
Com_change_master 0
Com_check 0
Com_checksum 0
Com_commit 0
Com_create_db 0
Com_create_event 0
Com_create_function 0
Com_create_index 0
Com_create_procedure 0
Com_create_server 0
Com_create_table 0
Com_create_trigger 0
Com_create_udf 0
Com_create_user 0
Com_create_view 0
Com_dealloc_sql 0
Com_delete 0
Com_delete_multi 0
Com_do 0
Com_drop_db 0
Com_drop_event 0
Com_drop_function 0
Com_drop_index 0
Com_drop_procedure 0
Com_drop_server 0
Com_drop_table 0
Com_drop_trigger 0
Com_drop_user 0
Com_drop_view 0
Com_empty_query 0
Com_execute_sql 0
Com_flush 0
Com_grant 0
Com_ha_close 0
Com_ha_open 0
Com_ha_read 0
Com_help 0
Com_insert 0
Com_insert_select 0
Com_install_plugin 0
Com_kill 0
Com_load 0
Com_lock_tables 0
Com_optimize 0
Com_preload_keys 0
Com_prepare_sql 0
Com_purge 0
Com_purge_before_date 0
Com_release_savepoint 0
Com_rename_table 0
Com_rename_user 0
Com_repair 0
Com_replace 0
Com_replace_select 0
Com_reset 0
Com_resignal 0
Com_revoke 0
Com_revoke_all 0
Com_rollback 0
Com_rollback_to_savepoint 0
Com_savepoint 0
Com_select 1
Com_set_option 0
Com_signal 0
Com_show_authors 0
Com_show_binlog_events 0
Com_show_binlogs 0
Com_show_charsets 0
Com_show_collations 0
Com_show_contributors 0
Com_show_create_db 0
Com_show_create_event 0
Com_show_create_func 0
Com_show_create_proc 0
Com_show_create_table 0
Com_show_create_trigger 0
Com_show_databases 0
Com_show_engine_logs 0
Com_show_engine_mutex 0
Com_show_engine_status 0
Com_show_events 0
Com_show_errors 0
Com_show_fields 0
Com_show_function_status 0
Com_show_grants 0
Com_show_keys 0
Com_show_master_status 0
Com_show_open_tables 0
Com_show_plugins 0
Com_show_privileges 0
Com_show_procedure_status 0
Com_show_processlist 0
Com_show_profile 0
Com_show_profiles 0
Com_show_relaylog_events 0
Com_show_slave_hosts 0
Com_show_slave_status 0
Com_show_status 1
Com_show_storage_engines 0
Com_show_table_status 0
Com_show_tables 0
Com_show_triggers 0
Com_show_variables 0
Com_show_warnings 0
Com_slave_start 0
Com_slave_stop 0
Com_stmt_close 0
Com_stmt_execute 0
Com_stmt_fetch 0
Com_stmt_prepare 0
Com_stmt_reprepare 0
Com_stmt_reset 0
Com_stmt_send_long_data 0
Com_truncate 0
Com_uninstall_plugin 0
Com_unlock_tables 0
Com_update 0
Com_update_multi 0
Com_xa_commit 0
Com_xa_end 0
Com_xa_prepare 0
Com_xa_recover 0
Com_xa_rollback 0
Com_xa_start 0
Compression OFF
Connections 375
Created_tmp_disk_tables 0
Created_tmp_files 6
Created_tmp_tables 0
Delayed_errors 0
Delayed_insert_threads 0
Delayed_writes 0
Flush_commands 1
Handler_commit 0
Handler_delete 0
Handler_discover 0
Handler_prepare 0
Handler_read_first 0
Handler_read_key 0
Handler_read_last 0
Handler_read_next 0
Handler_read_prev 0
Handler_read_rnd 0
Handler_read_rnd_next 0
Handler_rollback 0
Handler_savepoint 0
Handler_savepoint_rollback 0
Handler_update 0
Handler_write 0
Innodb_buffer_pool_pages_data 584
Innodb_buffer_pool_bytes_data 9568256
Innodb_buffer_pool_pages_dirty 0
Innodb_buffer_pool_bytes_dirty 0
Innodb_buffer_pool_pages_flushed 120
Innodb_buffer_pool_pages_free 7607
Innodb_buffer_pool_pages_misc 0
Innodb_buffer_pool_pages_total 8191
Innodb_buffer_pool_read_ahead_rnd 0
Innodb_buffer_pool_read_ahead 0
Innodb_buffer_pool_read_ahead_evicted 0
Innodb_buffer_pool_read_requests 14912
Innodb_buffer_pool_reads 584
Innodb_buffer_pool_wait_free 0
Innodb_buffer_pool_write_requests 203
Innodb_data_fsyncs 163
Innodb_data_pending_fsyncs 0
Innodb_data_pending_reads 0
Innodb_data_pending_writes 0
Innodb_data_read 11751424
Innodb_data_reads 594
Innodb_data_writes 243
Innodb_data_written 3988480
Innodb_dblwr_pages_written 120
Innodb_dblwr_writes 40
Innodb_have_atomic_builtins ON
Innodb_log_waits 0
Innodb_log_write_requests 28
Innodb_log_writes 41
Innodb_os_log_fsyncs 83
Innodb_os_log_pending_fsyncs 0
Innodb_os_log_pending_writes 0
Innodb_os_log_written 34816
Innodb_page_size 16384
Innodb_pages_created 1
Innodb_pages_read 583
Innodb_pages_written 120
Innodb_row_lock_current_waits 0
Innodb_row_lock_time 0
Innodb_row_lock_time_avg 0
Innodb_row_lock_time_max 0
Innodb_row_lock_waits 0
Innodb_rows_deleted 0
Innodb_rows_inserted 0
Innodb_rows_read 40
Innodb_rows_updated 39
Innodb_truncated_status_writes 0
Key_blocks_not_flushed 0
Key_blocks_unused 13396
Key_blocks_used 0
Key_read_requests 0
Key_reads 0
Key_write_requests 0
Key_writes 0
Last_query_cost 0.000000
Max_used_connections 3
Not_flushed_delayed_rows 0
Open_files 86
Open_streams 0
Open_table_definitions 109
Open_tables 109
Opened_files 439
Opened_table_definitions 0
Opened_tables 0
Performance_schema_cond_classes_lost 0
Performance_schema_cond_instances_lost 0
Performance_schema_file_classes_lost 0
Performance_schema_file_handles_lost 0
Performance_schema_file_instances_lost 0
Performance_schema_locker_lost 0
Performance_schema_mutex_classes_lost 0
Performance_schema_mutex_instances_lost 0
Performance_schema_rwlock_classes_lost 0
Performance_schema_rwlock_instances_lost 0
Performance_schema_table_handles_lost 0
Performance_schema_table_instances_lost 0
Performance_schema_thread_classes_lost 0
Performance_schema_thread_instances_lost 0
Prepared_stmt_count 0
Qcache_free_blocks 1
Qcache_free_memory 16758160
Qcache_hits 0
Qcache_inserts 1
Qcache_lowmem_prunes 0
Qcache_not_cached 419
Qcache_queries_in_cache 1
Qcache_total_blocks 4
Queries 1146
Questions 2
Rpl_status AUTH_MASTER
Select_full_join 0
Select_full_range_join 0
Select_range 0
Select_range_check 0
Select_scan 0
Slave_heartbeat_period 0.000
Slave_open_temp_tables 0
Slave_received_heartbeats 0
Slave_retried_transactions 0
Slave_running OFF
Slow_launch_threads 0
Slow_queries 0
Sort_merge_passes 0
Sort_range 0
Sort_rows 0
Sort_scan 0
Ssl_accept_renegotiates 0
Ssl_accepts 0
Ssl_callback_cache_hits 0
Ssl_cipher
Ssl_cipher_list
Ssl_client_connects 0
Ssl_connect_renegotiates 0
Ssl_ctx_verify_depth 0
Ssl_ctx_verify_mode 0
Ssl_default_timeout 0
Ssl_finished_accepts 0
Ssl_finished_connects 0
Ssl_session_cache_hits 0
Ssl_session_cache_misses 0
Ssl_session_cache_mode NONE
Ssl_session_cache_overflows 0
Ssl_session_cache_size 0
Ssl_session_cache_timeouts 0
Ssl_sessions_reused 0
Ssl_used_session_cache_entries 0
Ssl_verify_depth 0
Ssl_verify_mode 0
Ssl_version
Table_locks_immediate 123
Table_locks_waited 0
Tc_log_max_pages_used 0
Tc_log_page_size 0
Tc_log_page_waits 0
Threads_cached 1
Threads_connected 2
Threads_created 3
Threads_running 1
Uptime 2389
Uptime_since_flush_status 2389
How would one use awk to make this calculation of Queries per second (Queries/Uptime):
1146/2389
And print the result?
I'm grepping 2 results from a list of results and need to calculate items/second where 302 is the total item count and 503 the total uptimecount.
At this moment I'm doing
grep -Ew "Queries|Uptime" | awk '{print $2}'
to print out:
302
503
But here i got stuck.
You can use something like:
$ awk '/Queries/ {q=$2} /Uptime/ {print q/$2}' file
0.600398
That is: when the line contains the string "Queries", store its value. When it contains "Uptime", print the result of dividing its value by the one stored in queries.
This assumes the string "Queries" appearing before the string "Uptime".
Given your updated input, I see that we need to check if the first field is exactly "Uptime" or "Queries" so that it does not match other lines with this content:
$ awk '$1 == "Queries" {q=$2} $1=="Uptime" {print q/$2}' file
0.479699
I think the following awk one-liner will help you:
kent$ cat f
Queries 302
Uptime 503
LsyHP 13:42:57 /tmp/test
kent$ awk '{a[NR]=$NF}END{printf "%.2f\n",a[NR-1]/a[NR]}' f
0.60
If you want to do together with "grep" function:
kent$ awk '/Queries/{a=$NF}/Uptime/{b=$NF}END{printf "%.2f\n",a/b}' f
0.60

Resources