Related
With below configuration in pyproject.toml :
[tool.black]
# How many characters per line to allow.
line-length = 120
# When processing Jupyter Notebooks, add the given magic to the list of known
# python-magics (timeit, prun, capture, pypy, python3, python, time).
# Useful for formatting cells with custom python magics.
# python-cell-magics =
# Require a specific version of Black to be running
# (useful for unifying results across many environments e.g. with a pyproject.toml file).
# It can be either a major version number or an exact version.
# required-version =
# A regular expression that matches files and directories that should be
# included on recursive searches. An empty value means all files are included
# regardless of the name. Use forward slashes for directories on all platforms (Windows, too).
# Exclusions are calculated first, inclusions later.
# include = "(\.pyi?|\.ipynb)$"
# A regular expression that matches files and directories that should be
# excluded on recursive searches. An empty value means no paths are excluded.
# Use forward slashes for directories on all platforms (Windows, too).
# Exclusions are calculated first, inclusions later.
# exclude = "/(\.direnv|\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|venv|\.svn|\.ipynb_checkpoints|_build|buck-out|build|dist|__pypackages__)/"
# Like 'exclude', but adds additional files and directories on top of the excluded ones.
# (Useful if you simply want to add to the default).
# extend-exclude =
# Like 'exclude', but files and directories matching this regex will be excluded
# even when they are passed explicitly as arguments.
# force-exclude =
# The name of the file when passing it through stdin.
# Useful to make sure Black will respect 'force-exclude' option on some editors that rely on using stdin.
# stdin-filename =
# Number of parallel workers.
# Can be a number or a range.
# workers =
and this command line :
black --config "pyproject.toml" --target-version py39 --check --diff .
the following line of code is flagged :
ave_quantity = self.exec_math(math_iterable["mean"], "mean", []) # execute the "mean" fxn on the dataset # cspell: disable-line # fmt: skip
--- properties/datasets/models.py 2022-11-30 00:01:16.590743 +0000
+++ properties/datasets/models.py 2022-11-30 00:01:18.692767 +0000
## -746,11 +746,13 ##
calculate the mean value of all the dataset points
return: numerical value of this function when all variables are zero
rtype: float
"""
- ave_quantity = self.exec_math(math_iterable["mean"], "mean", []) # execute the "mean" fxn on the dataset # fmt:skip
+ ave_quantity = self.exec_math(
+ math_iterable["mean"], "mean", []
+ ) # execute the "mean" fxn on the dataset # fmt:skip
return getattr(ave_quantity, "magnitude", 0.0)
def serialize(self, flat=False):
return {
"type": "dataset",
would reformat properties/datasets/models.py
Oh no! ๐ฅ ๐ ๐ฅ
1 file would be reformatted, 102 files would be left unchanged.
What am I missing here?
Using black v22.10.0
Also asked here --> https://github.com/psf/black/issues/451#issuecomment-1331478945
I have to rotate (like "clockwise") a singe entry of a tuple in a list of tuples. I use this single tuple as a focus.
And when the flag was placed on the end of the list of tuples, it next should be again on the begin.
x = []
x.append(('search','https://www.search.com','1')) # '1' is the focus
x.append(('financials','https://www.stock-exchange.com','0'))
x.append(('fastfood','https://www.burgers.com','0'))
x.append(('tv','https://www.tv.com','0'))
print x
[('search','https://www.search.com','1'),('financials','https://www.stock-exchange.com','0'),('fastfood','https://www.burgers.com','0'),('tv','https://www.tv.com','0')]
What I need to have, is this (abstract displayed)...
- - *
- -
- -
- -
Then I must switch the focus to the "next line"...
- -
- - *
- -
- -
...and later...
- -
- -
- - *
- -
...then...
- -
- -
- -
- - *
...and then this again...
- - *
- -
- -
- -
...and so on.
Sometimes I need to find the focus in my code and extract the data from the first and second entry where the focus (third entry) is currently placed.
With this "one-liner", I can find the focus, and can store all entries to a, b and c:
a,b,c = list(data[[index for (index, a_tuple) in enumerate(x) if a_tuple[2]=='1'][0]])
I like those "one-liners". But changing the focus to the "next line" seems to be not so easy.
newX = [] # Create a new List for x
if '1' in x[len(x)-1][2]: # Determine whether the focus is currently placed on the end, and if Yes, place it to the begin
for b in range(0,len(x),1): # Parse
if b == 0: # Set the focus on the begin
newX.append((x[b][0],x[b][1],'1')) # Set new focus to new list
else: # No focus for all other entries
newX.append((x[b][0],x[b][1],"0")) # Set "no-focus" to all other entries in new list
else: # The focus was not on the end. Where is it?
a = 0 # A little helper
for b in range(0,len(x),1): # Parse again
if '1' in x[b][2]: # Focus found
a = b # Set the current tuple-number to a
break # Already found... don't go on with for/next
for b in range(0,len(x),1): # Parse again
if b == a+1: # Set the new focus on next entry as it was before
newX.append((x[b][0],x[b][1],'1')) # Set new focus to new list
else: # No Focus for all other entries
newX.append((x[b][0],x[b][1],"0")) # Set "no-focus" to all other entries in new list
x = newX # Set x with the new list
del newX,a,b # Save some memory
I managed to do this, but I don't like my code. This task looks so simple that I think there also has to be a "one-liner" for it, something that is embedded internally in Python and that is exactly intended for this. I have v2.7. Anyone an idea?
If I understand you correctly, you want simple modulo (%) operator.
Remove the third item from the tuple, make new variable named selected that will hold the value of selected item:
lst = [
("search", "https://www.search.com"),
("financials", "https://www.stock-exchange.com"),
("fastfood", "https://www.burgers.com"),
("tv", "https://www.tv.com"),
]
selected = 0
def next_selected(lst, cur_selected):
return (cur_selected + 1) % len(lst)
selected = next_selected(lst, selected) # will wrap around
Then getting the selected item is simply:
a, b = lst[selected]
I think I've got it
lst = {
0: ("0","search", "https://www.search.com"),
1: ("0", "financials", "https://www.stock-exchange.com"),
2: ("*", "fastfood", "https://www.burgers.com"),
3: ("0", "tv", "https://www.tv.com"),
}
def next_selected(lst, cur_selected):
return (cur_selected + 1) % len(lst)
for index, key in enumerate(lst): # parse
if lst[index][0] == "*": # found focus
break
lst[index] = "0",lst[index][1],lst[index][2] # delete old focus
selected = index # use index from above parsing
selected = next_selected(lst, selected) # will wrap around
lst[selected] = '*',lst[selected][1],lst[selected][2] # store focus to selected
a,b = lst[selected][1],lst[selected][2] # get values of selected
I am brand new to PostgreSql and I am having trouble returning a simple SELECT * FROM TABLE; query.
The problem I am encountering is that the client times out and I am receiving out of memory errors (OOM) when trying to return a large number of rows. In this case roughly 16 million.
I have tested the query via the server by executing the psql using the command line, and the query will return the full result set in about 10 minutes.
Also, I have tested this using the pgAdmin client on a Macbook and I was able to return the full result set using that setup in about the same amount of time as on the server.
However, when using a WINDOWS client, I am using Jetbrains/Datagrip (and have tried MySQL Lite & pgAdmin with the same results), I am unable to return the full table whether I query for it using SELECT * FROM TABLE_NAME; or trying to load the table/datagrid on the client itself.
This leads me to believe that this is a client-related issue, but if it is, I am hoping someone can provide some insight because at this point I am stumped as to why I cannot return this ~16 million row table/datagrid.
I am also having difficulty logging issues since the interrogation is timing out/OOM.
Any suggestions, insight, and/or guidance is greatly appreciated.
SPECS
Connecting ORACLE 12c to PostgreSql 9.3.5 using Oracle Foreign Data Wrapper (oracle_fdw v1.0)
VMWare
Debian GNU/Linux 7
Psql (9.3.5)
16GB RAM
4 CPUs
Here is my PostgreSql config:
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# (The "=" is optional.) Whitespace may be used. Comments are introduced with
# "#" anywhere on a line. The complete list of parameter names and allowed
# values can be found in the PostgreSQL documentation.
#
# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pg_ctl reload". Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#
# Any parameter can also be given as a command-line option to the server, e.g.,
# "postgres -c log_connections=on". Some parameters can be changed at run time
# with the "SET" SQL command.
#
# Memory units: kB = kilobytes Time units: ms = milliseconds
# MB = megabytes s = seconds
# GB = gigabytes min = minutes
# h = hours
# d = days
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------
# The default values of these variables are driven from the -D command-line
# option or PGDATA environment variable, represented here as ConfigDir.
data_directory = '/var/lib/postgresql/9.3/main' # use data in another directory
# (change requires restart)
hba_file = '/etc/postgresql/9.3/main/pg_hba.conf' # host-based authentication file
# (change requires restart)
ident_file = '/etc/postgresql/9.3/main/pg_ident.conf' # ident configuration file
# (change requires restart)
# If external_pid_file is not explicitly set, no extra PID file is written.
external_pid_file = '/var/run/postgresql/9.3-main.pid' # write an extra PID file
# (change requires restart)
#------------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#------------------------------------------------------------------------------
# - Connection Settings -
listen_addresses = '*' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).
#superuser_reserved_connections = 3 # (change requires restart)
unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
# (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
# (change requires restart)
#bonjour = off # advertise server via Bonjour
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)
# - Security and Authentication -
#authentication_timeout = 1min # 1s-600s
ssl = true # (change requires restart)
#ssl_ciphers = 'DEFAULT:!LOW:!EXP:!MD5:#STRENGTH' # allowed SSL ciphers
# (change requires restart)
#ssl_renegotiation_limit = 512MB # amount of data between renegotiations
ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem' # (change requires restart)
ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key' # (change requires restart)
#ssl_ca_file = '' # (change requires restart)
#ssl_crl_file = '' # (change requires restart)
#password_encryption = on
#db_user_namespace = off
# Kerberos and GSSAPI
#krb_server_keyfile = ''
#krb_srvname = 'postgres' # (Kerberos only)
#krb_caseins_users = off
# - TCP Keepalives -
# see "man 7 tcp" for details
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------
# - Memory -
shared_buffers = 128MB # min 128kB
# (change requires restart)
#temp_buffers = 8MB # min 800kB
#max_prepared_transactions = 0 # zero disables the feature
# (change requires restart)
# Note: Increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
# It is not advisable to set max_prepared_transactions nonzero unless you
# actively intend to use prepared transactions.
#work_mem = 1MB # min 64kB
#maintenance_work_mem = 16MB # min 1MB
#max_stack_depth = 2MB # min 100kB
# - Disk -
#temp_file_limit = -1 # limits per-session temp file space
# in kB, or -1 for no limit
# - Kernel Resource Usage -
#max_files_per_process = 1000 # min 25
# (change requires restart)
#shared_preload_libraries = '' # (change requires restart)
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0 # 0-100 milliseconds
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 10 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 1-10000 credits
# - Background Writer -
#bgwriter_delay = 200ms # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers scanned/round
# - Asynchronous Behavior -
#effective_io_concurrency = 1 # 1-1000; 0 disables prefetching
#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------
# - Settings -
#wal_level = minimal # minimal, archive, or hot_standby
# (change requires restart)
#fsync = on # turns forced synchronization on or off
#synchronous_commit = on # synchronization level;
# off, local, remote_write, or on
#wal_sync_method = fsync # the default is the first option
# supported by the operating system:
# open_datasync
# fdatasync (default on Linux)
# fsync
# fsync_writethrough
# open_sync
#full_page_writes = on # recover from partial page writes
#wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
# (change requires restart)
#wal_writer_delay = 200ms # 1-10000 milliseconds
#commit_delay = 0 # range 0-100000, in microseconds
#commit_siblings = 5 # range 1-1000
# - Checkpoints -
#checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
# - Archiving -
#archive_mode = off # allows archiving to be done
# (change requires restart)
#archive_command = '' # command to use to archive a logfile segment
# placeholders: %p = path of file to archive
# %f = file name only
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0 # force a logfile segment switch after this
# number of seconds; 0 disables
#------------------------------------------------------------------------------
# REPLICATION
#------------------------------------------------------------------------------
# - Sending Server(s) -
# Set these on the master and on any standby that will send replication data.
#max_wal_senders = 0 # max number of walsender processes
# (change requires restart)
#wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables
#wal_sender_timeout = 60s # in milliseconds; 0 disables
# - Master Server -
# These settings are ignored on a standby server.
#synchronous_standby_names = '' # standby servers that provide sync rep
# comma-separated list of application_name
# from standby(s); '*' = all
#vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed
# - Standby Servers -
# These settings are ignored on a master server.
#hot_standby = off # "on" allows queries during recovery
# (change requires restart)
#max_standby_archive_delay = 30s # max delay before canceling queries
# when reading WAL from archive;
# -1 allows indefinite delay
#max_standby_streaming_delay = 30s # max delay before canceling queries
# when reading streaming WAL;
# -1 allows indefinite delay
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
# query conflicts
#wal_receiver_timeout = 60s # time that receiver waits for
# communication from master
# in milliseconds; 0 disables
#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------
# - Planner Method Configuration -
#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_indexonlyscan = on
#enable_material = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on
# - Planner Cost Constants -
#seq_page_cost = 1.0 # measured on an arbitrary scale
#random_page_cost = 4.0 # same scale as above
#cpu_tuple_cost = 0.01 # same scale as above
#cpu_index_tuple_cost = 0.005 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above
#effective_cache_size = 128MB
# - Genetic Query Optimizer -
#geqo = on
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0 # selects default based on effort
#geqo_selection_bias = 2.0 # range 1.5-2.0
#geqo_seed = 0.0 # range 0.0-1.0
# - Other Planner Options -
#default_statistics_target = 100 # range 1-10000
#constraint_exclusion = partition # on, off, or partition
#cursor_tuple_fraction = 0.1 # range 0.0-1.0
#from_collapse_limit = 8
#join_collapse_limit = 8 # 1 disables collapsing of explicit
# JOIN clauses
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
#log_destination = 'stderr' # Valid values are combinations of
# stderr, csvlog, syslog, and eventlog,
# depending on platform. csvlog
# requires logging_collector to be on.
# This is used when logging to stderr:
#logging_collector = off # Enable capturing of stderr and csvlog
# into log files. Required to be on for
# csvlogs.
# (change requires restart)
# These are only used if logging_collector is on:
#log_directory = 'pg_log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_file_mode = 0600 # creation mode for log files,
# begin with 0 to use octal notation
#log_truncate_on_rotation = off # If on, an existing log file with the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# This is only relevant when logging to eventlog (win32):
#event_source = 'PostgreSQL'
# - When to Log -
#client_min_messages = notice # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
#log_min_messages = warning # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#log_min_error_statement = error # values in order of decreasing detail:
log_min_error_statement = error # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic (effectively off)
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_checkpoints = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_error_verbosity = default # terse, default, or verbose messages
#log_hostname = off
log_line_prefix = '%t ' # special values:
# %a = application name
# %u = user name
# %d = database name
# %r = remote host and port
# %h = remote host
# %p = process ID
# %t = timestamp without milliseconds
# %m = timestamp with milliseconds
# %i = command tag
# %e = SQL state
# %c = session ID
# %l = session line number
# %s = session start timestamp
# %v = virtual transaction ID
# %x = transaction ID (0 if none)
# %q = stop here in non-session
# processes
# %% = '%'
# e.g. '<%u%%%d> '
#log_lock_waits = off # log lock waits >= deadlock_timeout
#log_statement = 'none' # none, ddl, mod, all
log_statement = 'all' # none, ddl, mod, all
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
# -1 disables, 0 logs all temp files
log_timezone = 'UTC'
#------------------------------------------------------------------------------
# RUNTIME STATISTICS
#------------------------------------------------------------------------------
# - Query/Index Statistics Collector -
#track_activities = on
#track_counts = on
#track_io_timing = off
#track_functions = none # none, pl, all
#track_activity_query_size = 1024 # (change requires restart)
#update_process_title = on
#stats_temp_directory = 'pg_stat_tmp'
# - Statistics Monitoring -
#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off
#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------
#autovacuum = on # Enable autovacuum subprocess? 'on'
# requires track_counts to also be on.
#log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and
# their durations, > 0 logs only
# actions running at least this number
# of milliseconds.
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
# (change requires restart)
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
#autovacuum_analyze_threshold = 50 # min number of row updates before
# analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
# (change requires restart)
#autovacuum_multixact_freeze_max_age = 400000000 # maximum Multixact age
# before forced vacuum
# (change requires restart)
#autovacuum_vacuum_cost_delay = 20ms # default vacuum cost delay for
# autovacuum, in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit
#------------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------------------------------------------
# - Statement Behavior -
#search_path = '"$user",public' # schema names
#default_tablespace = '' # a tablespace name, '' uses the default
#temp_tablespaces = '' # a list of tablespace names, '' uses
# only default tablespace
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#default_transaction_deferrable = off
#session_replication_role = 'origin'
#statement_timeout = 0 # in milliseconds, 0 is disabled
#lock_timeout = 0 # in milliseconds, 0 is disabled
#vacuum_freeze_min_age = 50000000
#vacuum_freeze_table_age = 150000000
#vacuum_multixact_freeze_min_age = 5000000
#vacuum_multixact_freeze_table_age = 150000000
#bytea_output = 'hex' # hex, escape
#xmlbinary = 'base64'
#xmloption = 'content'
# - Locale and Formatting -
datestyle = 'iso, mdy'
#intervalstyle = 'postgres'
timezone = 'UTC'
#timezone_abbreviations = 'Default' # Select the set of available time zone
# abbreviations. Currently, there are
# Default
# Australia
# India
# You can create your own file in
# share/timezonesets/.
#extra_float_digits = 0 # min -15, max 3
#client_encoding = sql_ascii # actually, defaults to database
# encoding
# These settings are initialized by initdb, but they can be changed.
lc_messages = 'C' # locale for system error message
# strings
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
# default configuration for text search
default_text_search_config = 'pg_catalog.english'
# - Other Defaults -
#dynamic_library_path = '$libdir'
#local_preload_libraries = ''
#------------------------------------------------------------------------------
# LOCK MANAGEMENT
#------------------------------------------------------------------------------
#deadlock_timeout = 1s
#max_locks_per_transaction = 64 # min 10
# (change requires restart)
# Note: Each lock table slot uses ~270 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.
#max_pred_locks_per_transaction = 64 # min 10
# (change requires restart)
#------------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#------------------------------------------------------------------------------
# - Previous PostgreSQL Versions -
#array_nulls = on
#backslash_quote = safe_encoding # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#lo_compat_privileges = off
#quote_all_identifiers = off
#sql_inheritance = on
#standard_conforming_strings = on
#synchronize_seqscans = on
# - Other Platforms and Clients -
#transform_null_equals = off
#------------------------------------------------------------------------------
# ERROR HANDLING
#------------------------------------------------------------------------------
#exit_on_error = off # terminate session on any error?
#restart_after_crash = on # reinitialize after backend crash?
#------------------------------------------------------------------------------
# CONFIG FILE INCLUDES
#------------------------------------------------------------------------------
# These options allow settings to be loaded from files other than the
# default postgresql.conf.
#include_dir = 'conf.d' # include files ending in '.conf' from
# directory 'conf.d'
#include_if_exists = 'exists.conf' # include file only if it exists
#include = 'special.conf' # include file
#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------
# Add settings for extensions here
Thank you for any help that you can provide.
I am struggling to find a text comparison tool or algorithm that can compare an expected text against the current state of the text being typed.
I will have an experimentee typewrite a text that he has in front of his eyes. My idea is to compare the current state of the text against the expected text whenever something is typed. That way I want to find out when and what the subject does wrong (I also want to find errors that are not in the resulting text but were in the intermediate text for some time).
Can someone point me in a direction?
Update #1
I have access to the typing data in a csv format:
This is example output data of me typing "foOBar". Every line has the form (timestamp, Key, Press/Release)
17293398.576653,F,P
17293398.6885,F,R
17293399.135282,LeftShift,P
17293399.626881,LeftShift,R
17293401.313254,O,P
17293401.391732,O,R
17293401.827314,LeftShift,P
17293402.073046,O,P
17293402.184859,O,R
17293403.178612,B,P
17293403.301748,B,R
17293403.458137,LeftShift,R
17293404.966193,A,P
17293405.077869,A,R
17293405.725405,R,P
17293405.815159,R,R
In Python
Given your input csv file (I called it keyboard_records.csv)
17293398.576653,F,P
17293398.6885,F,R
17293399.135282,LeftShift,P
17293399.626881,LeftShift,R
17293401.313254,O,P
17293401.391732,O,R
17293401.827314,LeftShift,P
17293402.073046,O,P
17293402.184859,O,R
17293403.178612,B,P
17293403.301748,B,R
17293403.458137,LeftShift,R
17293404.966193,A,P
17293405.077869,A,R
17293405.725405,R,P
17293405.815159,R,R
The following code does the following:
Read its content and store it in a list named steps
For each step in steps recognizes what happened and
If it was a shift press or release sets a flag (shift_on) accordingly
If it was an arrow pressed moves the cursor (index of current where we insert characters) โ if it the cursor is at the start or at the end of the string it shouldn't move, that's why those min() and max()
If it was a letter/number/symbol it adds it in curret at cursor position and increments cursor
Here you have it
import csv
steps = [] # list of all actions performed by user
expected = "Hello"
with open("keyboard.csv") as csvfile:
for row in csv.reader(csvfile, delimiter=','):
steps.append((float(row[0]), row[1], row[2]))
# Now we parse the information
current = [] # text written by the user
shift_on = False # is shift pressed
cursor = 0 # where is the cursor in the current text
for step in steps:
time, key, action = step
if key == 'LeftShift':
if action == 'P':
shift_on = True
else:
shift_on = False
continue
if key == 'LeftArrow' and action == 'P':
cursor = max(0, cursor-1)
continue
if key == 'RightArrow' and action == 'P':
cursor = min(len(current), cursor+1)
continue
if action == 'P':
if shift_on is True:
current.insert(cursor, key.upper())
else:
current.insert(cursor, key.lower())
cursor += 1
# Now you can join current into a string
# and compare current with expected
print(''.join(current)) # printing current (just to see what's happening)
else:
# What to do when a key is released?
# Depends on your needs...
continue
To compare current and expected have a look here.
Note: by playing around with the code above and a few more flags you can make it recognize also symbols. This will depend on your keyboard. In mine Shift + 6 = &, AltGr + E = โฌ and Ctrl + Shift + AltGr + รจ = {. I think this is a good point to start.
Update
Comparing 2 texts isn't a difficult task and you can find tons of pages on the web about it.
Anyway I wanted to present you an object oriented approach to the problem, so I added the compare part that I previously omitted in the first solution.
This is still a rough code, without primary controls over the input. But, as you asked, this is pointing you in a direction.
class UserText:
# Initialize UserText:
# - empty text
# - cursor at beginning
# - shift off
def __init__(self, expected):
self.expected = expected
self.letters = []
self.cursor = 0
self.shift = False
# compares a and b and returns a
# list containing the indices of
# mismatches between a and b
def compare(a, b):
err = []
for i in range(min(len(a), len(b))):
if a[i] != b[i]:
err.append(i)
return err
# Parse a command given in the
# form (time, key, action)
def parse(self, command):
time, key, action = command
output = ""
if action == 'P':
if key == 'LeftShift':
self.shift = True
elif key == 'LeftArrow':
self.cursor = max(0, self.cursor - 1)
elif key == 'RightArrow':
self.cursor = min(len(self.letters), self.cursor + 1)
else:
# Else, a letter/number was pressed. Let's
# add it to self.letters in cursor position
if self.shift is True:
self.letters.insert(self.cursor, key.upper())
else:
self.letters.insert(self.cursor, key.lower())
self.cursor += 1
########## COMPARE WITH EXPECTED ##########
output += "Expected: \t" + self.expected + "\n"
output += "Current: \t" + str(self) + "\n"
errors = UserText.compare(str(self), self.expected[:len(str(self))])
output += "\t\t"
i = 0
for e in errors:
while i != e:
output += " "
i += 1
output += "^"
i += 1
output += "\n[{} errors at time {}]".format(len(errors), time)
return output
else:
if key == 'LeftShift':
self.shift = False
return output
def __str__(self):
return "".join(self.letters)
import csv
steps = [] # list of all actions performed by user
expected = "foobar"
with open("keyboard.csv") as csvfile:
for row in csv.reader(csvfile, delimiter=','):
steps.append((float(row[0]), row[1], row[2]))
# Now we parse the information
ut = UserText(expected)
for step in steps:
print(ut.parse(step))
The output for the csv file above was:
Expected: foobar
Current: f
[0 errors at time 17293398.576653]
Expected: foobar
Current: fo
[0 errors at time 17293401.313254]
Expected: foobar
Current: foO
^
[1 errors at time 17293402.073046]
Expected: foobar
Current: foOB
^^
[2 errors at time 17293403.178612]
Expected: foobar
Current: foOBa
^^
[2 errors at time 17293404.966193]
Expected: foobar
Current: foOBar
^^
[2 errors at time 17293405.725405]
I found the solution to my own question around a year ago. Now i have time to share it with you:
In their 2003 paper 'Metrics for text entry research: An evaluation of MSD and KSPC, and a new unified error metric', R. William Soukoreff and I. Scott MacKenzie propose three major new metrics: 'total error rate', 'corrected error rate' and 'not corrected error rate'. These metrics have become well established since the publication of this paper. These are exaclty the metrics i was looking for.
If you are trying to do something similiar to what i did, e.g. compare the writing performance on different input devices this is the way to go.
This question already has answers here:
is it possible to add a comment to a diff file (unified)?
(3 answers)
Closed 5 years ago.
I would like to review a patch of a colleague. We are unable to use a review tool. So I would like to comment the patch file he made. Is it possible to write inline comments to a (svn) patch file?
I couldn't find any information in the svn red book on it. I was even unable to find the patch file grammar to figure it out myself.
The diff format is just the unified diff format. If you wanted you could put some text after the range info. Consider this diff produced with command svn diff -c 1544711 https://svn.apache.org/repos/asf/subversion/trunk:
Index: subversion/mod_dav_svn/mod_dav_svn.c
===================================================================
--- subversion/mod_dav_svn/mod_dav_svn.c (revision 1544710)
+++ subversion/mod_dav_svn/mod_dav_svn.c (revision 1544711)
## -1097,7 +1097,8 ##
/* Fill the filename on the request with a bogus path since we aren't serving
* a file off the disk. This means that <Directory> blocks will not match and
- * that %f in logging formats will show as "svn:/path/to/repo/path/in/repo". */
+ * %f in logging formats will show as "dav_svn:/path/to/repo/path/in/repo".
+ */
static int dav_svn__translate_name(request_rec *r)
{
const char *fs_path, *repos_basename, *repos_path;
## -1146,7 +1147,7 ##
if (repos_path && '/' == repos_path[0] && '\0' == repos_path[1])
repos_path = NULL;
- /* Combine 'svn:', fs_path and repos_path to produce the bogus path we're
+ /* Combine 'dav_svn:', fs_path and repos_path to produce the bogus path we're
* placing in r->filename. We can't use our standard join helpers such
* as svn_dirent_join. fs_path is a dirent and repos_path is a fspath
* (that can be trivially converted to a relpath by skipping the leading
## -1154,7 +1155,7 ##
* repository is 'trunk/c:hi' this results in a non canonical dirent on
* Windows. Instead we just cat them together. */
r->filename = apr_pstrcat(r->pool,
- "svn:", fs_path, repos_path, SVN_VA_NULL);
+ "dav_svn:", fs_path, repos_path, SVN_VA_NULL);
/* Leave a note to ourselves so that we know not to decline in the
* map_to_storage hook. */
If you add the option -x-p to that command you'll get:
Index: subversion/mod_dav_svn/mod_dav_svn.c
===================================================================
--- subversion/mod_dav_svn/mod_dav_svn.c (revision 1544710)
+++ subversion/mod_dav_svn/mod_dav_svn.c (revision 1544711)
## -1097,7 +1097,8 ## static int dav_svn__handler(request_rec *r)
/* Fill the filename on the request with a bogus path since we aren't serving
* a file off the disk. This means that <Directory> blocks will not match and
- * that %f in logging formats will show as "svn:/path/to/repo/path/in/repo". */
+ * %f in logging formats will show as "dav_svn:/path/to/repo/path/in/repo".
+ */
static int dav_svn__translate_name(request_rec *r)
{
const char *fs_path, *repos_basename, *repos_path;
## -1146,7 +1147,7 ## static int dav_svn__translate_name(request_rec *r)
if (repos_path && '/' == repos_path[0] && '\0' == repos_path[1])
repos_path = NULL;
- /* Combine 'svn:', fs_path and repos_path to produce the bogus path we're
+ /* Combine 'dav_svn:', fs_path and repos_path to produce the bogus path we're
* placing in r->filename. We can't use our standard join helpers such
* as svn_dirent_join. fs_path is a dirent and repos_path is a fspath
* (that can be trivially converted to a relpath by skipping the leading
## -1154,7 +1155,7 ## static int dav_svn__translate_name(request_rec *r)
* repository is 'trunk/c:hi' this results in a non canonical dirent on
* Windows. Instead we just cat them together. */
r->filename = apr_pstrcat(r->pool,
- "svn:", fs_path, repos_path, SVN_VA_NULL);
+ "dav_svn:", fs_path, repos_path, SVN_VA_NULL);
/* Leave a note to ourselves so that we know not to decline in the
* map_to_storage hook. */
Note how the function is added after the ## on the range lines. This portion of the lines are ignored by any software processing the diff. So you're free to put whatever you want there. You could put your comments there.
Unidiff hunks start each line with ' ' (space) to mean context (as in an unchanged line), '+' to mean an added line, or '-' to mean a removed line. A lot of parsers will (including Subversion's svn patch command) will discard lines that start with some other character. So you might be able to simply insert a line that starts with some other character. But that's not guaranteed to be as portable as the above method.