HA Proxy Prioritized Connections - proxy

I am using HAProxy and I have been trying to set it up to work a certain way.
I want it so that if server 11.111.11.110 connects then it will always hit ABC_server01 unless that server is offline.
However this is how I have it currently written using weights:
acl the_workstation src 11.111.11.110
use_backend ABC if the_workstation
backend ABC
server ABC_server01 22.222.22.220:443 weight 255 maxconn 512 check
server ABC_server02 33.333.33.333:443 weight 1 maxconn 512 check
server ABC_server03 44.444.44.444:443 weight 1 maxconn 512 check
With what is written up top I believe that in 257 connection attempts 2 will not use ABC_server01.
I looked into if loops and timeouts however I was not able to come up with a working solution.
https://www.haproxy.org/coding-style.html
http://www.haproxy.org/download/1.5/doc/configuration.txt
Does anyone know a simple way to make it prioritize connection to a server then use the other remaining servers if the connection fais?
This is the current version of HA Proxy I am using "HA-Proxy version 1.5.18 2016/05/10"

We found the solution we altered the code to look like this:
acl the_workstation src 11.111.11.110
use_backend ABC if the_workstation
backend ABC
server ABC_server01 22.222.22.220:443 weight 255 maxconn 512 check
server ABC_server02 33.333.33.333:443 weight 1 maxconn 512 check backup
server ABC_server03 44.444.44.444:443 weight 1 maxconn 512 check backup
By adding backup it will only hit those servers if the first is offline.

Related

How to make a circuit-breaker in Istio?

I am trying to configure a circuit breaker in Istio. This is the yaml.
trafficPolicy:
connectionPool:
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
tcp:
maxConnections: 1
outlierDetection:
baseEjectionTime: 1m
consecutive5xxErrors: 1
interval: 1s
I have a list of thread groups in JMeter that will be continously hitting the service associated with the above circuit breaker. Upon receiving an error response, it should be making the service unavailable for 1 minute. But, that is not happenning.
Am I misunderstanding how it works? Is there any way to achieve that?
I think you are confusing between outlier detection and circuit breaker based on connectionPool settings.
The settings you are applying in the connectionPool will configure a circuit breaker where if any of the limits are breached then circuit will be tripped and new requests will get an immideate 503 response from istio proxy. As in the new requests will not be sent to the application.
However, the proxy will accept new requests as soon as it can (when limits are not breached by accepting the new request).
There is no such thing as circuit breaking for 1 minute in this context.
Outlier detection is different. This works by tripping a particular error prone POD from the load balancing pool.
Suppose, you have 4 replica pods running for your deployment. And let us say one of the PODs is giving 5xx error (The 503 errors sent by proxy, like in the connection pool breach case, are not counted here. This count is of your application errors). In this case istio will wait for consecutive5xxErrors (1 in your case) and once this is breached it will remove that pod from load balancing for the baseEjectionTime for the first time.
That is, it will wait for baseEjectionTime (1m in your case). Till then no new request will be sent to the error proned POD. After 1 minute it will add the POD again to the load balancing pool. But if again this POD breaches the consecutive5xxErrors (1 in your case) then istio will remove it from the load balancing for 2xbaseEjectionTime which would be 2 minutes in your case.
This will keep going until your POD is back giving non 5XX errors.
With the information you provided I think the problem might the parameter maxEjectionPercent not being set in your DestiationRule:
maxEjectionPercent - Maximum % of hosts in the load balancing pool for the upstream service that can be ejected. Defaults to 10%.
Since it default to 10% this means that only 10% of you deployment will ejected by circuit breaker. For testing purposes you might try to set this to 100%, similar the documentiation to demonstrate this:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: httpbin
spec:
host: httpbin
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutive5xxErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100 👈
I have tested the example in the docs and it works fine for me.
Another possible issue might be sidecar injection. Please verify that your pod actually has one (you should see 2 out 2 containers ready inside the pod):
~  kgp  ✔  cluster-1 ⎈
NAME READY STATUS RESTARTS AGE
fortio-deploy-576dbdfbc4-9crcf 2/2 Running 0 46m
httpbin-74fb669cc6-mg9rh 2/2 Running 0 48m

Apache 2.4 Event MPM - Unable to override MaxRequestWorker and ThreadsPerChild default configuration

We are using Apache 2.4 and we are trying to configure the MaxRequestWorker and ThreadLimit for Event MPM. Below is the configuration I have in apache's httpd.conf. But the configuration doesn't seem to take any effect. It still continues to use default values of (400 MaxRequestWorker and 25 Threads). Not sure if I am missing anything in my configuration.
I want to configure my server to use 1024 MaxRequestWorker and 64 ThreadsPerChild.
We have roughly 2Gig RAM and 2Gig in SWAP, Apache 2.4 (EVENT MPM) and Red Hat Linux OS.
Any help would really help. Thank you so much!!
Httpd.conf
------------
Event MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of Event threads which are kept spare
# MaxSpareThreads: maximum number of Event threads which are kept spare
# ThreadsPerChild: constant number of Event threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule event.c>
ServerLimit 16
StartServers 8
MaxRequestWorkers 1024
MinSpareThreads 75
MaxSpareThreads 250
ThreadsPerChild 64
ThreadLimit 64
MaxConnectionsPerChild 0
</IfModule>
I realise that this is an old post. Just in case anyone else comes across this again.
Check the exact module name. If you check /etc/httpd/conf.modules.d/00-mpm.conf (or equivalent location, this was on RHEL 7/CentOS 7) for the line that loads the events module:
LoadModule mpm_event_module
Copy this module name 'mpm_event_module'.
Rather than specifying this at the end of httpd.conf, it's better practice to create a file in /etc/httpd/conf.d/ called mpm_event.conf and load it there.
In this instance, I believe changing:
<IfModule event.c>
to
<IfModule mpm_event_module>
Then restarting HTTPD, would have fixed it.
Kind Regards,
Will

Hector is unable to read Cassandra data when nodes reboot or terminate

We are trying to run a cassandra cluster on AWS/EC2 within a standard VPC footprint (cassandra nodes on private subnets). Because this is AWS there is always a chance that an EC2 instance will terminate or reboot with no warning. I have been simulating this case on a test cluster and I am seeing things with the cluster that I thought a cluster was suppose to prevent. Specifically if a node reboots some data will go temporarily missing until the node completes its reboot. If a node terminates it appears that some data is lost forever.
For my test I just did a bunch of writes (using QUORUM consistency) to some keyspaces then interrogate the contents of those keyspaces as I bring down nodes (either through reboot or terminate). I'm just using cqlsh SELECT to do the keyspace/column family interrogation of the cluster using ONE consistency level.
Note, even though I am performing no writes to the cluster while I am doing the SELECTs rows temporarily disappear when rebooting and can permanently go missing during termination.
I thought Netflix Priam might be able to help, but sadly it doesn't work in a VPC the last time I checked.
Also, because we are using ephemeral storage instances there is no equivalent of 'shutdown' so I cannot run any scripts during reboot/terminate of an instance to perform a nodetool decommission or nodetool removenode before an instance goes away. Terminate is the equivalent of kicking the plug out of the wall.
Since I am using a replication factor of 3 and quorum/write that should mean that all data is written to at least 2 nodes. So, unless I am totally misunderstanding things (which is possible), losing one node should not mean that I lose any data for any period of time when I am using consistency level ONE for the read.
Questions
Why wouldn't a 6 node cluster with a replication factor of 3 work?
Do I need to run something like a 12 node cluster with a replication factor of 7? Don't bother telling me that will fix the problem, because it doesn't.
Do I need to use consistency level of ALL on the writes then use ONE or QUORUM on the reads?
Is there something not quite right with virtual nodes? unlikely
Are there nodetool commands besides removenode that I need to run when a node terminates to recover missing data? As mentioned earlier, when a reboot occurs, eventually the missing data reappears.
Is there some cassandra savant who can look at my cassandra.yaml file below and send me on the path to salvation?
More Info added 7/19
I don't think this is a QUORUM vs ONE vs ALL is the issue. The test I set up performs no writes to the keyspaces after the initial population of the column families. So the data has had plenty of time (hours) to make it to all the nodes as required by the replication factor. Plus the test dataset is REALLY small (2 column families with about 300-1000 values each). So in other words, the data is completely static.
The behavior I am seeing seems to be tied to the fact that the ec2 instance is no longer on the network. The reason I say this is because if I log on to a node and just do a cassandra stop I see no loss of data. But if I do the reboot or terminate I start getting the following in a stack trace.
CassandraHostRetryService - Downed Host Retry service started with queue size -1 and retry delay 10s
CassandraHostRetryService - Downed Host retry shutdown complete
CassandraHostRetryService - Downed Host retry shutdown hook called
Caused by: TimedOutException()
Caused by: TimedOutException()
So it seems to be more of a networking communication issue in that the cluster is expecting, for example 10.0.12.74, to be on the network after it has joined the cluster. If that ip is suddenly unreachable either due to reboot or termination the timeouts start happening.
When I do a nodetool status under all three scenarios (cassandra stop, reboot or terminate) the status of the node shows up as DN. Which is what you would expect. Eventually nodetool status will return to UN with cassandra start or reboot, but obviously termination always stays DN.
Details of my Configuration
Here are some details of my configuration (cassandra.yaml is at the bottom of this posting):
Nodes are running in private subnets of a VPC.
Cassandra 1.2.5 with num_tokens: 256 (virtual nodes). initial_token: (blank). I am really hoping this works because all of our nodes run in autoscaling groups so the thought that redistribution could be handle dynamically is appealing.
EC2 m1.large one seed and one non-seed node in each availability zone. (so 6 total nodes in the cluster).
Ephemeral storage, not EBS.
Ec2Snitch with NetworkTopologyStrategy and all keyspaces have replication factor of 3.
Non-seed nodes are auto_bootstraped, seed nodes are not.
sample cassandra.yaml file
cluster_name: 'TestCluster'
num_tokens: 256
initial_token:
hinted_handoff_enabled: true
max_hint_window_in_ms: 10800000
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
disk_failure_policy: stop
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /opt/company/dbserver/caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "SEED_IP_LIST"
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 8
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: LISTEN_ADDRESS
start_native_transport: false
native_transport_port: 9042
start_rpc: true
rpc_address: 0.0.0.0
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: true
snapshot_before_compaction: false
auto_bootstrap: AUTO_BOOTSTRAP
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
read_request_timeout_in_ms: 10000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: Ec2Snitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
server_encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra
client_encryption_options:
enabled: false
keystore: conf/.keystore
keystore_password: cassandra
internode_compression: all
I think http://www.datastax.com/documentation/cassandra/1.2/cassandra/dml/dml_config_consistency_c.html will clear up a lot of this. In particular, QUORUM/ONE is not guaranteed to return the most recent data. QUORUM/QUORUM is. So is ALL/ONE, but that will be intolerant to failure on write.
Edit to go with the new information:
CassandraHostRetryService is part of Hector. I assumed you were testing with cqlsh like a sane person would. Lessons:
Use cqlsh for testing
Use the DataStax Java Driver for building your application, which is faster, easier to use, and has more insight into the cluster state than Hector thanks to the native protocol it's built on.

Getting AltQ working in pf.conf (limiting inbound Tor traffic)

I'm trying to learn the ropes on packet queuing, so I thought I'd set up a limitation on traffic coming into port 80 from known Tor Exit nodes. This is on FreeBSD 9, so OpenBSD-specific solutions might not apply (syntax/etc).
# Snipped to mainly the relevant parts
table <torlist> persist file "/var/db/torlist"
# ...
set block-policy return
scrub in all
scrub out on $ext_if all
# What I *want* to do is create a cue for known tor exit nodes
# no single one IP should be able to do more than 56k/sec
# but the combined bandwidth of all tor visitors should not
# exceed 512k/sec, basically limiting Tor visitors to something
# like dialup
altq on $ext_if cbq bandwidth 512k queue { qin-tor }
queue qin-tor bandwidth 56Kb cbq ( default rio )
# ...
block in log all
antispoof for { $ext_if, $tun_if }
antispoof quick for $int_if inet
### inbound web rules
# Main Jail ($IP4_PUB3 is my webserver IP)
pass in on $ext_if inet proto tcp from <torlist> to $IP4_PUB3 port www synproxy state queue qin-tor
pass in on $ext_if inet proto tcp to $IP4_PUB3 port www synproxy state
The problem is, when the altq, queue, and pass line specific for torlist are enabled, all connections are extremely slow. I've even tested my own IP against pfctl -t torlist -T test , and got back "0/1 addresses match", and if I test one from the list it's "1/1 addresses match"
So I'm not really educated in the matter of what exactly I'm doing wrong, I was assuming the pass in line with in it would only be applied to the IPs listed in that table, as such my own IP wouldn't validate on that rule and would pass onto the next one.
Getting it working isn't urgent, but any help in understanding where I'm failing would be greatly appreciated.
Turns out that I didn't quite understand how altq works. When I created a queue on my external interface with only one queue I created a default for all connections. As a result I had to define my top speed plus create a default queue for everything else.
For example if my system has 100Mb top
altq on $ext_if cbq bandwidth 100Mb queue { qin-www, qin-tor }
queue qin-www bandwidth 98Mb priority 1 cbq ( default borrow )
queue qin-tor bandwidth 56Kb priority 7 cbq ( rio )
...
pass in on $ext_if inet proto tcp to $IP4_PUB3 port www synproxy state
pass in on $ext_if inet proto tcp from <torlist> to $IP4_PUB3 port www synproxy state queue qin-tor
(doesn't need to be on top since pf parses all the rules unless you use 'quick')
In this way only those IPs matching in gets throttled down to the qin-tor queue, everything else not defined defaults to the qin-www queue.
The FAQ on OpenBSD's pf didn't seem to make this clear to me until I thought about why there would be an error for a "default", then I figured maybe it applies to the whole interface, so need to define a default for rules not marked to a specific queue.
So there it is... the solution to my 'simple' problem. Hopefully anyone else who has this problem comes accross this.
This is the FAQ I was going by for packet queueing: http://www.openbsd.org/faq/pf/queueing.html

NTP working modes

I am new to NTP protocol. I read the RFC1305 and have some questions about NTP.
My questions are related to NTP working modes.
According to RFC1305 there are 8 modes
| 0 | reserved
| 1 | symmetric active
| 2 | symmetric passive
| 3 | client
| 4 | server
| 5 | broadcast
| 6 | NTP control message
| 7 | reserved for private use
My questions:
1- What are the differences between the symmetric passive device and symmetric active one?
2- Two symmetric active device can sync each other and Two passive active device can sync each other too ,but Can a symmetric passive device been synced by a symmetric active one and vice versa?
3- When a Symmetric passive device is connected to symmetric active one which one sends the NTP packet first?
4- What happens in broadcasting mode? Does the client send any NTP packet or only the broadcaster does that?
5- ”in order to sync some clients who have CLASS D IP ‘s , the server fills the 3 time stamp fields(receive time stamp is null) and set the mode to 5 and send the packet to 224.0.1.1 and clients get that packet and they send nothing in this procedure” Is this true?
6- Who sends the NTP control message? Client or broadcaster? What’s it for? What’s the appropriate answer for it?is it always 12 bytes long?
7- “A stratum 1 NTP server (GPS connected) acts like this: answer mode 1 requests with mode 2, mode 3 with mode 4 and mode 6 with 7” Is this true?
can only reply to a few questions:
-4. only the server (broadcaster) is allowed to send any ntp-packet in this mode
clients only listen to the interface, parse the received packet and set their clock accordingly - there is no reply being send.
but clients may send a ntp-request too, the server should then not reply to this one.
-5. right. there is no answer supposed to be send by this clients.
Mode 6 is used by the ntpq program. It can for example query "a list of the peers known to the server as well as a summary of their state" (from the man page).
This has recently be exploited to do DDOS reflection attacks, because it can be triggered with spoofed IP address, and the reply is larger than the query. 1
For this reason mode 6 and 7 queries should be blocked from outside sources.

Resources