Successive DNS caching - caching

I can't find anyone talking about this, but isn't it correct to say that if my authoritative DNS server sets the TTL of a record to 1 min and a client is trying to resolve that record on a home computer, going through a local recursive DNS server, that a change to this DNS record may take up to 2 minutes to propagate, due to the fact that caches of the local DNS server and the operating system are being added up?
If client A tries to resolve mysite.com going through local DNS D, right before a DNS record update, then local DNS D will have a fresh record on its local cash with a TTL of 1min, pointing to the wrong record. Now if client B tries to resolve mysite.com 59 seconds later using DNS D, then client's B operating system will cache this record for another minute, effectively taking 2 minutes, at least, for client B to be able to get the correct value.
Is my assumption correct? And if so, what are other possible caches along the way and how can I safely track all of the caches that a specific machine is behind for a specific hostname resolution?

Your assumption is not correct. When sending response to the client, caching DNS server will send only time left until cache expiry as TTL value. The client will not even be aware (and does not really care) what is the "original" TTL value.
Here is an example: I have created
dsmoraes.bajic.nl A 127.0.0.254
with TTL=60.
It is a new record so it is safe to assume that it is not cached anywhere. When I query Google DNS for the first time, I will get TTL of 60 (or 59) seconds:
$ dig dsmoraes.bajic.nl #8.8.8.8
; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> dsmoraes.bajic.nl #8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7656
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;dsmoraes.bajic.nl. IN A
;; ANSWER SECTION:
dsmoraes.bajic.nl. 59 IN A 127.0.0.254
;; Query time: 14 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Aug 02 09:49:49 CEST 2019
;; MSG SIZE rcvd: 62
But with every subsequent request, TTL sent to the client will get lower (only the remaining time):
$ dig +noall +answer dsmoraes.bajic.nl #8.8.8.8
dsmoraes.bajic.nl. 49 IN A 127.0.0.254
$ dig +noall +answer dsmoraes.bajic.nl #8.8.8.8
dsmoraes.bajic.nl. 43 IN A 127.0.0.254
(Google DNS is distributed, so you might hit different caching server in subsequent requests)
This way, if caching servers are well behaved, there is no chance of stale record being served (in exceptional cases one might deliberately configure their own DNS server to behave differently, but that is another story).
Further reading: https://www.rfc-editor.org/rfc/rfc1034#section-6

Related

AWS Neptune Performance

I'm working on transferring data from our database which is a rdf store DB to AWS Neptune, and I'm facing some performance issues.
I have a db.r4.large Neptune instance & ec2 instance on the same vpc as Neptune.
Basically, I'm trying to ingest data to Neptune using the following http request: <myinstance>:8182/sparql.
Actually, I send the http request from my ec2 instance, and it seems that Neptune processing time is slow. In addition, it seems that Neptune's processing is not parallel.
Below are my tests & results:
I sent the following request to Neptune:
time curl -X POST -d #/tmp/my_file_32m.txt http://myneptune-poc.c0zm6uyrnnwp.us-east-1.neptune.amazonaws.com:8182/sparql
/tmp/my_file_32m.txt contains sparql insert commands and the time for this request is 34.037s while Neptune claims that it took 21.846 s:
{
"type" : "Commit",
"totalElapsedMillis" : 21846
}
real 0m34.037s
user 0m0.044s
sys 0m0.062s
A tcpdump can clearly proves that the response from Neptune was received in a delay of 34 seconds.
When I sent a data of 100m it took more than 1 min.
When I sent the same file of 32m in parallel, time was multiple in 2 :
time xargs -I % -P 8 curl -vX POST -d #/tmp/my_file_32m.txt "http://myneptune-poc.c0zm6uyrnnwp.us-east-1.neptune.amazonaws.com:8182/sparql" < <(printf '%s\n' {1..2})<
{
"type" : "Commit",
"totalElapsedMillis" : 29797
}
{
"type" : "Commit",
"totalElapsedMillis" : 30362
}
real 0m57.752s
user 0m0.137s
sys 0m0.101s
I took a tcpdump and clearly see from the wireshark that the request was sent in parallel, but there is a delay of ~1 min till Neptune returned 200 OK for both requests.
Actually, it seems that Neptune's processing is not concurrent.
request was sent in time 12 and 200 ok for both requests was sent in time 69 which is exactly 57 seconds of delay.
I tried to increase my Neptune instance size to db.r4.xlarge and also to db.r4.2xlarge, db, but I got the same performance.
I tried to send a compressed data in a gzip format in order to improve times, but it seems that Neptune doesn't support it (checking in wireshark the request was sent correctly).
I would like to hear your opinion about my tests and the results:
why performance is slow for a single http request?
why Neptune's processing is not parallel?
You are comparing the output of time (client side round trip time) with server reported totalEllapsedMillis. The former includes your network transmission time where as the latter is just the time that the db took to compute the query from the time it accepted the request. Do you have any metrics on the time it took to transmit your 100MB file?
Neptune does process queries in parallel (in fact the amount of parallelism scales with your instance type). If your queries are really small compared to the time it spends on the wire, then it may appear like the results completed one after the other. I would like to see more granular details of your experiments to see if there is an issue with your setup.
For starters, what is the network lag between your client and the DB endpoint? (ie how long does it take for you to make a request to the /status API for example)

Compare the time difference between two time servers

I'm trying to determine if there is any difference in time between two time servers in Windows. For example, I have time.windows.com and time.nist.gov. Is there a simple way to compare the time difference?
From windows you can compare both time sources to your own machine and approximate the difference.
w32tm /monitor /computers:time.windows.com,time.nist.gov
time.windows.com[52.179.17.38:123]:
ICMP: error IP_REQ_TIMED_OUT - no response in 1000ms
NTP: -0.0528936s offset from local clock
RefID: utcnist2.colorado.edu [128.138.141.172]
Stratum: 2
time.nist.gov[132.163.97.1:123]:
ICMP: error IP_REQ_TIMED_OUT - no response in 1000ms
NTP: -0.0476330s offset from local clock
RefID: 'NIST' [0x5453494E]
Stratum: 1
Warning:
Reverse name resolution is best effort. It may not be
correct since RefID field in time packets differs across
NTP implementations and may not be using IP addresses.
Good Luck!
Shane

Bash script that checks website every 10 seconds

The following script checks a sites content to see if any change has been done to it, every 10 seconds. It's for a very time sensitive application. If something on the site has changed, I merely have seconds to do something else. It will then start a new download and compare cycle and wait for the next change and do cycle. The do something else, has yet to be scripted and not relevant to the question.
The question: Will it be a problem for a public website to have a script downloading a single page every 10-15 seconds. If so, is there any other way to monitor a site, unmanned?
#!/bin/bash
Domain="example.com"
Ocontent=$(curl -L "$Domain")
Ncontent="$Ocontent"
until [ "$Ocontent" != "$Ncontent" ]; do
Ocontent=$(curl -L "$Domain")
#CONTENT CHANGED TRUE
#if [ "$Ocontent" == "$Ncontent ]; then
# Ocontent=$(curl -L "$Domain")
#fi
echo "$Ocontent"
sleep 10
done
The problems you're going to run into:
If the site notices and has a problem with it, you may end up on a banned IP list. Using an IP pool or other distributed resource can mitigate this.
Pinging a website precisely every x number of seconds is unlikely. Network latency is likely to cause a great deal of variance in this.
If you get a network partition, your code should know how to cope. (What if your connection goes down? What should happen?)
Note that getting the immediate response is only part of downloading a webpage. There may be changes to referenced files, such as css, javascript or images that are not immediately apparent from just the original http response.

DisMan Monitoring - traps not being generated

In order to set up a self-monitoring of a linux OS (CentOS) in order to send traps if a condition occurs i have configured the lines
com2sec notConfigUser default Public0
group notConfigGroup v1 notConfigUser
group notConfigGroup v2c notConfigUser
view systemview included .1
access notConfigGroup "" any noauth exact systemview systemview none
for disk query
disk / 100000000
trap2sink 10.10.64.132
authorization for self monitoring
rouser admin
iquerySecName admin
define message to send OID to monitor threshold values
monitor -r 10 DiskAlmostFull dskPercent < 90
monitor -r 10 machineTooBusy hrProcessorLoad < 90
But the traps are generated only when i restart the snmpd deamon.
I have tried to troubleshoot this issue without success.
Any held will be helpful.
Thanks in advance
Having had the same problem, I discovered the following explanation in "man snmpd.conf".
Section "monitor [OPTIONS] NAME EXPRESSION" states:
"Note that the event will only be triggered once, when the expression first matches. This monitor entry will not fire again until the monitored condition first becomes false, and then matches again."
You may not like the answer, but the monitor command behaves as advertised.

Getting AltQ working in pf.conf (limiting inbound Tor traffic)

I'm trying to learn the ropes on packet queuing, so I thought I'd set up a limitation on traffic coming into port 80 from known Tor Exit nodes. This is on FreeBSD 9, so OpenBSD-specific solutions might not apply (syntax/etc).
# Snipped to mainly the relevant parts
table <torlist> persist file "/var/db/torlist"
# ...
set block-policy return
scrub in all
scrub out on $ext_if all
# What I *want* to do is create a cue for known tor exit nodes
# no single one IP should be able to do more than 56k/sec
# but the combined bandwidth of all tor visitors should not
# exceed 512k/sec, basically limiting Tor visitors to something
# like dialup
altq on $ext_if cbq bandwidth 512k queue { qin-tor }
queue qin-tor bandwidth 56Kb cbq ( default rio )
# ...
block in log all
antispoof for { $ext_if, $tun_if }
antispoof quick for $int_if inet
### inbound web rules
# Main Jail ($IP4_PUB3 is my webserver IP)
pass in on $ext_if inet proto tcp from <torlist> to $IP4_PUB3 port www synproxy state queue qin-tor
pass in on $ext_if inet proto tcp to $IP4_PUB3 port www synproxy state
The problem is, when the altq, queue, and pass line specific for torlist are enabled, all connections are extremely slow. I've even tested my own IP against pfctl -t torlist -T test , and got back "0/1 addresses match", and if I test one from the list it's "1/1 addresses match"
So I'm not really educated in the matter of what exactly I'm doing wrong, I was assuming the pass in line with in it would only be applied to the IPs listed in that table, as such my own IP wouldn't validate on that rule and would pass onto the next one.
Getting it working isn't urgent, but any help in understanding where I'm failing would be greatly appreciated.
Turns out that I didn't quite understand how altq works. When I created a queue on my external interface with only one queue I created a default for all connections. As a result I had to define my top speed plus create a default queue for everything else.
For example if my system has 100Mb top
altq on $ext_if cbq bandwidth 100Mb queue { qin-www, qin-tor }
queue qin-www bandwidth 98Mb priority 1 cbq ( default borrow )
queue qin-tor bandwidth 56Kb priority 7 cbq ( rio )
...
pass in on $ext_if inet proto tcp to $IP4_PUB3 port www synproxy state
pass in on $ext_if inet proto tcp from <torlist> to $IP4_PUB3 port www synproxy state queue qin-tor
(doesn't need to be on top since pf parses all the rules unless you use 'quick')
In this way only those IPs matching in gets throttled down to the qin-tor queue, everything else not defined defaults to the qin-www queue.
The FAQ on OpenBSD's pf didn't seem to make this clear to me until I thought about why there would be an error for a "default", then I figured maybe it applies to the whole interface, so need to define a default for rules not marked to a specific queue.
So there it is... the solution to my 'simple' problem. Hopefully anyone else who has this problem comes accross this.
This is the FAQ I was going by for packet queueing: http://www.openbsd.org/faq/pf/queueing.html

Resources