Is my VPS under DDoS atack? - vps

When i run this command on my VPS:
netstat -n|grep :80|cut -c 45-|cut -f 1 -d ':'|sort|uniq -c|sort -nr|more
i get this result:
207 222.73.144.194
89 191.96.249.54
58 191.96.249.53
21 2400
15 51.255.64.23
6 143.137.103.251
3 103.27.72.36
1 89.180.150.168
1 66.102.7.137
1 5.189.170.167
1 191.181.39.208
1 183.2.246.218
I think this command is showing the number of connections per IP to port 80.
Is this a DDoS atack?

Have you check another aspect (eg. cpu load, network throughput to specific ip(s) using iftop or iptraf)? If there's normal, maybe it's just web/http scanner to your web.
If you're use nginx, you can use limit_conn module, to queue the rest of ip(s) if didn't obey your policy.

Related

JMeter port issue

I need to run JMeter in distributed mode behind a firewall, as far as I know JMeter requires 3 ports:
1099 by default
server.rmi.localport ( 50000 in my case )
client.rmi.localport ( 60000 in my case )
Here is how I run my servers
jmeter-server -Dserver.rmi.localport=50000 -Djava.rmi.server.hostname=<PUBLIC IP> -Jserver.rmi.ssl.disable=true
How I run the client
jmeter -n -t sample-test.jmx -l result.log -e -o /var/www/html/jmeter -Djava.rmi.server.hostname=<PUBLIC IP> -Dclient.rmi.localport=60000 -Jserver.rmi.ssl.disable=true -R<SERVER IPS>
When I run the client command the test gets executed on the servers well. But the servers seems to not be able to send any data back to the client, the client gets stuck WITHOUT printing those example lines when it works:
summary + 2 in 00:00:03 = 0.7/s Avg: 282 Min: 278 Max: 286 Err: 0 (0.00%) Active: 42 Started: 42 Finished: 0
summary + 20400 in 00:00:30 = 688.7/s Avg: 105 Min: 100 Max: 292 Err: 0 (0.00%) Active: 500 Started: 500 Finished: 0
summary = 20402 in 00:00:32 = 631.0/s Avg: 105 Min: 100 Max: 292 Err: 0 (0.00%)
summary + 34429 in 00:00:29 = 1192.6/s Avg: 104 Min: 99 Max: 271 Err: 0 (0.00%) Active: 0 Started: 500 Finished: 500
summary = 54831 in 00:01:01 = 895.8/s Avg: 105 Min: 99 Max: 292 Err: 0 (0.00%)
Tidying up remote # Tue Jan 25 21:34:28 UTC 2022 (1643146468961)
... end of run
On the client side, I ran
lsof -i -P -n | grep LISTEN
which gives me
java 1267 root 131u IPv4 78926 0t0 TCP *:60002 (LISTEN)
java 1267 root 132u IPv4 78927 0t0 TCP *:60001 (LISTEN)
Which really surprise me I was expecting to only see the port 60000, I wasn't expecting at all to see two ports and none of them being 60000.
So, when I open the specific 60001,60002 ports on my firewall everything works well. But I really don't understand this behavior at all because everything I read about JMeter just tells to open ( in my case ) the port 60000 (1099,50000 too, but those work well ), it is never mentioned to open 60001,60002 ports ... I'm kind of lost.
Thanks.
As per Remote hosts and RMI configuration chapter of the JMeter Properties Reference:
client.rmi.localport - Parameter that controls the RMI ports used by RemoteSampleListenerImpl and RemoteThreadsListenerImpl (The Controller)
Default value is 0, which means ports are randomly assigned. If this is non-zero, it will be used as the base for local port numbers for the client engine. At the moment JMeter will open up to three ports beginning with the port defined in this property.
You may need to open corresponding ports in the firewall on the Controller machine.
You may find JMeter Distributed Testing with Docker article useful as it explains all the RMI networking related stuff
The only place I could find referring to my issue to:
https://bz.apache.org/bugzilla/show_bug.cgi?id=65028
This guy is right, tutorials don't talk about that and the documentation does not highlight this specific point.
This section tells "If this is non-zero, it will be used as the base for local port numbers for the client engine. At the moment JMeter will open up to three ports beginning with the port defined in client.rmi.localport. If there are any firewalls or other network filters between JMeter client and server, you will need to make sure that they are set up to allow the connections through. If necessary, use monitoring software to show what traffic is being generated."
https://jmeter.apache.org/usermanual/remote-test.html#tips
I read it too quickly.
Hope my problem could help some else ! :)

Peculiar behaviour with Mellanox ConnectX-5 and DPDK in rxonly mode

Recently I observed a peculiar behaviour with Mellanox ConnectX-5 100 Gbps NIC. While working on 100 Gbps rxonly using DPDK rxonly mode. It was observed that I was able to receive 142 Mpps using 12 queues. However with 11 queues, it was only 96 Mpps, with 10 queues 94 Mpps, 9 queues 92 Mpps. Can anyone explain why there is a sudden/abrupt jump in capture performance from 11 queues to 12 queues?
The details of the setup is mentioned below.
I have connected two servers back to back. One of them (server-1) is used for traffic generation and the other (server-2) is used for traffic reception. In both the servers I am using Mellanox ConnectX-5 NIC.
Performance tuning parameters mentioned in section-3 of https://fast.dpdk.org/doc/perf/DPDK_19_08_Mellanox_NIC_performance_report.pdf [pg no.:11,12] has been followed
Both servers are of same configuration.
Server configuration
Processor: Intel Xeon scalable processor, 6148 series, 20 Core HT, 2.4 GHz, 27.5 L3 Cache
No. of Processor: 4 Nos.
RAM: 256 GB, 2666 MHz speed
DPDK version used is dpdk-19.11 and OS is RHEL-8.0
For traffic generation testpmd with --forward=txonly and --txonly-multi-flow is used. Command used is below.
Packet generation testpmd command in server-1
./testpmd -l 4,5,6,7,8,9,10,11,12,13,14,15,16 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=4096 --rxd=4096--mbcache=512 --rxq=12 --txq=12 --nb-cores=12 -i -a --rss-ip --no-numa --forward=txonly --txonly-multi-flow
testpmd> set txpkts 64
It was able to generate 64 bytes packet at the sustained rate of 142.2 Mpps. This is used as input to the second server that works in rxonly mode. The command for reception is mentioned below
Packet Reception command with 12 cores in server-2
./testpmd -l 4,5,6,7,8,9,10,11,12,13,14,15,16 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=4096 --rxd=4096--mbcache=512 --rxq=12 --txq=12 --nb-cores=12 -i -a --rss-ip --no-numa
testpmd> set fwd rxonly
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 1363328297 RX-missed: 0 RX-bytes: 87253027549
RX-errors: 0
RX-nombuf: 0
TX-packets: 19 TX-errors: 0 TX-bytes: 3493
Throughput (since last show)
Rx-pps: 142235725 Rx-bps: 20719963768
Tx-pps: 0 Tx-bps: 0
############################################################################
Packet Reception command with 11 cores in server-2
./testpmd -l 4,5,6,7,8,9,10,11,12,13,14,15 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=4096 --rxd=4096--mbcache=512 --rxq=11 --txq=11 --nb-cores=11 -i -a --rss-ip --no-numa
testpmd> set fwd rxonly
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 1507398174 RX-missed: 112937160 RX-bytes: 96473484013
RX-errors: 0
RX-nombuf: 0
TX-packets: 867061720 TX-errors: 0 TX-bytes: 55491950935
Throughput (since last show)
Rx-pps: 96718960 Rx-bps: 49520107600
Tx-pps: 0 Tx-bps: 0
############################################################################
If you see there is a sudden jump in Rx-pps from 11 cores to 12 cores. This variation was not observed elsewhere like 8 to 9, 9 to 10 or 10 to 11 and so on.
Can anyone explain the reason of this sudden jump in performance.
The same experiment was conducted, this time using 11 cores for traffic generation.
./testpmd -l 4,5,6,7,8,9,10,11,12,13,14,15 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=4096 --rxd=4096--mbcache=512 --rxq=11 --txq=11 --nb-cores=11 -i -a --rss-ip --no-numa --forward=txonly --txonly-multi-flow
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 2473087484 TX-errors: 0 TX-bytes: 158277600384
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 142227777 Tx-bps: 72820621904
############################################################################
On the capture side with 11 cores
./testpmd -l 1,2,3,4,5,6,10,11,12,13,14,15 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=1024 --rxd=1024--mbcache=512 --rxq=11 --txq=11 --nb-cores=11 -i -a --rss-ip --no-numa
testpmd> set fwd rxonly
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 8411445440 RX-missed: 9685 RX-bytes: 538332508206
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 97597509 Rx-bps: 234643872
Tx-pps: 0 Tx-bps: 0
############################################################################
On the capture side with 12 cores
./testpmd -l 1,2,3,4,5,6,10,11,12,13,14,15,16 -n 6 -w 17:00.0,mprq_en=1,rxq_pkt_pad_en=1 --socket-mem=4096,0,0,0 -- --socket-num=0 --burst=64 --txd=1024 --rxd=1024--mbcache=512 --rxq=12 --txq=12 --nb-cores=12 -i -a --rss-ip --no-numa
testpmd> set fwd rxonly
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 9370629638 RX-missed: 6124 RX-bytes: 554429504128
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 140664658 Rx-bps: 123982640
Tx-pps: 0 Tx-bps: 0
############################################################################
The sudden jump in performance from 11 to 12 core still remains the same.
With DPDK LTS release for 19.11, 20.11, 21.11 running just in vector mode (default mode) for Mellanox CX-5 and CX-6 does not produce the problem mentioned above.
[EDIT-1] retested with rxqs_min_mprq=1 for 2 * 100Gbps for 64B, For 16 RXTX on 16T16C resulted in degradation 9~10Mpps. For all RX queue from 1 to 7 RX there is degration of 6Mpps with rxqs_min_mprq=1.
Following is the capture for RXTX to core scaling
investigating into MPRQ claim, the following are some the unique observations
For both MLX CX-5 and CX-6, the max that each RX queue can attain is around 36 to 38 MPPs
Single core can achieve up to 90Mpps (64B) with 3 RXTX in IO using AMD EPYC MILAN on both CX-5 and CX-6.
For 100Gbps on 64B can be achieved with 14 Logical cores (7 Physical cores) with testpmd in IO mode.
for both CX-5 and CX-6 2 * 100Gbps for 64B requires MPRQ and compression technique to allow more packets in and out of system.
There are multitude of configuration tuning required to achieve high number. Please refer stackoverflow question and DPDK MLX tuning parameters for more information.
PCIe gen4 BW is not the limiting factor, but the NIC ASIC with internal embedded siwtch results in above mentioned behaviour. hence to overcome these limitation one needs to use PMD arguments to activate the Hardware, which further increases the overhead on CPU in PMD processing. Thus there is barrier (needs more cpu) to process the compressed and multiple packets inlined to convert to DPDK single MBUF. This is reason why more therads are required when using PMD arguments.
note:
Test application: testpmd
EAL Args: --in-memory --no-telemetry --no-shconf --single-file-segments --file-prefix=2 -l 7,8-31
PMD args vector: none
PMD args for 2 * 100Gbps line rate: txq_inline_mpw=204,txqs_min_inline=1,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=12,rxq_pkt_pad_en=1,rxq_cqe_comp_en=4

Large result is slow anywhere but local

I have a fairly large query running on Clickhouse. The problem is when running on localhost using cmd line it takes about 0.7 sec to complete. This is consistently fast. Issue is when querying from C# / HTTP / Postman. Here it takes about 10 times to return the data. (the size is about 3-4mb) so I dont think its a size issue.
I have tried to monitor network latency, but nothing to notice here.
On the host it works like a charm, but outside it does not :(.... what to do.
I exptect the latency to be a few 100 ms, but turns out to be 7 sec :/
check timings with curl https://clickhouse.yandex/docs/en/interfaces/http/
https://stackoverflow.com/a/22625150
and compare local vs remote
CH HTTP usually provides almost the same performance as TCP and HTTP could be faster for small resultsets (like 10 rows)
Again. The problem is not the HTTP.
Example:
time clickhouse-client -q "select number, arrayMap(x->sipHash64(number,x), range(10)) from numbers(10000)" >native.out
real 0m0.034s
time curl -S -o http.out 'http://localhost:8123/?query=select%20number%2C%20arrayMap(x-%3EsipHash64(number%2Cx)%2C%20range(10))%20from%20numbers(10000)'
real 0m0.017s
ls -l http.out native.out
2108707 Oct 1 16:17 http.out
2108707 Oct 1 16:17 native.out
10 000 rows - 2Mb
HTTP is faster 0.017s VS 0.034s
Canada -> Germany (openvpn)
time curl -S -o http.out 'http://user:xxx#cl.host.x:8123/?query=select%20number%2C%20arrayMap(x-%3EsipHash64(number%2Cx)%2C%20range(10))%20from%20numbers(10000)'
real 0m1.619s
ping cl.host.x
PING cl.host.x (10.253.52.6): 56 data bytes
64 bytes from 10.253.52.6: icmp_seq=0 ttl=61 time=131.710 ms
64 bytes from 10.253.52.6: icmp_seq=1 ttl=61 time=133.711 ms

HikariCP: What database level timeouts should be considered to set maxLifetime for Oracle 11g

In the documentation for HikariCP, it is mentioned that
We strongly recommend setting this value, and it should be at least 30 seconds less than any database-level connection timeout.
What are those database-level connection timeouts that should be taken into account for Oracle11.2 database? And how could I find those timeouts (queries to execute)?
Short answer: none (by default).
For the record (to include details here in case the link changes), we're talking about property maxLifetime of HikariCP:
This property controls the maximum lifetime of a connection in the pool. An in-use connection will never be retired, only when it is closed will it then be removed. We strongly recommend setting this value, and it should be at least 30 seconds less than any database or infrastructure imposed connection time limit. A value of 0 indicates no maximum lifetime (infinite lifetime), subject of course to the idleTimeout setting. Default: 1800000 (30 minutes)
In my experience, it's a good thing that HikariCP does that. As far as I can tell by default Oracle does not enforce a max lifetime for connections (neither on JDBC driver side (1), nor on server side(2)). So in this respect, the "infrastructure-imposed connection time limit" is +infinity -- and that was a problem for us, as we did observe issues with long-lived connections. It also means any value is "at least 30 seconds less", including the default :)
I suppose the connection layer does not do anything about this because it counts on the pool layer above to handle such things. It was not possible with (now deprecated) implicit connection pool, and I don't know if UCP (the replacement) does that, but if you use HikariCP you don't use those.
Now, after 30 minutes (usually after many reuses for various purposes) for a given connection, HikariCP closes it and creates a fresh one. That has a very minor cost, and fixed our issues with long-lived connections. We're happy with that default, but still make it configurable just in case (see 2 below).
(1) OracleDataSource does not offer any configuration point (property or system property) to control that, and I observed infinite lifetime.
(2) For server-side limits, see profile parameter IDLE_TIME. Quoting
this answer:
Oracle by default will not close a connection due to inactivity. You can configure a profile with an IDLE_TIME to cause Oracle to close inactive connections.
To verify what is the value of IDLE_TIME for your user, combining answers from this Q&A:
select p.limit
from dba_profiles p, dba_users u
where p.resource_name = 'IDLE_TIME' and p.profile = u.profile and u.username = '...'
;
Default value is UNLIMITED.
Please note there can be other limits enforced elsewhere (firewall... yes I've been bitten by that, although most DB systems have a keep-alive mechanism) that might interfere. So you'd better make it configurable, in case such issues are discovered when you deploy your product.
On Linux, you can verify max lifetime of physical connections by monitoring TCP sockets connected to your database. I've been running script below on my server (from the DB point of view that's the client host), it takes 1 argument, the ip:port of your oracle node, as it appears in output of netstat -tan (or a pattern if you have several nodes).
#!/bin/bash
target="$1"
dir=$(mktemp -d)
while sleep 10
do
echo "------------ "$(date)
now=$(date +%s)
netstat -tan | grep " $target " | awk '{print $4}' | cut -f2 -d: | while read port
do
file="p_$port"
[ ! -e $file ] && touch $file
ftime=$(stat -c %Z "$file")
echo -e "$port :\t "$(( now - ftime))
done
done
\rm "$dir"/p_*
\rmdir "$dir"
If you run that and stop it with ctrl-c during sleep time, it should exit the loop and clean up the temp directory, but this is not 100% foolproof
In the results none of the ports should show a value that exceeds 1800 seconds (ie 30 minutes), give or take a minute. See example output below, first sample shows 2 sockets above 1800s, they're gone 10s later.
------------ Thu Jul 6 16:09:00 CEST 2017
49806 : 1197
49701 : 1569
49772 : 1348
49782 : 1317
49897 : 835
49731 : 1448
49620 : 1830
49700 : 1569
49986 : 523
49722 : 1498
49715 : 1509
49711 : 1539
49629 : 1820
49732 : 1448
50026 : 332
49849 : 1036
49858 : 1016
------------ Thu Jul 6 16:09:10 CEST 2017
49806 : 1207
49701 : 1579
49772 : 1358
49782 : 1327
49897 : 845
49731 : 1458
49700 : 1579
49986 : 533
49722 : 1508
49715 : 1519
49711 : 1549
49732 : 1458
50026 : 342
49849 : 1046
49858 : 1026
You'll need to run the script for more than 30 minutes to see that, because it does not know age of sockets that existed before

net-snmp snmptrap sending samples

I'm new in SNMP and I just configured the agent and the manager and I'm
able to receive the traps sent by the agent. But I noticed that the traps
received by the manager are captured between 10 seconds, but I need to
receive the traps as soon as I generate them not between 10 sec.
I'll show you my script which is intended to capture the signal avg power
that a client has with an Access Point, the samples are taking between 1
sec and I need to send that trap to the manager in less time than 1 sec.
while :
do
valor=$(iw dev wlan0 station dump \
| grep 'signal avg': | awk '{print $3}')
snmptrap -v 1 -c public 192.168.1.25 '1.2.3.4.5.6' \
'192.168.1.1' 6 99 '55' 1.11.12.13.14.15 s "$valor"
echo $valor >> muestras.txt
sleep 1
done
But surprisingly the traps seems to be generated between 10 sec or maybe
the manager is receive them in an elapsed time of 10 sec. I don't know
where is the problem, in the agent or in the manager, but I'm sure that the
agent generates samples in 1 sec because "muestras.txt" shows that.
Hope you can help me!.
Greetings!
I found the answer.
The problem was in the server who executes snmptrapd. Simply I passed the argument -n to the snmptrapd and that solved all!.

Resources