Docker container slow network access (Windows Server 2016 containers)

Docker container slow network access (Windows Server 2016 containers) - windows

Notes:
Slow network performance in Docker container does not address either of my issues.
Very slow network performance of Docker containers with host's network might be related, but the one response there is definitely not the problem.
I am running the windows server 2016 integrated version of docker, with the microsoft/mssql-server-windows-developer image. (windows containers, linux is not an option for the ultimate purpose). My goal is to use this image for a temporary SQL server for repeated acceptance test runs.
As of now, everything works as I need it to, except for performance. As a measurement of performance I have a set of scripts (invoked by powershell) that will setup a database with tables, schema, role, etc and some small amount of initial data.
When I share a drive with the host system, I can connect to the container and run this powershell script inside the container. It takes 30 seconds to complete. No errors, and when I inspect the database with SSMS, it is all correct.
When I run the script from the host machine (via the exposed port 1433), the script takes about 6000 percent longer. (i.e. about 30 minutes.) However, it also runs correctly and produces correct results.
The above measurements were made using the default "nat" network, with the container run with -p 1433:1433. My main question is, how can I get remotely reasonable performance when running my script from the host system? (Ultimately running anything under test from within the container is not an option. Also ultimately this same performance issue must be resolved in order for our container deployment plans to be realistic.)
Thanks!
What I have tried so far.
First, there is no internal CPU or memory performance issues from within the container. So I have already experimented with the --cpus related options and -m options and given the container far more resources than it really needed. The internal performance does not change. It's very fast that way regardless of any of these settings.
I have also investigate creating a "transparent" network. Using the powershell cmdlet New-ContainerNetwork, I created a transparent network, and started the container with the "--net Trans" switch. I got a valid DHCP address from the external network and had connectivity to the internet and other domain machines on the intranet. Using netstat -a, (and powershell Get-WMIObject win32_service) I was able to determine that the MSSQLSERVER instance was running and listening on port 1433. I installed telnet inside the container, and could make a connect to that port using the command "telnet [ipaddressfromipconfig] 1433".
From a host command prompt, I could ping the containers ip address and get replies, but the telnet command (above) would not connect from the host. So naturally when I tried SSMS it would not connect either. The -P, or -p 1433:1433 port mapping option is not supported with a transparent network, but I have been imagining that if I were accessing from the host machine that should not be necessary for a transparent network.
Suspecting that there was a firewall somehow blocking the connect, I verified that the firewall service in the container is not even running. I turned off the firewall completely on the host, however, nothing changed. I still cannot connect. I tried both the "--expose 1433" parameter on the docker run, as well as rebuilt the image with the EXPOSE 1433 line in the docker file. No change in conditions.
I have no idea if a transparent network will even solve the issue, but I would like advice on this.
It would be okay for the performance to be somewhat slower, within reason, but a 6000 percent degradation is a problem for my intended purpose.

It turns out that I did not know to supply enough information on this issue.
The powershell code we are using to create and populate the test database with is sending a series of 328 script.sql files to the sql server. It is using Windows authentication. This works with the container, because we are using a GSMA credential_spec which is documented here:
https://learn.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/manage-serviceaccounts
That method of authentication may or may not be relevant. Using Wireshark to monitor the container adapter, I noticed that a connection took just under 4 seconds to authenticate, but I have no other method to provide as a comparison. Therefore, I cannot say if that method of authentication is somehow significantly slower than some other method. What is definitely of relevant significance is that when our main powershell code sends a particular script.sql file, it does not use Invoke-Sqlcmd. Rather it invokes sqlcmd via Invoke-Expression similar to:
$args = "-v DBName = $dbName JobOwnerName = $jobOwnerName -E -i $fileName -S $server -V 1 -f 65001 -I";
if (!$skipDBFlag)
{
$args += " -d $dbName";
}
Invoke-Expression "& sqlcmd --% $args";
In this case, sqlcmd will reconnect to the database in the container, run a script.sql file, and then disconnect. It will not cache the connection the
way that Invoke-Sqlcmd does.
So, because of the lack of connection pooling the authentication was happening 328 times, once for each script.sql file. 4 seconds * 328 / 60 = ~21 minutes. Thats where the source of the above issue was. Not in any container network issue.
I apologize for not being able to supply all of the relevant information initially. I hope that this answer will help someone if they run into a similar issue with using containers in this way, and the length of time that authentication with SQL Server takes in this configuration.

Related

Localhost refused to connect on WSL2 when accessed via https://localhost:8000/ but works when using internal WSL IP adress

What I'm Trying to Achieve
To access localhost from my local machine during the development of a Symfony web app.
My Environment
WSL2 running on Windows 10
Linux, Apache2, MySQL, PHP-7.4 stack (with Xdebug3 intalled)
Debian 10
Symfony 5.4 (although not sure on if relevant to this problem)
Steps I've Taken
Set up WSL2 according to this Microsoft WSL2 tutorial
Set up LAMP stack according to this Digital Ocean tutorial
Set up Symfony according to this Symfony tutorial
Run the following bash script on startup to start my services and set the host to the virtual WSL IP in my xdebug.ini file
#!/bin/sh
REMOTEIP=`cat /etc/resolv.conf | grep nameserver | sed 's/nameserver\s//'`
sed -i -E "s/client_host=[0-9\.]+/client_host=$REMOTEIP/g" /etc/php/7.4/mods-available/xdebug.ini
service php7.4-fpm start
service apache2 start
service mysql start
Run my Symfony project on the development server using symfony serve -d (Symfony then tells me "The Web server is using PHP FPM 7.4.23 https://127.0.0.1:8000")
Go to https://localhost:8000/ in Chrome where the app is running
What I Expect to Happen
My Symfony web app to be running on https://localhost:8000/ when I visit the URL in my Chrome browser
What Actually Happens
I get "This site can't be reached localhost refused to connect." in the Chrome browser
What I've Tried
This used to happen less frequently and I would give my laptop a restart, repeat the process above, and I could connect via https://localhost:8000/. However, it refuses to connect more regularly now (like 8/10 times I start up for the day)
Connecting to https://127.0.0.1:8000 yields the same result.
Connecting to the site using the internal WSL IP address, found using hostname -I and replacing localhost with this IP (still on port 8000). This is an adequate workaround to use my app, however I am unable to interact with my database via MySQL Workbench without having to set up a new connection, therefore a fix where I can use localhost would be very helpful!
(Based off comments) Only ran symfony serve -d without starting apache and PHP services separately - still sometimes allows connections to localhost but sometimes doesn't work.
Conclusion
The behaviour is odd as it works sometimes but other times it doesn't when the exact same steps are carried out. I am unsure where else to look for answers and I can't seem to find anything online with this same problem. Please let me know if any config files, etc would be helpful. Thank you so much for your help! :)

When it's working normally, as you are clearly aware, the "localhost forwarding" feature of WSL2 means that you can access services running inside WSL2 using the "localhost" address of the Windows host.
Sometimes, however, that feature breaks down. This is known to happen when you either:
Hibernate
Have the Windows "Fast Startup" feature enabled (and it is the default). Fast Startup is a pseudo-hibernation which triggers the same problem.
Typically the best solution is to disable Hibernation and Fast Startup. However, if you do need these features, you can reset the WSL localhost feature by:
Exiting any WSL instances
Issuing wsl --shutdown
Restarting your instance
It's my experience that localhost forwarding will work after that. However, if it doesn't, thanks to #lwohlhart in the comments for mentioning that another thing to try is disabling IPv6 on WSL2, since (I believe) there's a possibility that the application is listening on IPv6 while the Windows->WSL2 connection localhost connection is being attempted on IPv6.
You can disable IPv6 on WSL2 per this Github comment by creating or editing .wslconfig in your Windows user profile directory with the following:
[wsl2]
kernelCommandLine=ipv6.disable=1
A wsl --shutdown and restart will be necessary to complete the changes.
If you find that this works, it may be possible to solve the issue by making sure to either use the IPv4 (127.0.0.1) or IPv6 (::1) address specifically in place of localhost on the Windows side, or by configuring the service to listen on both addresses.

Try to run command netstat -nltp. It shows active addresses and ports. Your nginx process should be run at 0.0.0.0:8000. 0.0.0.0 means the nginx process is available from anywhere.
If your nginx process is ran by any specific ip address, you should access it by that ip address, e.g http://192.168.4.2:8000.

osx docker max connections limit

I installed Docker-ce(ver 17.03.1-ce-mac12 17661) on macOS Sierra(ver 10.12.5)
I created a container and run a simple socket echo server.
And then attempted to connect to the container's echo server from the host.
Initially, when the number of open sockets reached 370, a connection failure occurred,
I found the following via Google search.
https://github.com/docker/for-mac/issues/1009
To summarize, the docker for mac has its own maximum number of connections.
I modified the maximum number of connections moderately according to this content.
And I connected to the docker host in the following way.
http://pnasrat.github.io/2016/04/27/inside-docker-for-os-x-ii
I changed the ulimit configuration of the docker host as well, and changed the osx and container settings accordingly.
And again, I tried again, but this time the number of sockets exceeded the 370 limit mentioned above, but it is also blocked at about 930 ~ 940.
I try to change the settings like this, but it does not get better.
Note that a docker running on top of an Ubuntu server does not need to change any settings, and works well without any socket restrictions.
An echo server running inside the container of a docker running on Ubuntu can maintain at least 4000 sockets.
The problem only occurs with the docker for mac.
If you are aware of this situation, can anyone suggest a solution?
Thank you.

Unable to access MongoDB within a container within a Docker Machine instance from Windows

I am running Windows 7 on my desktop at work and I am signed in to a regular user account on the VPN. To develop software, we are to normally open a Dev VM and work from in there however recently I've been assigned a task to research Docker and Mongo DB. I have very limited access to what I can install on the main machine.
Here lies my problem:
Is it possible for me to connect to a MongoDB instance inside a container inside the docker machine from Windows and make changes? I would ideally like to use a GUI tool such as Mongo Management Studio to make changes to a Mongo database within a container.
By inspecting the Mongo container, it has the ports listed as: 0.0.0.0:32768 -> 27017/tcp
and docker-machine ip (vm name) returns 192.168.99.111.
I have commented out the 127.0.0.1 binding host ip within the mongod.conf file also.
From what I have researched so far, most users resolve their problem by connecting to their docker-machine IP with the port they've set with -p or been given with -P. Unfortunately for me, trying to connect with 192.168.99.111:32768 does not work.
I am pretty stumped and quite new to this environment. I am able to get inside the container with bash and manipulate the database there however I'm wondering if I can do this within Windows.
Thank you if anyone can help.

After reading Smutje's advice to ping the VM IP and testing it out to no avail, I attempted to find a pingable IP which would hopefully move me closer to my goal.
By doing "ifconfig" within the Boot2Docker VM (but not inside the container), I was able to locate another IP listed under eth0. This IP looks something like 134.36.xxx.xxx to me and is pingable. With the Mongo container running I can now access the database from within Mongo Management Studio by connecting to 134.36.xxx.xxx:32768 and manipulate the data from there.

If you have the option of choosing the operating system for your dev VM, go with Ubuntu and setup docker with all of the the containers you want to test on that. Either way, you will need to have a VM for testing docker on windows since it uses VirtualBox if i'm not mistaken. Instead, setup an Ubuntu VM and do all of your testing on that.

Remote Postgresql - extremely slow

I have setup PostgreSQL on a VPS I own - the software that accesses the database is a program called PokerTracker.
PokerTracker logs all your hands and statistics whilst playing online poker.
I wanted this accessible from several different computers so decided to installed it on my VPS and after a few hiccups I managed to get it connecting without errors.
However, the performance is dreadful. I have done tons of research on 'remote postgresql slow' etc and am yet to find an answer so am hoping someone is able to help.
Things to note:
The query I am trying to execute is very small. Whilst connecting locally on the VPS, the query runs instantly.
While running it remotely, it takes about 1 minute and 30 seconds to run the query.
The VPS is running 100MBPS and then computer I'm connecting to it from is on an 8MB line.
The network communication between the two is almost instant, I am able to remotely connect fine with no lag whatsoever and am hosting several websites running MSSQL and all the queries run instantly, whether connected remotely or locally so it seems specific to PostgreSQL.
I'm running their newest version of the software and the newest compatible version of PostgreSQL with their software.
The database is a new database, containing hardly any data and I've ran vacuum/analyze etc all to no avail, I see no improvements.
I don't understand how MSSQL can query almost instantly yet PostgreSQL struggles so much.
I am able to telnet to the port 5432 on the VPS IP with no problems, and as I say the query does execute it just takes an extremely long time.
What I do notice is on the router when the query is running that hardly any bandwidth is being used - but then again I wouldn't expect it to for a simple query but am not sure if this is the issue. I've tried connecting remotely on 3 different networks now (including different routers) but the problem remains.
Connecting remotely via another machine via the LAN is instant.
I have also edited the postgre conf file to allow for more memory/buffers etc but I don't think this is the problem - what I am asking it to do is very simple - it shouldn't be intensive at all.
Thanks,
Ricky
Edit: Please note the client and server are both running Windows.
Here is information from the config files.
pg_hba - currently allowing all traffic:
# TYPE DATABASE USER CIDR-ADDRESS METHOD
# IPv4 local connections:
host all all 0.0.0.0/0 md5
# IPv6 local connections:
# host all all ::1/128 md5
And postgresqlconf - I'm aware I've given some mammoth amount of buffers/memory to this config, just to test if it was the issue - showing uncommented lines only:
listen_addresses = '*'
port = 5432
max_connections = 100
shared_buffers = 512MB
work_mem = 64MB
max_fsm_pages = 204800
shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll'
log_destination = 'stderr'
logging_collector = on
log_line_prefix = '%t '
datestyle = 'iso, mdy'
lc_messages = 'English_United States.1252'
lc_monetary = 'English_United States.1252'
lc_numeric = 'English_United States.1252'
lc_time = 'English_United States.1252'
default_text_search_config = 'pg_catalog.english'
Any other information required, please let me know. Thanks for all your help.

I enabled logging and sent the logs to the developers of their software. Their answer was that there software was originally intended to run on a local or near local database so running on a VPS would be expectedly slow - due to network latency.
Thanks for all your help, but it looks like I'm out of ideas and it's due to the software, rather than PostgreSQL on the VPS specifically.
Thanks,
Ricky

You can do an explain analyze which will tell you the execution time of the query on the server (without the network overhead of sending the result to the client).
If the server execution time is very quick (compared to the time you are seeing) than this is a network problem. If the reported time is very similar to what you observe on your side, it's a PostgreSQL problem (and then you need to post the execution plan and possibly your PostgreSQL configuration)

Have been plagued by this issue for awhile and this question lead me to the answer so thought I would share incase it helps.
The server had a secondary network interface (eth1) that was setup as the default route. The client performing the queries was within the same subnet as eth0, so this should not cause any issues.. but it was.
Disabling the default route made the queries return back within normal time frames. But the long term fix was to change the listen_addresses from '*' to the correct IP.

Use network monitoring tools (I reccomend wireshark, because it can trace many protocols, including postgresql's) to see if network connection is ok. You will see dropped/retransmitted packets if the connection is bad.

Maybe Postgres is trying to authenticate you using ident, which isn't working (for example firewalled out), and has to wait for timeout before allowing connection by other means.
Try to query remote server for select version() using psql - this should be instant, as it does not touch disk.
If it isn't instant please post your pg_hba.conf (uncommented lines).
Another possible causes:
authentication using RevDNS;
antivirus on server or client;
some other connection is blocking a table or row, because it didn't end clearly.

This is not the answer to why pg access is slow over the VPN, but a possible solution/alternative could be setting up TeamPostgreSQL to access PG through a browser. It is an AJAX webapp that includes some very convenient features for navigating your data as well as managing the database.
This would also avoid dropped connections which in my experience is common when working with pg over a VPN.
There is also phpPgAdmin for web access but I mention TeamPostgreSQL because it can be very helpful for navigating and getting an overview over the data in the database.

What could be wrong: ping works fine but tnsping works intermittently

We have oracle 10g running on windows server 2003. A machine which runs an application using that database has as of a few weeks ago suddenly started having connectivity problems. Today we ran the automatic updates for windows server and the problem has only gotten worse. I realize this isn't enough information for anyone to diagnose the problem but perhaps you can get me pointed in the right direction with the following more specific scenario:
From this machine we can ping the server with absolutely no problem and, being physically close and on an intranet the return is very fast.
However, when we run tnsping I have seen 3 different results within a few minutes of each other.
tnsping returns just fine and in a reasonable amount of time
tnsping returns but only after a real long time (several seconds)
tnsping results in an ora-12560 protocol adapter error
At the same time I can tnsping the server from my machine with no problem.
Can anyone point me in the right direction?

I'd try to check the following:
do traceroute from the app server and from your machine check for anything abnormal
check tnsping from various other machine and try to identify a pattern
try a tcp/ip sniffer to see what is going on at both ends of the connection
get oracle support involved

To help eliminate DNS issues from the equation, specify the host's IP address in the TNSNAMES.ora file for your connection instead of a hostname. Are you using DHCP?
Have you eliminated hardware as the problem - have you tried a different NIC?

Before calling Oracle, I would create a trace file for a Fail case.
TNSPING.TRACE_LEVEL
Purpose
Use the parameter TNSPING.TRACE_LEVEL to turn TNSPING utility tracing on, at a specific level, or off.
Default
off
Values
* off: for no trace output
* user: for user trace information
* admin: for administration trace information
* support: for Oracle Support Services trace information
Example
TNSPING.TRACE_LEVEL=admin

Before involving oracle in this issue, get some help from your network administrator for the following test. First enable verbose logging on the database in the listener. Enable logging on the client via sqlnet. Go to the machine that is having trouble with tnsping, have the network administrator run a network tool to trace tcp packets from there. Perform the tnsping and see if what packet are being sent, what dns lookup are being made, what route is being taken. On the database see if the listener actually receives a ping from the client. If not then see where along the network to the database the problem is. Is it nameserver resolution? Is it a bad network cable, bad switch port, etc. Your network admin is your best friend for this problem. Do the same test via sqlplus with a simple connection and see what the client is logging.

Make sure there is no other machine on the network with the same IP address. A method would be unplug your machine from the network and see if you can still ping it. If you can then this is the problem.

If the server doesn't have a domain-name setup at a dns server, then add it's ip address and name to the host file on the server; this (the server not being able to find itself in dns) has been known to cause tns timeouts.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio