Max numbers of connections / threads on my TCP / IP server? - windows

I am curious about whether my server would work better on Linux or Windows, from what I have read Windows only supports around 2,000 connections/threads while I have not seen much information about how many threads / connections Linux can handle.
Is there any advantages to using Linux over Windows other than stability / security for my TCP /IP server?
Thanks.

Threads and sockets are different resources, the limits for each will depend not just on Linux vs Windows but also which versions of each OS you are using. Also, if you're using a class library instead of raw socket or thread APIs, those might impose a specific limit. As an example early versions of CSocket in MFC created a hidden window for each socket, so you were effectively limited to the number of GDI resources on the system.

Either platform will be fine, and most apps will never get big enough to need more than a single server to run them anyway. Get your project done in whichever way is easier for you.

I would imagine that the primary concern when building a high-scale application is the experience of the engineers on your team, including operations engineers. By all means consider performance when selecting a platform, but the experience and preference of your development and operations engineers is probably more important - after all, they will need to maintain and operate the service respectively.
In any case, if you have a real need for a service with 2000 concurrent clients, it probably has some high availability requirement which means it can't be run on a single server anyway.

Related

ruby scrape with multiple ip addresses

I would like to know if it is possible for a Ruby program to possess multiple IP addresses? I am trying to download a lot of data from a site, but it is very slow with only 1 connection at a time.
I intend to multi-thread my program with each thread using its own IP address, but I do not know if it is possible in the first place, any help or hints would be greatly appreciated.
It is definitely possible for a machine or a program to have multiple IP addresses. You can even have multiple network adapters, and tie each of them to different physical connections.
However, it can get really hairy to maintain. The challenge for that is partly in the code, partly in the system maintenance, and partly in the networking required to make that happen.
A better approach that you can take is to design your program so that it can run distributed. As such, you can have several copies of it synchronized and doing the work in parallel. You can then scale it horizontally (build more copies) as required, and over different machines and connections if required.
EDIT: You mentioned that you cannot scale horizontally, and that you prefer to use multiple connections from the same machine.
It's very likely that for this you'll have to go a little bit lower in the network stack, developing yourself the connection through sockets in order to use specific network interfaces.
Check out an introduction to Ruby sockets.
Also, check out these related questions:
How does a socket know which network interface controller to use?
Binding to networking interfaces in ruby
Ruby: Binding a listening socket to a specific interface
Can I make ruby send network traffic over a specific iface?

Communication between Windows Store app and native desktop application

! For the sake of simplifying things I will refer to Windows Store applications (also known as Metro or Modern UI) as "app" and to common desktop applications as "application" !
I believe this is still one of the most unclear yet important questions concerning app-development for developers who already have established applications on the market:
How to manage communication between apps and applications on a Windows 8 system? (please let's not start a debate on principles - there're so many use cases where this is really required!)
I basically read hundrets of articles in the last few days but still it remains unclear how to proceed doing it right from the first time. Mainly because I found several conflicting information.
With my question here I'd like to re-approach this problem from the viewpoint of the final Windows 8 possibilities.
Given situation:
App and application run on same system
1:1 communication
Application is native (written in Delphi)
Administrator or if required even system privileges are available for the application
In 90% of the use cases the app requests an action to be performed by the application and receives some textual result. The app shouldn't be left nor frozen for this!
In 10% the application performs an action (triggered by some event) and informs the app - the result might be: showing certain info on the tile or in the already running and active app or if possible running the app / bringing it to the foreground.
Now the "simple" question is, how to achieve this?
Is local webserver access actually allowed now? (I believe it wasn't for a long time but now is since the final release)
WCF? (-> apparently MS doesn't recommend that anymore)
HTTP requests on a local REST/SOAP server?
WinRT syndication API? (another form of webservice access with RSS/atom responses)
WebSockets (like MessageWebSocket)?
Some other form of TCP/IP communication?
Sharing a text file for in- and output (actually simply thinking of this hurts, but at least that's a possibility MS can't block...)
Named Pipes are not allowed, right?
There are some discussions on this topic here on SO, however most of them are not up-to-date anymore as MS changed a lot before releasing the final version of Windows 8. Instead of mixing up old and new information I'd like to find a definite and current answer to this problem for me and for all the other Windows application and app developers. Thank you!
If you are talking about an application going into the Store, communication with the local system via any mechanism is not allowed. Communication with the local system is supported in some debug scenarios to make app development easier.
You can launch desktop applications from Windows Store applications with file or protocol handlers, but there is no direct communication.
So, to reiterate the point... communication between WinRT and the desktop is not allowed for released Windows Store applications. Communication between the two environments is allowed in debug only.
The PG has posted in different places reasons for why communication is not allowed, ranging from security, to the WinRT lifecycle (i.e., you app gets suspended - how does that get handled re: resources, sockets, remote app, etc. -- lots of failure points) and the fact that Store apps cannot have a dependency on external programs (i.e., I need your local desktop app/service for the app to run, but how do I get your app/service installed? You cannot integrate into the Store app. You can provide another Store desktop app entry, but that is a bad user experience.) Those are high level summaries, of course.

multi-client inter-process communication on Windows, VB6

What is the best way for multiple client programs to
communicate with a single server program, all running
on a single Windows computer? All written in VB6.
I'd appreciate recommendations of how you might solve
this problem.
NOTE: we are working on transition to .NET, but have to
add a capability to the V6B version before the .NET will
be ready.
The possibilities include TPC connections, named pipes,
shared memory, messages, files, and more.
A client passes the server a string as input, and the server
combines it with data known only to the server, to generate
another string which is returned to the client. Both strings
are only about 100 characters long. The server is contacted
only when a new file needs to be opened, and so it is a very
low volume of communication... probably a flurry of 10 calls
within 15 seconds, followed by an hour of idle time.
But it is possible that two clients would choose about the
same time to request information. Blocking/Locking are certainly
acceptable, as the server will be done with each request in
well under a second, and several seconds of delay is unimportant
to any of the programs.
The server's algorithm is complex, and for several reasons important
to the application should not be replicated in each helper program.
That is the reason for needing a server.
Background:
I am adding capability to a large existing legacy program.
This single program has several other legacy programs which
act as helpers and are run when the user makes certain
choices. These programs are started with a shell command,
and are not just separate threads. For instance, one helper
loads new data from a DVD drive onto the hard drive. Another
helper just displays a chart of the current positions of
the planets.
This is a LARGE commercial legacy program that happens to be
written in VB6. We are working to convert it and all the
helper programs to .NET, but must first release a new version
under vb6 with this added capability. (Please don't tell me
to not use VB6, as we are already moving elsewhere.)
We need a temporary VB6 solution.
VB6 does TCP and UDP extremely well via the standard Winsock Control component included in Pro and Enterprise Editions. A lot of shadetree coders do seem to struggle with it though. This is probably the most obvious route since the only other native IPC in VB6 would be COM/DCOM and DDE, however MSMQ provided excellent support for VB6 as well.
The downside of IP-based protocols is their limited namespace and resulting high probability of collisions (64K port numbers, many set aside for standard applications, ephemeral port ranges, etc.). They're also somewhat "heavyweight" but considering the vast resources of even the oldest PCs still in service and your light traffic requirements you can ignore that in deciding.
Another option you've considered is Named Pipes.
This offers a number of advantages in your situation. For one thing the namespace is much larger requiring only a unique name, which in the post-Win9x era can be up to 256 characters long making uniqueness fairly easy to achieve. For another, as long as your firewalls permit "File and Print Sharing" you're all set on that front.
Also, for your application you only seem to require an RPC-style mechanism rather than arbitrary bidirectional streams or messages. TransactNamedPipe() calls in your clients might be ideal. Named Pipes work over a LAN, but within one PC they are quite fast and light weight.
While VB6 doesn't come with a Named Pipe component such a thing is fairly easy to create as long as extremely high performance isn't required. You can use Timer-based polling in the server instead of trying to implement overlapped I/O to get asynchronicity. I put one together a couple of years ago and have had good luck with this approach.
I published a fairly stable rendition of this a while back at PipeRPC - RPC Over Named Pipes. There is an older and a somewhat newer version there with examples of use and documentation. As designed, clients make "calls" passing a Byte array of request parameters and receiving back a Byte array of response results. You can also shove Unicode Strings though with no changes, letting the compiler coerce the types.
Just one "drop in" UserControl for both clients and servers.
Looking back at this question:
The server's algorithm is complex, and for several reasons important
to the application should not be replicated in each helper program.
That is the reason for needing a server.
If that's really the concern why not just create a shared DLL that all programs use?
For a one-off upgrade release to an existing VB6 application being moved to a newer platform, I would stress keeping the modification as simple and straightforward as possible. As a result, I wouldn't go down any routes involving shared memory or anything relatively unusual.
A few options, none perfectly simple, but at least some ideas:
Expose a COM object in the server code that performs the translation, and can be consumed by the client apps. The clients instantiate the object from the server as an out-of-process object, and let COM handle all the marshalling, etc.
Does the server have any network awareness? VB6 doesn't do sockets/tcp natively very well, but if you've had a reason to add that in, you might be able to leverage it to perform a socket-based connection and data exchange.
The server and client could each poll a common resource folder for the presence of a specific file that constituted inbound/outbound requests for the translation service you describe. Not very elegant, but it might be the simplest.
Just a few ideas to give you some things to think about. Hope that's helpful in some way. Good luck!

Scaling Tigase XMPP server on Amazon EC2

Does anyone have an experience running clustered Tigase XMPP servers on Amazon's EC2, primarily I wish to know about anything that might trip me up that is non-obvious. (For example apparently running Ejabberd on EC2 can cause issues due to Mnesia.)
Or if you have any general advice to installing and running Tigase on Ubuntu.
Extra information:
The system I’m developing uses XMPP just to communicate (in near real-time) between a mobile app and the server(s).
The number of users will initially be small, but hopefully will grow. This is why the system needs to be scalable. Presumably for a just a few thousand users you wouldn’t need a cc1.4xlarge EC2 instance? (Otherwise this is going to be very expensive to run!)
I plan on using a MySQL database hosted in Amazon RDS for the XMPP server database.
I also plan on creating an external XMPP component written in Python, using SleekXMPP. It will be this external component that does all the ‘work’ of the server, as the application I’m making is quite different from instant messaging. For this part I have not worked out how to connect an external XMPP component written in Python to a Tigase server. The documentation seems to suggest that components are written specifically for Tigase - and not for a general XMPP server, using XEP-0114: Jabber Component Protocol, as I expected.
With this extra information, if you can think of anything else I should know about I’d be glad to know.
Thank you :)
I have lots of experience. I think there is a load of non-obvious problems. Like the only reliable instance to run application like Tigase is cc1.4xlarge. Others cause problems with CPU availability and this is just a lottery whether you are lucky enough to run your service on a server which is not busy with others people work.
Also you need an instance with the highest possible I/O to make sure it can cope with network traffic. The high I/O applies especially to database instance.
Not sure if this is obvious or not, but there is this problem with hostnames on EC2, every time you start instance the hostname changes and IP address changes. Tigase cluster is quite sensitive to hostnames. There is a way to force/change the hostname for the instance, so this might be a way around the problem.
Of course I am talking about a cluster for millions of online users and really high traffic 100k XMPP packets per second or more. Generally for large installation it is way cheaper and more efficient to have a dedicated servers.
Generally Tigase runs very well on Amazon EC2 but you really need the latest SVN code as it has lots of optimizations added especially after tests on the cloud. If you provide some more details about your service I may have some more suggestions.
More comments:
If it comes to costs, a dedicated server is always cheaper option for constantly running service. Unless you plan to switch servers on/off on hourly basis I would recommend going for some dedicated service. Costs are lower and performance is way more predictable.
However, if you really want/need to stick to Amazon EC2 let me give you some concrete numbers, below is a list of instances and how many online users the cluster was able to reliably handle:
5*cc1.4xlarge - 1mln 700k online users
1*c1.xlarge - 118k online users
2*c1.xlarge - 127k online users
2*m2.4xlarge (with 5GB RAM for Tigase) - 236k online users
2*m2.4xlarge (with 20GB RAM for Tigase) - 315k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 400k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 312k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 327k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 280k online users
A few more comments:
Why amount of memory matters that much? This is because CPU power is very unreliable and inconsistent on all but cc1.4xlarge instances. You have 8 virtual CPUs but if you look at the top command you often see one CPU is working and the rest is not. This insufficient CPU power leads to internal queues grow in the Tigase. When the CPU power is back Tigase can process waiting packets. The more memory Tigase has the more packets can be queued and it better handles CPU deficiencies.
Why there is 5*m2.4xlarge 4 times? This is because I repeated tests many times at different days and time of the day. As you can see depending on the time and date the system could handle different load. I guess this is because Tigase instance shared CPU power with some other services. If they were busy Tigase suffered from CPU under power.
That said I think with installation of up to 10k online users you should be fine. However, other factors like roster size greatly matter as they affect traffic, and load. Also if you have other elements which generate a significant traffic this will put load on your system.
In any case, without some tests it is impossible to tell how really your system behaves or whether it can handle the load.
And the last question regarding component:
Of course Tigase does support XEP-0114 and XEP-0225 for connecting external components. So this should not be a problem with components written in different languages. On the other hand I recommend using Tigase's API for writing component. They can be deployed either as internal Tigase components or as external components and this is transparent for the developer, you do not have to worry about this at development time. This is part of the API and framework.
Also, you can use all the goods from Tigase framework, scripting capabilities, monitoring, statistics, much easier development as you can easily deploy your code as internal component for tests.
You really do not have to worry about any XMPP specific stuff, you just fill body of processPacket(...) method and that's it.
There should be enough online documentation for all of this on the Tigase website.
Also, I would suggest reading about Python support for multi-threading and how it behaves under a very high load. It used to be not so great.

bandwidth and traffic simulator for web apps?

Can you suggest how to create a test environment to simulate various types of bandwidths and traffic in a web app?
Or maybe an open source program which does this against localhost?
I think this is a very important subject when programming web apps but it is not a usual topic, the only way i can imagine to create such kind of environment is to use some kind of proxy in a local network but before start looking into the squid documentation i would like to hear your suggestions.
if you're using apache you may want to take a look at apache ab
There are two approaches to shape network traffic to simulate a network link:
Run some software on the client or server that sits somewhere in the networking stack and shapes the traffic between the app and the network interface
Run the traffic shaping software on a dedicated machine with 2 network interfaces through which your traffic is routed
(2) is a better solution if you don't want to install software on the client or server (and possibly impact performance), but requires more hardware fiddling.
Some other features you might want to think about are what shaping parameters can be simulated. Most do delay and packet loss, some do jitter and bandwidth limiting as well. Some solutions can selectively filter traffic (for instance by port number, TCP or UDP etc).
Here is a list of some of the systems I've found:
Open Source or Freeware
DummyNet is an open source BSD Unix-based for dedicated devices. It is not clear if the software is being actively maintained
NistNet is an open source Linux-based system for dedicated devices. The software has not been actively maintained for several years.
Commercial
Apposite Technoligies sell dedicated hardware solutions for simulating WAN links, with a Web based GUI for configuring the settings and collecting traffic measurements
East Coast DataCom sell hardware dedicated simulators for simulating routers and modems
Itrinegy offer both dedicated device solutions, and solutions for running on clients or servers.
Network FX offer several dedicated device products for simulating network impairments between the client & server
NetLimiter is a client side system that allows throttling of individual applications, and includes a firewall.
Shunra Software offer a range of products, from high end enterprise WAN simulation and testing, to a simple client-resident emulator.
The closest I can think of is doing something similar with VEDekstop from Shunra..
Simulating High Latency and Low Bandwidth in Testing of Database Applications
Shunra VE Desktop Standard is a Windows-based client software solution that simulates a wide area network link so that you can test applications under a variety of current and potential network conditions – directly from your desktop.
I wrote a php script awhile back which used CURL to run a sequence of page requests against my server which represented a typical use scenario. I had it output the times that it took for the server to respond to each of the requests. I then had another script which spawned a bunch of these test case scripts simultaneously for a sustained period and correlated the results into a file which I could then look at in a spreadsheet to see average times. This way I could simulate the number of users hitting the site that I wanted. The limitations are that you need to run the test script on a different server to the web server and that the client machine can become too loaded to give meaningful results past a certain point. I've since left the job otherwise I would paste the scripts here.
If you are running a Linux box as your server, Linux box as your client, or have the capability to put (perhaps a VM) a Linux router between your client and server, you can use NetEm.
NetEm is a Linux TC (Traffic Control) discipline which can delay (i.e. add latency) packets leaving a host. Although it's tricky to set up clever rules (e.g. add latency to some traffic, not to others), it's easy to add a simple "delay everything leaving the interface by 50ms" type rules and some recipes are provided.
By sticking a Linux VM between your client and server, you can simulate as much latency as you like. And you can turn it on and off dynamically. Linux has other TC disciplines which can be combined with NetEm to restrict bandwidth (but the script to set this up can be somewhat complicated). NetEm can also randomly drop packets.
I use it and it works a treat :)
Web Application Stress Tool (WAST) from Microsoft is what you need.
http://www.microsoft.com/downloads/details.aspx?familyid=e2c0585a-062a-439e-a67d-75a89aa36495&displaylang=en
I haven't used it for years (lack of need, not because I'd found anything else), but xat webspeed would be the first thing I would point toward
As other people have mentioned, Apache's ab (comes with Apache, so you probably have it already) is good.
Other good options are:
HP's LoadRunner Apache
Jakarta's JMeter
Tsung (if you want to get your erlang on)
I personally like ab and JMeter the best.
We use Loadrunner to do bandwidth and traffic simulation in our App. Loadrunner is can start agents on various machines and you can simulate one machine as running on dialup modem v/s another on DSL v/s another on Cable internet.
We also use Loadrunner to simulate various kinds of traffic conditions from 10 user run to 500 user run. We can also insert think times in the script and simulate a real user executing the http request. The best part is that it comes with a recording studio where it will plug in with Internet explorer and you can record the whole scenario/Usecase that can be as simple as hitting one page to a full blown 50-60 page script or more.
i found this little java program that works great : sloppy
yet not a proffesional solution but it works for simple tests, i guess it uses java streams and buffers to slow down the connection .
Have you looked at Tsung? It's a great utility for seeing if your website will scale in event of attack, I mean massive popularity. We use it for our web frontend, and our internal systems too.
If you're interested in performing your tests out of your browser, there is also a really great Firefox plug-in.
Do not forget about Wanulator (http://www.wanulator.de/).
The name Wanulator comes from "WAN" and "simulator. This pretty much describes what the software does: It simulates different Internet conditions such as delay or packet loss. Furthermore it simulates user access line speeds e.g. modem, ISDN or ADSL.
Wanulator is currently packaged as a Linux boot CD based on SLAX. This will give you a full out of the box experience. You can turn any PC into a test-system within a blink - just by booting the Wanulator CD. The package already includes useful client SW such as web-browser and network sniffer (Wireshark). Nevertheless if the PC has 2 network interfaces the system can run as an intermediate system between your server and your client - as a switch - without any configuration hassles.

Resources