How to view server connections of a session or workflow - session

As an SQA, I need to verify that all connections associated with a set of workflows have been updated. How can I view all connections associated with a workflow? Is there a connection, or connections, assigned to individual workflows, or would I need to find the connections for each individual session? If so, how do I view the connection(s) of a session?

There may be plenty connections used by a single session - that's very often the point of ETL and data integration. Those connections used in session (or sessions!) may be defined in this session. But a session can use parameter for the connection that will be replaced with some value defined in a parameter file. A parameter file can be defined on the session itself, or it can be defined for a workflow. Or there can be some external utility that stores the configuration and executes the workflow with a generated parameterfile that you will not find in the workflow definition at all.
There are some tools that would help you get all the connections for a session or all sessions in a workflow, but deriving the values for the parameters is more difficult. You need to know where they come from in your environment.
Please note that workflows and sessions hold the connection name only - where does it actually connect depends on the connection definition.
In addition, the very same workflow and sessions can be executed with different connection values, e.g. to perform the same set of operations on a different set of sources and/or targets.

Related

Oracle - Get authenticated user

My MVC application connects in Oracle database. We created a lot of triggers to save all data changed by users.
inside the trigger, we used the code bellow to get authentcated user:
UPPER(SYS_CONTEXT('USERENV', 'OS_USER'))
When i'm running my application in localhost, the database get the correct user, but when i plublish it on server (IIS), the database always get as user the application pool name.
Is there some IIS configuration that i need to set to get "Windows authentication" user? Is there another way to get this information inside oracle function/trigger?
You would realistically want to use a secure application context which is basically a user-controlled context unlike the system-controlled USERENV context. When the application code gets a connection from the pool, it would call a stored procedure that sets the application username in the new application context. Your triggers would then reference the new context rather than USERENV. Your application needs to ensure that the context is set appropriately every time a connection is acquired from the pool-- if the application fails to set the context correctly, your triggers will get the wrong information.
If you don't want to create your own context, you could use the CLIENT_IDENTIFIER in USERENV which you can set via dbms_session whenever you get a connection from the pool. Functionally, this is basically identical to creating your own context. The nice thing about creating your own context, though, is that you can seamlessly add attributes in the future as you identify the need (i.e. adding the IP address of the client browser or a tier attribute if you have gold, silver, and bronze customers).
There are alternate ways to approach the problem such as using proxy authentication. In general, though, that's not going to work as well with connection pools particularly when you have very large numbers of users.

JDBC connection pool manager

We're in the process of rewriting a web application in Java, coming from PHP. I think, but I'm not really sure, that we might run into problems in regard to connection pooling. The application in itself is multitenant, and is a combination of "Separate database" and "Separate schema".
For every Postgres database server instance, there can be more than 1 database (named schemax_XXX) holding more than 1 schema (where the schema is a tenant). On signup, one of two things can happen:
A new tenant schema is created in the highest numbered schema_XXX database.
The signup process sees that a database has been fully allocated and creates a new schemas_XXX+1 database. In this new database, the tenant schema is created.
All tenants are known via a central registry (also a Postgres database). When a session is established the registry will resolve the host, database and schema of the tenant and a database session is established for that HTTP request.
Now, the problem I think I'm seeing here is twofold:
A JDBC connection pool is defined when the application starts. With that I mean that all databases (host+database) are known at startup. This conflicts with the signup process.
When I'm writing this we have ~20 database servers with ~1000 databases (for a total sum of ~100k (tenant) schemas. Given those numbers, I would need 20*1000 data sources for every instance of the application. I'm assuming that all pools are also, at one time or another, also started. I'm not sure how much resources a pool allocates, but it must be a non trivial amount for 20k pools.
So, is it feasable to even assume that a connection pool can be used for this?
For the first problem, I guess that a pool with support for JMX can be used, and that we create a new datasource when and if a new schemas_XXX database is created. The larger issue is that of the huge amount of pools. For this, I guess, some sort of pool manager should be used that can terminate a pool that have no open connections (and on demand also start a pool). I have not found anything that supports this.
What options do I have? Or should I just bite the bullet and fall back to an out of process connection pool such as PgBouncer and establish a plain JDBC connection per request, similar to how we're handling it now with PHP?
A few things:
A Connection pool need not be instantiated only at application start-up. You can create or destroy them whenever you want;
You obviously don't want to eagerly create one Connection pool per database or schema to be open at all times. You'd need to keep at least 20000 or 100000 Connections open if you did, a nonstarter even before you get to the non-Connection resources used by the DataSource;
If, as is likely, requests for Connections for a particular tenant tend to cluster, you might consider lazily, dynamically instantiating pools, and destroying them after some timeout if they've not handled a request for a while.
Good luck!

Azure cache failing with multiple concurrent requests

Everything with my co-located cache works fine as long as there is one request at a time. But when I hit my service with several concurrent requests, my cache doesn't seem to work.
Preliminary analysis led me to this - https://azure.microsoft.com/en-us/documentation/articles/cache-dotnet-how-to-use-service/
Apparently, I would have to use maxConnectionsToServer to allow multiple concurrent connections to cache. But the document also talks about a useLegacyProtocol parameter which has to be set to false to enable connection pooling.
I have the following questions:
My service would be getting a few hundred concurrent requests. Would this be a
good setting for such a scenario:
<dataCacheClient name="default" maxConnectionsToServer="100"
useLegacyProtocol="false">
This is my understanding of the behavior I would get with this configuration - Each time a request comes in, an attempt would be made to retrieve a connection from the pool. If there is no available connection, a new connection would be created if there are less than 100 connections currently, else the request would fail. Please confirm if this is correct.
The above documentation says that one connection would be used per instance of DataCacheFactory. I have a cache manager class which manages all interactions with cache. This is a singleton class. It creates a DataCacheFactory object and uses it to get a handle to the cache during its instantiation. My service would have 2 instances. Looks like I would need only 2 connections to server. Is this correct? Do I even need connection pooling?
What is the maximum value maxConnectionsToServer can accept and what would be an ideal value for the given scenario?
I also see a boolean paramater named "ConnectionPool". This looks complementary to "useLegacyProtocol". Is this not redundant? How is setting useLegacyProtocol="false" different from connectionPool="true"? I am confused as to whether or not and how to use this parameter.
Are maxConnectionsToServer and ConnectionPool parameters related in any way? What does it mean when I have maxConnectionsToServer set to 5 and ConnectionPool=true?

How does session replication across containers work?

I would be interested in some timing details. For example I place in session some container, which can keep different data. I do change of content of the container frequently. How can I assure that the container session value get replicates across nodes for any change?
You don't need to make sure; that's the application server's job.
The J2EE specification doesn't deal with session-information synchronization amongst distributed components.
Theoretically, all you have to do is code thread-safe. In your example, simply make sure that access to the container is synchronized. If your application server is bug-free, then you can safely assume that the session information is properly replicated across all nodes in a seamless manner; if your application server has bugs around session synchronization... well... then nothing is really safe anymore, now is it.
Application servers use different strategies to synchronize session information between nodes. Session content can be considered as dirty and required synchronization at
put data in session
get data from session
get data from session falls in two categories as
get structured object
get scalar object or immutable object
So if session data get modified indirectly by modifying an structured object, then simple re-read it from session can assure that the object content got replicated.

How to manage session variables in a web cluster?

Session variables are normally keept in the web server RAM memory.
In a cluster, each request made by a client can be handled by a different cluster node. right?!
So, in this case...
What happens with session variables? Aren't they stored in the nodes RAM memory?
How the other nodes will handled my request correctly if it doesn't have my session variables, or at least all of it?
This issue is treated by the web server (Apache, IIS) or by the language runtime (PHP, ASP.NET, Ruby, JSP)?
EDIT: Is there some solution for Classic ASP?
To extend #yogman's answer.
Memcached is pure awesomeness! It's a high performance and distributed object cache.
And even though I mentioned distributed it's basically as simple as starting one instance on one of your spare/idle servers, you configure it as in ip, port and how much ram to use and you're done.
memcached -d -u www -m 2048 -l 10.0.0.8 -p 11211
(Runs memcached in daemon mode, as user www, 2048 MB (2 GB) of RAM on IP 10.0.0.8 with port 11211.)
From then on, you ask memcached for data and if the data is not yet cached you pull it from the original source and store it in memcached. I'm sure you are familiar with cache basics.
In a cluster environment you can link up your memcached's into a cluster and replicate the cache across your nodes. Memcached runs on Linux, Unix and Windows, start it anywhere you have spare RAM and start using your resources.
APIs for memcached should be generally available. I'm saying should because I only know of Perl, Java and PHP. But I am sure that e.g. in Python people have means to leverage it as well. There is a memcached wiki, in case you need pointers, or let me know in the comments if I was raving too much. ;)
There are 3 ways to store session state in ASP.NET. The first is in process, where the variables are stored in memory. The second is to use a session state service by putting the following in your web.config file:
<sessionState
mode="StateServer"
stateConnectionString="tcpip=127.0.0.1:42424"
sqlConnectionString="data source=127.0.0.1;user id=sa;password="
cookieless="false"
timeout="20" />
As you can see in the stateConnectionString attribute, the session state service can be located on a different computer.
The third option is to use a centralized SQL database. To do that, you put the following in your web.config:
<sessionState
mode="SQLServer"
stateConnectionString="tcpip=127.0.0.1:42424"
sqlConnectionString=
"data source=SERVERHAME;user id=sa;password="
cookieless="false"
timeout="20"
/>
More details on all of these options are written up here: http://www.ondotnet.com/pub/a/dotnet/2003/03/24/sessionstate.html
Get a Linux machine and set up http://www.danga.com/memcached . Its speed is unbeatable compared to other approaches. (for example, cookies, form hidden variables, databases)
As with all sorts of thing, "it depends".
There are different solutions and approaches.
As mentioned, there's the concept of a centralized store for session state (database, memcached, shared file system, etc.).
There are also cluster wide caching systems available that make local data available to all of the machines in the cluster. Conceptually it's similar to the centralized session state store, but this data isn't persistent. Rather it lives within the individual nodes and is replicated using some mechanism provided by your provider.
Another method is server pinning. When a client hits the cluster the first time, some mechanism (typically a load balancer fronting the cluster) pins the client to a specific server. In a typical client lifespan, that client will spend their entire time on a single machine.
For the failover mechanism, each machine of the cluster is paired with another machine, and so any session changes are shared with the paired machine. Should the clients pinned machine encounter an issue, the client will hit another machine. At this point, perhaps due to cookies, the new machine sees that it's not the original machine for the client, so it pings both the original machine, and the paired machine for the clients session data.
At that point the client may well be pinned to the new machine.
Different platforms do it in different ways, including having no session state at all.
With Hazelcast, you can either use Hazelcast distributed map to store and share sessions across the cluster or let Hazelcast Webapp Manager do everything for you. Please check out the docs for details. Hazelcast is a distributed/partitioned, super lite and easy, free data distribution solution for Java.
Regards,
-talip
http://www.hazelcast.com
To achieve load balancing for classic ASP, you may store the user specific values in the database and pass a reference unique id in the URL as follows.
Maintain a session table in the database which generates a unique id for each record. The first time you want to store session specific data, generate a record in your session table and store the session values in it. Obtain the unique id of the new session record and re-write all links in your web application to send the unique id as part of querystring.
In every subsequent page where you need the session data, query the session table with the unique id passed in the querystring.
Example:
Consider your website to have 4 pages: Login.asp, welcome.asp, taskList.asp, newtask.asp
When the user logs in using login.asp page, after validating the user, create a record in session table and store the required session specific values (lets say user's login date/time for this example). Obtain the new session record's unique id (lets say the unique id is abcd).
Append all links in your website with the unique id as below:
welcome.asp?sessionId=abcd
tasklist.asp?sessionId=abcd
newtask.asp?sessionId=abcd
Now, if in any of the above web pages you want to show the user's login date/time, you just have to query your session table with the sessionID parameter (abcd in this case) and display to the user.
Since the unique value identifying the session is a part of the URL, any of your web servers serving the user will be able to display the correct login date/time value.
Hope this helps.
In ASP.NET you can persist session data to an SQL Server database which is common to all web servers in the cluster.
Once configured (in the web.config for your site), the framework handles all of the persistance for you and you can access the session data as normal.
As Will said, most load-balancing approaches will use some sort of stickiness in the way the distribute forthcoming requests from the same client, meaning, a unique client will hit the same server unless that actual server goes down.
That minimizes the need of distribution of session-data, meaning that only in the eventual failure of a server, a client would loose his session. Depending on your app, this is more or less critical. In most cases, this is not a big issue.
Even the simplest way of loadbalacing (round-rubin the DNS-lookups) will do some sort of stickiness since most browsers will cache the actual lookup and therefor keep going to the first record it received, AFAIK.
It's usually the runtime that is responsible for the sessiondata, in for exampla PHP it's possible to define your own session-handler, which can persist the data into a database for instance. By default PHP stores sessiondata on files, and it might be possible to share these files on a SAN or equivalent in order to share session-data. This was just a theory I had but never got around to test since we decided that loosing sessions wasn't critical and didn't want that single point of failure.

Resources