cosign-discuss at umich.edu
general discussion of cosign development and deployment
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: replication behind load balancer
On Thu, 2005-03-31 at 09:40, Wesley Craig wrote:
> On 30 Mar 2005, at 14:38, David Alexander wrote:
> > I want to understand more about how cosign replication works. In
> > the scenario described above, it does not seem like cosignd
> > replication has a high-availability architecture. If cosignd on
> > host b has up-to-date info that is not propagated to other
> > cosignds, and host b dies, then the information is lost.
> High-availability can mean a lot of things. In all high-
> availability systems, replicating the data is not instantaneous.
> The delay can *either* happen before the transaction completes, a la
> 3-phase commit, or *after* the transaction appears to have completed,
> in the background. CoSign does the latter, for superior end-user
> performance. However, that means there is a window during which
> state information is only on one machine, having not yet replicated
> to other hosts -- typically less than a second, but that's not
> guaranteed. During that window, should a CoSign server be lost, the
> users that were only logged in on the lost machine would all be
> logged out.
> > Our intention was to load balance https and port 6663 traffic.
> > Communication between the cosignd processes would occur on the
> > private network and would not be load balanced. If cosignd indeed
> > has replication capabilities, it's not clear to me why this
> > wouldn't work.
> I don't think that a load balancer is necessary for CoSign at all.
> Load balancing can occur on https, but should not be used on 6663.
> The CoSign filters (and CGI) perform connection caching and
> failover. This failover allows the filters to function correctly,
> even with the potential replication delay mentioned above.
> > Has anyone done any work to replace file read/writes with database
> > calls? It seems like this would provide a high-availability
> > architecture that would be reliable and easy to deploy.
> Brett Lomas at University of Auckland has done work to replace the
> CoSign server's cookie cache with DB calls. He can speak more about
> his experience with that.
I modified the code in cosign to abstract from the data source. The
cosign daemon and monster just use an interface, and either the
filesystem or oracle implementation is compiled in.
This worked really well until we tried to get the database stuff to use
oracle session pooling (because the connections were homogenous), but to
do that would havee required a large change to the cosign daemon, as I
could not verify if the session pooling would work over forks, and a
little testing suggested it did not, as I got very weird oracle errors.
Basically what I had done worked very well, but we moved away from it
because of the VERY slow disks we had internally and are using the
filesystem on a RAM disk.
If you are sufficiently interested I can give you the code. I have tried
to uMich to put this in, but since interest in thee community was small
they felt it was not a bing requirement (and, rightly so).
> I would not describe any system that requires a high-availability
> database as "easy to deploy". If you've already done the work to
> create a high-availability database, then perhaps layering above it
> might be easier than building it from scratch. But that's a pretty
> big "if", and if you don't already have a high-availability database,
> it will be very costly to build. CoSign's high-availability features
> meet the UMich WebISO requirements. And these high-availability
> features work very well, as can be seen from the real-life
> availability and performance of CoSign at UMich and other sites.