Here's how Subversion Edge replication works,
why replication could be useful in your setup, and what to consider when you plan to set up
a replica.
Why replicate?
Typically, you would deploy a replica for one of these
reasons:
- Projects in remote locations with lower bandwidth or higher latency want the
performance of a "local" server.
- Your company has a number of developers clustered together at a remote
location. When you install a replica inside the LAN of these developers,
they can greatly improve their Subversion performance and keep a lot of
their traffic off the WAN. In this scenario, you would probably want a
replica in each such location. Keep in mind that a replica can only be a
proxy for one master — so if your company has more than one Subversion
master server, you may need more than one replica at each location.
- You want to reduce the load on the master server.
- For example, continuous integration tools can place a lot of load on the
server and moving that load to a separate server can increase the response
time for other users. In this scenario you probably only need to add one
replica; you'd add it as close to the master as possible so that
synchronization is quick. Of course the previous point can be a factor here.
If the continuous integration server is at remote location, then you would
want to put a replica near the continuous integration server.
Rules for using a replica
When you
convert a Subversion Edge server to replica mode,
you provide the
TeamForge username and
password to use for the replica. The replica uses these credentials when it
replicates Subversion content. This user must be added to the
TeamForge project(s)
and given permissions to the repositories being replicated. Those permissions also
control what parts of the repository will be replicated. So if you have folders that
should not ever live on remote servers, you can set up path-based permissions and
that content will never be replicated to the server. If you forget to set up
permissions, the replication will fail. However, there's no real harm done, and once
you fix the permissions, you can do it again.
The replica user can be a normal user account — it does not have to be an
Admin account. If the replica is set up and maintained by an Operations team, they
might want to just use an Admin account so that project teams do not have to worry
about adding the user to the project or setting up permissions.
Permissions for end users accessing those repositories will follow the normal
TeamForge rules.
Architecture
All communication originates from the replica.
TeamForge
never contacts or pushes to the replica — the replica initiates this. When
TeamForge
wants the replica to do something such as synch a new repository or synch a new revision
for an existing replicated repository, it queues an event for that replica. When a new
commit comes in to
TeamForge and a new commit object is created, the
TeamForge
application server queues up an event for each replica that has this repository. If the
commit is for a repository with no replicas, then nothing happens.
The replica polls
the TeamForge application server for its events. TeamForge site administrators can
configure the polling interval for each replica individually. Typically the interval
would be around 60 seconds. The longer the polling interval, the longer it will take
for a commit to eventually reach a replica. At the same time, the longer the
interval, the lesser the load placed on the application server by the polling
mechanism.
The replica receives all of the queued events since the last time it
polled, and splits these events into two groups:
- New repository initializations: Since these can take a long time, they are
handled separately so as to not block all replication activity.
- All other transactions: This would mainly be synching revisions for existing
replicas, but it could also be removing a repository or synching a revision
property change.
TeamForge site administrators can control how many simultaneous jobs of each type
run concurrently. Typically, the number for the first group would be low, such as 1 or
2. For the second group, it would be larger. Obviously the maximum simultaneous
svnsyncs will have an impact on the Subversion master and the
replica itself and must be considered. For more information on replica settings and how
to configure them, see
Edit replica settings.