• Yorick Peterse's avatar
    Add support for load balancing database queries · 623db07d
    Yorick Peterse authored
    This adds support for balancing queries amongst multiple database hosts.
    Web requests will stick to using the primary for a little while after a
    write took place, removing the need for synchronous replication. Load
    balancing is disabled for Sidekiq since using this could lead to race
    conditions, and Sidekiq mostly performs writes anyway.
    
    == Balancing
    
    Balancing is done using a simple round-robin algorithm. The first time a
    connection is needed the first host is used, then the second, third,
    etc. This logic resets to the first host once reaching the end of the
    hosts list.
    
    The code added in this commit _only_ load balances queries sent from
    models. The code does _not_ touch ActiveRecord::Base.connection. This
    means that direct use of this method will result in the queries being
    sent to the primary.
    
    == Configuration
    
    Configuration is done by adding a YAML section to config/database.yml.
    For example:
    
        production:
          load_balancing:
            hosts:
              - 10.0.0.1
              - 10.0.0.2
    
    All hosts will use the same authentication credentials.
    
    == Sticking
    
    When a write is performed the query is sent to the primary. Any queries
    executed after this point are also sent to the primary. At the end of a
    request some session details are stored for the current user, these
    details are used to stick to the primary for as long as necessary (or
    until the data expires).
    
    This prevents the user from running into cases where they write data to
    the primary, read from the secondary, and the data isn't available yet
    (e.g. leading to an HTTP 404 error).
    
    == Overhead
    
    The load balancing code has minimal overhead. Instead of parsing raw SQL
    queries it hooks into Rails specific methods to determine what host to
    use for a query.
    
    == Transactions
    
    Transactions are always executed on the primary, even if they don't
    perform any writes. Once a transaction completes a session will stick to
    the primary. This is based on transactions almost always being used for
    writes (there's little benefit to using a transaction for only reads).
    
    == Prepared Statements
    
    Prepared statements don't work well when queries are being distributed
    amongst hosts. As a result GitLab will automatically disable prepared
    statements when load balancing is enabled. Disabling prepared statements
    has no impact on response timings, and may even reduce the memory usage
    of PostgreSQL.
    
    == Failovers
    
    The load balancing code is capable of dealing with database failovers.
    In the event of a secondary being unavailable the load balancer will
    mark it as offline and use the next available secondary. If no
    secondaries are available the primary is used instead.
    
    Secondaries that are marked as offline are checked again automatically,
    preventing a host from being marked as offline forever.
    
    In the event of a connection error when writing to the primary the
    code will suspend the caller, then retry the operation up to 3 times.
    Every retry the sleep time will increase exponentially.
    
    All of this means that in the event of a DB restart or failover some
    requests may take a bit longer to complete; instead of the application
    immediately returning an error.
    623db07d
README.md 7.77 KB