README 5.51 KB
Newer Older
1 2
TIDStorage

3
This product provides a way to have consistent backups when running a
4 5 6 7 8
multi-storage instance (only ZEO is supported at the moment).

Doing backups of individual Data.fs who are part of the same instance
(one mounted in another) is a problem when there are transactions involving
multiple storages: if there is a crash during transaction commit, there is no
Łukasz Nowak's avatar
Łukasz Nowak committed
9
way to tell which storage was committed and which was not (there is no TID
10
consistency between databases).
11
There is an even more tricky case. Consider the following:
12
 2 transactions running in parallel:
13 14 15
  T1: modifies storage A and B
  T2: modifies storage A
 Commit order scenario:
Łukasz Nowak's avatar
Łukasz Nowak committed
16 17
  T1 starts committing (takes commit lock on A and B)
  T2 starts committing (waits for commit lock on A)
18 19 20 21 22
  T1 commits A (commit lock released on A)
  T2 commits A (takes & releases commit lock on A)
  [crash]
  T1 commits B (commit lock released on B) <- never happens because of crash

Łukasz Nowak's avatar
Łukasz Nowak committed
23 24
Here, T2 was able to commit entirely, but it must not be saved. This is
because transactions are stored in ZODB in the order they are committed.
25 26 27 28 29
So is T2 is in the backup, a part of T1 will also be, and backup will be
inconsistent (T1 commit on B never happened).  

TIDStorage fixes those issues by keeping track of transaction-to-tid relations
for all (ZODB, via ZEO) storages involved in any transaction, and by tracking
30
inter-transaction dependencies.
31 32 33 34 35 36 37 38

TIDStorage is composed of 3 parts:
 - A Zope product, which monkey-patches "ZEO" and "transaction" products.
   transaction patch:
     TIDStorage works at transaction boundaries, so we hook around
     _commitResource method to know when it happens.
     It must be configured to fit your network setup (TID_STORAGE_ADDRESS)
   ZEO patch:
Łukasz Nowak's avatar
Łukasz Nowak committed
39 40
     With regular ZEO, there is no way to know last committed TID at
     transaction-code level. This patch stores last committed TID on ZEO
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
     connection object, to be read by transaction patch.
 - A daemon
   This is TIDStorage itself, receiving TIDs from Zopes and delivering
   coherency points to backup scripts.
 - Backup scripts
   Those scripts are (mostly) wrappers for repozo backup script, fetching
   coherency points from TIDStorage daemon and invoking repozo.
   This requires a patch to be applied to regular repozo, so that it can
   backup ZODBs only up to a given TID.

Constraints under which TIDStorage was designed:
 - Zope performance
   Protocol (see below) was designed as one-way only (Zope pushes data to
   TIDStorage, and does not expect an answer), so that TIDStorage speed do not
   limit Zope performance.
 - No added single-point-of-failure
   Even if Zope cannot connect to TIDStorage, it will still work. It will only
58
   emit one log line when connection is lost or at first attempt if it did not
59 60 61 62
   succeed. When connection is established, another log line is emitted.
 - Bootstrap
   As TIDStorage can be started and stopped while things still happen on
   ZODBs, it must be able to bootstrap its content before any backup can
63
   happen. This is done by creating artificial Zope transactions whose only
64 65 66
   purpose is to cause a commit to happen on each ZODB, filling TIDStorage and
   making sure there is no pending commit on any storage (since all locks
   could be taken by those transactions, it means that all transaction started
67 68 69
   before that TIDStorage can receive their notification have ended).
 - Restoration from Data.fs
   In addition to the ability to restore from repozo-style backups, and in
70 71 72 73 74 75
   order to provide greater backup frequency than repozo can offer on big
   databases, TIDStorage offers the possibility to restore coherent Data.fs
   from crashed ones - as long as they are not corrupted.

Limits:
 - Backup "lag"
76 77 78 79
   As TIDStorage can only offer a coherency point when interdependent
   transactions are all finished (committed or aborted), a backup started at
   time T might actually contain data from moments before. There are pathologic
   cases where no coherency point can be found, so no backup can happen.
Łukasz Nowak's avatar
Łukasz Nowak committed
80
   Also, bootstrap can prevent backups from happening if daemon is
81 82 83
   misconfigured.

Protocol:
84 85 86
 All characters allowed in data, except \n and \r (0x0A & 0x0D).
 Each field ends with \n, \r is ignored.
 No escaping.
87
 When transferring a list, it is prepended by the number of included fields.
88 89 90 91 92
   Example:
     3\n
     foo\n
     bar\n
     baz\n
93
 When transferring a dict, it is prepended by the number of items, followed by
Łukasz Nowak's avatar
Łukasz Nowak committed
94
 keys and then values. Values must be integers represented as strings.
95 96 97 98 99 100 101
   Example:
     2\n
     key1\n
     key2\n
     1\n
     2\n
 Commands are case-insensitive.
102

103
1) Start of commit command:
104 105

BEGIN\n
106 107
<commit id>\n
<list of involved storages>
108

109 110 111
 <commit id>: must be identical to the one given when commit finishes (be it ABORT or COMMIT)
 <list of involved storages>: list of storage ids involved in the transaction
 NB: final \n is part of list representation, so it's not displayed above.
112

113
Response: (nothing)
114

115
2) Transaction abort command:
116 117

ABORT\n
118
<commit id>\n
119

120
  <commit id>: (cf. BEGIN)
121

122
Response: (nothing)
123

124
3) Transaction finalisation command:
125 126

COMMIT\n
127
<commit id>\n
Łukasz Nowak's avatar
Łukasz Nowak committed
128
<dict of involved storages and committed TIDs>
129

130
 <commit id>: (cf. BEGIN)
131
 involved storages: (cf. BEGIN)
132
 committed TIDs: TIDs for each storage, as int.
133
 NB: final \n is part of list representation, so it's not displayed above.
134

135
Response: (nothing)
136

137
4) Data read command:
138 139 140

DUMP\n

141
Response:
142 143 144 145 146 147 148 149 150 151 152 153 154
<dict of storages and TIDs>

5) Connection termination command:

QUIT\n

Response: (nothing, server closes connection)

6) Bootstrap status command:

BOOTSTRAPED\n

Response: 1 if bootstrap was completely done, 0 otherwise.
155