summaryrefslogtreecommitdiffstats
path: root/doc/README.html.in
blob: 28c71dc7e5dc2c94feefac378ead9e619aaaf7dd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
<?xml version="1.0" encoding="iso-8859-15"?> <!-- -*-html-helper-*- -->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<title>syrep @PACKAGE_VERSION@</title>
<link rel="stylesheet" type="text/css" href="style.css" />
</head>

<body>
<h1><a name="top">syrep @PACKAGE_VERSION@</a></h1>

<p><i>Copyright 2003,2004 Lennart Poettering &lt;@PACKAGE_BUGREPORT@&gt;</i></p>

<ul class="toc">
    <li><a href="#license">License</a></li>
    <li><a href="#news">News</a></li>
    <li><a href="#overview">Overview</a></li>
    <li><a href="#status">Status</a></li>
    <li><a href="#documentation">Documentation</a></li>
    <li><a href="#requirements">Requirements</a></li>
    <li><a href="#installation">Installation</a></li>
    <li><a href="#acks">Acknowledgements</a></li>
    <li><a href="#download">Download</a></li>
</ul>

<h2><a name="license">License</a></h2>

<p>This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.</p>

<p>This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.</p>

<p>You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.</p>

<h2><a name="news">News</a></h2>

<div class="news-date">Wed Sep 22 2004: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.6.tar.gz">Version 0.6</a> released; Changes
include: fix an ugly bug which made snapshots where <tt>--forget</tt> was used unusable</p>

<div class="news-date">Mon Jul 19 2004: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.5.tar.gz">Version 0.5</a> released; Changes
include: optionally show sizes of file on <tt>--diff</tt>, implement
new command <tt>--forget</tt>, check for extended attribute
<tt>user.syrep</tt> on <tt>--update</tt> on file systems that support
it.</p>

<div class="news-date">Mon Mar 22 2004: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.4.tar.gz">Version 0.4</a> released; Changes
include: fix annonoying SIGBUS failure when working on files &gt;= 100 MB, update to Berkeley DB 4.2, use <tt>madvise()</tt> to improve file copying throughput on newer kernels, minor other fixes </p>

<div class="news-date">Sun Nov 30 2003: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.3.tar.gz">Version 0.3</a> released; Changes
include: new options <tt>--sort</tt>, <tt>--check-md</tt>,
<tt>--always-copy</tt>; implemented direct bi-directory merges,
documentation updates, build system updates, assorted fixes.</p>

<div class="news-date">Tue Sep 9 2003: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.2.tar.gz">Version
0.2</a> released; Fixes include: documentation update, <tt>--diff</tt> output improved, <tt>--merge</tt> output fixed.</p>

<div class="news-date">Mon Sep 8 2003: </div> <p class="news-text"><a
href="@PACKAGE_URL@syrep-0.1.tar.gz">Version
0.1</a> released.</p>

<h2><a name="overview">Overview</a></h2>

<p><tt>syrep</tt> is a generic file repository synchronization tool. It may be
used to synchronize large file hierarchies bidirectionally by
exchanging patch files. <tt>Syrep</tt> is truely peer-to-peer, no central
servers are involved. Synchronizations between more than two
repositories are supported. The patch files may be transferred via
offline media, e.g. removable hard disks or compact discs.</p>

<p>Files are tracked by their message digests, currently MD5. The
following file operations are tracked in the snapshot files: creation,
deletion, modification, creation of new hard or symbolic links,
renaming. (The latter is nothing more than a new hard link and removal
of the old file). <tt>syrep</tt> doesn't distuinguish between soft and
hard links. In fact even copies of files are treated as the
same. Currently, <tt>syrep</tt> doesn't synchronize file attributes like
access modes or modification times.</p>

<p><tt>syrep</tt> was written to facilitate the synchronization of two
large digital music repositories without direct network
connection. Patch files of several gigabytes are common in
this situation.</p>

<p><tt>syrep</tt> is able to cope with 64 bit file sizes. (LFS)</p>

<p><tt>syrep</tt> is optimized for speed. It may make use of a message digest
cache to accelerate the calculation of digests of a whole directory
hierarchy.</p>

<h3>How does syrep compare with <tt>rsync</tt>, <tt>cvs</tt>, Subversion, <tt>arch</tt>/<tt>tla</tt>, BitKeeper, <tt>xdelta2</tt>, <tt>diff</tt>/<tt>patch</tt>?</h3>

<p><tt>syrep</tt> is kind of a bidirectional <tt>rsync</tt>, but
stores and makes use of a file hierarchy history. Synchronization with
<tt>syrep</tt> is based on patch files, and doesn't require a direct
connection between the synchronizing peers.</p>

<p><tt>syrep</tt> has many things in common with version control
systems like <tt>CVS</tt> or <tt>SVN</tt>: it stores a history and has operations
similar to <tt>update</tt> and <tt>commit</tt>. However: the history doesn't contain
file contents and is not line based, it stores the MD5 digest and some
meta data only. There is no central server, instead all peers have the
same role. There is no distinction between repositories and
checkouts. In fact checkout and repository are identical.</p>

<p><tt>syrep</tt> has even more things in common with <tt>arch</tt>/<tt>tla</tt> and
BitKeeper. All three are patch based and are "peer-to-peer". However,
there are certain differences: <tt>syrep</tt> doesn't differentiate
between repositories and checkouts. <tt>syrep</tt> doesn't keep a file
contents history of any kind.</p>

<p><tt>syrep</tt> resembles <tt>diff</tt>/<tt>patch</tt> or
<tt>xdelta2</tt> in some way. While the latter work on file
contents, <tt>syrep</tt> works on file hierarchies.</p>

<p>In contrast two most of the software mentioned above,
<tt>syrep</tt> is capable of synchronizing repositories of several
100GB of size, with only a very small overhead. (i.e. 4 MB of control
data for half a year history for 100 GB of user data)</p>

<h2><a name="status">Status</a></h2>

<p>Version @PACKAGE_VERSION@ is more or less stable and fulfills its purpose.</p>
  
<h2><a name="documentation">Documentation</a></h2>

<p>Have a look on the man page <a href="@PACKAGE_URL@syrep.1.xml"><tt>syrep(1)</tt></a>. (A XSLT capable browser is required)</p>

<h3>Method of operation</h3>

<p><tt>Syrep</tt>'s operation relies on "snapshots" of a file
repository. A snapshot contains information about all files existent
in the hierarchy combined with a limited history log of file
operations. Snapshots may be compared, files missing or deleted on one
of both sides may be detected this way. Based on this knowledge patch
files containing all missing files may be created and merged.</p>

<p>To keep the file operation log in a sensible state it is crucial to
update the snapshot frequently, probably by adding a new <tt>cron</tt>
job.</p>

<h3>Example usage</h3>

<p>Fred and Karl want to synchronize their digital music libraries by
exchanging an USB hard disk with patch files. As first step, both
initialize their repositories for usage with <tt>syrep</tt>:</p>

<pre>fred$ syrep -zp --update ~/mp3/
...
karl$ syrep -zp --update ~/mp3/</pre>

<p>Depending on the size of the repositories this takes a lot of time,
since a message digest is calculated for every file. Since exact
tracking of all file operations on the repository is crucial for
effetive synchronization, they both use the time passing to add a new
entry to their <tt>crontab</tt>:</p>

<pre>1 3 * * * syrep -z --update ~/mp3/</pre>

<p>When the snapshot creation finished, they send the newly created
patch files <tt>~/mp3/.syrep/curent.syrep</tt> to each other. As these
snapshots are only about 400K of size for a 80GB repository they do
that via email:</p>

<pre>fred$ mutt -a ~/mp3/.syrep/current.syrep -s "The current snapshot of fred" karl
...
karl$ mutt -a ~/mp3/.syrep/current.syrep -s "The current snapshot of karl" fred</pre>

<p>When the mails arrive they both detach the snapshot and create a
patch on their USB harddisk containing all local files not existing on
the other siede:</p>

<pre>fred$ mount /mnt/usb
fred$ syrep -p -o /mnt/usb/patch-for-karl --makepatch ~/mp3/ ~/karls-current.syrep
fred$ umount /mnt/usb
...
karl$ mount /mnt/usb
karl$ syrep -p -o /mnt/usb/patch-for-fred --makepatch ~/mp3/ ~/freds-current.syrep
karl$ umount /mnt/usb
</pre>

<p>As next step they exchange their harddisks. Back at home they merge the newly acquired patch into their own repository:</p>

<pre>fred$ mount /mnt/usb
fred$ syrep -pT --merge /mnt/usb/patch-for-fred ~/mp3/
fred$ umount /mnt/usb
...
karl$ mount /mnt/usb
karl$ syrep -pT --merge /mnt/usb/patch-for-karl ~/mp3/
karl$ umount /mnt/usb
</pre>

<p>At this moment both have the same file hierarchy. To update the
local snapshot log with the newly merged files they both should run
<tt>--update</tt> now. This update run should be much quicker since
the message digests of all unchanged files are read from a message
cache created and update each time <tt>--update</tt> runs:</p>

<pre>fred$ syrep -zp --update ~/mp3/
...
karl$ syrep -zp --update ~/mp3/</pre>

<p>Some time later Fred got plenty of new music files, while Karl
didn't change anything on his repository. Thus, Fred is able to use
the old snapshot he recieved from Karl to generate a new patch for
him. He does it exactly the same way he did the last time, see
above.</p>

<p>And now, several iterations of the story described above
follow.</p>

<p>That's the end of the story.</p>

<p>OK, not quite. Sometimes a conflict happens, e.g. at the same time both created a
file <tt>foo.mp3</tt> with different contents. When
this happens the local copy is always copied into the patch and the
user may decide during merge which file version he wants to have
locally. Because of that merging is an interactive task and cannot be
automated completely.</p>

<p>There is no need that the synchronization operations happen in such
a "symmetric" way as described above.</p>

<h2><a name="requirements">Requirements</a></h2>

<p><tt>syrep</tt> requires installed development versions of <tt><a
href="http://www.gzip.org/zlib/">zlib</a></tt> and <tt><a
href="http://www.sleepycat.com/">Berkeley DB</a></tt> 4.2. If you want
build <tt>syrep</tT> with support for extended attributes (currently
supported on Linux only) you have to install <tt>libattr</tt> and a kernel that supports it.</p>

<p><tt>syrep</tt> was developed and tested on Debian GNU/Linux
"testing" from September 2003, it should work on most other Linux
distributions and may be POSIX implementations since it uses GNU
autoconf for source code configuration.</p>

<p>Some support for for big endian architectures is included, however, it is incomplete. You're welcome to send me patches.</p>

<p>If the <tt>syrep</tt> build system detects Oliver Kurth's <a
href="http://masqmail.cx/xml2man/"><tt>xmltoman</tt></a> the man page is
rebuilt. Otherwise the pre-compiled versions shipped with
<tt>syrep</tt> are used.</p>

<h2><a name="installation">Installation</a></h2>

<p>As this package is made with the GNU autotools you should run
<tt>./configure</tt> inside the distribution directory for configuring
the source tree. After that you should run <tt>make</tt> for
compilation and <tt>make install</tt> (as root) for installation of
<tt>syrep</tt>.</p>

<h2><a name="acks">Acknowledgements</a></h2>

<p>This software includes an implementation of the MD5 algorithm by
L. Peter Deutsch. Thanks to him for this.</p>

<h2><a name="download">Download</a></h2>

<p>The newest release is always available from <a href="@PACKAGE_URL@">@PACKAGE_URL@</a></p>

<p>The current release is <a href="@PACKAGE_URL@syrep-@PACKAGE_VERSION@.tar.gz">@PACKAGE_VERSION@</a></p>

<p>Get <tt>syrep</tt>'s development sources from the <a href="http://subversion.tigris.org/">Subversion</a> <a href="https://seth.intheinter.net:8081/svn/syrep/">repository</a>. (<a href="http://0pointer.de/cgi-bin/viewcvs.cgi/?root=syrep">viewcvs</a>)</p>

<hr/>

<address class="grey">Lennart Poettering &lt;@PACKAGE_BUGREPORT@&gt;, September 2004</address>

<div class="grey"><i>$Id$</i></div>

</body>
</html>