From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1MTdRc-0001AE-NP for garchives@archives.gentoo.org; Wed, 22 Jul 2009 15:09:26 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 9C39AE027E; Wed, 22 Jul 2009 15:09:23 +0000 (UTC) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.245]) by pigeon.gentoo.org (Postfix) with ESMTP id 696A1E027E for ; Wed, 22 Jul 2009 15:09:23 +0000 (UTC) Received: by an-out-0708.google.com with SMTP id d40so330518and.1 for ; Wed, 22 Jul 2009 08:09:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=PoeLWAw6QCmgIZXOm1ieuBEkLDosSyWBwgRM1nFZtaA=; b=Sug37mMyobUS5oOR8Qyk5BQteMz6mirWPpnlP8T0gLafepybm/cp6eUFtYNypgGlux tgJ7wRlpO9Rgpgtilx6pNbU+Mm5AB4rm/aD6Rg+X7M26lr5JF3gYcO41wkQzj3C38Orr stkAPlGVTSlNcZ/k5OI4jzeDsKfaDJO0NNQNI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=HwfGErW6Q1M+Ix5KZKo/d+8CPd9TFnKlHwLHuAQnaZClnthr5/TuQimFmVnGeYtlXJ W7/fG9hzNfTgZ/ron0hGusgsx+kjHudjHxZv/Y36QAQt1hn7hdQRGOPtr4NSpfUmYxlj A9rvZM9V0JuywR+p8Md6gUQJz8tyTA3CvcKY0= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.100.42.4 with SMTP id p4mr1295041anp.115.1248275362994; Wed, 22 Jul 2009 08:09:22 -0700 (PDT) Date: Wed, 22 Jul 2009 11:09:22 -0400 Message-ID: <5f14cf5e0907220809ud14a99dq81950fba1c45b495@mail.gmail.com> Subject: [gentoo-user] File synchronisation utility (searching for/about to program it) From: Simon To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Archives-Salt: 86647711-abb0-436f-a4d4-7430579c883c X-Archives-Hash: 0b89413a24de8bd399481872d881a667 Hi there! I was about to jump into the programming of my own sync utility when i thought: Maybe i should ask if it exists first! Also, this is not really gentoo-related: it doesnt deal with OS or portage... but i'm rather asking the venerable community at large, excuse me if you find this post inappropriate (but can you suggest a more appropriate audience?). There are lots of sync utility out there, but my search hasnt found the one utility that has all the features i require. Most lack some of these features, some will have undesirable limitations... I'm currently using unison for all my sync needs, it's the best i found so far but it is very limited on some aspects and it's a bit painful on my setup. Make sure i clearly refuse to even consider network filesystems, and the reason is i need each computer to be fully independent from each other, i sync my important files so to have a working backup on all my pcs (my laptop breaks? fine, i just start my desktop and continue working transparently, well, with last sync'ed files). Any kind of NFS could be considered for doing the file transfers, but i dont think any of them can compete with rsync, so they're out of the question. Now, i know some of you will have the reflex to say: try Such tool, it support 4 out of your 5 requirements. Or try Such tool, it supports them all, but you'll have to bend things a bit to make it work like you want.... I'm looking for the perfect solution, and if it doesnt exist, well, i'm about to code it in C or C++, i have the design ready and the concept is very simple yet provides all my features. I wish to publish the result as open software (probably with a license like BSD or maybe LGPL, maybe but hopefully not GPL) and what i'm about to code will be compatible Linux and MacOSX for sure, a port to windows will require some dumb extensions (such as windows path to unix path conversion, and file transfer support) and it will use very little deps. My project intends to use rsync for the transfer, and so my project will basically extend rsync with all my required features. Rsync does the transfer, i can't compete with how good rsync is at transfering (works through ssh, rsh, through its daemon, does differential transfers, transfers attributes/ownership...), but my project will be better at finding what needs to be transfered, what needs to be deleted and this on as many computers you want and in one shot. Here are the features that i seek/require (that i will be programming if no utility can provide them all, the list is actually longer, but i can live without the items not written here): -Little space requirements: I could use rsync to make an incremental backup using hardlinks, and basically just copy whatever is "new" on each replica, but this takes way too much space and still doesnt deal with deletes properly (ie a file is on A and B, gets deleted on A and on B and recreated on B. In reality we have a new file on B, but rsync might want to delete this new file on B thinking it's the file that got deleted on A, unison works admirably here, it finds the first file effectively got deleted on both, nothing to do, and new file appeared on B which needs to be transfered to A... the space unison uses to cache its date is about 100mb now, and i havent cleaned it since i started using it, i believe more than half of it could be removed, even 100mb still represents about 1% of what is sync'ed). -Server-less: I dont want to maintain a server on even a single computer. I like unison since it executes the server through ssh only when used, it's never listening, it's never started at boot time. This is excellent behavior and simplifies maintenance. -Bidirectional pair-wise sync: Meaning i can start the sync from host A or from host B, the process should be the same, should take same amount of time, result should be the same. I should never have to care where the sync is initiated. (Unison doesnt support this, but it's ok to sync from both directions, it's just not optimised) -Star topology: Or any topologise that allow syncing multiple computers at once... I'm tired of doing several pairwise syncs since to do a full sync of my 3 computers (called A,B and C), i first have to sync A->B and A->C, at this point A contains all the diffs and is sync'ed, but i have to do it once more A->B and A->C to sync the others (ie so B gets C's modifs). -Anarchic mode: hehe however you call it, using the same 3 hosts, i'd like to be able to do a pairwise sync between: A->B, A->C and also B->C. To have the sync process decentralised... This is possible with unison but of course i have to ssh to the remote host i want to sync with another remote host. -Intelligent conflict resolution: Let's face it, the sync utility wasnt gifted with artificial intelligence, so why bother? It should depend on the user's intelligence, but it should depend on it intelligently. Meaning, it should remember (if users wants it) the resolution of a given conflict to always resolve it this way. This could effectively help in having some files mirrored from A->B, some others mirrored from B->A, some others to be backed up before being overwritten and some would always require user interactivity (like my current project's file)... This is a matter of preference and any utility that dont understand this works against me. No tool i've encountered supports this, unison could do some of these but i would have to break the sync'ing process into multiple smaller syncs, and most tool will just shoot a list of all conflicts and as wheter to keep local, keep remote, ignore, cancel, and this for each and every conflict (the list is long, the cancel option is tempting!). -Friendly config/maintain: I have the friendly user in mind (me), meaning the tool should be user-friendly! User-friendly doesnt mean graphical interface with lots of eyecandy (this makes people fat, it's hostile to me, not friendly at all!). However, I like to have only one config file to edit for all my needs, or a directory containing one level of files, a few files, each logically separated (think about /etc/portage) and most of all documented, intuitive. These are the features i need most. I am tired of 'working around' limitations or missing features. I am tired of having to do multiple syncs to get my whole house up2date. And finally, thanks to those that were interested in my post enough to read as far as here (unless you jumped straight here, but thank you still for taking the time!). I'm desperate at creating a project that will be useful to me and hopefully to others too. I'm a very good C/C++/PHP/JS programmer but i could only rarely find work in that field since i have no diploma (highschool diploma from 10 years ago that's all). Due to some illness i've lived a terribly unstable life and i've had an exploratory tendency in development, meaning i've started about 10K projects, but finished none. I have published nothing so far... in other words, i am nobody, and for companies, i am a risk, even if i ask half the usual salary it's still a division by zero: salary divided by zero credibility (ie no diploma and no work xp). If i can build this project (on my own for the start) and publish it, i think it would help me a lot professionally. Also, once the first version is out, i'll clearly welcome patches from the community and having a team work will help even more. Also, very important to note, i am currently unemployed, collecting unemployment insurrance as income, i still have about 2 months left of free time to get my professional situation back on track, this 2 months of my expertise is more than enough to get a good stable beta version of this project. But i need to get it started, i must be convinced this is the right choice. Thanks for reading, hopeful to be reading your answers! Simon