Notes on Synchronization and Backup of $HOME using git, git-annex and mr

This is a dump of all git, git-annex and mr related configuration and scripts so I can edit them in one place. When tangled,

  • All mr configs will go into their respective folders.
  • All Bash scripts will go into ~/.bin/ folder.

Repository structure,

  • ~/annex/ - git-annex repositories that are synced between computers.
  • ~/source/ - Git repos.
  • /external - Single git-annex repo spread over multiple USB drives. Working as a Poor Mans Raid.

~/annex/

Instead of having a single large annex folder synced across machines, I have split all files into 6 annexes,

  • documents
  • music
  • notes
  • old-code
  • photos

This scheme makes git-annex much faster. At first I was reluctant to go with this, instead of pushing pulling single annex, now I have to deal with 6. Then I found about mr which lets you run commands on a collection of repositories, even though there are 6 repos with mr a single command will push pull all of them.

These annexes are shared between 3 computers (one with two full copy of all repos and two with partial copies.), all behind NAT so all clients dump data to rsync.net (GPG encrypted with separate keys.) and sync changes with a repo inside rsync.net.

When ever I make changes to any of the repos, I just run mr push. It will iterate repositories with changes (new files/deletes/renames) and upload changes to and sync with rsync.net. Then when I switch machines I just do a mr pull which downloads all changes.

~/.mrconfig

include = cat ~/annex/.mrconfig-annex ~/source/.mrconfig-source

~/annex/.mrconfig

[DEFAULT]
git_sync = git annex add .;git annex sync --content
git_status = git annex status
git_gc = git repack -ad; git gc

lib = 
    initAnnex() {
       git config remote.origin.annex-ignore true
       git annex init "`hostname`"
       git annex untrust here
       git annex enableremote cloud
    }
    fastFsChck() {
      git annex fsck --fast --from "$1"
    }

[annex/documents]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/annex-documents.git' 'documents'
           cd documents/
           initAnnex
skip = lazy
fsck = fastFsChck cloud
       fastFsChck storage-box

[annex/notes]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/annex-notes.git' 'notes'
           cd notes/
           initAnnex
           git annex direct
           git annex get .
fsck = fastFsChck cloud

~/.bin/git-anx-drop-unused

Drop all unused files by date,

#!/usr/bin/env hy

(import  [sh [grep git perl awk ErrorReturnCode]]
         [re [split]]
         [datetime [datetime date]]
         [sys])

(def remote (if (>= (len sys.argv) 2)
              (second sys.argv)
              "here"))

(def drop-age (if (= (len sys.argv) 3)
                (int (nth sys.argv 2))
                180))

(defn unused-files []
  (let [[files (try 
                (-> (.annex git "unused" "--from" remote)
                    (perl "-ne" "print if /^    [0-9]+.*/")
                    str)
                (catch [e ErrorReturnCode] ""))]
        [unused-files (->> files 
                           (split "\n")
                           (map (fn [x] 
                                  (->> (.strip x)
                                       (split " +")
                                       (take 2)
                                       (map (fn [x] (.strip x))))))
                           (filter (fn [x] 
                                     (= (len x) 2)))
                           list)]]
    (print "Unused files: " (len unused-files))
    unused-files))

(defn last-seen [file]
  (let [[key (second file)]]
    (->> (git "--no-pager" "log" "-1" "-S" key "--pretty=format:%at")
         str
         (split "\n")
         (map (fn [x] (.fromtimestamp datetime (float x))))
         first)))

(defn age [file]
  (let [[delta (- (.today datetime) (last-seen file))]]
    delta.days))

(print "Dropping " remote)

(if (= drop-age 0)
  (for [file (unused-files)]
    (let [[id (first file)]]
      (print "Id " id)
      (if (= remote "here")
        (.annex git "dropunused" "--force" (str id))
        (.annex git "dropunused" "--force" "--from" remote (str id)))))
  (for [file (unused-files)]
    (let [[id (first file)]
          [file-age (try 
                     (age file)
                     (catch [e Exception] -1))]]

      (if (>= file-age drop-age)
        (do 
         (print "Id " id " age " file-age " days...")
         (if (= remote "here")
           (.annex git "dropunused" "--force" (str id))
           (.annex git "dropunused" "--force" "--from" remote (str id))))))))

~/.bin/git-fast-push

Custom push command. For repositories with no changes it simply returns true, for repositories with changes or new files,

  • If acting on a regular git repo, pushes changes to origin.
  • If acting on a git annex repo, uploads changes and sync with rsync.net.
#/bin/bash

updateAnnexHost() {
    echo 'Updating Remote...'
    ORIGIN=`git config --get remote.origin.url`
    HOST=`echo "$ORIGIN" | grep -oiP '//.*?\/' | cut -d/ -f3`
    DIR="/${ORIGIN#*//*/}"
    echo "$HOST $DIR"
    ssh $HOST "cd $DIR;git annex sync"
}

hasNoChanges(){
    git diff-index --quiet HEAD --
}

hasNewFiles(){
    if [ `git ls-files --exclude-standard --others| wc -l` != 0 ]; then 
        true
    else
        false
    fi
}

isRepoAhead(){
    if [ `git log origin/$(git branch | grep '*' | cut -d' ' -f2)..HEAD | wc -l` != 0 ]; then 
        true
    else
        false
    fi
}

#handle direct annex repo
if `git config --get annex.direct`; then
    oldHead=`git rev-parse HEAD`
    git annex add .
    git annex sync
    newHead=`git rev-parse HEAD`
    if [ "$oldHead" != "$newHead" ]; then
        if git config remote.depot.annex-uuid; then
            git annex copy --to depot --not --in depot
            git annex sync
        else
            git annex copy --to origin --not --in origin
            updateAnnexHost
        fi
    fi
    exit
fi

if ! hasNoChanges || hasNewFiles || isRepoAhead; then 
#handle indirect annex repo
    if [ -d '.git/annex/' ]; then    
        git annex add .
        git annex sync
        if git config remote.depot.annex-uuid; then
            git annex copy --to depot --not --in depot
            git annex sync
        else
            git annex copy --to origin --not --in origin
            updateAnnexHost
        fi
        exit
#handle plain git repo        
    else
        git push origin master
    fi
else
    true
fi

Mount / Unmount EncFS Volumes

Scripts for mounting and unmounting EncFS Volumes.

#/bin/bash

CUR_DIR=`pwd`
cd "$1"
DIR=$(basename "$1")
mkdir "/Volumes/$DIR"
git annex get .
git annex unlock "."
encfs "$CUR_DIR/${1}" "/Volumes/$DIR"
cd $CUR_DIR
#/bin/bash

CUR_DIR=`pwd`
DIR=$(basename "$1")
if umount "/Volumes/$DIR"; then
    rm -rf "/Volumes/$DIR"
fi
cd "$1"
git annex add .
git annex add .encfs6.xml
git commit -m 'Update'
cd $CUR_DIR

Webapp

Create autostart file relative paths don't work so tangle one file for each OS (Linux,OS X) then mv one to correct location,

/home/nakkaya/annex/notes
/home/nakkaya/annex/documents
/Users/nakkaya/annex/notes
/Users/nakkaya/annex/documents

Start asistant and webapp,

git annex assistant --autostart && nohup git annex webapp

Misc

Setup encrypted annex directory remote,

git annex initremote mobile type=directory directory=/path/to/annex/repo/ encryption=hybrid keyid=ID embedcreds=yes

Setup encrypted annex S3 remote in EU (Ireland) (eu-west-1),

export AWS_ACCESS_KEY_ID="KID"
export AWS_SECRET_ACCESS_KEY="SKEY"
git annex initremote cloud type=S3 encryption=hybrid keyid=ID embedcreds=yes datacenter=eu-west-1 chunk=250MiB
git setup-bitbucket
git config remote.origin.annex-ignore true

Setup encrypted annex S3 remote in Google Cloud Storage,

git annex initremote cloud type=S3 encryption=hybrid keyid=ID embedcreds=yes host=storage.googleapis.com port=80 chunk=250MiB

Setup encrypted annex rsync remote,

git annex initremote depot type=rsync encryption=hybrid rsyncurl=rsync:annex/repo/ keyid=ID

~/source/

~/source/.mrconfig

Git Repos,

[DEFAULT]
git_pull = git pull origin master
git_push = git fast-push
git_status = git status --short
sync = git pull && git push

[source/latte]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/latte.git' 'latte'
skip=lazy

[source/alter-ego]
checkout = git clone 'git@github.com:nakkaya/alter-ego.git' 'alter-ego'
skip=lazy

[source/ardrone]
checkout = git clone 'git@github.com:nakkaya/ardrone.git' 'ardrone'
skip=lazy

[source/clodiuno]
checkout = git clone 'git@github.com:nakkaya/clodiuno.git' 'clodiuno'
skip=lazy

[source/easy-dns]
checkout = git clone 'git@github.com:nakkaya/easy-dns.git' 'easy-dns'
skip=lazy

[source/emacs]
checkout = git clone 'git@github.com:nakkaya/emacs.git' 'emacs'
           cd emacs
           git submodule init
           git submodule update

[source/inbox-feed]
checkout = git clone 'git@github.com:nakkaya/inbox-feed.git' 'inbox-feed'
skip=lazy

[source/nakkaya.com]
checkout = git clone 'git@github.com:nakkaya/nakkaya.com.git' 'nakkaya.com'
skip=lazy

[source/net-eval]
checkout = git clone 'git@github.com:nakkaya/net-eval.git' 'net-eval'
skip=lazy

[source/neu-islanders]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/neu-islanders.git' 'neu-islanders'
skip=lazy

[source/static]
checkout = git clone 'git@github.com:nakkaya/static.git' 'static'
skip=lazy

[source/vector-2d]
checkout = git clone 'git@github.com:nakkaya/vector-2d.git' 'vector-2d'
skip=lazy

[source/doganilic.com]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/doganilic.com.git' 'doganilic.com'
skip=lazy

[source/ansible-docker-build]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/ansible-docker-build.git' 'ansible-docker-build'
skip=lazy

[source/ansible-storage]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/ansible-storage.git' 'ansible-storage'
skip=lazy

[source/ansible-base]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/ansible-base.git' 'ansible-base'
skip=lazy

[source/ansible-backup]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/ansible-backup.git' 'ansible-backup'
skip=lazy

[source/control-toolbox]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/control-toolbox.git' 'control-toolbox'
skip=lazy

[source/solarcar-tracker]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/solarcar-tracker.git' 'solarcar-tracker'
skip=lazy

[source/solarcar-turn-indicator]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/solarcar-turn-indicator.git' 'solarcar-turn-indicator'
skip=lazy

[source/ferret]
checkout = git clone 'https://git.nakkaya.com/git/nakkaya/ferret.git' 'ferret'
skip=lazy