Appendix III. How to Backup
Basic
see Backup your work - Basic in Getting Started.
Advanced: git, rsync & crontab
Tips
实验室机器上的存储都会用RAID等磁盘阵列技术,保证即使两块硬盘同时坏了(概率非常小)也仍然不会丢失数据。所以根据多年的经验来看,一般的数据不太需要备份。
对于code等重要的程序文件,建议高频率地(比如每天)通过 git 同步到github。可以写个自动脚本,用 crontab 设置每天自动提交一个备份到 github;或者自己按修改版本更新;或者使用
git hooks以及其他第三方工具例如git-auto-syncorgit-sync-on-inotify自动更新变化的文件(详见下文或者自己通过AI查找方法)。大家根据个人喜好,都可以。对于很重要的、而且较大的数据文件,可以利用 rsync 备份到不同的存储上(比如不同的存储机器,移动硬盘,或者实验室的群晖NAS存储,备份设备最好在不同的楼)。从实际经验来看,我们不再建议每天或者每周定期自动备份大的数据文件,实际用处不大;而是建议按一定的频率和课题进展程度手动地备份最为重要的数据。
1) git - backup code
You can find a detailed instruction of using git in this official documentation of Github:
1.1) Setup git in Linux/Unix/Mac
Setup: add a setting file: ~/.gitconfig
Clone/Download an existed repository on github
Tips:
Git 的使用——提交避免输入用户名和密码 (较为繁琐,推荐下面链接中的VS Code中Git插件方法)
【推荐】 Git in VS Code (VS Code中登录Github账户后会自动保存账户密码和登入状态,而且本地登录好后即使在VS Code中操作远程的git也不用输入用户密码了,方法简单)
Create a new repository
Sync local files with github repo
Pull (update):
Add:
Change:
Remove:
1.2) Tips of using git
Tip 1: git-sync.sh
Tip 2: clone a private repo
Methods in this tip were generated by AI. They have not been tested yet.
Cloning a private Git repository requires authentication to confirm you have access. This can be done using Github Desktop App, VS Code, HTTPS with a Personal Access Token (PAT) or SSH keys.
Method 1. Git integrated in VS Code 【推荐】
Git in VS Code (VS Code中登录Github账户后会自动保存账户密码和登入状态,而且本地登录好后即使在VS Code中操作远程的git也不用输入用户密码了,方法简单)
Method 2. Using HTTPS with a Personal Access Token (PAT)
Generate a Personal Access Token (PAT):
Navigate to your Git hosting service (e.g., GitHub, GitLab) settings.
Find the "Developer settings" or "Access tokens" section.
Generate a new PAT, ensuring it has the necessary permissions (e.g.,
reposcope for GitHub) to clone repositories. Copy the generated token immediately, as it usually won't be shown again.
Clone the repository:
Open your terminal or command prompt.
Navigate to the directory where you want to clone the repository.
Use the
git clonecommand with the HTTPS URL of the repository. When prompted for a password, paste your PAT instead of your account password.
Method 3. Using SSH Keys
Generate an SSH Key Pair:
Open your terminal or command prompt.
Generate an SSH key pair using
ssh-keygen:
Follow the prompts, optionally setting a passphrase for added security.
Add the Public Key to your Git Hosting Service:
Copy the content of your public key (usually
~/.ssh/id_rsa.pub).Navigate to your Git hosting service settings (e.g., GitHub, GitLab).
Find the "SSH and GPG keys" or "SSH Keys" section.
Add a new SSH key, giving it a descriptive title and pasting the content of your public key.
Clone the repository:
Open your terminal or command prompt.
Navigate to the directory where you want to clone the repository.
Use the
git clonecommand with the SSH URL of the repository, like[email protected]:user/repo.git
Tip 3: Automatically sync with git
Purpose: Automatically sync local changes to a remote GitHub repository.
Methods:
Git Hooks: you can set up Git hooks (e.g.,
post-commit,post-merge) in your local repository to automatically push changes to the remote GitHub repository after commits or pulls. This requires scripting and careful configuration to avoid unintended pushes.External Tools:Tools like
git-auto-sync(as found on GitHub) can run as background daemons, monitoring local repositories for changes and automatically syncing them with the remote.Scheduled Tasks using crontab: Set up a scheduled task to periodically execute
git pullandgit pushcommands, pulling remote changes and pushing local commits.
2) rsync - backup large data files
2.1) Setup ssh key (optional if backup remotely)
Purpose: ssh to remote server not requiring password.
You do not need to setup ssh key if you only need to backup files between local directories. Then, you may go to step 2.2 directly.
(a) Generate SSH key
(b) Copy your keys to the target server
2.2) Prepare a backup script with rsync
(a) First you need to prepare some backup dirs
(b) Then, write a back up script, for example : ~/backup.sh
(c) Last, make your backup.sh excutable
Parameters of rsync (use
man rsyncto see more details):
Parameter
Mean
-a:
以递归方式传输文件
--delete:
删除那些接收端还有而发送端已经不存在的文件
-q:
精简输出模式
-z:
在传输文件时进行压缩处理
-H:
保持硬链接文件
-t:
对比两边文件的时间戳和文件大小.如果一致,则就认为两边文件一样,对此文件就不再采取更新动作了
-I:
挨个文件去发起数据同步
--port=PORT:
端口号
3) crontab - schedule a sync/backup task
Purpose: run scheduled jobs automatically.
You can use Crontab Generator or edit a crontab job by yourself:
then add these:
Click this to see an example of git-sync.sh.
This table explains the value in each column:
Column
Mean
Column 1:
Minutes 0 to 59
Column 2:
Hours 0 to 23 (0 means midnight)
Column 3:
Day 1 to 31
Column 4:
Months 1~12
Column 5:
Week 0 to 7 (0 and 7 for Sunday)
Column 6:
Command to run
4) More Reading for advanced users
《鸟哥的Linux私房菜-基础学习篇》 (25章推荐章节)
Linux 推荐章节:
第25章 LINUX备份策略: 25.2.2完整备份的差异备份; 25.3鸟哥的备份策略; 25.4灾难恢复的考虑; 25.5重点回顾
Last updated
Was this helpful?