Files
gitea-mirror/www/src/pages/use-cases/backup-github-repositories.mdx
2025-10-23 00:04:58 +05:30

121 lines
5.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: ../../layouts/UseCaseLayout.astro
title: "Backup GitHub Repositories with Gitea Mirror"
description: "Run a homelab-friendly playbook to mirror GitHub into self-hosted Gitea with automated schedules, health checks, and restore drills."
canonical: "https://gitea-mirror.com/use-cases/backup-github-repositories/"
---
# Backup GitHub Repositories with Gitea Mirror
## Why homelabbers care
GitHub is great—right up until an outage, SSO change, or account lockout strands your projects. Gitea Mirror keeps a self-hosted copy of everything (history, metadata, LFS) so you can keep working locally. This playbook walks through the minimal Docker setup the project ships with and shows how to prove your backups actually work.
## Requirements
- Docker Engine and Compose on the host that will run the mirror
- A GitHub personal access token (classic) with `repo`, plus the `read:org` checkbox under `admin:org` when you mirror organizations (leave the write/admin boxes unchecked)
- A self-hosted Gitea instance (can be on the same box) and admin or org owner credentials
- Open ports 4321 (web UI) and 3000 (default Gitea) inside your network
## Step-by-step
### 1. Clone the repo and start the stack
```bash
git clone https://github.com/RayLabsHQ/gitea-mirror.git
cd gitea-mirror
docker compose -f docker-compose.alt.yml up -d
```
The `alt` compose file ships with sane defaults for a single-node backup mirror. It stores data in `./data`. To use a different path, edit the volume mapping (for example `- /srv/gitea-mirror:/app/data`).
Verify the containers:
```bash
docker compose -f docker-compose.alt.yml ps
docker compose -f docker-compose.alt.yml logs -f gitea-mirror
```
Wait for "Server started" before moving on.
### 2. Generate tokens and connect GitHub
1. Create a GitHub personal access token (classic) with `repo` enabled and, inside the `admin:org` section, check `read:org` so the mirror can list organization repositories—leave `write:org` and `admin:org` unchecked.
2. Log in to Gitea and create an access token for an admin/owner account with `write:repository`.
3. Visit `http://<host>:4321` and sign up—the first user becomes admin.
4. Complete the setup wizard:
- Paste the GitHub PAT and Gitea URL/token.
- Choose which GitHub owners (user/org) to track.
- Leave sync interval at the default 1 hour to start.
### 3. Stage your first backup job
On the dashboard:
1. Click **Mirror Repository** for a small test project.
2. Open Gitea and confirm the mirror appears with the right owner/org.
3. In **Configuration → Connections**, open the **Content & Data** section to enable **Mirror metadata** and **Git LFS** if you rely on issues, wikis, or large assets.
For broader coverage, switch the organization strategy to **Preserve structure** so Gitea mirrors your GitHub org layout automatically.
### 4. Turn on automatic syncs and cleanup
Open **Configuration → Automation** in the web UI.
- Enable **Automatic syncing** and pick an interval that matches how fresh you want the mirror (start with `60 minutes`, shorten for active repos).
- Leave the scheduler enabled—auto-discovery ships with it, so new GitHub repositories and stars are pulled in on the next pass.
- If you want the mirror to tidy up when GitHub repos disappear, enable **Handle orphaned repositories** and keep the action on **Archive** so history stays intact.
<figure class="mt-8 flex flex-col items-center">
<img
src="/assets/configuration.png"
alt="Automation tab in Gitea Mirror showing the automatic syncing controls for GitHub backups."
class="w-full max-w-5xl rounded-xl border border-muted shadow-sm"
loading="lazy"
/>
<figcaption class="mt-3 text-sm text-muted-foreground text-center">
Configure the scheduler and cleanup policies from the Automation tab so GitHub mirrors stay fresh without manual cron jobs.
</figcaption>
</figure>
### 5. Prove the backup works
Treat the mirror like any other DR asset:
1. Temporarily block outbound GitHub access on your machine.
2. Clone from Gitea instead: `git clone http://<gitea-host>/<owner>/<repo>.git`.
3. Confirm commit history, tags, releases, and issues exist.
4. Remove the block and document the restore steps in your homelab wiki.
## Health checks & monitoring
- The container exposes `/api/health`; add it to Uptime Kuma, Healthchecks.io, or Prometheus.
- Mirror failures surface in the activity log; consider exporting them through the `/api/events` endpoint.
- Watch the `data/` volume on the host (e.g. `du -sh data/`) to make sure you have headroom for mirrored repos and LFS blobs.
## Hardening tips
- Put the stack behind a reverse proxy (Traefik, Caddy, Nginx) and enable TLS.
- Rotate both GitHub and Gitea tokens quarterly; the UI will flag expired credentials.
- Snapshot the `data/` volume (ZFS/BTRFS) or back it up with `restic` so the mirror survives host failure.
## Next steps
- Promote the mirror to read-only users who do not need GitHub access.
- Layer on the [Helm](../deploy-with-helm-chart) or [Proxmox LXC](../proxmox-lxc-homelab) playbooks when you outgrow the single-node setup.
## FAQ
### Does Gitea Mirror copy issues, pull requests, releases, and LFS?
Yes. Enable Mirror metadata, Mirror releases, and Git LFS from **Configuration → Connections → Content & Data**. Pull requests are mirrored as enriched issues with linked branches and metadata.
### How often should I sync GitHub backups?
Most homelabs pick 30120 minutes. Faster schedules improve RPO but use more GitHub API quota; adjust by org/repo if only a few projects are critical.
### Where are backups stored and how do I restore?
Repositories and the SQLite DB live under the `data/` directory (or your configured volume). Restore by cloning from Gitea or by moving the volume to a fresh deployment and signing back in.