Files
gitea-mirror/www/src/pages/use-cases/preserve-github-history.mdx
2025-10-23 00:04:58 +05:30

98 lines
4.9 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: ../../layouts/UseCaseLayout.astro
title: "Preserve GitHub History Forever"
description: "Archive commits, issues, releases, and LFS assets into Gitea so hobby projects survive account removals or repo deletions."
canonical: "https://gitea-mirror.com/use-cases/preserve-github-history/"
---
# Preserve GitHub History Forever
## Keep the entire story, not just the code
GitHub accounts get banned, repos go private, and owners rage-delete history. If you care about the full timeline—issues, releases, wiki—Gitea Mirror snapshots everything on a schedule so the story survives in your homelab.
## Requirements
- Running Gitea Mirror (follow the [backup playbook](../backup-github-repositories/))
- GitHub PAT with `repo` enabled (add the `read:org` checkbox under `admin:org` when you archive organization repositories; leave write/admin unchecked)
- Destination Gitea with enough disk for cloned repos + attachments
- Optional: object storage or snapshots for long-term archiving of the mirror volume
## Step-by-step
### 1. Set archival-friendly defaults
In **Configuration → Connections**, open **Content & Data**:
- Enable **Mirror metadata** and choose the components you care about (issues, pull requests, labels, milestones, wiki).
- Enable **Mirror releases** and raise the **Latest releases** limit if you need a deeper history of release assets.
- Toggle **Git LFS (Large File Storage)** so binaries follow the repository, assuming LFS is enabled in your Gitea instance.
### 2. Create an "Archive" organization in Gitea
1. In Gitea, create an org like `github-archive` and grant read-only access to everyone who needs the history.
2. Back in Gitea Mirror under **Configuration → Connections**, pick the **Preserve structure** strategy (or set a destination organization) so repos land in that archive org.
3. Tighten permissions in Gitea—disable pushes for regular users so the archive stays immutable while the service updates it via its token.
<figure class="mt-8 flex flex-col items-center">
<img
src="/assets/repositories.png"
alt="Repositories dashboard in Gitea Mirror showing archived GitHub projects synced into Gitea."
class="w-full max-w-5xl rounded-xl border border-muted shadow-sm"
loading="lazy"
/>
<figcaption class="mt-3 text-sm text-muted-foreground text-center">
Keep every GitHub project visible in the repositories dashboard while routing mirrors into a dedicated archive organization.
</figcaption>
</figure>
### 3. Choose retention & cadence
- In **Configuration → Automation**, enable **Automatic syncing** and set the interval (`1h` keeps fast-moving repos current; `12h` is usually enough for archives).
- Turn on **Handle orphaned repositories automatically** and leave the action on **Archive** so anything deleted upstream is preserved locally but marked read-only.
- Bump the **Latest releases** limit or run an occasional manual sync from the **Repositories** table when you need older release assets.
### 4. Record provenance
- Add a README or label inside the archive organization that captures the upstream URL, first mirrored date, and token owner.
- Export a CSV from the **Repositories** view or hit `/api/events` quarterly so you retain a human-friendly change log.
- Store the configuration export (`/api/export`) alongside your disaster-recovery docs in case you need to rebuild the service.
### 5. Back up the backup
- Snapshots: Use ZFS/BTRFS or Proxmox backups on the mirrors data volume.
- Offsite: `restic`/`rclone` the `data/` directory to a NAS or object store.
- Test: Restore to a test Gitea instance and spot-check history every few months.
## Verify the archive
1. Delete a draft issue on GitHub.
2. Wait for the next sync; open the issue in Gitea—you should still see the original content.
3. Compare `git tag -l` in both remotes to ensure releases match.
4. Use `git lfs ls-files` to confirm large assets made it across.
## Maintenance checklist
- Rotate tokens annually and document the rotation date in the repo README.
- Monitor disk growth; configure `persistence.size` if you run the Helm chart.
- Log anomalies—failed runs, conflicts—in your homelab journal to track trends.
## Related playbooks
- [Automate GitHub Backups](../github-backup-automation/)
- [Build a Starred Repo Collection](../starred-repos-collection/)
## FAQ
### Does this preserve issues, pull requests, and releases?
Yes—enable Mirror metadata and Mirror releases from **Configuration → Connections → Content & Data**. Pull requests copy as enriched issues, keeping discussion and labels.
### What happens if a GitHub repo is deleted or goes private?
Turn on Handle orphaned repositories automatically and use Archive to keep a read-only copy locally. Delete enforces a strict mirror, removing the repo.
### How much storage will I need long-term?
Plan for repo size plus attachments and LFS. Monitor the mirrors `data/` volume growth and consider ZFS/BTRFS snapshots or object storage for older archives.