Compare commits

...

7 Commits

Author SHA1 Message Date
Arunavo Ray
4a54cf9009 v3.5.3 2025-09-07 16:29:43 +05:30
Arunavo Ray
fab4efd93a Auto-start on boot 2025-09-07 16:29:23 +05:30
Arunavo Ray
9f21cd6b1a Addressing concerns of Issue #85 and #86 2025-09-07 15:25:48 +05:30
Arunavo Ray
9ef6017a23 v3.5.2 2025-09-07 13:55:43 +05:30
Arunavo Ray
502796371f Attempt to address #84 2025-09-07 13:55:20 +05:30
Arunavo Ray
b956b71c5f Fixed #87 where the Release Notes was missing 2025-09-07 13:14:41 +05:30
Arunavo Ray
26b82e0f65 Added AGENTS.md 2025-09-07 11:46:14 +05:30
10 changed files with 636 additions and 61 deletions

View File

@@ -95,6 +95,7 @@ DOCKER_TAG=latest
# Release and Metadata
# MIRROR_RELEASES=false # Mirror GitHub releases
# RELEASE_LIMIT=10 # Maximum number of releases to mirror per repository
# MIRROR_WIKI=false # Mirror wiki content
# Issue Tracking (requires MIRROR_METADATA=true)
@@ -110,10 +111,10 @@ DOCKER_TAG=latest
# ===========================================
# Basic Schedule Settings
# SCHEDULE_ENABLED=false
# SCHEDULE_ENABLED=false # When true, auto-imports and mirrors all repos on startup (v3.5.3+)
# SCHEDULE_INTERVAL=3600 # Interval in seconds or cron expression (e.g., "0 2 * * *")
# GITEA_MIRROR_INTERVAL=8h # Mirror sync interval (5m, 30m, 1h, 8h, 24h, 1d, 7d)
# AUTO_IMPORT_REPOS=true # Automatically discover and import new GitHub repositories
# GITEA_MIRROR_INTERVAL=8h # Mirror sync interval (5m, 30m, 1h, 8h, 24h, 1d, 7d) - also triggers auto-start
# AUTO_IMPORT_REPOS=true # Automatically discover and import new GitHub repositories during syncs
# DELAY=3600 # Legacy: same as SCHEDULE_INTERVAL, kept for backward compatibility
# Execution Settings

46
AGENTS.md Normal file
View File

@@ -0,0 +1,46 @@
# Repository Guidelines
## Project Structure & Module Organization
- `src/` app code
- `components/` (React, PascalCase files), `pages/` (Astro/API routes), `lib/` (domain + utilities, kebab-case), `hooks/`, `layouts/`, `styles/`, `tests/`, `types/`, `data/`, `content/`.
- `scripts/` operational TS scripts (DB init, recovery): e.g., `scripts/manage-db.ts`.
- `drizzle/` SQL migrations; `data/` runtime SQLite (`gitea-mirror.db`).
- `public/` static assets; `dist/` build output.
- Key config: `astro.config.mjs`, `tsconfig.json` (alias `@/* → src/*`), `bunfig.toml` (test preload), `.env(.example)`.
## Build, Test, and Development Commands
- Prereq: Bun `>= 1.2.9` (see `package.json`).
- Setup: `bun run setup` install deps and init DB.
- Dev: `bun run dev` start Astro dev server.
- Build: `bun run build` produce `dist/`.
- Preview/Start: `bun run preview` (static preview) or `bun run start` (SSR entry).
- Database: `bun run db:generate|migrate|push|studio` and `bun run manage-db init|check|fix|reset-users`.
- Tests: `bun test` | `bun run test:watch` | `bun run test:coverage`.
- Docker: see `docker-compose.yml` and variants in repo root.
## Coding Style & Naming Conventions
- Language: TypeScript, Astro, React.
- Indentation: 2 spaces; keep existing semicolon/quote style in touched files.
- Components: PascalCase `.tsx` in `src/components/` (e.g., `MainLayout.tsx`).
- Modules/utils: kebab-case in `src/lib/` (e.g., `gitea-enhanced.ts`).
- Imports: prefer alias `@/…` (configured in `tsconfig.json`).
- Do not introduce new lint/format configs; follow current patterns.
## Testing Guidelines
- Runner: Bun test (`bun:test`) with preload `src/tests/setup.bun.ts` (see `bunfig.toml`).
- Location/Names: `**/*.test.ts(x)` under `src/**` (examples in `src/lib/**`).
- Scope: add unit tests for new logic and API route tests for handlers.
- Aim for meaningful coverage on DB, auth, and mirroring paths.
## Commit & Pull Request Guidelines
- Commits: short, imperative, scoped when helpful (e.g., `lib: fix token parsing`, `ui: align buttons`).
- PRs must include:
- Summary, rationale, and testing steps/commands.
- Linked issues (e.g., `Closes #123`).
- Screenshots/gifs for UI changes.
- Notes on DB/migration or .env impacts; update `docs/`/CHANGELOG if applicable.
## Security & Configuration Tips
- Never commit secrets. Copy `.env.example``.env` and fill values; prefer `bun run startup-env-config` to validate.
- SQLite files live in `data/`; avoid committing generated DBs.
- Certificates (if used) reside in `certs/`; manage locally or via Docker secrets.

View File

@@ -216,12 +216,20 @@ Gitea Mirror provides powerful automatic synchronization features:
- **Repository cleanup**: Removes repositories that no longer exist in GitHub
- **Proper intervals**: Mirrors respect your configured sync intervals (not Gitea's default 24h)
- **Smart scheduling**: Only syncs repositories that need updating
- **Auto-start on boot** (v3.5.3+): Automatically imports and mirrors all repositories when `SCHEDULE_ENABLED=true` or `GITEA_MIRROR_INTERVAL` is set
#### Configuration via Web Interface (Recommended)
Navigate to the Configuration page and enable "Automatic Syncing" with your preferred interval.
#### Configuration via Environment Variables
**Set it and forget it!** With these environment variables, Gitea Mirror will automatically:
1. Import all your GitHub repositories on startup
2. Mirror them to Gitea immediately
3. Keep them synchronized based on your interval
4. Auto-discover new repos you create/star on GitHub
5. Clean up repos you delete from GitHub
```bash
# Enable automatic scheduling (required for auto features)
SCHEDULE_ENABLED=true
@@ -235,11 +243,22 @@ AUTO_IMPORT_REPOS=true
# Auto-cleanup orphaned repositories
CLEANUP_DELETE_IF_NOT_IN_GITHUB=true
CLEANUP_ORPHANED_REPO_ACTION=archive # or 'delete'
CLEANUP_ORPHANED_REPO_ACTION=archive # 'archive' (recommended) or 'delete'
CLEANUP_DRY_RUN=false # Set to true to test without changes
```
**Important**: The scheduler checks every minute for tasks to run. The `GITEA_MIRROR_INTERVAL` determines how often each repository is actually synced. For example, with `8h`, each repo syncs every 8 hours from its last successful sync.
**Important Notes**:
- **Auto-Start**: When `SCHEDULE_ENABLED=true` or `GITEA_MIRROR_INTERVAL` is set, the service automatically imports all GitHub repositories and mirrors them on startup. No manual "Import" or "Mirror" button clicks required!
- The scheduler checks every minute for tasks to run. The `GITEA_MIRROR_INTERVAL` determines how often each repository is actually synced. For example, with `8h`, each repo syncs every 8 hours from its last successful sync.
**🛡️ Backup Protection Features**:
- **No Accidental Deletions**: Repository cleanup is automatically skipped if GitHub is inaccessible (account deleted, banned, or API errors)
- **Archive Never Deletes Data**: The `archive` action preserves all repository data:
- Regular repositories: Made read-only using Gitea's archive feature
- Mirror repositories: Renamed with `[ARCHIVED]` prefix (Gitea API limitation prevents archiving mirrors)
- Failed operations: Repository remains fully accessible even if marking as archived fails
- **The Whole Point of Backups**: Your Gitea mirrors are preserved even when GitHub sources disappear - that's why you have backups!
- **Strongly Recommended**: Always use `CLEANUP_ORPHANED_REPO_ACTION=archive` (default) instead of `delete`
## Troubleshooting

View File

@@ -134,6 +134,7 @@ Control what content gets mirrored from GitHub to Gitea.
| Variable | Description | Default | Options |
|----------|-------------|---------|---------|
| `MIRROR_RELEASES` | Mirror GitHub releases | `false` | `true`, `false` |
| `RELEASE_LIMIT` | Maximum number of releases to mirror per repository | `10` | Number (1-100) |
| `MIRROR_WIKI` | Mirror wiki content | `false` | `true`, `false` |
| `MIRROR_METADATA` | Master toggle for metadata mirroring | `false` | `true`, `false` |
| `MIRROR_ISSUES` | Mirror issues (requires MIRROR_METADATA=true) | `false` | `true`, `false` |
@@ -149,10 +150,17 @@ Configure automatic scheduled mirroring.
| Variable | Description | Default | Options |
|----------|-------------|---------|---------|
| `SCHEDULE_ENABLED` | Enable automatic mirroring | `false` | `true`, `false` |
| `SCHEDULE_ENABLED` | Enable automatic mirroring. **When set to `true`, automatically imports and mirrors all repositories on startup** (v3.5.3+) | `false` | `true`, `false` |
| `SCHEDULE_INTERVAL` | Interval in seconds or cron expression | `3600` | Number or cron string (e.g., `"0 2 * * *"`) |
| `DELAY` | Legacy: same as SCHEDULE_INTERVAL | `3600` | Number (seconds) |
> **Note**: Setting either `SCHEDULE_ENABLED=true` or `GITEA_MIRROR_INTERVAL` triggers auto-start functionality where the service will:
> 1. Import all GitHub repositories on startup
> 2. Mirror them to Gitea immediately
> 3. Continue syncing at the configured interval
> 4. Auto-discover new repositories
> 5. Clean up deleted repositories (if configured)
### Execution Settings
| Variable | Description | Default | Options |
@@ -174,6 +182,7 @@ Configure automatic scheduled mirroring.
| Variable | Description | Default | Options |
|----------|-------------|---------|---------|
| `AUTO_IMPORT_REPOS` | Automatically discover and import new GitHub repositories during scheduled syncs | `true` | `true`, `false` |
| `SCHEDULE_ONLY_MIRROR_UPDATED` | Only mirror repos with updates | `false` | `true`, `false` |
| `SCHEDULE_UPDATE_INTERVAL` | Check for updates interval (milliseconds) | `86400000` | Number |
| `SCHEDULE_SKIP_RECENTLY_MIRRORED` | Skip recently mirrored repos | `true` | `true`, `false` |
@@ -206,10 +215,25 @@ Configure automatic cleanup of old events and data.
|----------|-------------|---------|---------|
| `CLEANUP_DELETE_FROM_GITEA` | Delete repositories from Gitea | `false` | `true`, `false` |
| `CLEANUP_DELETE_IF_NOT_IN_GITHUB` | Delete repos not found in GitHub (automatically enables cleanup) | `true` | `true`, `false` |
| `CLEANUP_ORPHANED_REPO_ACTION` | Action for orphaned repositories | `archive` | `skip`, `archive`, `delete` |
| `CLEANUP_ORPHANED_REPO_ACTION` | Action for orphaned repositories. **Note**: `archive` is recommended to preserve backups | `archive` | `skip`, `archive`, `delete` |
| `CLEANUP_DRY_RUN` | Test mode without actual deletion | `true` | `true`, `false` |
| `CLEANUP_PROTECTED_REPOS` | Comma-separated list of protected repository names | - | Comma-separated strings |
**🛡️ Safety Features (Backup Protection)**:
- **GitHub Failures Don't Delete Backups**: Cleanup is automatically skipped if GitHub API returns errors (404, 403, connection issues)
- **Archive Never Deletes**: The `archive` action ALWAYS preserves repository data, it never deletes
- **Graceful Degradation**: If marking as archived fails, the repository remains fully accessible in Gitea
- **The Purpose of Backups**: Your mirrors are preserved even when GitHub sources disappear - that's the whole point!
**Archive Behavior (Aligned with Gitea API)**:
- **Regular repositories**: Uses Gitea's native archive feature (PATCH `/repos/{owner}/{repo}` with `archived: true`)
- Makes repository read-only while preserving all data
- **Mirror repositories**: Uses rename strategy (Gitea API returns 422 for archiving mirrors)
- Renamed with `[ARCHIVED]` prefix for clear identification
- Description updated with preservation notice and timestamp
- Mirror interval set to 8760h (1 year) to minimize sync attempts
- Repository remains fully accessible and cloneable
### Execution Settings
| Variable | Description | Default | Options |

View File

@@ -1,7 +1,7 @@
{
"name": "gitea-mirror",
"type": "module",
"version": "3.5.1",
"version": "3.5.3",
"engines": {
"bun": ">=1.2.9"
},

View File

@@ -133,6 +133,7 @@ function parseEnvConfig(): EnvConfig {
mirrorLabels: process.env.MIRROR_LABELS === 'true',
mirrorMilestones: process.env.MIRROR_MILESTONES === 'true',
mirrorMetadata: process.env.MIRROR_METADATA === 'true',
releaseLimit: process.env.RELEASE_LIMIT ? parseInt(process.env.RELEASE_LIMIT, 10) : undefined,
},
schedule: {
enabled: process.env.SCHEDULE_ENABLED === 'true' ||
@@ -271,6 +272,7 @@ export async function initializeConfigFromEnv(): Promise<void> {
forkStrategy: envConfig.gitea.forkStrategy || existingConfig?.[0]?.giteaConfig?.forkStrategy || 'reference',
// Mirror metadata options
mirrorReleases: envConfig.mirror.mirrorReleases ?? existingConfig?.[0]?.giteaConfig?.mirrorReleases ?? false,
releaseLimit: envConfig.mirror.releaseLimit ?? existingConfig?.[0]?.giteaConfig?.releaseLimit ?? 10,
mirrorMetadata: envConfig.mirror.mirrorMetadata ?? (envConfig.mirror.mirrorIssues || envConfig.mirror.mirrorPullRequests || envConfig.mirror.mirrorLabels || envConfig.mirror.mirrorMilestones) ?? existingConfig?.[0]?.giteaConfig?.mirrorMetadata ?? false,
mirrorIssues: envConfig.mirror.mirrorIssues ?? existingConfig?.[0]?.giteaConfig?.mirrorIssues ?? false,
mirrorPullRequests: envConfig.mirror.mirrorPullRequests ?? existingConfig?.[0]?.giteaConfig?.mirrorPullRequests ?? false,

View File

@@ -7,7 +7,7 @@ import { membershipRoleEnum } from "@/types/organizations";
import { Octokit } from "@octokit/rest";
import type { Config } from "@/types/config";
import type { Organization, Repository } from "./db/schema";
import { httpPost, httpGet, httpDelete, httpPut } from "./http-client";
import { httpPost, httpGet, httpDelete, httpPut, httpPatch } from "./http-client";
import { createMirrorJob } from "./helpers";
import { db, organizations, repositories } from "./db";
import { eq, and } from "drizzle-orm";
@@ -1435,20 +1435,54 @@ export async function mirrorGitHubReleasesToGitea({
}
).catch(() => null);
const releaseNote = release.body || "";
if (existingReleasesResponse) {
console.log(`[Releases] Release ${release.tag_name} already exists, skipping`);
skippedCount++;
// Update existing release if the changelog/body differs
const existingRelease = existingReleasesResponse.data;
const existingNote = existingRelease.body || "";
if (existingNote !== releaseNote || existingRelease.name !== (release.name || release.tag_name)) {
console.log(`[Releases] Updating existing release ${release.tag_name} with new changelog/title`);
await httpPut(
`${config.giteaConfig.url}/api/v1/repos/${repoOwner}/${repository.name}/releases/${existingRelease.id}`,
{
tag_name: release.tag_name,
target: release.target_commitish,
title: release.name || release.tag_name,
body: releaseNote,
draft: release.draft,
prerelease: release.prerelease,
},
{
Authorization: `token ${decryptedConfig.giteaConfig.token}`,
}
);
if (releaseNote) {
console.log(`[Releases] Updated changelog for ${release.tag_name} (${releaseNote.length} characters)`);
}
mirroredCount++;
} else {
console.log(`[Releases] Release ${release.tag_name} already up-to-date, skipping`);
skippedCount++;
}
continue;
}
// Create the release
// Create new release with changelog/body content
if (releaseNote) {
console.log(`[Releases] Including changelog for ${release.tag_name} (${releaseNote.length} characters)`);
}
const createReleaseResponse = await httpPost(
`${config.giteaConfig.url}/api/v1/repos/${repoOwner}/${repository.name}/releases`,
{
tag_name: release.tag_name,
target: release.target_commitish,
title: release.name || release.tag_name,
note: release.body || "",
body: releaseNote,
draft: release.draft,
prerelease: release.prerelease,
},
@@ -1507,13 +1541,14 @@ export async function mirrorGitHubReleasesToGitea({
}
mirroredCount++;
console.log(`[Releases] Successfully mirrored release: ${release.tag_name}`);
const noteInfo = releaseNote ? ` with ${releaseNote.length} character changelog` : " without changelog";
console.log(`[Releases] Successfully mirrored release: ${release.tag_name}${noteInfo}`);
} catch (error) {
console.error(`[Releases] Failed to mirror release ${release.tag_name}: ${error instanceof Error ? error.message : String(error)}`);
}
}
console.log(`✅ Mirrored ${mirroredCount} new releases to Gitea (${skippedCount} already existed)`);
console.log(`✅ Mirrored/Updated ${mirroredCount} releases to Gitea (${skippedCount} already up-to-date)`);
}
export async function mirrorGitRepoPullRequestsToGitea({
@@ -1981,6 +2016,12 @@ export async function deleteGiteaRepo(
/**
* Archive a repository in Gitea
*
* IMPORTANT: This function NEVER deletes data. It only marks repositories as archived.
* - For regular repos: Uses Gitea's archive feature (makes read-only)
* - For mirror repos: Renames with [ARCHIVED] prefix (Gitea doesn't allow archiving mirrors)
*
* This ensures backups are preserved even when the GitHub source disappears.
*/
export async function archiveGiteaRepo(
client: { url: string; token: string },
@@ -1988,24 +2029,115 @@ export async function archiveGiteaRepo(
repo: string
): Promise<void> {
try {
const response = await httpPut(
// First, check if this is a mirror repository
const repoResponse = await httpGet(
`${client.url}/api/v1/repos/${owner}/${repo}`,
{
archived: true,
},
{
Authorization: `token ${client.token}`,
'Content-Type': 'application/json',
}
);
if (response.status >= 400) {
throw new Error(`Failed to archive repository ${owner}/${repo}: ${response.status} ${response.statusText}`);
if (!repoResponse.data) {
console.warn(`[Archive] Repository ${owner}/${repo} not found in Gitea. Skipping.`);
return;
}
console.log(`Successfully archived repository ${owner}/${repo} in Gitea`);
if (repoResponse.data?.mirror) {
console.log(`[Archive] Repository ${owner}/${repo} is a mirror. Using safe rename strategy.`);
// IMPORTANT: Gitea API doesn't allow archiving mirror repositories
// According to Gitea source code, attempting to archive a mirror returns:
// "repo is a mirror, cannot archive/un-archive" (422 Unprocessable Entity)
//
// Our solution: Rename the repo to clearly mark it as orphaned
// This preserves all data while indicating the repo is no longer actively synced
const currentName = repoResponse.data.name;
// Skip if already marked as archived
if (currentName.startsWith('[ARCHIVED]')) {
console.log(`[Archive] Repository ${owner}/${repo} already marked as archived. Skipping.`);
return;
}
const archivedName = `[ARCHIVED] ${currentName}`;
const currentDesc = repoResponse.data.description || '';
const archiveNotice = `\n\n⚠ ARCHIVED: Original GitHub repository no longer exists. Preserved as backup on ${new Date().toISOString()}`;
// Only add notice if not already present
const newDescription = currentDesc.includes('⚠️ ARCHIVED:')
? currentDesc
: currentDesc + archiveNotice;
const renameResponse = await httpPatch(
`${client.url}/api/v1/repos/${owner}/${repo}`,
{
name: archivedName,
description: newDescription,
},
{
Authorization: `token ${client.token}`,
'Content-Type': 'application/json',
}
);
if (renameResponse.status >= 400) {
// If rename fails, log but don't throw - data is still preserved
console.error(`[Archive] Failed to rename mirror repository ${owner}/${repo}: ${renameResponse.status}`);
console.log(`[Archive] Repository ${owner}/${repo} remains accessible but not marked as archived`);
return;
}
console.log(`[Archive] Successfully marked mirror repository ${owner}/${repo} as archived (renamed to ${archivedName})`);
// Also try to reduce sync frequency to prevent unnecessary API calls
// This is optional - if it fails, the repo is still preserved
try {
await httpPatch(
`${client.url}/api/v1/repos/${owner}/${archivedName}`,
{
mirror_interval: "8760h", // 1 year - minimizes sync attempts
},
{
Authorization: `token ${client.token}`,
'Content-Type': 'application/json',
}
);
console.log(`[Archive] Reduced sync frequency for ${owner}/${archivedName} to yearly`);
} catch (intervalError) {
// Non-critical - repo is still preserved even if we can't change interval
console.debug(`[Archive] Could not update mirror interval (non-critical):`, intervalError);
}
} else {
// For non-mirror repositories, use Gitea's native archive feature
// This makes the repository read-only but preserves all data
console.log(`[Archive] Archiving regular repository ${owner}/${repo}`);
const response = await httpPatch(
`${client.url}/api/v1/repos/${owner}/${repo}`,
{
archived: true,
},
{
Authorization: `token ${client.token}`,
'Content-Type': 'application/json',
}
);
if (response.status >= 400) {
// If archive fails, log but data is still preserved in Gitea
console.error(`[Archive] Failed to archive repository ${owner}/${repo}: ${response.status}`);
console.log(`[Archive] Repository ${owner}/${repo} remains accessible but not marked as archived`);
return;
}
console.log(`[Archive] Successfully archived repository ${owner}/${repo} (now read-only)`);
}
} catch (error) {
console.error(`Error archiving repository ${owner}/${repo}:`, error);
throw error;
// Even on error, the repository data is preserved in Gitea
// We just couldn't mark it as archived
console.error(`[Archive] Could not mark repository ${owner}/${repo} as archived:`, error);
console.log(`[Archive] Repository ${owner}/${repo} data is preserved but not marked as archived`);
// Don't throw - we want cleanup to continue for other repos
}
}

View File

@@ -27,15 +27,37 @@ async function identifyOrphanedRepositories(config: any): Promise<any[]> {
const decryptedToken = getDecryptedGitHubToken(config);
const octokit = createGitHubClient(decryptedToken);
// Fetch GitHub data
const [basicAndForkedRepos, starredRepos] = await Promise.all([
getGithubRepositories({ octokit, config }),
config.githubConfig?.includeStarred
? getGithubStarredRepositories({ octokit, config })
: Promise.resolve([]),
]);
let allGithubRepos = [];
let githubApiAccessible = true;
try {
// Fetch GitHub data
const [basicAndForkedRepos, starredRepos] = await Promise.all([
getGithubRepositories({ octokit, config }),
config.githubConfig?.includeStarred
? getGithubStarredRepositories({ octokit, config })
: Promise.resolve([]),
]);
allGithubRepos = [...basicAndForkedRepos, ...starredRepos];
} catch (githubError: any) {
// Handle GitHub API errors gracefully
console.warn(`[Repository Cleanup] GitHub API error for user ${userId}: ${githubError.message}`);
// Check if it's a critical error (like account deleted/banned)
if (githubError.status === 404 || githubError.status === 403) {
console.error(`[Repository Cleanup] CRITICAL: GitHub account may be deleted/banned. Skipping cleanup to prevent data loss.`);
console.error(`[Repository Cleanup] Consider using CLEANUP_ORPHANED_REPO_ACTION=archive instead of delete for safety.`);
// Return empty array to skip cleanup entirely when GitHub account is inaccessible
return [];
}
// For other errors, also skip cleanup to be safe
console.error(`[Repository Cleanup] Skipping cleanup due to GitHub API error. This prevents accidental deletion of backups.`);
return [];
}
const allGithubRepos = [...basicAndForkedRepos, ...starredRepos];
const githubRepoFullNames = new Set(allGithubRepos.map(repo => repo.fullName));
// Get all repositories from our database
@@ -44,13 +66,19 @@ async function identifyOrphanedRepositories(config: any): Promise<any[]> {
.from(repositories)
.where(eq(repositories.userId, userId));
// Identify orphaned repositories
// Only identify repositories as orphaned if we successfully accessed GitHub
// This prevents false positives when GitHub is down or account is inaccessible
const orphanedRepos = dbRepos.filter(repo => !githubRepoFullNames.has(repo.fullName));
if (orphanedRepos.length > 0) {
console.log(`[Repository Cleanup] Found ${orphanedRepos.length} orphaned repositories for user ${userId}`);
}
return orphanedRepos;
} catch (error) {
console.error(`[Repository Cleanup] Error identifying orphaned repositories for user ${userId}:`, error);
throw error;
// Return empty array on error to prevent accidental deletions
return [];
}
}

View File

@@ -5,9 +5,8 @@
*/
import { db, configs, repositories } from '@/lib/db';
import { eq, and, or, lt, gte } from 'drizzle-orm';
import { syncGiteaRepo } from '@/lib/gitea';
import { createGitHubClient } from '@/lib/github';
import { eq, and, or } from 'drizzle-orm';
import { syncGiteaRepo, mirrorGithubRepoToGitea } from '@/lib/gitea';
import { getDecryptedGitHubToken } from '@/lib/utils/config-encryption';
import { parseInterval, formatDuration } from '@/lib/utils/duration-parser';
import type { Repository } from '@/lib/db/schema';
@@ -15,6 +14,7 @@ import { repoStatusEnum, repositoryVisibilityEnum } from '@/types/Repository';
let schedulerInterval: NodeJS.Timeout | null = null;
let isSchedulerRunning = false;
let hasPerformedAutoStart = false; // Track if we've already done auto-start
/**
* Parse schedule interval with enhanced support for duration strings, cron, and numbers
@@ -41,6 +41,12 @@ async function runScheduledSync(config: any): Promise<void> {
console.log(`[Scheduler] Running scheduled sync for user ${userId}`);
try {
// Check if tokens are configured before proceeding
if (!config.githubConfig?.token || !config.giteaConfig?.token) {
console.log(`[Scheduler] Skipping sync for user ${userId}: GitHub or Gitea tokens not configured`);
return;
}
// Update lastRun timestamp
const currentTime = new Date();
const scheduleConfig = config.scheduleConfig || {};
@@ -72,7 +78,7 @@ async function runScheduledSync(config: any): Promise<void> {
if (scheduleConfig.autoImport !== false) {
console.log(`[Scheduler] Checking for new GitHub repositories for user ${userId}...`);
try {
const { getGithubRepositories, getGithubStarredRepositories, getGithubOrganizations } = await import('@/lib/github');
const { getGithubRepositories, getGithubStarredRepositories } = await import('@/lib/github');
const { v4: uuidv4 } = await import('uuid');
const { getDecryptedGitHubToken } = await import('@/lib/utils/config-encryption');
@@ -82,12 +88,11 @@ async function runScheduledSync(config: any): Promise<void> {
const octokit = new Octokit({ auth: decryptedToken });
// Fetch GitHub data
const [basicAndForkedRepos, starredRepos, gitOrgs] = await Promise.all([
const [basicAndForkedRepos, starredRepos] = await Promise.all([
getGithubRepositories({ octokit, config }),
config.githubConfig?.includeStarred
? getGithubStarredRepositories({ octokit, config })
: Promise.resolve([]),
getGithubOrganizations({ octokit, config }),
]);
const allGithubRepos = [...basicAndForkedRepos, ...starredRepos];
@@ -281,6 +286,278 @@ async function syncSingleRepository(config: any, repo: any): Promise<void> {
}
}
/**
* Check if we should auto-start based on environment configuration
*/
async function checkAutoStartConfiguration(): Promise<boolean> {
// Don't auto-start more than once
if (hasPerformedAutoStart) {
return false;
}
try {
// Check if any configuration has scheduling enabled or mirror interval set
const activeConfigs = await db
.select()
.from(configs)
.where(eq(configs.isActive, true));
for (const config of activeConfigs) {
// Check if scheduling is enabled via environment
const scheduleEnabled = config.scheduleConfig?.enabled === true;
const hasMirrorInterval = !!config.giteaConfig?.mirrorInterval;
// If either SCHEDULE_ENABLED=true or GITEA_MIRROR_INTERVAL is set, we should auto-start
if (scheduleEnabled || hasMirrorInterval) {
console.log(`[Scheduler] Auto-start conditions met for user ${config.userId} (scheduleEnabled=${scheduleEnabled}, hasMirrorInterval=${hasMirrorInterval})`);
return true;
}
}
return false;
} catch (error) {
console.error('[Scheduler] Error checking auto-start configuration:', error);
return false;
}
}
/**
* Perform initial auto-start: import repositories and trigger mirror
*/
async function performInitialAutoStart(): Promise<void> {
hasPerformedAutoStart = true;
try {
console.log('[Scheduler] Performing initial auto-start...');
// Get all active configurations
const activeConfigs = await db
.select()
.from(configs)
.where(eq(configs.isActive, true));
for (const config of activeConfigs) {
// Skip if tokens are not configured
if (!config.githubConfig?.token || !config.giteaConfig?.token) {
console.log(`[Scheduler] Skipping auto-start for user ${config.userId}: tokens not configured`);
continue;
}
const scheduleEnabled = config.scheduleConfig?.enabled === true;
const hasMirrorInterval = !!config.giteaConfig?.mirrorInterval;
// Only process configs that have scheduling or mirror interval configured
if (!scheduleEnabled && !hasMirrorInterval) {
continue;
}
console.log(`[Scheduler] Auto-starting for user ${config.userId}...`);
try {
// Step 1: Import repositories from GitHub
console.log(`[Scheduler] Step 1: Importing repositories from GitHub for user ${config.userId}...`);
const { getGithubRepositories, getGithubStarredRepositories } = await import('@/lib/github');
const { v4: uuidv4 } = await import('uuid');
// Create GitHub client
const decryptedToken = getDecryptedGitHubToken(config);
const { Octokit } = await import('@octokit/rest');
const octokit = new Octokit({ auth: decryptedToken });
// Fetch GitHub data
const [basicAndForkedRepos, starredRepos] = await Promise.all([
getGithubRepositories({ octokit, config }),
config.githubConfig?.includeStarred
? getGithubStarredRepositories({ octokit, config })
: Promise.resolve([]),
]);
const allGithubRepos = [...basicAndForkedRepos, ...starredRepos];
// Check for new repositories
const existingRepos = await db
.select({ fullName: repositories.fullName })
.from(repositories)
.where(eq(repositories.userId, config.userId));
const existingRepoNames = new Set(existingRepos.map(r => r.fullName));
const reposToImport = allGithubRepos.filter(r => !existingRepoNames.has(r.fullName));
if (reposToImport.length > 0) {
console.log(`[Scheduler] Importing ${reposToImport.length} repositories for user ${config.userId}...`);
// Insert new repositories
const reposToInsert = reposToImport.map(repo => ({
id: uuidv4(),
userId: config.userId,
configId: config.id,
name: repo.name,
fullName: repo.fullName,
url: repo.url,
cloneUrl: repo.cloneUrl,
owner: repo.owner,
organization: repo.organization,
isPrivate: repo.isPrivate,
isForked: repo.isForked,
forkedFrom: repo.forkedFrom,
hasIssues: repo.hasIssues,
isStarred: repo.isStarred,
isArchived: repo.isArchived,
size: repo.size,
hasLFS: repo.hasLFS,
hasSubmodules: repo.hasSubmodules,
defaultBranch: repo.defaultBranch,
visibility: repo.visibility,
status: 'imported',
createdAt: new Date(),
updatedAt: new Date(),
}));
await db.insert(repositories).values(reposToInsert);
console.log(`[Scheduler] Successfully imported ${reposToImport.length} repositories`);
} else {
console.log(`[Scheduler] No new repositories to import for user ${config.userId}`);
}
// Check if we already have mirrored repositories (indicating this isn't first run)
const mirroredRepos = await db
.select()
.from(repositories)
.where(
and(
eq(repositories.userId, config.userId),
or(
eq(repositories.status, 'mirrored'),
eq(repositories.status, 'synced')
)
)
)
.limit(1);
// If we already have mirrored repos, skip the initial mirror (let regular sync handle it)
if (mirroredRepos.length > 0) {
console.log(`[Scheduler] User ${config.userId} already has mirrored repositories, skipping initial mirror (let regular sync handle updates)`);
// Still update the schedule config to indicate scheduling is active
const currentTime = new Date();
const intervalSource = config.scheduleConfig?.interval ||
config.giteaConfig?.mirrorInterval ||
'8h';
const interval = parseScheduleInterval(intervalSource);
const nextRun = new Date(currentTime.getTime() + interval);
await db.update(configs).set({
scheduleConfig: {
...config.scheduleConfig,
enabled: true,
lastRun: currentTime,
nextRun: nextRun,
},
updatedAt: currentTime,
}).where(eq(configs.id, config.id));
console.log(`[Scheduler] Scheduling enabled for user ${config.userId}, next sync at ${nextRun.toISOString()}`);
continue;
}
// Step 2: Trigger mirror for all repositories that need mirroring
console.log(`[Scheduler] Step 2: Triggering mirror for repositories that need mirroring...`);
const reposNeedingMirror = await db
.select()
.from(repositories)
.where(
and(
eq(repositories.userId, config.userId),
or(
eq(repositories.status, 'imported'),
eq(repositories.status, 'pending'),
eq(repositories.status, 'failed')
)
)
);
if (reposNeedingMirror.length > 0) {
console.log(`[Scheduler] Found ${reposNeedingMirror.length} repositories that need mirroring`);
// Reuse the octokit instance from above
// (octokit was already created in the import phase)
// Process repositories in batches
const batchSize = config.scheduleConfig?.batchSize || 5;
for (let i = 0; i < reposNeedingMirror.length; i += batchSize) {
const batch = reposNeedingMirror.slice(i, Math.min(i + batchSize, reposNeedingMirror.length));
console.log(`[Scheduler] Processing batch ${Math.floor(i / batchSize) + 1} of ${Math.ceil(reposNeedingMirror.length / batchSize)} (${batch.length} repos)`);
await Promise.all(
batch.map(async (repo) => {
try {
const repository: Repository = {
...repo,
status: repoStatusEnum.parse(repo.status),
organization: repo.organization ?? undefined,
lastMirrored: repo.lastMirrored ?? undefined,
errorMessage: repo.errorMessage ?? undefined,
mirroredLocation: repo.mirroredLocation || '',
forkedFrom: repo.forkedFrom ?? undefined,
visibility: repositoryVisibilityEnum.parse(repo.visibility),
};
await mirrorGithubRepoToGitea({
octokit,
repository,
config
});
console.log(`[Scheduler] Successfully mirrored repository: ${repo.fullName}`);
} catch (error) {
console.error(`[Scheduler] Failed to mirror repository ${repo.fullName}:`, error);
}
})
);
// Pause between batches if configured
if (i + batchSize < reposNeedingMirror.length) {
const pauseTime = config.scheduleConfig?.pauseBetweenBatches || 2000;
console.log(`[Scheduler] Pausing for ${pauseTime}ms before next batch...`);
await new Promise(resolve => setTimeout(resolve, pauseTime));
}
}
console.log(`[Scheduler] Completed initial mirror for ${reposNeedingMirror.length} repositories`);
} else {
console.log(`[Scheduler] No repositories need mirroring`);
}
// Update the schedule config to indicate we've run
const currentTime = new Date();
const intervalSource = config.scheduleConfig?.interval ||
config.giteaConfig?.mirrorInterval ||
'8h';
const interval = parseScheduleInterval(intervalSource);
const nextRun = new Date(currentTime.getTime() + interval);
await db.update(configs).set({
scheduleConfig: {
...config.scheduleConfig,
enabled: true, // Ensure scheduling is enabled
lastRun: currentTime,
nextRun: nextRun,
},
updatedAt: currentTime,
}).where(eq(configs.id, config.id));
console.log(`[Scheduler] Auto-start completed for user ${config.userId}, next sync at ${nextRun.toISOString()}`);
} catch (error) {
console.error(`[Scheduler] Failed to auto-start for user ${config.userId}:`, error);
}
}
console.log('[Scheduler] Initial auto-start completed');
} catch (error) {
console.error('[Scheduler] Failed to perform initial auto-start:', error);
}
}
/**
* Main scheduler loop
*/
@@ -307,25 +584,41 @@ async function schedulerLoop(): Promise<void> {
config.scheduleConfig?.enabled === true
);
if (enabledConfigs.length === 0) {
console.log(`[Scheduler] No configurations with scheduling enabled (found ${activeConfigs.length} active configs)`);
// Further filter configs that have valid tokens
const validConfigs = enabledConfigs.filter(config => {
const hasGitHubToken = !!config.githubConfig?.token;
const hasGiteaToken = !!config.giteaConfig?.token;
// Show details about why configs are not enabled
activeConfigs.forEach(config => {
const scheduleEnabled = config.scheduleConfig?.enabled;
const mirrorInterval = config.giteaConfig?.mirrorInterval;
console.log(`[Scheduler] User ${config.userId}: scheduleEnabled=${scheduleEnabled}, mirrorInterval=${mirrorInterval}`);
});
if (!hasGitHubToken || !hasGiteaToken) {
console.log(`[Scheduler] User ${config.userId}: Scheduling enabled but tokens missing (GitHub: ${hasGitHubToken}, Gitea: ${hasGiteaToken})`);
return false;
}
return true;
});
if (validConfigs.length === 0) {
if (enabledConfigs.length > 0) {
console.log(`[Scheduler] ${enabledConfigs.length} config(s) have scheduling enabled but lack required tokens`);
} else {
console.log(`[Scheduler] No configurations with scheduling enabled (found ${activeConfigs.length} active configs)`);
// Show details about why configs are not enabled
activeConfigs.forEach(config => {
const scheduleEnabled = config.scheduleConfig?.enabled;
const mirrorInterval = config.giteaConfig?.mirrorInterval;
console.log(`[Scheduler] User ${config.userId}: scheduleEnabled=${scheduleEnabled}, mirrorInterval=${mirrorInterval}`);
});
}
return;
}
console.log(`[Scheduler] Processing ${enabledConfigs.length} configurations with scheduling enabled (out of ${activeConfigs.length} total active configs)`);
console.log(`[Scheduler] Processing ${validConfigs.length} valid configurations (out of ${enabledConfigs.length} with scheduling enabled)`);
// Check each configuration to see if it's time to run
const currentTime = new Date();
for (const config of enabledConfigs) {
for (const config of validConfigs) {
const scheduleConfig = config.scheduleConfig || {};
// Check if it's time to run based on nextRun
@@ -347,7 +640,7 @@ async function schedulerLoop(): Promise<void> {
/**
* Start the scheduler service
*/
export function startSchedulerService(): void {
export async function startSchedulerService(): Promise<void> {
if (schedulerInterval) {
console.log('[Scheduler] Scheduler service is already running');
return;
@@ -355,6 +648,14 @@ export function startSchedulerService(): void {
console.log('[Scheduler] Starting scheduler service');
// Check if we should auto-start mirroring based on environment variables
const shouldAutoStart = await checkAutoStartConfiguration();
if (shouldAutoStart) {
console.log('[Scheduler] Auto-start detected from environment variables, triggering initial import and mirror...');
await performInitialAutoStart();
}
// Run immediately on start
schedulerLoop().catch(error => {
console.error('[Scheduler] Error during initial scheduler run:', error);

View File

@@ -8,6 +8,7 @@ import { setupSignalHandlers } from './lib/signal-handlers';
import { auth } from './lib/auth';
import { isHeaderAuthEnabled, authenticateWithHeaders } from './lib/auth-header';
import { initializeConfigFromEnv } from './lib/env-config-loader';
import { db, users } from './lib/db';
// Flag to track if recovery has been initialized
let recoveryInitialized = false;
@@ -17,6 +18,7 @@ let schedulerServiceStarted = false;
let repositoryCleanupServiceStarted = false;
let shutdownManagerInitialized = false;
let envConfigInitialized = false;
let envConfigCheckCount = 0; // Track attempts to avoid excessive checking
export const onRequest = defineMiddleware(async (context, next) => {
// First, try Better Auth session (cookie-based)
@@ -79,14 +81,31 @@ export const onRequest = defineMiddleware(async (context, next) => {
}
}
// Initialize configuration from environment variables (only once)
if (!envConfigInitialized) {
envConfigInitialized = true;
try {
await initializeConfigFromEnv();
} catch (error) {
console.error('⚠️ Failed to initialize configuration from environment:', error);
// Continue anyway - environment config is optional
// Initialize configuration from environment variables
// Optimized to minimize performance impact:
// - Once initialized, no checks are performed (envConfigInitialized = true)
// - Limits checks to first 100 requests to avoid DB queries on every request if no users exist
// - After user creation, env vars load on next request and flag is set permanently
if (!envConfigInitialized && envConfigCheckCount < 100) {
envConfigCheckCount++;
// Only check every 10th request after the first 10 to reduce DB load
const shouldCheck = envConfigCheckCount <= 10 || envConfigCheckCount % 10 === 0;
if (shouldCheck) {
try {
const hasUsers = await db.select().from(users).limit(1).then(u => u.length > 0);
if (hasUsers) {
// We have users now, try to initialize config
await initializeConfigFromEnv();
envConfigInitialized = true; // This ensures we never check again
console.log('✅ Environment configuration loaded after user creation');
}
} catch (error) {
console.error('⚠️ Failed to initialize configuration from environment:', error);
// Continue anyway - environment config is optional
}
}
}
@@ -160,7 +179,10 @@ export const onRequest = defineMiddleware(async (context, next) => {
if (recoveryInitialized && !schedulerServiceStarted) {
try {
console.log('Starting automatic mirror scheduler service...');
startSchedulerService();
// Start the scheduler service (now async)
startSchedulerService().catch(error => {
console.error('Error in scheduler service startup:', error);
});
// Register scheduler service shutdown callback
registerShutdownCallback(async () => {