feat: Refactor database cleanup process by removing scripts and updating documentation to use the Activity Log for event management

This commit is contained in:
Arunavo Ray
2025-05-24 17:58:37 +05:30
parent 4efe741c64
commit d7ce2a6908
10 changed files with 6 additions and 266 deletions

View File

@@ -481,7 +481,7 @@ Try the following steps:
> docker compose up -d
> ```
>
> This setup includes automatic database maintenance that runs daily to clean up old events and mirror jobs, preventing the database from growing too large. You can customize the retention periods by setting the `EVENTS_RETENTION_DAYS` and `JOBS_RETENTION_DAYS` environment variables.
> This setup provides a complete containerized deployment for the Gitea Mirror application.
#### Database Maintenance
@@ -498,35 +498,9 @@ Try the following steps:
>
> # Reset user accounts (for development)
> bun run reset-users
>
> # Clean up old events (keeps last 7 days by default)
> bun run cleanup-events
>
> # Clean up old events with custom retention period (e.g., 30 days)
> bun run cleanup-events 30
>
> # Clean up old mirror jobs (keeps last 7 days by default)
> bun run cleanup-jobs
>
> # Clean up old mirror jobs with custom retention period (e.g., 30 days)
> bun run cleanup-jobs 30
>
> # Clean up both events and mirror jobs
> bun run cleanup-all
> ```
>
> For automated maintenance, consider setting up cron jobs to run the cleanup scripts periodically:
>
> ```bash
> # Add these to your crontab
> # Clean up events daily at 2 AM
> 0 2 * * * cd /path/to/gitea-mirror && bun run cleanup-events
>
> # Clean up mirror jobs daily at 3 AM
> 0 3 * * * cd /path/to/gitea-mirror && bun run cleanup-jobs
> ```
>
> **Note:** When using Docker, these cleanup jobs are automatically scheduled inside the container with the default retention period of 7 days. You can customize the retention periods by setting the `EVENTS_RETENTION_DAYS` and `JOBS_RETENTION_DAYS` environment variables in your docker-compose file.
> **Note:** For cleaning up old activities and events, use the cleanup button in the Activity Log page of the web interface.
> [!NOTE]

View File

@@ -41,9 +41,6 @@ services:
- GITEA_ORGANIZATION=${GITEA_ORGANIZATION:-github-mirrors}
- GITEA_ORG_VISIBILITY=${GITEA_ORG_VISIBILITY:-public}
- DELAY=${DELAY:-3600}
# Database maintenance settings
- EVENTS_RETENTION_DAYS=${EVENTS_RETENTION_DAYS:-7}
- JOBS_RETENTION_DAYS=${JOBS_RETENTION_DAYS:-7}
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=3", "--spider", "http://localhost:4321/api/health"]
interval: 30s

View File

@@ -30,24 +30,7 @@ if [ "$JWT_SECRET" = "your-secret-key-change-this-in-production" ] || [ -z "$JWT
echo "JWT_SECRET has been set to a secure random value"
fi
# Set up automatic database cleanup cron job
# Default to 7 days retention for events and mirror jobs unless specified by environment variables
EVENTS_RETENTION_DAYS=${EVENTS_RETENTION_DAYS:-7}
JOBS_RETENTION_DAYS=${JOBS_RETENTION_DAYS:-7}
# Create cron directory if it doesn't exist
mkdir -p /app/data/cron
# Create the cron job file
cat > /app/data/cron/cleanup-cron <<EOF
# Run event cleanup daily at 2 AM
0 2 * * * cd /app && bun dist/scripts/cleanup-events.js ${EVENTS_RETENTION_DAYS} >> /app/data/cleanup-events.log 2>&1
# Run mirror jobs cleanup daily at 3 AM
0 3 * * * cd /app && bun dist/scripts/cleanup-mirror-jobs.js ${JOBS_RETENTION_DAYS} >> /app/data/cleanup-mirror-jobs.log 2>&1
# Empty line at the end is required for cron to work properly
EOF
# Skip dependency installation entirely for pre-built images
# Dependencies are already installed during the Docker build process
@@ -223,34 +206,7 @@ if [ -f "package.json" ]; then
echo "Setting application version: $npm_package_version"
fi
# Set up cron if it's available
if command -v crontab >/dev/null 2>&1; then
echo "Setting up automatic database cleanup cron jobs..."
# Install cron if not already installed
if ! command -v crond >/dev/null 2>&1; then
echo "Installing cron..."
apk add --no-cache dcron
fi
# Try to install the cron job, but don't fail if it doesn't work
if crontab /app/data/cron/cleanup-cron 2>/dev/null; then
echo "✅ Cron job installed successfully"
# Start cron service (Alpine uses crond)
if command -v crond >/dev/null 2>&1; then
crond -b
echo "✅ Cron daemon started"
else
echo "⚠️ Warning: Could not start cron service. Automatic database cleanup will not run."
fi
else
echo "⚠️ Warning: Could not install cron job (permission issue). Automatic database cleanup will not be set up."
echo "Consider setting up external scheduled tasks to run cleanup scripts."
fi
else
echo "⚠️ Warning: crontab command not found. Automatic database cleanup will not be set up."
echo "Consider setting up external scheduled tasks to run cleanup scripts."
fi
# Run startup recovery to handle any interrupted jobs
echo "Running startup recovery..."

View File

@@ -17,9 +17,7 @@
"check-db": "bun scripts/manage-db.ts check",
"fix-db": "bun scripts/manage-db.ts fix",
"reset-users": "bun scripts/manage-db.ts reset-users",
"cleanup-events": "bun scripts/cleanup-events.ts",
"cleanup-jobs": "bun scripts/cleanup-mirror-jobs.ts",
"cleanup-all": "bun scripts/cleanup-events.ts && bun scripts/cleanup-mirror-jobs.ts",
"startup-recovery": "bun scripts/startup-recovery.ts",
"startup-recovery-force": "bun scripts/startup-recovery.ts --force",
"test-recovery": "bun scripts/test-recovery.ts",

View File

@@ -64,19 +64,7 @@ The following scripts help manage events in the SQLite database:
### Event Cleanup (cleanup-events.ts)
Removes old events and duplicate events from the database to prevent it from growing too large.
```bash
# Remove events older than 7 days (default) and duplicates
bun scripts/cleanup-events.ts
# Remove events older than X days and duplicates
bun scripts/cleanup-events.ts 14
```
This script can be scheduled to run periodically (e.g., daily) using cron or another scheduler. When using Docker, this is automatically scheduled to run daily.
### Remove Duplicate Events (remove-duplicate-events.ts)
@@ -90,19 +78,7 @@ bun scripts/remove-duplicate-events.ts
bun scripts/remove-duplicate-events.ts <userId>
```
### Mirror Jobs Cleanup (cleanup-mirror-jobs.ts)
Removes old mirror jobs from the database to prevent it from growing too large.
```bash
# Remove mirror jobs older than 7 days (default)
bun scripts/cleanup-mirror-jobs.ts
# Remove mirror jobs older than X days
bun scripts/cleanup-mirror-jobs.ts 14
```
This script can be scheduled to run periodically (e.g., daily) using cron or another scheduler. When using Docker, this is automatically scheduled to run daily.
### Fix Interrupted Jobs (fix-interrupted-jobs.ts)

View File

@@ -1,50 +0,0 @@
#!/usr/bin/env bun
/**
* Script to clean up old events from the database
* This script should be run periodically (e.g., daily) to prevent the events table from growing too large
*
* Usage:
* bun scripts/cleanup-events.ts [days]
*
* Where [days] is the number of days to keep events (default: 7)
*/
import { cleanupOldEvents, removeDuplicateEvents } from "../src/lib/events";
// Parse command line arguments
const args = process.argv.slice(2);
const daysToKeep = args.length > 0 ? parseInt(args[0], 10) : 7;
if (isNaN(daysToKeep) || daysToKeep < 1) {
console.error("Error: Days to keep must be a positive number");
process.exit(1);
}
async function runCleanup() {
try {
console.log(`Starting event cleanup (retention: ${daysToKeep} days)...`);
// First, remove duplicate events
console.log("Step 1: Removing duplicate events...");
const duplicateResult = await removeDuplicateEvents();
console.log(`- Duplicate events removed: ${duplicateResult.duplicatesRemoved}`);
// Then, clean up old events
console.log("Step 2: Cleaning up old events...");
const result = await cleanupOldEvents(daysToKeep);
console.log(`Cleanup summary:`);
console.log(`- Duplicate events removed: ${duplicateResult.duplicatesRemoved}`);
console.log(`- Read events deleted: ${result.readEventsDeleted}`);
console.log(`- Unread events deleted: ${result.unreadEventsDeleted}`);
console.log(`- Total events deleted: ${result.readEventsDeleted + result.unreadEventsDeleted + duplicateResult.duplicatesRemoved}`);
console.log("Event cleanup completed successfully");
} catch (error) {
console.error("Error running event cleanup:", error);
process.exit(1);
}
}
// Run the cleanup
runCleanup();

View File

@@ -1,102 +0,0 @@
#!/usr/bin/env bun
/**
* Script to clean up old mirror jobs from the database
* This script should be run periodically (e.g., daily) to prevent the mirror_jobs table from growing too large
*
* Usage:
* bun scripts/cleanup-mirror-jobs.ts [days]
*
* Where [days] is the number of days to keep mirror jobs (default: 7)
*/
import { db, mirrorJobs } from "../src/lib/db";
import { lt, and, eq } from "drizzle-orm";
// Parse command line arguments
const args = process.argv.slice(2);
const daysToKeep = args.length > 0 ? parseInt(args[0], 10) : 7;
if (isNaN(daysToKeep) || daysToKeep < 1) {
console.error("Error: Days to keep must be a positive number");
process.exit(1);
}
/**
* Cleans up old mirror jobs to prevent the database from growing too large
* Should be called periodically (e.g., daily via a cron job)
*
* @param maxAgeInDays Number of days to keep mirror jobs (default: 7)
* @returns Object containing the number of completed and in-progress jobs deleted
*/
async function cleanupOldMirrorJobs(
maxAgeInDays: number = 7
): Promise<{ completedJobsDeleted: number; inProgressJobsDeleted: number }> {
try {
console.log(`Cleaning up mirror jobs older than ${maxAgeInDays} days...`);
// Calculate the cutoff date for completed jobs
const cutoffDate = new Date();
cutoffDate.setDate(cutoffDate.getDate() - maxAgeInDays);
// Delete completed jobs older than the cutoff date
// Only delete jobs that are not in progress (inProgress = false)
const completedResult = await db
.delete(mirrorJobs)
.where(
and(
eq(mirrorJobs.inProgress, false),
lt(mirrorJobs.timestamp, cutoffDate)
)
);
const completedJobsDeleted = completedResult.changes || 0;
console.log(`Deleted ${completedJobsDeleted} completed mirror jobs`);
// Calculate a much older cutoff date for in-progress jobs (3x the retention period)
// This is to handle jobs that might have been abandoned or crashed
const inProgressCutoffDate = new Date();
inProgressCutoffDate.setDate(inProgressCutoffDate.getDate() - (maxAgeInDays * 3));
// Delete in-progress jobs that are significantly older
// This helps clean up jobs that might have been abandoned due to crashes
const inProgressResult = await db
.delete(mirrorJobs)
.where(
and(
eq(mirrorJobs.inProgress, true),
lt(mirrorJobs.timestamp, inProgressCutoffDate)
)
);
const inProgressJobsDeleted = inProgressResult.changes || 0;
console.log(`Deleted ${inProgressJobsDeleted} abandoned in-progress mirror jobs`);
return { completedJobsDeleted, inProgressJobsDeleted };
} catch (error) {
console.error("Error cleaning up old mirror jobs:", error);
return { completedJobsDeleted: 0, inProgressJobsDeleted: 0 };
}
}
// Run the cleanup
async function runCleanup() {
try {
console.log(`Starting mirror jobs cleanup (retention: ${daysToKeep} days)...`);
// Call the cleanupOldMirrorJobs function
const result = await cleanupOldMirrorJobs(daysToKeep);
console.log(`Cleanup summary:`);
console.log(`- Completed jobs deleted: ${result.completedJobsDeleted}`);
console.log(`- Abandoned in-progress jobs deleted: ${result.inProgressJobsDeleted}`);
console.log(`- Total jobs deleted: ${result.completedJobsDeleted + result.inProgressJobsDeleted}`);
console.log("Mirror jobs cleanup completed successfully");
} catch (error) {
console.error("Error running mirror jobs cleanup:", error);
process.exit(1);
}
}
// Run the cleanup
runCleanup();

View File

@@ -150,20 +150,11 @@ Events in Gitea Mirror (such as repository mirroring operations) are stored in t
# View all events in the database
bun scripts/check-events.ts
# Clean up old events (default: older than 7 days)
bun scripts/cleanup-events.ts
# Clean up old mirror jobs (default: older than 7 days)
bun scripts/cleanup-mirror-jobs.ts
# Clean up both events and mirror jobs
bun run cleanup-all
# Mark all events as read
bun scripts/mark-events-read.ts
```
When using Docker, database cleanup is automatically scheduled to run daily. You can customize the retention periods by setting the `EVENTS_RETENTION_DAYS` and `JOBS_RETENTION_DAYS` environment variables in your docker-compose file.
For cleaning up old activities and events, use the cleanup button in the Activity Log page of the web interface.
### Health Check Endpoint

View File

@@ -179,4 +179,4 @@ After your initial setup:
- Check out the [Configuration Guide](/configuration) for advanced settings
- Review the [Architecture Documentation](/architecture) to understand the system
- For server deployments, set up monitoring using the health check endpoint
- Consider setting up a cron job to clean up old events: `bun scripts/cleanup-events.ts`
- Use the cleanup button in the Activity Log page to manage old events and activities

View File

@@ -214,7 +214,7 @@ export async function removeDuplicateEvents(userId?: string): Promise<{ duplicat
/**
* Cleans up old events to prevent the database from growing too large
* Should be called periodically (e.g., daily via a cron job)
* This function is used by the cleanup button in the Activity Log page
*
* @param maxAgeInDays Number of days to keep events (default: 7)
* @param cleanupUnreadAfterDays Number of days after which to clean up unread events (default: 2x maxAgeInDays)