Fixes#141
The repository metadata field was missing from the database schema, which
caused the metadata sync state (issues, PRs, releases, etc.) to not persist.
This resulted in duplicate issues being created every time a repository was
synced because the system couldn't track what had already been mirrored.
Changes:
- Added metadata text field to repositories table in schema
- Added metadata field to repositorySchema Zod validation
- Generated database migration 0008_serious_thena.sql
Root cause analysis:
1. Code tried to read/write repository.metadata to track mirrored components
2. The metadata field didn't exist in the database schema
3. On sync, metadataState.components.issues was always false
4. This triggered re-mirroring of all issues, creating duplicates
The fix ensures metadata state persists between mirrors and syncs, preventing
duplicate metadata (issues, PRs, releases) from being created in Gitea.
Since githubConfig is stored as JSON in the database (not individual columns),
no database migration is needed. However, we need to handle reading old configs
that still use the 'skipStarredIssues' field name.
Changes:
- Added skipStarredIssues as optional field in Zod schema (marked deprecated)
- Updated config mapper to check both starredCodeOnly and skipStarredIssues
- Old configs will continue to work seamlessly
- New configs will use starredCodeOnly field name
This ensures zero-downtime upgrades for existing installations.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The previous name 'skipStarredIssues' was misleading as it now skips ALL
metadata (not just issues) for starred repositories. The new name
'starredCodeOnly' better reflects the actual behavior - mirroring only
source code for starred repos.
Changes:
- Renamed skipStarredIssues → starredCodeOnly in all files
- Updated UI label from "Don't fetch issues" to "Code-only mode"
- Updated description to clarify it skips ALL metadata types:
issues, PRs, labels, milestones, wiki, and releases
- Updated database schema, types, config mapper, and all components
- Updated Helm charts, CI configs, and documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Prevent Automation UI from overriding schedule:
- mapDbScheduleToUi now parses intervals robustly (cron/duration/seconds) via parseInterval
- mapUiScheduleToDb merges with existing config and stores interval as seconds (no lossy cron conversion)
- /api/config passes existing scheduleConfig to preserve ENV-sourced values
- schedule-sync endpoint uses parseInterval for nextRun calculation
- Add AUTO_MIRROR_REPOS support and scheduled auto-mirror phase:
- scheduleConfig schema includes autoImport and autoMirror
- env-config-loader reads AUTO_MIRROR_REPOS and carries through to DB
- scheduler auto-mirrors imported/pending/failed repos when autoMirror is enabled before regular sync
- docker-compose and ENV docs updated with AUTO_MIRROR_REPOS
- Tests pass and build succeeds
- Implemented comprehensive GitHub API rate limit handling:
- Integrated @octokit/plugin-throttling for automatic retry with exponential backoff
- Added RateLimitManager service to track and enforce rate limits
- Store rate limit status in database for persistence across restarts
- Automatic pause and resume when limits are exceeded
- Proper user identification for 5000 req/hr authenticated limit (vs 60 unauthenticated)
- Improved rate limit UI/UX:
- Removed intrusive rate limit card from dashboard
- Toast notifications only at critical thresholds (80% and 100% usage)
- All rate limit events logged for debugging
- Optimized for GitHub's API constraints:
- Reduced default batch size from 10 to 5 repositories
- Added documentation about GitHub's 100 concurrent request limit
- Better handling of repositories with many issues/PRs
- Add missing database fields (language, description, mirroredLocation, destinationOrg) to repository operations
- Add missing organization fields (publicRepositoryCount, privateRepositoryCount, forkRepositoryCount) to schema
- Update GitRepo interface to include all required database fields
- Fix GitHub data fetching functions to map all fields correctly
- Update all sync endpoints (main, repository, organization, scheduler) to handle new fields
This fixes the "SQLite query expected X values, received Y" error when importing
large numbers (4.6k+) of starred repositories by ensuring all database fields
are properly mapped from GitHub API responses through to database insertion.
- Add fork tags to repository table and dashboard list components
- Display 'Fork' badge for repositories where isForked is true
- Enhance organization cards to show breakdown of public, private, and fork repositories
- Update organization API to respect user configuration filters (private repos, forks)
- Add visual indicators with colored dots for each repository type
- Ensure consistent filtering between repository and organization APIs
- Fix issue where private repositories weren't showing due to configuration filtering
- Added new fields to the mirror_jobs table for job resilience, including job_type, batch_id, total_items, completed_items, item_ids, completed_item_ids, in_progress, started_at, completed_at, and last_checkpoint.
- Implemented database migration scripts to update the mirror_jobs table schema.
- Introduced processWithResilience utility for handling item processing with checkpointing and recovery capabilities.
- Updated API routes for mirroring organizations and repositories to utilize the new resilience features.
- Created recovery system to detect and resume interrupted jobs on application startup.
- Added middleware to initialize the recovery system when the server starts.