How GitHub Trending Works
A daily pipeline that collects, scores, and summarises rising GitHub repositories — fully automated and open-source.
Data Collection
Each day the pipeline calls the GitHub Search API to discover repositories that have been actively pushed within the last 30 days and have accumulated at least 50 stars. Up to 200 repositories are collected per run.
Results are paginated and deduplicated by GitHub repository ID before being passed to the ranker. If the API returns the same repository on multiple pages (which can happen at page boundaries), only the entry with the higher star count is kept.
The Trending Score Formula
Repositories are ranked by a composite trending score that blends five signals. The goal is to surface repos that are genuinely gaining momentum right now — not just the all-time most-starred projects.
| Signal | Weight | What it measures |
|---|---|---|
| Star Velocity | 50% | Stars divided by the repo's age in days (capped at 365). A repo accumulating stars quickly relative to its age scores higher — this is the dominant signal. |
| Fork Ratio | 15% | Forks ÷ stars. Reflects how often people actually build on the code, not just bookmark it with a star. |
| Watcher Ratio | 10% | Watchers ÷ stars. Sustained community interest: people who opt-in to receive notifications indicate deeper engagement than a passive star. |
| Recency Bonus | 20% | Linear decay from 1.0 → 0.0 over the 30 days since the repo's last push. Rewards actively maintained projects and penalises abandoned ones. |
| Issue Health | 5% |
Calculated as 1 / (1 + (open_issues / stars) × 10). Penalises repos drowning in unresolved issues relative to their community size.
|
AI-Generated Summaries
For each repository the pipeline fetches the README, the top-level file tree, key config files (e.g. package.json, pyproject.toml, Cargo.toml), and the primary entry-point source file. This context is sent to Claude (Haiku model by default) to generate:
- Summary — A concise 2–4 sentence plain-English description of what the repo does and why it is interesting.
- Topic tags — Between 1 and 5 machine-readable tags (e.g.
rust,llm,cli-tool) that power the tag-cloud filter on the main page.
Summaries are cached in the database. A repository is only re-summarised if it was not already present from a previous run, keeping API costs low and the daily run fast.
Update Schedule
The pipeline runs daily at 8:00 AM via Windows Task Scheduler. Each run executes the following steps in order:
docs/index.html site from the latest database snapshot.Data Storage
All results are stored in a SQLite database at data/trending.db. The database accumulates every daily snapshot, so historical data is preserved across runs. The database is seeded with 16 days of historical snapshots (over 1.5 million rows) giving the period filters meaningful baseline data from day one.
This enables the pipeline to compute star-growth deltas over multiple time windows by diffing successive snapshots — the data behind the 24h, 7-day, and 30-day growth stats shown on each repository card and used by the period filter thresholds.
Filters Explained
The main page provides several ways to narrow the list of trending repositories:
- Search — Full-text match across the repository name, GitHub description, AI-generated summary, and topic tags.
- Language — Filter by GitHub's detected primary programming language for the repository.
- Time period — Restricts results to repos that have crossed a minimum star-growth threshold within the selected window. Each period uses a percentage of the repo's total star count as the bar:
| Period | Threshold | What it means |
|---|---|---|
| 24 hours | ≥ 30% | Gained at least 30% of its total star count in a single day — a genuine viral spike. |
| 7 days | ≥ 60% | Gained at least 60% of its total star count over the past week — sustained rapid growth. |
| 30 days | ≥ 120% | Gained more stars in a single month than its entire prior history — explosive breakout growth. |
- Repos with no historical data for a given window are always shown rather than hidden. The "All" option displays every repo regardless of growth rate.
- Sort — Order by Trending (composite score), Most Stars, or Most Forks.
- Min Stars — Slider to set a minimum total star count threshold, filtering out lower-signal repositories.
- Tag cloud — AI-generated topic tags appear beneath the filter bar. Click one or more tags to filter the list — multiple tags use OR logic, so any matching tag qualifies a repo.