Stay updated with the latest and most popular GitHub repositories. We fetch data from the official GitHub API, analyze it in-house, and update our listings every hour to bring you the freshest trends.
Finetune Llama 4, TTS, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
real time face swap and one-click video deepfake with only a single image
Agent Framework / shim to use Pydantic with LLMs
de-pixelate youtube video gaV-O6NPWrI
Tensors and Dynamic neural networks in Python with strong GPU acceleration
ACI.dev is the open source platform that connects your AI agents to 600+ tool integrations with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.
Run AI Agent in your browser.
Build resilient language agents as graphs.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
gemini轮询代理服务
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Convert PDF to markdown + JSON quickly with high accuracy
A specialized utility that automates discovering missing and upgrading your media collection!
Multi-Platform Live Stream Automatic Recording Tool | 多平台直播流自动录制工具 · 基于FFmpeg · 支持监控/定时/转码
AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.
Maple Mono: Open source monospace font with round corner, ligatures and Nerd-Font for IDE and terminal, fine-grained customization options. 带连字和控制台图标的圆角等宽字体,中英文宽度完美2:1,细粒度的自定义选项
欢迎star⭐。🚀从聊天记录创造数字分身的一站式解决方案💡 使用微信聊天记录微调大语言模型,让大模型有“那味儿”。使用微信语音消息➕0.5B大模型实现高质量声音克隆,并绑定到聊天机器人,实现自己的数字分身。 数字克隆/数字分身/数字永生/声音克隆/LLM/大语言模型/微信聊天机器人/LoRA
The Memory layer for AI Agents
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
AWS MCP Servers — specialized MCP servers that bring AWS best practices directly to your development workflow
SGLang is a fast serving framework for large language models and vision language models.
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
Spark-TTS Inference Code
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A Conversational Speech Generation Model
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
✨ ml engineering agent that builds ml models in natural language
《深入JDBC安全:特殊URL构造与不出网反序列化利用技术揭秘》对应研究总结项目 "Deep Dive into JDBC Security: Special URL Construction and Non-Networked Deserialization Exploitation Techniques Revealed" - Research Summary Project
MuJoCo fruit fly body model and reinforcement learning tasks
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
The Python programming language
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
An open and fair framework for everyone to build AI agents equipped with powerful skills. Launch your agent, improve the world, your wallet, or both!
(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge bases. It can effectively overcome the shortcomings of the traditional RAG vector similarity calculation model.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.