GitHub Trends@githubtrending · Post #15123 · 09/06/2025, 11:30 AM
#rust#artificial_intelligence#big_data#data_engineering#distributed_computing#machine_learning#multimodal#python#rust
Daft is a powerful, easy-to-use data engine that lets you process large-scale data using Python or SQL with high speed and efficiency. It supports complex data types like images and tensors, works well interactively for quick data exploration, and can scale to huge cloud clusters using Ray. Daft integrates smoothly with cloud storage and data catalogs, making it ideal for data engineering, analytics, and machine learning workflows. By using Daft, you can handle big, multimodal datasets faster and more flexibly, improving your ability to analyze and prepare data for AI models without complex setup or slowdowns.
https://github.com/Eventual-Inc/Daft
GitHub Trends@githubtrending · Post #15350 · 12/21/2025, 11:30 AM
#rust#ai#change_data_capture#context_engineering#data#data_engineering#data_indexing#data_infrastructure#data_processing#etl#hacktoberfest#help_wanted#indexing#knowledge_graph#llm#pipeline#python#rag#real_time#rust#semantic_search
**CocoIndex** is a fast, open-source Python tool (Rust core) for transforming data into AI formats like vector indexes or knowledge graphs. Define simple data flows in ~100 lines of code using plug-and-play blocks for sources, embeddings, and targets—install via `pip install cocoindex`, add Postgres, and run. It auto-syncs fresh data with minimal recompute on changes, tracking lineage. **You save time building scalable RAG/semantic search pipelines effortlessly, avoiding complex ETL and stale data issues for production-ready AI apps.**
https://github.com/cocoindex-io/cocoindex