TGTGInsightintelligence telegramLIVE / telegram public index
Contenuto del post
Contenuto
Hugging Face (Twitter) RT @vanstriendaniel: Datasets and benchmarks drive AI progress, but finding papers that introduce new ones means digging through thousands of arXiv abstracts. Updated the Dataset Papers on ArXiv app to surface them: 52K+ papers classified as introducing new datasets from 212K CS papers. Semantic search, confidence filtering, updated weekly (using @huggingface Jobs!) Powered by a fine-tuned ModernBERT classifier. Full dataset stored in @lancedb Lance format on the Hub, with vector embeddings stored with the dataset.