TGTGInsightintelligence telegramLIVE / telegram public index
Contenuto del post
Contenuto
Hugging Face (Twitter) RT @Xianbao_QIAN: opendatalab/AICC: Markdown version of Common Crawl, extracted by MinerU. Very cool. It only has two shards for now but someone could scale it up to the entire Common Crawl.