Contenu du post
Language Models are Open Knowledge Graphs (video) In July 16 I posted that it would be cool to use GPT to build the graphs and on 22 Oct someone did just that. Language Models are Open Knowledge Graphs shows how you can use attention matrices of BERT/GPT-2 to extract the relationships for given entities from the text and build a knowledge graph from that. Yannick thoroughly discusses the paper. Here are comments from Michael Galkin: The paper shows impressive results extracting facts that are present in Wikidata as well as adding new ones, unknown to Wikidata. There is still a lot of room for improvement, though. * It's computationally expensive - 20 servers each with 4 Tesla K80 GPUs were running for 48 / 96 hours. GPUs go brrr ;) * Neither BERT nor GPT have a notion of entities, so you'd need auxiliary tools and NER annotators to map tokens "New" "York" to "New York" as one entity. From that PoV, using KG-augmented LMs trained with millions of explicit entities might be a promising move. * The facts search strategy is a key. The authors used a generative beam search strategy with some post-processing filtering and it is arguably expensive. The whole space of facts in triple-based KGs is a Cartesian product of all entities and relations (E x R x E), and only a small fraction of those facts are correct. * One might say that extracted graphs are way too star-shaped, i.e., there are not so many links between leaf nodes - that is a direct consequence of the fact extraction strategy.