Contenu du post
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks In our new work at ICLR 2021, we explore how to apply Gradient Boosted Decision Trees to graphs. Surprisingly, I haven't encountered before papers that test performance of pure GBDT on graphs, for example for node classification. GBDTs are usually used for heterogeneous data (e.g. in Kaggle competitions): the columns can be categorical, of different scale and meaning (e.g. income column vs age column). Such data is quite common in the real world, but most of the research graph datasets have sparse homogeneous nodes features (e.g. bag-of-words features or word embeddings). So we asked a question whether GNNs are efficient on graphs with heterogeneous features. The first insight is that you can just pretrain GBDT on the node features and use the predictions of GBDT for training GNN model. This already gives a boost to GNN model. Second, we proposed a scheme how to train GBDT and GNN end-to-end, and this would additionally boost performance. Third, this combo of GBDT and GNN, which we call BGNN, converges much faster than GNN and therefore usually is faster to train than pure GNN. Some limitations. * BGNN works well with heterogeneous features. So Cora datasets and others with homogeneous features are still better of with plain GNN. * The approach works for node regression and classification. We have some ideas how to extend it to link prediction or graph classification, but haven't worked it out yet. If you have some interest in continuing this line of work, let me know. The code and datasets are available here.