Почему развитие в ИИ стоит начинать с изучения математики и алгоритмов
Руководитель Школы анализа данных Яндекса в подкасте Machine Learning Podcast рассказывает, почему фундамент (матан, линал, теорвер, алгоритмы) — это не скучная теория, а база для работы с ИИ в 2026.
Вы узнаете, как глубокое понимание математики помогает писать эффективный код, отлаживать модели и ориентироваться в разных областях ML. А ещё — почему даже опытным разработчикам полезно возвращаться к фундаментальным дисциплинам.
Перейти к прослушиванию
#подкаст#ML
#ml
In his MinT paper, Hyndman said he confused these two quantities in his previous paper. 😂
MinT is a simple method to make forecasts with hierarchical structure coherent. Here coherent means the sum of the lower level forecasts equals the higher level forecasts.
For example, our time series has a strucutre like sales of coca cola + sales of spirit = sales of beverages. If this relations holds for our forecasts, we have coherent forecasts.
This may sound trivial, the problem is in fact hard. There are many trivial methods such as only forecasting lower levels (coca cola, spirit) then use the sum as the higher level (sales of beverages). These are usually too naive to be effective.
MinT is a reconciliation method that combines high level forecasts and the lower level forecasts to find an optimal combination/reconciliation.
https://robjhyndman.com/papers/MinT.pdf
#ml
You spent 10k euros on GPU then realized the statistical baseline model is better. 🤣
https://github.com/Nixtla/statsforecast/tree/main/experiments/m3
#ml
https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/
I find this post very useful. I have always wondered what happens after my dataloader prepared everything for the GPU. I didn’t know that CUDA has to copy the data again to create page-locked memory.
I used to set pin_memory=True in a PyTorch DataLoader and benchmark it. To be honest, I have only observed very small improvements in most of my experiments. So I stopped caring about pin_memory.
After some digging, I also realized that performance from setting pin_memory=True in DataLoader is ticky. If we don’t use multiprocessing nor reuse the page-locked memory, it is hard to expect any performance gain.
(some other notes: https://datumorphism.leima.is/cards/machine-learning/practice/cuda-memory/)
#ml
Amazon has been updating their Machine Learning University website. It is getting more and more interesting.
They have added an article about linear regression recently. There is a section in this article about interpreting linear models and it is just fun.
https://mlu-explain.github.io/
( Time machine: https://t.me/amneumarkt/293 )
#ML
This is interesting.
Toy Models of Superposition. [cited 15 Sep 2022]. Available: https://transformer-circuits.pub/2022/toy_model/index.html#learning
#ml
https://ai.googleblog.com/2022/08/optformer-towards-universal.html?m=1
I find this work counter intuitive. They took some descriptions of the optimization in machine learning and trained a transformer to "guesstimate" the hyperparameters of a model.
I understand that human being has some "feeling" of the hyperparameters after working with the data and model for a while. But it is usually hard to extrapolate such knowledge when we have completely new data and models.
I guess our brain is doing some statistics based on our historical experiments. And we call this intuition. My "intuition" is that there is little generalizable knowledge in this problem. 🙈 It would have been so great if they investigated the saliency maps.
#ml
Fotios Petropoulos initiated the forecasting encyclopaedia project. They published this paper recently.
Petropoulos, Fotios, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, Souhaib Ben Taieb, Christoph Bergmeir, et al. 2022. “Forecasting: Theory and Practice.” International Journal of Forecasting 38 (3): 705–871.
https://www.sciencedirect.com/science/article/pii/S0169207021001758
Also available here: https://forecasting-encyclopedia.com/
The paper covers many recent advances in forecasting, including deep learning models. There are some important topics missing but I’m sure they will cover them in future releases.