Найти похожее

@amneumarkt · Post #379 · 15.07.2022, 06:09

#ml The recommended readings serve as a good curriculum for transformers. https://web.stanford.edu/class/cs25/index.html#course

Hashtags

@amneumarkt · Post #378 · 13.07.2022, 19:28

#ml 1.0! https://modelserving.com/blog/introducing-bentoml-10

Hashtags

@amneumarkt · Post #375 · 12.07.2022, 19:02

#ml I was playing with dalle-mini ( https://github.com/borisdayma/dalle-mini ). So... in the eyes of Dalle-mini, 1. science == chemistry (? I guess), 2. scientists are men. Tried several times, same conclusions. It is so hard to fight against the bias in ML models. --- Update: OpenAI is fixing this. https://openai.com/blog/reducing-bias-and-improving-safety-in-dall-e-2/

Hashtags

@amneumarkt · Post #370 · 29.06.2022, 17:11

#ml Mitchell M, Wu S, Zaldivar A, Barnes P, Vasserman L, Hutchinson B, et al. Model cards for model reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM; 2019. doi:10.1145/3287560.3287596 https://arxiv.org/abs/1810.03993

Hashtags

@amneumarkt · Post #369 · 10.06.2022, 02:58

#ml This is also like one thousand years later... PyMC 4.0 Release Announcement — PyMC project website https://www.pymc.io/blog/v4_announcement.html

Hashtags

@amneumarkt · Post #366 · 27.05.2022, 07:06

#ml This is hilarious. Source: https://mobile.twitter.com/arankomatsuzaki/status/1529278580189908993 Paper: https://arxiv.org/abs/2205.11916

Hashtags

@amneumarkt · Post #364 · 21.05.2022, 16:38

#ml I have heard about deepeta before but never thought it was a transformer. According to this blog post by uber, they are using an encoder decoder architecture with linear attention. This blog post also explains how they made a transformer fast. DeepETA: How Uber Predicts Arrival Times Using Deep Learning https://eng.uber.com/deepeta-how-uber-predicts-arrival-times/

Hashtags

@amneumarkt · Post #363 · 21.05.2022, 12:23

#ml Parsimony with cognitive resource limitations 🤔 https://www.nature.com/articles/s41586-022-04743-9

Hashtags

@amneumarkt · Post #360 · 18.05.2022, 16:27

#ml Finally... We can now utilize the real power of M1 chips. Introducing Accelerated PyTorch Training on Mac | PyTorch https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ I have been following this issue: https://github.com/pytorch/pytorch/issues/47702#issuecomment-1130162835 There were even some fights. 😂

Hashtags

@amneumarkt · Post #351 · 05.05.2022, 19:48

#ml https://ts.gluon.ai/ Highly recommended! If you are working on deep learning for forecasting, gluonts is a great package. It simplifies all these tedious data preprocessing, slicing, backrest stuff. We can then spend time on implementing the models themselves (there're a lot of ready-to-use models). What's even better, we can use pytorch lightning! See this repository for a list of transformer based forecasting models. https://github.com/kashif/pytorch-transformer-ts

Hashtags

@amneumarkt · Post #350 · 05.05.2022, 05:34

#ml Came across this post this morning. I realized the reason I am not writing a lot in Julia is simply because I don't know how to write quality code in Julia. When we build a model in Python, we know all these details about making it quality code. For a new language, I'm just terrified by the amount of details I need to be aware of. Ah I'm getting older. JAX vs Julia (vs PyTorch) · Patrick Kidger https://kidger.site/thoughts/jax-vs-julia/

Hashtags

@amneumarkt · Post #348 · 03.05.2022, 06:10

#ml I heard about information bottleneck so many times but didn't really go back and read the original papers. I spent some time on it and I found it quite interesting. It is philosophically based on what was described in Vapnik's The Nature of Statistical Learning, where he discussed how generalizations work by enforcing parsimony. Here in this information bottleneck paper, the most interesting thing is the quantified generalization gap and complexity gap. With these, we know where to go on the information plane. It's a good read. Tishby N, Zaslavsky N. Deep Learning and the Information Bottleneck Principle. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1503.02406,

Hashtags