Найти похожее

@amneumarkt · Post #238 · 29.06.2021, 10:58

#ML A Turing lecture article by the three famous DL guys. It's an overview of the history, development, and future of AI. There are two very interesting points in the outlook section: - "From homogeneous layers to groups of neurons that represent entities." In biological brains, there are memory engrams and motifs that almost do this. - "Multiple time scales of adaption." This is another key idea that has been discussed numerous times. One of the craziest things about our brain is the diversity of time scales of plasticity, i.e., different mechanisms change the brain on different time scales. Reference: Bengio Y, Lecun Y, Hinton G. Deep learning for AI. Commun ACM. 2021;64: 58–65. doi:10.1145/3448250 https://dl.acm.org/doi/10.1145/3448250

Hashtags

@amneumarkt · Post #237 · 21.06.2021, 08:25

#ML Geometric Deep Learning is an attempt to unify deep learning using geometry. Instead of building deep neural networks ignoring the symmetries in the data and leaving it to be discovered by the network, we apply the symmetries in the problem to the network. For example, instead of flattening the matrix of a cat image and have some predetermined order of the pixels, we apply a translational transformation on the 2D image and the cat should also be a cat without any doubt. This transformation can be enforced in the network. BTW, If you come from a physics background, it is most likely that you have heard about the symmetries in physical theories like Noether's theorem. In the history of physics, there was an era of many theories yet most of them are connected or even unified under the umbrella of geometry. Geometric deep learning is another "benevolent propaganda" based on a similar idea. References: 1. Bronstein, Michael. “ICLR 2021 Keynote - ‘Geometric Deep Learning: The Erlangen Programme of ML’ - M Bronstein.” Video. YouTube, June 8, 2021. https://www.youtube.com/watch?v=w6Pw4MOzMuo. 2. Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P. Geometric deep learning: going beyond Euclidean data. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1611.08097 3. Bronstein MM, Bruna J, Cohen T, Veličković P. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2104.13478

Hashtags

@amneumarkt · Post #235 · 13.06.2021, 10:28

#ML The Bayesian hierarchical model provides a process to use Bayesian inference hierarchically to update the posteriors. What is a Bayesian model? In a Bayesian linear regression problem, we can take the posterior from the previous data points and use it as our new prior for inferring based on new data. In other words, as more data coming in, our belief is being updated. However, this is a problem if some clusters in the dataset have small sample sizes, aka small support. As we take these samples and fit them onto the model, we may get a huge credible interval. One simple idea to mitigate this problem is to introduce some constraints on how the priors can change. For example, we can introduce a hyperprior that is parametrized by new parameters. Then the model becomes hierarchical since we will also have to model the new parameters. The referenced post, "Bayesian Hierarchical Modeling at Scale", provides some examples of coding such models using numpyro with performance in mind. https://florianwilhelm.info/2020/10/bayesian_hierarchical_modelling_at_scale/

Hashtags

@amneumarkt · Post #230 · 17.05.2021, 19:52

#ML An interesting talk: ------------------- Dear all, We are pleased to have Anna Golubeva speak on "Are wider nets better given the same number of parameters?" on Wednesday May 19th at 12:00 ET. You can find further details here and listen to the talk here. We hope you can join! Best, Sven

Hashtags

@amneumarkt · Post #221 · 06.04.2021, 07:38

#ML Voss, et al., "Branch Specialization", Distill, 2021. https://distill.pub/2020/circuits/branch-specialization/ TLDR; - Branch: neuron clusters that are roughly segregated locally, e.g., AlexNet branches by design. - Branch specialization: branches specialize in specific tasks, e.g., the two AlexNet branches specialize in different detectors (color detector or black-white filter). - Is it a coincidence? No. Branch specialization repeatedly occurs in different trainings and different models. - Do we find the same branch specializations in different models and tasks? Yes. - Why? The authors' proposal is that a positive feedback loop will be established between layers, and this loop enhances what the branch will do. - Our brains have specialized regions too. Are there any connections?

Hashtags

@amneumarkt · Post #220 · 04.04.2021, 21:10

#ML Silla CN, Freitas AA. A survey of hierarchical classification across different application domains. Data Min Knowl Discov. 2011;22: 31–72. doi:10.1007/s10618-010-0175-9 A survey paper on hierarchical classification problems. It is a bit old as it didn’t consider the classifier chains, but this paper summarizes most of the ideas in hierarchical classification. The authors also proposed a framework for the categorization of such problems using two different dimensions (ranks).

Hashtags

@amneumarkt · Post #210 · 18.03.2021, 07:54

#ML How do we interpret the capacities of the neural nets? Naively, we would represent the capacity using the number of parameters. Even for Hopfield network, Hopfield introduced the concept of capacity using entropy which in turn is related to the number of parameters. But adding layers to neural nets also introduces regularizations. It might be related to capacities of the neural nets but we do not have a clear clue. This paper introduced a new perspective using sparse approximation theory. Sparse approximation theory represents the data by encouraging parsimony. The more parameters, the more accurate the model is representing the training data. But it causes generalization issues as similar data points in the test data may have been pushed apart [^Murdock2021]. By mapping the neural nets to shallow "overcomplete frames", the capacity of the neural nets is easier to interpret. [Murdock2021]: Murdock C, Lucey S. Reframing Neural Networks: Deep Structure in Overcomplete Representations. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2103.05804

Hashtags

@amneumarkt · Post #208 · 10.03.2021, 15:53

#ML Simple algorithm, powerful results https://avinayak.github.io/algorithms/programming/2021/02/19/finding-mona-lisa-in-the-game-of-life.html

Hashtags

@amneumarkt · Post #207 · 10.03.2021, 08:56

#ML hmmm https://bair.berkeley.edu/blog/2021/03/09/maxent-robust-rl/

Hashtags

@amneumarkt · Post #206 · 09.03.2021, 20:28

#ML I just found an elegant decision tree visualization package for sklearn. I have been trying to explain decision tree results to many business people. It is very hard. This package makes it much easier to explain the results to a non-techinical person. https://github.com/parrt/dtreeviz

Hashtags

@amneumarkt · Post #200 · 01.03.2021, 18:47

#ML Haha Deep Learning Activation Functions using Dance Moves https://www.reddit.com/r/learnmachinelearning/comments/lvehmi/deep_learning_activation_functions_using_dance/?utm_medium=android_app&utm_source=share

Hashtags

@amneumarkt · Post #198 · 25.02.2021, 08:01

#ML note2self: From ref 1 > we can take any expected utility maximization problem, and decompose it into an entropy minimization term plus a “make-the-world-look-like-this-specific-model” term. This view should be combined with ref 2. If the utility is related to the curvature of the discrete state space, we are making a connection between entropy + KL divergence and curvature on graph. (This idea has to be polished in depth.) Refs: 1. Trivial proof but interesting perspective: https://www.lesswrong.com/posts/voLHQgNncnjjgAPH7/utility-maximization-description-length-minimization 2. Samal Areejit, Pharasi Hirdesh K., Ramaia Sarath Jyotsna, Kannan Harish, Saucan Emil, Jost Jürgen and Chakraborti Anirban 2021Network geometry and market instabilityR. Soc. open sci.8201734. http://doi.org/10.1098/rsos.201734

Hashtags