In honor of the last week of classes, there will be two Colloquia within the next three days! They say “April Showers Bring May Flowers.” For those of you who showered at least once in April, the flowers are here: Math Colloquiua! In honor of this occasion, I will put 6 GIF’s into this email instead of the normal 3.
Here’s the information for the first colloquium, which is happening tomorrow!
Who? Professor Richard Ehrenborg
What? A theorem by Baxter
When? Wednesday, May 1st at 4:30pm
Where? Fine Hall 214
Food? Your choice of Large-Sized Boba (we’re going all out this week) or Regular-Sized Boba Fett
Abstract: We discuss a theorem by Baxter from 1966 on planar graphs.
By reformulating the theorem we make an incursion into topology
and prove analogous results in higher dimensions. A connection
with Sperner’s Lemma is also highlighted.
Joint work with Gábor Hetyei.
There will be large-sized boba.
Now let’s talk about the second colloquium, which is happening on Friday.
Who? Dr. Nadav Cohen
What? Analyzing Optimization in Deep Learning via Trajectories
When? Friday, May 3rd at 4:30pm
Where? Fine Hall 214
Food? Your choice of Large-Sized Boba (we weren’t kidding above about going all out this week) or Post-Sarlacc Pit Boba Fett
Abstract: The prominent approach for analyzing optimization in deep learning is based on the geometry of loss landscapes. While this approach has led to successful treatments of shallow (two layer) networks, it suffers from inherent limitations when facing deep (three or more layer) models. In this talk I will argue that a more refined perspective is in order, one that accounts for the specific trajectories taken by the optimizer. I will then demonstrate a manifestation of the latter approach, by analyzing the trajectories of gradient descent over arbitrarily deep linear neural networks. We will derive what is, to the best of my knowledge, the most general guarantee to date for efficient convergence to global minimum of a gradient-based algorithm training a deep network. Moreover, in stark contrast to conventional wisdom, we will see that sometimes, gradient descent can train a deep linear network faster than a classic linear model. In other words, depth can accelerate optimization, even without any gain in expressiveness, and despite introducing non-convexity to a formerly convex problem.
We hope to see you there!
Your academic chairs,
Nathan and Tristan
PS: For those of you who watched Endgame already, I was traumatized when they killed Batman.
PSS: For those of you who are all caught up on Game of Thrones, it was really sad to see Gandalf die.