Please have a look at my working paper on RG: https://www.researchgate.net/publication/313164189_A_fast_gradient_descent_algorithm_for_strictly_quasiconvex_functions. My algorithm represents a different approach to gradient descent and is very fast according to my tests. I need to do some more testing with real life high dimensional problems, maybe you can help me with it. ]]>

University of Washington has been one my top choices for a long time because of the strengths you mentioned but I'm more of a probability person so I was always planning on applying to the stats (and/or) math department. (math undergrad).

For background on me I've taken the year long graduate probability sequence at UCLA, am in the graduate real analysis sequence, done research in REU's on compressed sensing and excited random walks. So basically I'm split between sort of pure mathematics (probability) and statistics.

Could you possibly contrast the statistics and CS departments?

]]>Regarding the canonical controllable form question in the post, I learned it from this lecture notes

http://control.ee.ethz.ch/~ifalst/docs/LectureNotes.pdf

In Section 9.2, it is proved that if a single-inupt single-output system $(A,B,C,D)$ satisfies

$[B, AB, \dots, A^{n-1}B]\in \mathbb{R}^{n\times n}$ is of full rank,

then there exists an invertible matrix $T$ such that $(TAT^{-1}, TB, CT^{-1}, D)$ is an equivalent system with the canonical controllable form mentioned in the post. This kind of system is called controllable system (for some other reason).

]]>