Comments on: Maximizing a function over orthogonal matrices
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices/
Comments on Ask MetaFilter post Maximizing a function over orthogonal matricesSun, 26 Oct 2008 10:59:37 -0800Sun, 26 Oct 2008 10:59:37 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Maximizing a function over orthogonal matrices
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices
Maximizing a function over orthogonal matrices, or: solving polynomial order 2 systems. Help me out or point me to a better forum / listhost. <br /><br /> I have a function f(T) of a matrix that I want to maximize, which is order two in the elements of T. Easy, partial derivatives gives you a linear system. Here's the catch; I also want T to be orthonormal (the dot product of the rows to be zero or one -- the extension of orthogonal to non-square matrices). <br>
<br>
The obvious way to maximize a function with constraints is to tack on Lagrange multipliers g(T) = f(T) + sum{lambda[ij](Ti . Tj - delta(i,j) ) }, but now g(T) is order three in parameters, and partial derivatives are order two and I don't know how to solve that system generally.<br>
<br>
The dimension of T can be huge, tens of thousands. Let's not suggest numerics unless there's an amazing trick to deal with the dimensionality.<br>
<br>
Any ideas? Maximization subject to orthogonality seems like something that should have come up a lot and someone have figured out.post:ask.metafilter.com,2008:site.105198Sun, 26 Oct 2008 08:26:48 -0800a robot made out of meatoptimizationlagrangepolynomialsystemorthogonalorthonormalmathmathematicsstatisticsmatrixalgebraresolvedBy: vernondalhart
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1519997
Well, the point of Lagrange multipliers is that for each additional variable lambda<sub>ij</sub>, you also add an additional equation. So the system should still be, in principle, solvable.<br>
<br>
When you say that the function is order two in the elements of T, do you mean that it is essentially a polynomial of degree 2 in n<sup>2</sup> variables?<br>
<br>
Honestly, I'm not sure that there is a better way to go about this. Setting each of your partials to zero results in the intersection of n<sup>2</sup> hyperplanes in <b>R</b><sup>n<sup>2</sup></sup>. In general position, these should all intersect in a single point.<br>
<br>
Of course, this doesn't answer your question. I'm largely brainstorming, but coming up with nothing helpful.comment:ask.metafilter.com,2008:site.105198-1519997Sun, 26 Oct 2008 10:59:37 -0800vernondalhartBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520136
Yes, it contains terms like T[1,2]*T[9,3] and T[4,7]^2.comment:ask.metafilter.com,2008:site.105198-1520136Sun, 26 Oct 2008 13:27:30 -0800a robot made out of meatBy: metastability
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520149
1. Are you looking for solutions over the complex numbers or real numbers?<br>
2. O(n, R) is compact, so if your function is real-valued, and continuous, (which it is since it is polynomial), then it attains a maximum. So that's good to know.<br>
3. <a href="http://www.singular.uni-kl.de/index.html">Singular</a> may be able to solve systems of polynomial equations over C. I don't remember exactly what it can do, but it's worth checking out.comment:ask.metafilter.com,2008:site.105198-1520149Sun, 26 Oct 2008 13:53:28 -0800metastabilityBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520248
Real values only. Yeah, I know that it has a solution. Finding it is the trick.<br>
<br>
Note: I'm willing to accept an approximate answer. I'd have to think about exactly how much tolerance I'm willing to accept from the peak or from my condition.comment:ask.metafilter.com,2008:site.105198-1520248Sun, 26 Oct 2008 15:51:22 -0800a robot made out of meatBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520271
Ok, I'm seeing the "Grobner basis method" come up in the documentation for Singular and Maple, so maybe this is The Way. It's activating long dormant parts of my brain responsible for abstract algebra and algebraic geometry, which hurts.comment:ask.metafilter.com,2008:site.105198-1520271Sun, 26 Oct 2008 16:12:16 -0800a robot made out of meatBy: metastability
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520305
Real numbers are harder. You should probably abandon all hope of getting an exact solution. Numerical methods are a better bet.comment:ask.metafilter.com,2008:site.105198-1520305Sun, 26 Oct 2008 17:15:16 -0800metastabilityBy: hAndrew
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520524
You could directly parameterise the X set of all orthonormal matrices of size m x n. Then write f as a function of these parameters and minimise from there.<br>
<br>
To parameterise the set X, you could think by analogy with parameterising the set of all pairs (m=2) of orthogonal unit vectors in R^3 (n=3). I think you'll end up with a bunch of products of cos(parameter) and sin(parameter).<br>
<br>
I suppose the set X will have dimension (n-1) + (n - 2) + ... + (n - m + 1) = 1/2 (-2 - 3 m - m^2 + 2 n + 2 m n). You directly minimise over those parameters with no need for any constraints.comment:ask.metafilter.com,2008:site.105198-1520524Sun, 26 Oct 2008 22:25:09 -0800hAndrewBy: hAndrew
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1520525
<small>"You could directly parameterise the X set of ..." should be "You could directly parameterise <b>the set X of</b> ..."</small>comment:ask.metafilter.com,2008:site.105198-1520525Sun, 26 Oct 2008 22:26:43 -0800hAndrewBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1521917
Thanks everyone, I'm pursuing these strategies in parallel after midterms.<br>
<br>
The Grobner or Janet basis method is appealing, but I have some apprehension since members on the forums claim trouble with MUCH smaller ideals than the ones I'd be working with.<br>
<br>
It's true that I have an analytic gradient and hessian, so it's not inconceivable that straight numeric optimization over g(T) could work even in high dimension. I'm skeptical, but it's worth a shot. Since I have definable tolerance for suboptimal solutions, I might have a hope.<br>
<br>
It's true that the rows of an orthonormal matrix are orthogonal elements of the hypersphere, which is defined by the right choice of orienting angles. I'll look into rewriting my matrix in angular form. This would give me advantages in numeric and statistical methods of optimization since I've plugged into the reduced space rather than the larger space implied by the Lagrange method. I'll have to see if the many frequency domain methods will give me anything analytic here. The disadvantage is the loss of computational simplicity of the function, which will now be a sum of products of trig functions rather than monomials.comment:ask.metafilter.com,2008:site.105198-1521917Tue, 28 Oct 2008 05:51:47 -0800a robot made out of meatBy: onalark
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1525842
Is your function concave? Maximizing a non-concave function usually scales exponentially. <br>
<br>
It depends on your function but it's possible that with a few assumptions/restrictions (symmetry, positive-definite-ness, concavity) you could solve this using a convex programming approach, which allows for the restriction that you are optimizing over the space of symmetric orthogonal matrices.comment:ask.metafilter.com,2008:site.105198-1525842Fri, 31 Oct 2008 07:54:12 -0800onalarkBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1527781
So, it seems that defining orthogonal vectors on the hypersphere is less easy than I thought. However, you can optimize over skew-symmetric matrices and use the Cayley transform to project into SO(n). Tomorrow I'll see if there's an easy extension for non-square matrices.comment:ask.metafilter.com,2008:site.105198-1527781Sun, 02 Nov 2008 17:31:29 -0800a robot made out of meatBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1527786
Also, I am working to see if I can write my orthonormal matrix as the product of householder or givens matrices in an easy way.comment:ask.metafilter.com,2008:site.105198-1527786Sun, 02 Nov 2008 17:40:24 -0800a robot made out of meatBy: hAndrew
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1527980
<i>So, it seems that defining orthogonal vectors on the hypersphere is less easy than I thought.</i><br>
<br>
Do you mean you couldn't find forms for them as a function of the 1/2 (-2 - 3 m - m^2 + 2 n + 2 m n) parameters, or that you could but something else proved difficult after that?comment:ask.metafilter.com,2008:site.105198-1527980Sun, 02 Nov 2008 22:26:39 -0800hAndrewBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1528237
Yes, writing them as a clean function of the reduced parameter space wasn't clear. That's what I'm hoping to do with the Cayley transformation.comment:ask.metafilter.com,2008:site.105198-1528237Mon, 03 Nov 2008 07:16:17 -0800a robot made out of meatBy: a robot made out of meat
http://ask.metafilter.com/105198/Maximizing-a-function-over-orthogonal-matrices#1530241
For anyone following,<br>
I still don't have a great proof that k-row skew-symmetric matrices inject into the space of k-row special orthonormal matrices, but I have some appealing statistical results. Namely, that out of many random 7x7 orthogonal matrices, if I trim the first k rows I seem to be able to (numerically) find k-row skew-symmetric matrices which Caley transform arbitrarily close to the target.<br>
<br>
I may do this for the Givens rotator method of generating orthogonal matrices also; ie can I just use fewer angles and inject into the desired space.comment:ask.metafilter.com,2008:site.105198-1530241Wed, 05 Nov 2008 05:16:09 -0800a robot made out of meat