![](http://4.bp.blogspot.com/-WZq2bkIyCu8/Tct9px6L9vI/AAAAAAAAAEc/1cAAIqo6Lbk/s320/eqsource1.png)
![](http://2.bp.blogspot.com/-Sz2pbOB-2jI/TcKKyTgsZmI/AAAAAAAAAEE/_B5_wHCXbxE/s320/eqsource4.png)
![](http://4.bp.blogspot.com/-Bh4OdLbZN_0/TcKKv8zcHZI/AAAAAAAAAD8/CXBq2l48HV4/s320/eqsource3.png)
![](http://3.bp.blogspot.com/-6mQW30hs9vE/TcKKrr9MFGI/AAAAAAAAAD0/_x8dND-knN4/s320/eqsource2.png)
The example above comes from the manual of the glpk library. That manual continues by describing how to convert this problem into the standard form of glpk (which involves introducing three new variables) and then gives the c-code needed to call the library. Relating that c-code to the problem above without the intermediate explanation of the manual is not easy. A common solution here is to build a hi-level interface that allows a more natural way of defining the matrices and/or allow the equations to be entered symbolically. Unfortunately, such interfaces often become slow. For the benchmark below for example, cvxopt requires 20 minutes to setup a problem that takes 9.43 seconds to solve (this seems a bit extreme, am I doing something wrong?).
The high-level interface I constructed on top of the glpk library is pplp and it allows the equations to be entered symbolically. The above problem can be solved using
lp = LinearProgram() x, y, z = lp.IntVar(), lp.IntVar(), lp.IntVar() lp.objective = 10*x + 6*y + 4*z lp.add_constraint( x + y + z <= 100 ) lp.add_constraint( 10*x + 4*y + 5*z <= 600 ) lp.add_constraint( 2*x + 2*y + 6*z <= 300 ) lp.add_constraint( x >= 0 ) lp.add_constraint( y >= 0 ) lp.add_constraint( z >= 0 ) maxval = lp.maximize() print maxval print x.value, y.value, z.value
To benchmark the API I used it to solve a minimum-cost flow problem with 154072 nodes and 390334 arcs. The C library needs 9.43 s to solve this and the pplp interface adds another 5.89 s under PyPy and 28.17 s under CPython. A large amount of time is still spend setting up the problem, but it's a significant improvement over the 20 minutes required on CPython by cvxopt. It is probably not designed to be fast on this kind of benchmark. I have not been able to get cvxopt to work under PyPy. The benchmark used is available here
8 comments:
for the first equation do you not perhaps mean f(x,y,z) = 10x+6y+4z instead of z = 10x+6y+4z ?
Yes, there is a typo there, I'll update the post. Thanx for noting.
That seems like a lot of overhead for the wrapper, what is up with that? I mean, I'd expect the wrapper to reasonably quickly pass it off to the C library.
you should try www.solverfoundation.com using ironpython too.
Winston: It is indeed. What cvxopt spends 20 min on I don't know. One guess would be that it is passing the ~2 million coefficients involved to C one by one, possible with a bit of error checking for each of them. As for the 6 s used by pplp, it needs to convert the equations into the matrices glpk wants. That means shuffling the coefficients around a bit and some bookkeeping to keep track of which goes where.
Anonymous: OK, how would the above example look in that case?
Thanx for noting, I've fixed the post (again).
have you tried openopt[1]?
[1] openopt.org
Are you distinguishing between the time it takes to setup the optimization problem and the time it takes to actually solve it?
GLPK is a simplex solver written in C, and CVXOPT is an interior point solver written in Python/C and is not particularly optimized for sparse problem. Nevertheless, you should check the you actually formulate a large sparse problem in CVXOPT, and not a dense one.
Post a Comment