|
Hi Y'all I am experimenting with an entropy max problem, and I run into the problem that log(0) is not allowed. In the model below, the entropy would be maxed by putting all the whatevers into Stuff[1] and zero into Stuff[2..end], but zero values for Stuff[j] make the objective function undefined. I might try an equiv formulation, maybe using Stuff2[j] = Stuff[j]+1, but there might be canonical tricks to solving this problem that I don't know about. Any ideas?
option halt_on_ampl_error 1;
param total;
param celln;
set J := 1..celln;
var Stuff{j in J} >= 0;
var E {j in J} = (Stuff[j]/total) * log(Stuff[j]/total);
var Ent = -1 * sum {j in J} E[j];
maximize _Ent: Ent;
subject to T1: sum {j in J} Stuff[j] = total;
data;
param total := 100;
param celln := 10;
After much interesting discussion, here is something that works as I expect with Ipopt: option halt_on_ampl_error 1;
set PLACE;
param total;
param smin{PLACE};
param smax{PLACE};
var Stuff{p in PLACE} >= 0;
var E {p in PLACE};
var Ent;
## minimize the negative for max entropy
minimize _Ent: -1 * Ent;
## control total
s.t. T1 : sum {p in PLACE} Stuff[p] = total;
## min and max stuff
s.t. SMin {p in PLACE} : Stuff[p] >= smin[p];
s.t. SMax {p in PLACE} : Stuff[p] <= smax[p];
## set up entropy equations
s.t. E1 {p in PLACE} : E[p] = (Stuff[p]/total) * log(Stuff[p]/total);
s.t. Ent1 : Ent = -1.0 * (sum {p in PLACE} E[p]);
data; ## DATA STARTS HERE ################
param total := 7000;
param: PLACE: smin smax :=
FF01 1000 1300
FF03 1200 1400
FF05 1000 1300
FF07 800 900
FF09 1000 1300
FF11 1000 1200 ;
|
|
There's
3
Hi, just a few friendly comments. (1) Conditional expressions only work for parameters in AMPL, not variables. (2) In my experience, the problem with logs is their extremely nonlinear behavior near 0. Adding a small constant avoids the singularity but some solvers still don't like the kind of behavior that is produced. As for log(x/y), in this case, the denominator is a constant that behaves like a scaling factor, so leaving it as log(x/y) may actually be numerically favorable. (3) adding integer variables may be a bit of an expensive approach just to get around the singularity issue. ;-)
(Dec 14 '11 at 14:25)
Gilead ♦
2
BTW, please don't take my comments the wrong way. It's just that I've spent years and years doing nonlinear modeling and have gone down the wrong path many times (I have white hair to show for it). I just want to share my experience so that others can avoid making the mistakes I did.
(Dec 14 '11 at 14:30)
Gilead ♦
|
|
If you'll allow me to restate your constraint in mathematical terms: $$ \begin{align} & E_{j} = \frac{S_{j}}{T} \log(\frac{S_{j}}{T}), &\forall j \in J \end{align} $$ You could conceivably rewrite the constraint this way: $$ \begin{align} & E_{j} = \frac{S_{j}}{T} y_{j}, &\forall j \in J \\ & \exp(y_{j}) = \frac{S_{j}}{T}, &\forall j \in J \\ & y_{L} \leq y_{j} \leq y_{U} , &\forall j \in J \end{align} $$ Reasonable bounds can usually be found for \(y_{j}\). Update: controversial statement removed because further study is required on my part. Note: your mileage may vary. This is only one possible formulation. I believe it to be a correct and robust one (for several reasons), but it may not be the best one. As others have noted, it is valid to write a log formulation and restrict its function argument to be greater than a slightly positive number. So if I'm understanding it right, his y_U will be 0, and y_L = -M. How would you go about determining a good value for it? I guess c * log(1/total)?
(Dec 14 '11 at 15:34)
Iain Dunning
In practice, I rarely specify bounds on \(y\) (I only included it for completeness). I use an IPM solver and even it seems to do the right thing without the bounds. However, it's easy to see that \(S_{j} = T\exp(y_{j})\). Suppose \(\epsilon_{m}\) is the optimizer tolerance, and \(S_{U}\) is the upper bound for \(S_{j}\). Then \(y_{L} = \log(\epsilon_{m}/T)\) and \(y_{U} = \log(S_{U}/T) \). If we consider \(\epsilon_{m} = 1 \times 10^{-6}\) and \(S_{U} = 100\) in this case, you're looking at \(y_{L} = -18.4 \) and \(y_{U} = 0 \), which is a reasonable range.
(Dec 14 '11 at 18:16)
Gilead ♦
|
|
I strongly disagree with those claiming that log() should be avoided. For instance geometric programming problems are almost always solved on dual form which is same as preferring log() to exp(). Note exp(x) is very badly behaved for even moderately large x. It is true active set based optimizers have problems with the log() and therefore you should use an interior-point based one. It automatically keeps x away from 0. I suggest you try something like MOSEK (www.mosek.com) on your problem. You can do that using NEOS for instance. Erling (Working for MOSEK) Just to respond to your comment about exp(x) being badly behaved; it is true that even modest positive values of x will give rise to very large function values. However, in my work with logs, say for y=log(x), I usually do a scaling on x such that: e <= x <= 100, where e is say, 1e-6. This means, exp(y)=x and therefore log(e) <= y <= log(100), or -13.8 <= y <= 4.6, which to me seems to be a reasonable range for both x and y.
(Dec 15 '11 at 01:54)
Gilead ♦
As for geometric programming, I have to defer to your expertise there because I have no experience at all with that class of problems. But I do see that I need to tone down my suggestion that logs should be avoided. I need to rethink this.
(Dec 15 '11 at 01:59)
Gilead ♦
Note also that the so-called barrier function in interior-point methods is sum_j ln(x_j). Hence, you even introduce the ln(x) in your method :-). Btw since ln(x)>=t is the same as x >=exp(t) then the two functions are closely related.
(Dec 15 '11 at 02:39)
Erling
|
|
Thanks to everyone, here is a new model which (1) seems to work (at least in Ipopt), (2) uses the more idiomatic method of stuffing the equalities into constraints, rather than "variables", and (3) fixes the logarithm problem. Now I will start to work on the data section to better describe the problem (allocating equally into 10 bins isn't all that exciting ....) The one thing I haven't figured out is why when I switch signs on my objective function I don't get a different allocation -- seems like with such a switch we should get a clustered distribution all in one bin. Thanks to all! Fun and instructive!
option halt_on_ampl_error 1;
param total;
param celln;
set J := 1..celln;
var Stuff{j in J} >= 0;
var E {j in J};
var Y {j in J};
var Ent;
minimize _Ent: Ent;
s.t. T1 : sum {j in J} Stuff[j] = total;
s.t. E1 {j in J} : E[j] = (Stuff[j]/total) * Y[j];
s.t. Y1 {j in J} : exp(Y[j]) = Stuff[j]/total;
s.t. Ent1 : Ent = +1.0 * (sum {j in J} Y[j]);
data; ## DATA STARTS HERE ################
param total := 100;
param celln := 10;
2
Just a note: You may want to append your updates to the original question. The way these Q&A sites work is just slightly different from forums in that each reply is really an answer. As for your model, I think the
(Dec 14 '11 at 18:45)
Gilead ♦
|

I might be misunderstanding the issue... but have you considered just adding a small constant inside the log to make sure that it doesn't happen?
I guess this is what you mean by Stuff2[j] = Stuff[j]+1, but, don't need to make a new variable or anything, or even use 1.
What is the physical meaning of log(0) in the context of your problem?
I tried the small constant and it works now, at least it compiles...
I am trying to model allocating population (or whatever) based on limited prior knowledge encoded in the constraints. So, we might know that between 10 and 20 people live in census tract #1, zero in #2, but have no idea about #3. The entropy max thing tries to spread around whatever is not constrained equally.
(I am getting weird results when I use Bonmin, but I thought it should be OK, even though there is no integer term. On NEOS' mosek it works fine though... EDIT: Works with Ipopt ...)