Of Typewriters and Permutations (IV)

Written by: Paul Rubin

Primary Source: OR in an OB World, 12/16/2018.

I’m continuing the recent theme of solving a quadratic assignment problem that lays out the 26 letters of the English alphabet on a one-dimensional “keyboard” for an 18th century typewriter. I thought this would be the last post, but something new turned up, so there will likely be one more.

In the blog post that started all this, Hardmath123 found a solution (via a heuristic) with cost 5,499,341. Using frequency data from Nate Brixius, I get a slightly higher cost (5,510,008), which is the value I will use for it (because, hey, it’s my blog). Hardmath123 suspected but could not prove that this layout is optimal. I still can’t verify optimality (but maybe in the next post I will … or maybe not).

In my previous epistles on this subject, I tried out three MIP models and a quadratic (integer) program. In five minute runs using a beta copy of the next version of CPLEX, the best I was able to do was a solution with objective value 5,686,878.

Out of curiosity, I tried a constraint programming (CP) model (using CP Optimizer, the constraint solver in the CPLEX Studio suite). Constraint programming is best suited to models with integer variables (which we have here), although it can handle floating-point variables to some extent. It is well suited to logic constraints, and in particular is popular in scheduling, but it’s not my go-to tool in part because it tends to have very weak bounding.

Defining the constraint programming model for the keyboard problem (denoted “CP” in my code) is fairly straightforward, but rather different from any of the MIP/QP models. We use general integer variables rather than binary variables to determine the position assignments. So \(p_i\in \{0,\dots,25\}\) is the position of symbol \(i\) in the layout, \(s_j \in \{0,\dots,25\}\) is the symbol at position \(j\), and \(d_{ik}\in \{0,\dots,25\}\) is the distance between symbols \(i\) and \(k\). (Recall that we are using zero-based indexing.) As before, the objective is

\(\displaystyle \min \sum_i \sum_j f_{ij}d_{ij},\)

where \(f_{ij}\) is the frequency with which symbol \(j\) follows symbol \(i\).

Where it gets interesting is in specifying the constraints. Although CP is designed for logic constraints (“or”, “and”, “if-then”), it’s real power comes from what they call “global constraints”, specialized constraints that tie multiple variables together. Arguably the most common global constraint is “all-different”, meaning that in a group of variables no two of them can take the same value. It is the key to defining permutations, and in our model we add all-different constraints for both \(p\) (no two symbols have the same position) and \(s\) (no two slots have the same symbol). One or the other of those is technically redundant, but redundant constraints in a CP model can help the solver, so it’s probably worth including both.

Another global constraint provided by CP Optimizer is “inverse”. CP models allow variables to be used to index other variables, something we cannot do directly in math programming modes. The inverse constraint, applied to two vector variables \(u\) and \(v\), says that \(u_{v_i}=i\) and \(v_{u_i}=i\). In our model, we can apply the inverse constraint to \(p\) and \(s\). In effect, it says that the symbol in the position occupied by symbol \(i\) is \(i\), and the position of the symbol in position \(j\) is \(j\).

Finally, we can define the distance variables using the position variables:

\(\displaystyle d_{ij} = |p_i – p_j|\quad \forall i,j.\)

How does the CP model perform? As I expected, the bounding is pretty bad. In a five minute run, the lower bound only makes it to 447,795 (91.87% gap). The good news is that the CP model finds Hardmath123’s solution with objective value 5,510,008 … and finds it in about 22 seconds on my PC! This is using the default search strategy; I did not assign branching priorities.

The take-away here, from my perspective, is that in some problems a CP model can be a great way to generate an optimal or near-optimal incumbent, which you can use as a starting solution in an optimization model if you need to prove optimality.

The following two tabs change content below.
I'm an apostate mathematician, retired from a business school after 33 years of teaching mostly (but not exclusively) quantitative methods courses. My academic interests lie in operations research. I also study Tae Kwon Do a bit on the side.

Latest posts by Paul Rubin (see all)