Reversing Differences

Written by: Paul Rubin

Primary Source: OR in an OB World, 02/17/2020

Fellow blogger Håkan Kjellerstrand posted an interesting question on OR Stack Exchange recently. Starting from a list of integers, it is trivial to compute the list of all pairwise absolute differences, but what about going in the other direction? Given the pairwise (absolute) differences, with duplicates removed, can you recover the source list (or a source list)?

We can view the source “list” as a vector \(x\in\mathbb{Z}^n\) for some dimension \(n\) (equal to the length of the list). With duplicates removed, we can view the differences as a set \(D\subset \mathbb{Z}_+\). So the question has to do with recovering \(x\) from \(D\). Our first observation kills any chance of recovering the original list with certainty:

If \(x\) produces difference set \(D\), then for any \(t\in\mathbb{R}\) the vector \(x+t\cdot (1,\dots,1)’\) produces the same set \(D\) of differences.

Translating all components of \(x\) by a constant amount has no effect on the differences. So there will be an infinite number of solutions for a given difference set \(D\). A reasonable approach (proposed by Håkan in his question) is to look for the shortest possible list, i.e., minimize \(n\).

Next, observe that \(0\in D\) if and only if two components of \(x\) are identical. If \(0\notin D\), we can assume that all components of \(x\) are distinct. If \(0\in D\), we can solve the problem for \(D\backslash\{0\}\) and then duplicate any one component of the resulting vector \(x\) to get a minimum dimension solution to the original problem.

Combining the assumption that \(0\notin D\) and the observation about adding a constant having no effect, we can assume that the minimum element of \(x\) is 1. That in turn implies that the maximum element of \(x\) is \(1+m\) where \(m=\max(D)\).

From there, Håkan went on to solve a test problem using constraint programming (CP). Although I’m inclined to suspect that CP will be more efficient in general than an integer programming (IP) model, I went ahead and solved his test problem via an IP model (coded in Java and solved using CPLEX 12.10). CPLEX’s solution pool feature found the same four solutions to Håkan’s example that he did, in under 100 ms. How well the IP method scales is an open question, but it certainly works for modest size problems.

The IP model uses binary variables \(z_1, \dots, z_{m+1}\) to decide which of the integers \(1,\dots,m+1\) are included in the solution \(x\). It also uses variables \(w_{ij}\in [0,1]\) for all \(i,j\in \{1,\dots,m+1\}\) such that \(i \lt j\). The intent is that \(w_{ij}=1\) if both \(i\) and \(j\) are included in the solution, and \(w_{ij} = 0\) otherwise. We could declare the \(w_{ij}\) to be binary, but we do not need to; constraints will force them to be \(0\) or \(1\).

The full IP model is as follows:
\[ \begin{array}{lrlrc} \min & \sum_{i=1}^{m+1}z_{i} & & & (1)\\ \textrm{s.t.} & w_{i,j} & \le z_{i} & \forall i,j\in\left\{ 1,\dots,m+1\right\} ,i\lt j & (2)\\ & w_{i,j} & \le z_{j} & \forall i,j\in\left\{ 1,\dots,m+1\right\} ,i\lt j & (3)\\ & w_{i,j} & \ge z_{i}+z_{j}-1 & \forall i,j\in\left\{ 1,\dots,m+1\right\} ,i\lt j & (4)\\ & w_{i,j} & =0 & \forall i,j\in\left\{ 1,\dots,m+1\right\} \textrm{ s.t. }(j-i)\notin D & (5)\\ & \sum_{i,j\in\left\{ 1,\dots,m+1\right\} |j-i=d}w_{i,j} & \ge 1 & \forall d\in D & (6)\\ & z_{1} & = 1 & & (7) \end{array} \]

The objective (1) minimizes the number of integers used. Constraints (2) through (4) enforce the rule that \(w_{ij}=1\) if and only if both \(z_i\) and \(z_j\) are \(1\) (i.e., if and only if both \(i\) and \(j\) are included in the solution). Constraint (5) precludes the inclusion of any pair \(i < j\) whose difference \(j – i\) is not in \(D\), while constraint (6) says that for each difference \(d \in D\) we must include at least one pair \(i < j\) for that produces that difference (\(j – i = d\)). Finally, since we assumed that our solution starts with minimum value \(1\), constraint (7) ensures that \(1\) is in the solution. (This constraint is redundant, but appears to help the solver a little, although I can’t be sure given the short run times.)

My Java code is available from my repository (bring your own CPLEX).

The following two tabs change content below.
I'm an apostate mathematician, retired from a business school after 33 years of teaching mostly (but not exclusively) quantitative methods courses. My academic interests lie in operations research. I also study Tae Kwon Do a bit on the side.

Latest posts by Paul Rubin (see all)