Written by: Paul Rubin

Primary Source: OR in an OB World, 07/17/2018.

Someone posted an interesting question about box sizes on Mathematics Stack Exchange. He (well, his girlfriend to be precise) has a set of historical documents that need to be preserved in boxes (apparently using a separate box for each document). He wants to find a solution that minimizes the total surface area of the boxes used, so as to minimize waste. The documents are square (I’ll take his word for that) with dimensions given in millimeters.

To start, we can make a few simplifying assumptions.

- The height of a box is not given, so we’ll assume it is zero, and only consider the top and bottom surfaces of the box. For a box (really, envelope) with side \(s\), that makes the total area \(2s^2\). If the boxes have uniform height \(h\), the area changes to \(2s^2 + 4hs\), but the model and algorithm I’ll pose are unaffected.
- We’ll charitably assume that a document with side \(s\) fits in a box with side \(s\). In practice, of course, you’d like the box to be at least slightly bigger, so that the document goes in and out with reasonable effort. Again, I’ll let the user tweak the size formula while asserting that the model and algorithm work well regardless.

The problem also has three obvious properties.

- Only document sizes need be considered as box sizes, i.e. for every selected size at least one document should fit “snugly”.
- The number of boxes you need at each selected size equals the number of documents too large to fit in a box of the next smaller selected size but capable of fitting in a box of this size.
- You have to select the largest possible box size (since that is required to store the largest of the documents).

What interests me about this problem is that it can be a useful example of Maslow’s Hammer: if all you have is a hammer, every problem looks like a nail. As an operations researcher (and, more specifically, practitioner of discrete optimization) it is natural to hear the problem and think in terms of general integer variables (number of boxes of each size), binary variables (is each possible box size used or not), assignment variables (mapping document sizes to box sizes) and so on. OR consultant and fellow OR blogger Erwin Kalvelagen did a blog post on this problem, laying out several LP and IP formulations, including a network model. I do recommend your reading it and contrasting it to what follows.

The first thought that crossed my mind was the possibility of solving the problem by brute force. The author of the original question supplied a data file with document dimensions. There are 1166 documents, with 384 distinct sizes. So the brute force approach would be to look at all \({383 \choose 2} = 73,153\) or \({383 \choose 3} = 9,290,431\) combinations of box sizes (in addition to the largest size), calculate the number of boxes of each size and their combined areas, and then choose the combination with the lowest total. On a decent PC, I’m pretty sure cranking through even 9 million plus combinations will only need a tolerable amount of time.

A slightly more sophisticated approach is to view the problem through the lens of a layered network. There are either three or four layers, representing progressively larger selected box sizes, plus a “layer 0” containing a start node. In the three or four layers other than “layer 0”, you put one node for each possible box size, with the following restrictions:

- the last layer contains only a single node, representing the largest possible box, since you know you are going to have to choose that size;
- the smallest node in each layer is omitted from the following layer (since layers go in increasing size order); and
- the largest node in each layer is omitted from the preceding layer (for the same reason).

Other than the last layer (and the zero-th one), the layers here will contain 381 nodes each if you allow four box sizes and 382 if you allow three box sizes. An arc connects the start node to every node in the first layer, and an arc connects every node (except the node in the last layer) to every node in the next higher layer where the head node represents a larger size box than the tail node. The cost of each arc is the surface area for a box whose size is given by the head node, multiplied by the number of documents too large to fit in a box given by the tail node but small enough to fit in a box given by the head node.

I wanted to confirm that the problem is solvable without special purpose software, so I coded it in Java 8. Although there are plenty of high quality open-source graph packages for Java, I wrote my own node, arc and network classes and my own coding of Dijkstra’s shortest path algorithm just to prove a point about not needing external software. You are welcome to grab the source code (including the file of document sizes) from my Git repository if you like.

I ran both the three and four size cases and confirmed that my solutions had the same total surface areas that Erwin got, other than a factor of two (I count both top and bottom; he apparently counts just one of them). How long does it take to solve the problem using Dijkstra’s algorithm? Including the time reading the data, the four box version takes about half a second on my decent but not workstation caliber PC. The three box version takes about 0.3 seconds, but of course gives a worse solution (since it is more tightly constrained). This is single-threaded, by the way. Both problem set up and Dijkstra’s method are amenable to parallel threading, but that would be overkill given the run times.

So is it wrong to take a fancier modeling approach, along the lines of what Erwin did? Not at all. There are just trade-offs. The modeling approach produces more maintainable code (in the form of mathematical models, using some modeling language like GAMS or AMPL) that are also more easily modified if the use case changes. The brute force and basic network approaches I tried requires no extra software (so no need to pay for it, no need to learn it, …) and works pretty well for a “one-off” situation where maintainability is not critical.

Mainly, though, I just wanted to make a point that we should not overlook simple (or even brute force) solutions to problems when the problem dimensions are small enough to make them practical … especially with computers getting more and more powerful each year.

#### Paul Rubin

#### Latest posts by Paul Rubin (see all)

- Coordinating Variable Signs - September 26, 2018
- Choosing “Big M” Values - September 17, 2018
- Adding Items to a Sequence - September 4, 2018