title: “The story of the Gromov-Wasserstein optimal transport for the seating arrangement of a wedding party” emoji: “🥂” type: “tech” # tech: technical article / idea: idea topics: [“optimaltransport”, “optimal transport”, “python”, “wedding”] published: true

My wife and I had a small wedding party this spring. Among the various preparations, I could nitpick about the seating arrangement endlessly, so I used mathematical optimization to decide on it. ! I want to come up with a seating arrangement that looks good, taking into account inter-guest relationships

tl;dr

Background

In short, we wanted to make it guest-centered. Since we would have limited time to talk to each guest in person, we selected guests and their seating arrangements in such a way as to loosely connect the entire group through “friend of friend” relationships, hoping that new interactions would happen among the guests, as in banquets of academic conferences. This led to the optimization problem of embedding the nearly connected guest-relationship graph into a continuous long table as shown in the header image.

Straightforward formulation: quadratic programming

To let the computer solve this as an optimization problem, we need to define the goodness of seating.

This can be formulated as follows.

$$ \begin{align*} \operatorname{maximize} \sum_{i,i’} &\text{Friendship } S(\text{guest }i, \text{guest }i’) &\times \text{Closeness } C(\text{Seat of guest } i, \text{Seat of guest }i’) \end{align*} $$

Find the seating assignment that optimizes the above $\pi: [\text{number of people}] \to [\text{number of seats}]$. Hereafter, the number of people $=$seats$=:N$. This problem is called the quadratic allocation problem and is NP-hard in general, so we have to take an approximate solution.

As a solution space, we define a “degree” $x_{i,j}\in [0,1]$ that assigns guest $i$ to seat $j$. Let $x=(x_{0,0},\ldots,x_{i,j},x_{i,j+1},\cdots)^\top$ be a vector of these lined up. ! [](/images/seatopt_xspace.jpg =480x) Representation of Guest-Seat Allocation.

The optimization problem can be expressed as follows.

$$ \begin{align*} &\operatorname{maximize}{x \in \mathbb R^{N^2}} &&x^\top (\underbrace{S\otimes C}{=:P}) x \ &\operatorname{subjectto} && x_{i,j} \geq 0\ &&&\sum_i x_{i,j}=\sum_j x_{i,j}=1/N \end{align*} $$ The second constraint is the normalization to allow one seat per person.

Now that the problem is defined, all we have to do is to let the general-purpose solver solve it…but it did not work as far as we tried.

I decided to take a specialized approach, thinking that the problem is generally too difficult to solve with a general-purpose solver.

Reformulation: Gromov-Wasserstein optimal transport

In fact, it seems that this problem can be well viewed as an optimal transport problem.

The usual optimal transport is the problem of optimizing the assignment between two sets or probability distributions so as to minimize the “transport volume” by considering the distance between the elements. ! [optimal and non-optimal transport](/images/seatopt_optimal_and_not_optimal_transport.png =500x) Find the correspondence between the red point group and the blue point group that minimizes the transportation cost (Figure: Introduction to Optimal Transportation)

Gromov-Wasserstein (GW) extends this framework to situations where two sets each contain different distance structures. Since no distance is defined between elements between those sets, the assignment is determined so as to preserve the distance between elements within each set. In this problem, since the distance between guests and seats (good allocation) is not directly defined, but the distance between guests and seats can be defined, it is equivalent to thinking that the allocation that minimizes the “difference between (guest-to-guest relationships) distance and (seat-to-seat) distance” is a good allocation.

! [seatopt_gw_matching.png](/images/seatopt_gw_matching.png =300x) <! – ! seatopt_gw_matching.png –> Gromov-Wasserstein optimal transport finds the assignment that minimizes distance vs. distance (Figure: GW introductory slide)

Predefine the distance between guests (degree of acquaintance) and the distance between seats (whether they are close enough to talk). When the $i$th guest is assigned to the $j$th seat and the $i’$th guest is assigned to the $j’$th seat, if guest $i$ and guest $i’$ are acquainted (distance between guests $d_{i,i’}=$small), their seats $j,j’$ are also close ($\bar d_{j,j’}$=$small) and the reverse is also true We determine the seat assignment $x \in [0,1]^{N^2}$ so that That is, the difference between the distance between guests $d_{i,i’}$ and the distance between seats $\bar d_{j,j’}$ $|d_{i,i’}-\bar d_{j,j’}|$ is minimized by weighting the sum of the correspondence between guests and seats $x_{i,j}, x_{i’,j’}$ to obtain the following formula.

$$ \begin{align*} &\operatorname{maximize}{x \in \mathbb R^{N^2}} &&\sum{i,i’,j,j’} x_{i,j} x_{i’,j’}|d_{i,i’}-\bar d_{j,j’}| \in{maximize} &&&\operatorname{subjectto} && x_{i,j} \geq 0\ &&&\sum_i x_{i,j}=\sum_j x_{i,j}=1/N {align*} $$ If one of the distances $D$ and $\bar d$ is multiplied by a constant, the solution will change, so it is better to define each distance so that they are approximately equally distributed, or to use the percentile value of the distance instead of the distance itself.

Once the problem is defined, it can be passed to the solver in the Python-OT package and solved in a few seconds. The process flow can be summarized as follows The reproduction notebook is available at Google Colab.

! seatopt_gw_alg_flow.png Gromov-Wasserstein optimal transport finds the assignment that minimizes the distance between distances.

The general outline is above, but several minor adjustments have been made.

Evaluating the quality of the solution with the objective function $x^\top P x$ of the original QP, we got the following: the GW solution looks somewhat good in the sense of the objective function of QP^[The original QP is similarity maximization, GW is distance minimization, and they solve similar problems. Therefore, it is assumed that the solution of one can be in the sense of the other.] .

Random GW GW+Greedy Method
0.097 $\pm$ 0.02 0.242 0.300

Participants also commented that it was so natural that they could not believe it was decided mechanically.

Discussion: Why does it work?

As mentioned above, such quadratic assignment problems are computationally complex and could not be solved with standard QP solvers. In particular, general-purpose solvers often take advantage of convexity to solve it, but in this case it is not inherently convex. Because it is a long desk, there is symmetry in the seating arrangement, and the optimal arrangement remains optimal even if the optimal arrangement is rotated 180 degrees or mirror-image flipped, but at the midpoint $0.5 x + 0.5 x’$ of those solutions $x, x’$, there are people nearby (only half) or overlapping who should have been placed farther away. The problem is nonconvex because the intermediate solution is worse than the average of the endpoint solutions, making it difficult to approach in a generic way.

The GW solver also does not give a globally optimal solution. I believe that by treating the non-convex problem as non-convex and making good use of the good properties of the assignment problem with the distance structure, a fairly good solution can be obtained at high speed. The specific algorithm is described in GW introductory slides and elsewhere, but I am not sure why it solves fast. Personally, I have the rough impression that it often works well when distance structures can be used^[For example, it is [known] that the traveling salesman problem can also be solved fast when distances can be used, especially low-dimensional Euclidean distances (https://en.wikipedia. org/wiki/Travelling_salesman_problem#Euclidean)]

Summary

Although we spent two full weekends on the method study and implementation, we are glad we used the optimization solver. In general, I found the following advantages to the approach of using an optimization problem solver to solve the problem instead of human labor.

This was the first time for me to use a mathematical optimization solver by myself, but I learned the following lessons.

References

I see that a textbook on optimal transport is coming out next month. Rumor has it that Gromov-Wasserstein will be explained in an easy-to-understand manner. Optimal transport is a hot technology, not only for pure optimization like this one, but also increasingly used in machine learning such as GAN and causal inference. The book is written by a young leading Japanese expert who also gave the tutorial listed below, so I think it is a good book to have.

https://www.amazon.co.jp/dp/4065305144/

Here are some study materials for those who can’t wait.

Related work