Informatica 39 (2015) 229-235 229 The Random Hypergraph Assignment Problem Ralf Borndorfer Zuse Institute Berlin, Takustr. 7, 14195 Berlin, Germany E-mail: borndoerfer@zib.de Olga Heismann Zuse Institute Berlin, Takustr. 7, 14195 Berlin, Germany E-mail: heismann@zib.de Keywords: assignment, hyperassignment, random, expected value Received: July 13, 2014 Parisi's famous (proven) conjecture states that the expected cost of an optimal assignment in a complete bipartite graph on n + n nodes with i. i. d. exponential edge costs with mean 1 isJ2"=i 1/'2, which converges to an asymptotic limit of n2/6 as n tends to infinity. We consider a generalization of this question to complete "partitioned" bipartite hypergraphs G2,n that contain edges of size two and proper hyperedges of size four. We conjecture that for i. i. d. uniform hyperedge costs on [0,1] and i. i. d. exponential hyper-edge costs with mean 1, optimal assignments expectedly contain half of the maximum possible number of proper hyperedges. We prove that under the assumption of this number of proper hyperedges the asymptotic expected minimum cost of a hyperassignment lies between 0.3718 and 1.8310 if hyperedge costs are i. i. d. exponential random variables with mean 1. We also consider an application-inspired cost function which favors proper hyperedges over edges by means of an edge penalty parameter p. We show how results for an arbitrary p can be deduced from results for p = 0. Povzetek: V članku je opisana analiza komplektnosti dvojno povezanih grafov na osnovi razširitve Parisi-jevega izreka. 1 Introduction A way to gain a better understanding of the structure of a combinatorial optimization problem is to analyze the optimal values of random instances. For the assignment problem, such results were conjectured after extensive computational experiments and then proven theoretically. In particular, the famous (proven) Conjectures of M6zard and Parisi [M6zard and Parisi, 1985] state that the expected optimal cost value of an assignment problem on a complete bipartite graph with i. i. d. uniform edge costs on [0,1] or i. i. d. exponential edge costs with mean 1 converges to 2 ^ = 1.6449 ... if the number of vertices tends to infinity. The limit is equal for both distributions since it can be proven that only the density at 0 is relevant, which coincides for both distributions [Aldous, 1992]. For a survey on the random assignment problem and several of its variants, see [Krokhmal and Pardalos, 2009]. We consider a generalization of this setting to a class of bipartite hypergraphs in terms of what we call the random hypergraph assignment problem (HAP). This problem is an idealized version of vehicle rotation planning problems in long-distance passenger rail transport, see [Borndorfer et al., 2011] for further details and [Mar6ti, 2006] for a survey on railway vehicle rotation planning. We will deal with HAPs in a special well-structured type of bipartite hypergraphs G2,n, that contain on each side n "parts" of size 2 each. In this case, the HAP is already NP-hard [Borndorfer and Heismann, 2012] and therefore interesting to analyze. The hyperedge set of such a partitioned hypergraph G2,n consists only of edges of size 2 and proper hyperedges of size 4, and it has a structure that makes it easy to view a hyperassignment as a combination of two assignments, one consisting only of edges, and the other one consisting only of proper hyperedges (that can also be viewed as edges). Despite this simple general idea, however, combining the two assignments involves a choice over an exponential number of possibilities which is quite difficult to analyze. We will explain this in more detail in Section 2 after introducing the problem. In Section 3, we conjecture that the expected number of proper hyperedges in an optimal solution of the random HAP on partitioned hypergraphs G2,2n with i. i. d. uniform random edge costs on [0,1] or i. i. d. exponential random edge costs with mean 1 is n. This conjecture is based on extensive computational results. Assuming that this conjecture holds, we can prove a lower bound of 0.3718 and an upper bound of 1.8310 for the expected value of a minimum cost hyperassignment in G2,2n for the exponential edge cost distribution and for vertex numbers tending to infinity. To achieve this, we first use a combinatorial argument to represent the bounds in terms of bounds for random assignments. Then, we compute these bounds using results 230 Informatica 39 (2015) 229-235 R. Borndörfer et al. for the random assignment problem. In hypergraph assignment problems that arise from practical applications, proper hyperedges represent unions of edges. Such hyperedges have costs that are smaller than the sum of the costs of the edges that they contain; these edges are considered to be similar and a solution with much similarity is desirable [Borndörfer et al., 2011]. We consider a setting with regularity-rewarding cost functions, in which the number of proper hyperedges in a solution and the optimal value of a random HAP in G2,n do not only depend on the number of vertices n but also on an edge penalty parameter p. We will show how the number of proper hyperedges and the value of an optimal solution for every p can be deduced from results for p = 0 in Section 4. The paper ends in Section 5 with a discussion of the results. A short conference version of this paper has already been published as [Heismann and Borndörfer, 2013]. 2 The hypergraph assignment problem We consider in this paper hypergraph assignment problems on a special type of bipartite hypergraphs. Definition 2.1. Let G2,n = (U, V, E) be the bipartite hypergraph with vertex sets with U = U Ui, Ui = {ui, ui}, v = U V Vi = {Vi, v} and hyperedge set E = Ei U E2 where Ei = {{u, v} : u € U, v € V} are the edges and E2 = {Ui U Vj : ij €{1,...,n}} are the proper hyperedges of size 4. The sets U and Vj, i € {1,..., n} are called the parts on the U- and V-side, respectively. For a visualization of such a hypergraph, see Figure 1. Note that every hyperedge in G2,n connects a part on the U- and a part on the V-side. We remark that the HAP can be formulated in the same way for more general bipartite hypergraphs, with less structure and possibly containing hyperedges which contain more than four vertices, see [Borndorfer and Heismann, 2012]. Figure 1: Visualization of the bipartite hypergraph G2,3. The thick hyperedge is the proper hyperedge U' U V2 = {u1,u'1,V2,V,2 }. Definition 2.2. For a vertex subset W C U U V we define the incident hyperedges ¿(W) := {e € E : e n W = 0, e \ W = 0} to be the set of all hyperedges having at least one vertex in both W and (U U V) \ W. A hyperassignment in G2,n is a subset H C E of pair-wise disjoint hyperedges that cover U and V, i. e., for all ei, e2 € H, ei n e2 = 0, and U H = U U V. Given a cost function cE : E ^ R, the cost of a hyperassignment is£ eeH cE (e). The hypergraph assignment problem with input (G2,n, cE) consists of finding a hyperassignment in G2,n of minimum cost w. r. t. cE. For bipartite hypergraphs G2,n, the hypergraph assignment problem can be seen as a combination of two assignment problems. Namely, observe that for every hyperassignment H and every part U and Vj, i € {1,..., n}, the set of incident hyperedges ¿(Uj) n H and ¿(Vj) n H consists either of one proper hyperedge or of two edges. If we decide for every part Uj and Vj whether the hyperassign-ment to be constructed is incident to one proper hyperedge or to two edges, we can restrict the hyperedge set of G2,n to - the set of edges connecting pairs of vertices from the parts Uj, Vj that will be incident to edges—this is the first assignment problem, and - the proper hyperedges {Uj U Vj} for Uj and Vj that will be incident to proper hyperedges—viewing Uj and Vj as composite vertices and the hyperedges as edges connecting them—this is the second assignment problem. Solving these two assignment problems independently produces the minimum cost hyperassignment subject to the fixed edge and hyperedge incidences. The HAP in G2,n can thus be solved in two steps. The first step decides which parts Uj and Vj will be incident to proper hyperedges. Of course, we must chose the same number of parts on the U- and the V-side, equal to the number of proper hyperedges in the hyperassignment to be constructed; the other parts will be incident to edges. The second steps consists of solving the resulting two assignment problems stated above. The Random Hypergraph Assignment Problem Informatica 39 (2015) 229-235 231 3 Expected optimal values for the random HAP with exponential or uniform costs Predicting the optimal value of a random hypergraph assignment problem in G2,n involves a prediction of the number of proper hyperedges in an optimal solution. This number depends on how advantageous it is to choose a proper hyperedge instead of two edges (so that one has just one number adding to the cost instead of two) compared to the disadvantage of having less freedom (there are fewer possibilities to cover a single vertex with a proper hyperedge than with an edge) when searching for a hyperassignment with the least possible cost. We conjecture that one can expect that an optimal hypergraph assignment in G2,n contains half of the possible number of proper hyperedges. Conjecture 3.1. The expected number of proper hyperedges in a minimum cost hyperassignment in G2,2n with cost function cE such that all cE(e), e G E are i. i. d. exponential random variables with mean 1 or i. i. d. uniform random variables on [0,1] is n. Table 1 backs this conjecture. It gives computational results for the random hypergraph assignment problem in the bipartite hypergraph G2,n with i. i. d. uniform random variables on [0,1] and i. i. d. exponential random variables with mean 1 as hyperedge costs. For every n, we report the mean value and the standard deviation of the optimal cost value and the number of proper hyperedges in the optimal solution for 1000 computations. The HAPs were solved as integer programs using CPLEX 12.5. The first column (n) of Table 1 shows the number of parts on the U- and V-side. Columns 2 and 6 (opt. val.) give the mean optimal values. Their standard deviations can be seen in columns 3 and 7 (s. d.) for the two cost function distributions, respectively. Columns 4 and 8 (# pr. hy.) show the number of proper hyperedges in the optimal solutions found, columns 5 and 9 (s. d.) show their standard deviations. The important finding w. r. t. Conjecture 3.1 is that the values in columns 4 and 8 are about half the values in column 1 in each row. The computational results also suggest that the expected optimal cost converges to a value around 1.05 for both distributions. Although for larger n more hyperedges are contained in a hyperassignment, the optimal value does not increase much. This can be intuitively explained by noting that for larger n there are also more possible hyperassign-ments to select from, and the chances to find a hyperassign-ment that has a low cost are therefore still good even if it will contain more hyperedges. We will now compute a lower and upper bound on the expected value of a minimum cost hyperassignment in G2,2n with n proper hyperedges for the exponential distribution. To this end, we will use the following result: For a complete bipartite graph with vertex sets of size m and n and with i. i. d. exponential random variables with mean 1 as edge costs, the expected minimum value of the sum of k pairwise disjoint edges (this is called a partial assignment) is E(m,n,k):= V --—--. ij>o (n - i)(m - j) i+j0 (2n - i)(2n - j) i+j0 (2n - i)(2n - j) ' i,j>0 (2n - i)(2n - j) i+j-2 i+j0 i+j 0, i + j < n - 4 so that they can cancel. The remainder is as follows. For the first sum, it is used that it is symmetric in j = -2 n-2 + E j = -2 (2n + 1)(2n - j) 2 (2n + 2)(2n - j) i and j. The term 77- j 4(? (4n+3)2 , , -|S2/n , -1 \2 is the sum of the values where -2 < i, j < -1. This has to be subtracted from the first term as otherwise these values would be counted twice. (4n + 3)2 4(n + 1)2(2n + 1)2 n-1 D(n) = 2 • y _1_ -2-2 (2n - i)(2n - j) i+j 0.69. Therefore, for n > 80, 2 • lim E(n) n—^to > 2 • 0.1859 = 0.3718 and for the upper bound lim (E(n) + E(2n, 2n, 2n)) = lim E(n) n—^^o n—^w + lim E(2n, 2n, 2n) n—w n2 < 0.1860+ — 6 < 1.8310. □ We remark that the upper bound computed in Theorem 3.2 is greater than the expected optimal value of the 2 random assignment problem ^ = 1.6449 .... We believe that it must be possible to reduce it, because moving from an assignment problem in a complete bipartite graph with 4n vertices on each side to a HAP in G2,2n adds more possibilities (still all assignments are feasible solutions but using hyperassignments with proper hyperedges gives additional ones). Indeed, it is clear that if we do not prescribe the number of proper hyperedges in an optimal solution, the expected optimal value of a hyperassignment in G2 2n 2 will tend to some number < ^ .As already discussed, the computational results shown in Table 1 suggest that the correct number is some value around 1.05, much smaller than n2. 4 Regularity rewarding costs Hypergraph assignment problems arising from practical applications feature costs for proper hyperedges that depend on the costs of the edges that they contain. Indeed, proper hyperedges model a "reward" for choosing combinations of edges; in this way, one can model a so-called regularity of the solution [Borndorfer et al., 2011]. More precisely, one considers partitioned bipartite hypergraphs and wants to favor the simultaneous choice of a set of edges that connects all nodes in a certain part in U to all nodes in a certain part in V. To this purpose, one introduces a proper hyperedge that represents the union of such pairwise disjoint edges and that has a cost that is smaller than the sum of the edge costs. If different edge combinations result in the same hyperedge, the cost is inferred from the edge set with the minimum cost sum. Here is a more formal statement. 2H2 n — 2Hn 230 Informatica 39 (2015) 229-235 R. Borndörfer et al. Definition 4.1. Let G = (U, V, E) be a partitioned hypergraph. For e € E, let E(e) := |E' C E1 : e1 n e2 = 0 Vex, e2 € E' with ei = e2^E' = e} be the set of all pairwise disjoint edge sets with union e. For some penalty p > 0, we call a cost function cpE : E ^ R regularity-rewarding if for all proper hyperedges e € E2, cE (e) = min E'GE (e) £ ce (e') - p •|E'|J . e'GE' J The greater p, the more irregularity is punished and regularity rewarded. We remark that the cost of a hy-peredge in a vehicle rotation planning model depends on several other parameters such as an additional irregularity penalty for hyperedges that are not inclusion-wise maximal [Borndorfer et al., 2011]. This is the reason why we call p a penalty and not a bonus or a reward. A way to define a regularity-rewarding random cost function cE is to draw a random basic cost re for each edge e € E1, e.g., from a uniform distribution on [0,1] or an exponential distribution with mean 1, and then to set re + p cE(e) := I minE'GE(e)Ee'eE' re' if e is an edge, re/ if e is a proper hyperedge. we will view Z(h) as a continuous, monotonically increasing, differentiable function on [0, n]. This will allow us to formulate our result in a much easier way than if we would have to replace the derivative by its discretization. We can require Z (h) to be monotonically increasing, because z(h, 0) is monotonically increasing with increasing h. The reason is that, as described above, using proper hyperedges in the solution cannot lead to smaller optimal values than using only edges in the case p = 0. Theorem 4.2. Consider the complete bipartite hypergraph G2,n = (U, V, E) and let re, e € E1 be random basic costs chosen from some random distribution. Denote by h^... hkd the solutions to the equation Z'(h) = 2p and let h* = arg min (Z(h) - (2n - 2h)p) hG{0,hJ,...,hk,n} Then the expected number of proper hyperedges in an optimal solution to the HAP in G2,n w. r. t. cE with basic random costs re is h* and the expected optimal value of the random HAP is Z(h*) - (2n - 2h*)p. Proof. First, observe that z(h,p) = z(h, 0) + (2n - 2h)p holds since the cost of each hyperassignment H w. r. t. cE is In the following, we will assume that cE is structured in this way with arbitrary re. For a given bipartite hypergraph G2,n = (U, V, E) and random basic costs re for the edges e € E1, we denote by z(h,p) the minimal cost value of a hyperassignment with penalty p that contains exactly 0 < h < n proper hyper-edges. Obviously, the number of proper hyperedges and the value of an optimal solution will depend on p. If p = 0, there is no reward for choosing a proper hyperedge. For every solution using proper hyperedges, we can find a solution with the same value that contains only edges by replacing each proper hyperedge jwj, uj, vj, vj } by either the two edges jwj, vj}, {uj, vj} or the two edges {u, vj}, {uj, vj} depending on which two edges have the lower cost sum. On the other hand, if p is very large, choosing edges for a solution becomes so disadvantageous that the number of proper hyperedges in an optimal solution will become very high. Fortunately, knowledge about the case p = 0 gives information about all other penalties as the following theorem shows. Thus, we do not need to analyze random HAPs for regularity-rewarding cost functions separately for each penalty p. For some random basic cost distribution, we denote by Z(h) the expected value of z(h, 0) with respect to this distribution. Although z(h, 0) is defined only for integral h, cE(H) = £ cE(e) eGE = £ cE (e)+ £ cE (e) eGEi e£E2 (r £(re + p)+ E ESiL £ re' e£E1 e£E2 ( ) e'GE' £ re + |Ei n H|p eGEi + > min > re' ^ E'GE(e) ^ e eGE2 ^ ' e'GE' £ re + (2n - 2|E2 n H|)p eGEi + min re' ^ E'GE(e) ^ e eGE2 V e'GE' £ cE(e) + (2n - 2|E2 n H|)p eGE cE(H) + (2n - 2|E2 n H|)p. Since this holds for all random basic costs, it also holds for the expected value of all random basic cost distributions and we get E(z(h,p)) = Z(h) + (2n - 2h)p. Its derivative is Z'(h) - 2p. A minimum of a differen-tiable function is attained either at the bounds or where the derivative is equal to zero, which proves the theorem. □ The Random Hypergraph Assignment Problem Informatica 39 (2015) 229-235 231 5 Discussion In this paper, we have presented results on the expected minimum cost of the random hypergraph assignment problem for two types of cost functions. For the first type, i. i. d. exponential random variables with mean 1 or i. i. d. uniform random variables on [0,1], we conjectured that the number of proper hyperedges in an optimal solution is expected to be n for the hypergraph G2,2n, and showed computational results supporting this conjecture. Assuming this number of proper hyperedges in an optimal solution, we proved bounds on the expected optimal value for a vertex number tending to infinity. A proof of our conjecture as well as convergence results and either sharper bounds or an exact limit would be a natural continuation of our work towards a generalization of M6zard and Parisi's Conjecture. A first step is to extend the proof of our bounds to fixed numbers of hyperedges other than n by altering the computation. For the second type of regularity-rewarding cost functions, we established a connection between results for different penalty values. This result could be extended by an analysis similar to that for the first cost function type in future. All our results hold for complete partitioned hypergraphs G2,n. A further line of research could try to extend these results to bipartite hypergraphs with larger part sizes or even bipartite hypergraphs that are not partitioned or/and not complete. Our results show how to approach the random HAP using results for the random assignment problem. Probably approaches using more sophisticated probability-theoretical results are needed to understand more about the problem. References [Aldous, 1992] Aldous, D. (1992). Asymptotics in the random assignment problem. Probability Theory and Related Fields, 93:507-534. [Borndorfer and Heismann, 2012] Borndorfer, R. and Heismann, O. (2012). The hypergraph assignment problem. Technical Report 12-14, ZIB. [Borndorfer et al., 2011] Borndorfer, R., Reuther, M., Schlechte, T., and Weider, S. (2011). A Hypergraph Model for Railway Vehicle Rotation Planning. In Caprara, A. and Kontogiannis, S., editors, 11th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS 2011), volume 20 of OpenAccess Series in Informatics (OASIcs), pages 146-155, Dagstuhl, Germany. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. ZIB Report 11-36. [Coppersmith and Sorkin, 1999] Coppersmith, D. and Sorkin, G. B. (1999). Constructive bounds and exact expectations for the random assignment problem. Random Structures & Algorithms, 15(2):113-144. [Heismann and Borndörfer, 2013] Heismann, O. and Borndörfer, R. (2013). The random hypergraph assignment problem. In Proceedings of the 16th International Multiconference INFORMATION SOCIETY-IS 2013. [Krokhmal and Pardalos, 2009] Krokhmal, P. A. and Pardalos, P. M. (2009). Random assignment problems. European Journal of Operational Research, 194(1):1-17. [Linusson and Wästlund, 2004] Linusson, S. and Wästlund, J. (2004). A proof of Parisi's conjecture on the random assignment problem. Probability Theory and Related Fields, 128(3):419-440. [Maröti, 2006] Maröti, G. (2006). Operations research models for railway rolling stock planning. PhD thesis, Technische Universiteit Eindhoven. [M6zard and Parisi, 1985] M6zard, M. and Parisi, G. (1985). Replicas and optimization. Journal de Physique Lettres, 46:771-778.