https://doi.org/10.31449/inf.v43i2.2678 Informatica 43 (2019) 177–186 177 Benchmark Problems for Exhaustive Exact Maximum Clique Search Algorithms Sándor Szabó Institute of Mathematics and Informatics, University of Pécs Ifjuság utja 6, H-7624 Pécs, Hungary E-mail: sszabo7@hotmail.com Bogdán Zaválnij Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences Reáltanoda u. 13–15, H-1053 Budapest, Hungary E-mail: bogdan@renyi.hu Keywords: clique, maximal clique, maximum clique, graph coloring, random graph, Mycielski graph Received: February 12, 2019 There are well established widely used benchmark tests to assess the performance of practical exact clique search algorithms. In this paper a family of further benchmark problems is proposed mainly to test exhaus- tive clique search procedures. Povzetek: Podanih je nekaj novih standardnih problemov za testiranje algoritmov za iskanje klik. 1 Preliminaries Let G = (V;E) be a finite simple graph. Here V is the set of vertices of G and E is the set of edges of G. The finiteness ofG means thatG has finitely many nodes and finitely many vertices, that is,jVj <1,jEj <1. The simplicity ofG means thatG does not have any loop and it does not have double edges. LetC be a subset ofV . If two distinct nodes inC are always adjacent inG, thenC is called a clique inG. When C has k elements, then we talk about a k-clique. We in- clude the casesk = 0 andk = 1 as well whenjCj = 0 and jCj = 1, respectively. Though in these cases C does not have two distinct elements. A clique is maximal in G if it is not part of any larger clique inG. In other words a clique is maximal clique of the graphG if it cannot be extended to a larger clique by adding a new node of the graphG. Ak-clique is a maxi- mum clique inG ifG does not contain any (k + 1)-clique. All the maximum cliques of a graphG have the same num- ber of nodes. We call this well defined number the clique number ofG and we denote it by!(G). A number of problems is referred as clique search prob- lems [1]. Problem 1. Given a finite simple graphG. List all maxi- mal cliques ofG without repetition. Problem 2. Given a finite simple graphG and given a pos- itive integerk. Decide ifG has ak-clique. Problem 3. Given a finite simple graph G. Determine !(G). Problem 4. Given a finite simple graphG. List all maxi- mum cliques ofG without repetition. An algorithm to solve Problem 1 can be found in [2]. The algorithm is commonly known as Bron-Kerbosch al- gorithm. Obviously, the Bron-Kerbosch algorithm can be used to solve Problems 2, 3 and 4. A more efficient algo- rithm to solve these problems was first given in [3]. The algorithm is known under the name Caraghan-Pardalos al- gorithm. The Bron-Kerbosch and Carraghan-Pardalos al- gorithms are the classical algorithms that form the base of many further clique search procedures. These algorithms are presented in [14], [26], [24], [10], [9]. But this list is not intended to be complete. The complexity theory of the algorithm tells us that Problem 2 is in the NP-complete complexity class. (See for instance [6].) Consequently, Problem 3 must be NP-hard. We color the vertices ofG such that the following con- ditions are satisfied. (1) Each vertex ofG is colored with exactly one color. (2) Vertices of G connected by an edge receive distinct colors. A coloring of the nodes ofG that satisfies conditions (1), (2) is called a legal coloring or well coloring of the nodes ofG. Suppose that the nodes ofG can be colored legally using k colors. We may use a mapf : V !f1;:::;kg to de- scribe a coloring of the nodes ofG. The numbers 1;:::;k play the roles of the colors andf(v) is the color of the ver- texv for eachv2 V . If for adjacent nodesu andv ofG 178 Informatica 43 (2019) 177–186 S. Szabo et al. the equationf(u) =f(v) impliesu =v, then the coloring defined by the mapf is a legal coloring. There is a number of the colors k such that the nodes of G can be colored legally using k colors and the nodes of G cannot be colored legally using k 1 colors. This well defined number k is called the chromatic number of the graphG and it is denoted by (G). Graph coloring is a vast subject and we cannot make jus- tice to this venerable field. In this paper we take a very narrow view. We are interested in only one particular ap- plication. Note that !(G) (G) holds for each finite simple graphG and so coloring of the nodes can be used to estimate the size of a maximum clique. However, the gap between!(G) and (G) can be arbitrarily large. J. Myciel- ski [13] exhibited a graph M (k) for which !(M (k) ) = 2 and (M (k) ) =k for each integerk 2. In order to find bounds for!(G) the following node col- oring was proposed in [21]. Let us choose an integers 2 and consider a coloring of the nodes ofG that satisfies the following conditions. (1 0 ) Each vertex ofG is colored with exactly one color. (2 0 ) If the verticesv 1 ;:::;v s are vertices of a clique inG, then all the verticesv 1 ;:::;v s cannot receive the same color. A coloring of the nodes ofG satisfying the conditions (1 0 ), (2 0 ), is called a monochrome s-clique free coloring. In short we will talk abouts-clique free coloring. Fors = 2 the monochromes-clique free coloring of the nodes gives back the legal coloring of the nodes. There is a well de- fined minimum numberk such that the nodes ofG have an s-clique free coloring usingk colors. Thisk is referred to as the thes-clique free chromatic number ofG and it is de- noted by (s) (G). The inequality!(G) (s 1) (s) (G) shows thats-clique free coloring of the nodes can be used to establish upper bound for the clique number. A number of problems is considered in connection with coloring the nodes of a graph customarily. Problem 5. Given a finite simple graphG and given a pos- itive integerk. Decide whether the nodes ofG admit a legal coloring usingk colors. Problem 6. Given a finite simple graph G. Determine (G). It is a well known result of the complexity theory of al- gorithms that Problem 5 belongs to the P complexity class for k = 2 and it belongs to the NP-complete complexity class fork 3. (See for example [15].) It follows that for k 3 Problem 6 is NP-hard. Problem 7. Given a finite simple graphG and two positive integerss,k. Decide if the nodes ofG have a legals-clique free coloring withk colors. It was established in [23] that fork = 3,s 3 Problem 7 is NP-complete. 2 A Mycielski type result As we have already mentioned the chromatic number can be a poor upper estimate of the clique number. By My- cielski’s construction there are 3-clique free graphs with arbitrarily large chromatic number. P. Erd˝ os [5] general- ized this result. Let us call the length of a shortest cordless circle in a graph the girth of the graph. Clearly, the girth of a 3-clique free graph must be at least 4. Erd˝ os has proved that for given positive integersk andl, there is a finite sim- ple graphG with girth(G) l, (G) k. Erd˝ os’s proof is not constructive and so it is not at all straight forward how the resulting graphs could be used in constructing test instances. In this section we present another extension of Myciel- ski’s result. We replace the legal coloring of the nodes of a graph by a legals-clique free coloring of the nodes of the graph. Consequently, the s-clique free chromatic number (s) (G) will play the role of the chromatic number (G). The result is motivated by the fact that one might try to construct clique search test instances by replacing the Mycielski graph by the graph emerging from the proof the generalized version. Theorem 1. Let us choose two positive integers s and k with s 3 and k 2 (s 1) =(s 1). There is a finite simple graph L (s;k) such that !(L (s;k) ) 2 (s 1) and (s) (L (s;k) ) k. The reader may notice that the graph L (2;k) is isomor- phic to the Mycielski graphM (k) . Proof. The proof will be constructive. We start with the special case s = 3. We choose an integer k for which k 2 (3 1) =(3 1) = 2. LetM (k) be the Mycielski graph with parameter k. Let u 1 ;:::;u n be the nodes of M (k) . We substitute the nodeu i ofM (k) by an isomorphic copy M (k) i of the Mycielski graph for eachi, 1 i n. Let v i;1; ;:::;v i;n be the nodes of M (k) i . We assume that the nodes v 1;1 ;:::;v 1;n ;:::;v n;1 ;:::;v n;n are pair-wise distinct. These nodes will be the nodes of the graphL (3;k) . The edges of M (k) i are going to be edges of L (3;k) for each i, 1 i n. Further, whenever the unordered pairfu i ;u j g is an edge of M (k) , then we add the edge fv i; ;v j; g toL (3;k) for each , , 1 ; n. The dedicated reader will not fail to notice that the con- struction we just presented is the so called lexicographic product of the graphsM (k) andM (k) . We claim that!(L (3;k) ) 4. In order to verify the claim we assume on the con- trary that !(L (3;k) ) 5. Let C be a 5-clique in L (3;k) . Set V i = fv i;1; ;:::;v i;n g. Note that the set V i induces M (k) i inL (3;k) as a subgraph ofL (3;k) . From the fact that !(M (k) i ) 2 it follows thatC may have at most 2 nodes inV i for eachi, 1 i n. ThereforeC has nodes in some M (k) i for at least 3 distinct values ofi. Benchmark Problems for Exhaustive Exact. . . Informatica 43 (2019) 177–186 179 Suppose that i and j are distinct numbers in the set f1;:::;ng. A node in M (k) i and a node in M (k) j can be adjacent only if the unordered pairfu i ;u j g is an edge of M (k) . This means thatM (k) contains a 3-clique. ButM (k) does not contain any 3-clique. This contradiction com- pletes the verification of the claim. We claim that (3) (L (3;k) ) k. In order to prove the claim let us assume on the contrary that (3) (L (3;k) ) k 1. Set V = V 1 [[ V n = fv 1;1 ;:::;v 1;n ;:::;v n;1 ;:::;v n;n g: Suppose that the mapf : V !f1;:::;k 1g defines a 3-clique free coloring of the nodes ofL (3;k) . The restriction of f to the set V i is a map g i : V i ! f1;:::;k 1g. Clearly, the mapg i defines a coloring of the nodes of the graphM (k) i . From the fact that (M (k) i ) k it follows that there are two distinct adjacent nodes of M (k) i such that the two nodes receive the same color c i . SetU =fu 1 ;:::;u n g. Using the colorc i we define a map h : U !f1;:::;k 1g. We seth(u i ) = c i for eachi, 1 i n. Note that the maph defines a legal coloring of the nodes of the graphM (k) . The only thing which needs verification is that ifu i andu j are distinct adjacent nodes ofM (k) , then c i 6=c j . Let us assume on the contrary thatu i andu j are distinct adjacent nodes ofM (k) andc i = c j . The graphM (k) i has two distinct adjacent nodesv i;i(1) andv i;i(2) such that f(v i;i(1) ) =f(v i;i(2) ) =c i : Similarly, the graphM (k) j has two distinct adjacent nodes v j;j(1) andv j;j(2) such that f(v j;j(1) ) =f(v j;j(2) ) =c j : Note that the nodes v i;i(1) , v i;i(2) , v j;j(1) , v j;j(2) are the nodes of a 4-clique inL (3;k) . This means that the coloring defined by the mapf is not a 3-clique free coloring of the nodes of L (3;k) . This shows that the coloring defined by the maph is a legal coloring of the nodes ofM (k) . The coloring defined byh is using at mostk 1 colors. This contradicts the fact that (M (k) ) k. Thus we may conclude that (3) (L (3;k) ) k as we claimed. Let us turn to the special cases = 4. We choose a integer k for whichk 2 (4 1) =(4 1), that is,k 3. LetM (k) be the Mycielski graph with parameterk. Letu 1 ;:::;u n be the nodes ofM (k) . We substitute the nodeu i ofM (k) by an isomorphic copyL (3;k) i of the graphL (3;k) for each i, 1 i n. Letv i;1; ;:::;v i;m be the nodes ofL (3;k) i . We assume that the nodes v 1;1 ;:::;v 1;m ;:::;v n;1 ;:::;v n;m are pair-wise distinct. These nodes will be the nodes of the graphL (4;k) . The edges ofL (3;k) i are going to be edges ofL (4;k) for each i, 1 i n. Further, whenever the unordered pairfu i ;u j g is an edge of M (k) , then we add the edge fv i; ;v j; g toL (4;k) for each , , 1 ; m. We claim that!(L (4;k) ) 8. In order to verify the claim we assume on the contrary that !(L (4;k) ) 9. Let C be a 9-clique in L (4;k) . Note that the setV i =fv i;1; ;:::;v i;m g inducesL (3;k) i inL (4;k) as a subgraph ofL (4;k) . From the fact that!(L (3;k) i ) 4 it follows thatC may have at most 4 nodes inV i for each i, 1 i n. ThereforeC has nodes in someL (3;k) i for at least 3 distinct values ofi. Suppose that i and j are distinct numbers in the set f1;:::;ng. A node in L (3;k) i and a node in L (3;k) j can be adjacent only if the unordered pairfu i ;u j g is an edge of M (k) . This means that M (k) contains a 3-clique. But M (k) does not contain any 3-clique. This contradiction completes the proof of the claim. We claim that (4) (L (4;k) ) k. In order to prove the claim let us assume on the contrary that (4) (L (4;k) ) k 1. Set V = V 1 [[ V n = fv 1;1 ;:::;v 1;n ;:::;v n;1 ;:::;v n;n g: Suppose that the mapf : V !f1;:::;k 1g defines a 4-clique free coloring of the nodes ofL (4;k) . The restriction of f to the set V i is a map g i : V i ! f1;:::;k 1g. Clearly, the map g i defines a coloring of the nodes of the graph L (3;k) i . From the fact that (3) (L (3;k) i ) k it follows that there is a 3-clique inL (3;k) i such that the 3 nodes of the clique receive the same color c i . SetU =fu 1 ;:::;u n g. Using the colorc i we define a maph :U!f1;:::;k 1g. We seth(u i ) =c i for each i, 1 i n. Note that the maph defines a legal coloring of the nodes of the graphM (k) . The only thing which needs verification is that ifu i andu j are distinct adjacent nodes ofM (k) , then c i 6=c j . Let us assume on the contrary thatc i = c j . The graph L (3;k) i has 3 distinct pair-wise adjacent nodesv i;i(1) ,v i;i(2) , v i;i(3) such that f(v i;i(1) ) =f(v i;i(2) ) =f(v i;i(2) ) =c i : Similarly, the graphL (3;k) i has 3 distinct pair-wise adjacent nodesv j;j(1) ,v j;j(2) ,v j;j(2) such that f(v j;j(1) ) =f(v j;j(2) ) =f(v j;j(3) ) =c j : Note that the nodes v i;i(1) ;v i;i(2) ;v i;i(3) ;v j;j(1) ;v j;j(2) ;v j;j(3) are the nodes of a 6-clique inL (4;k) . This means that the coloring defined by the mapf is not a 4-clique free coloring of the nodes ofL (4;k) . This shows that the coloring defined by the maph is a legal coloring of the nodes ofM (k) . 180 Informatica 43 (2019) 177–186 S. Szabo et al. In the coloring defined by h most k 1 colors occur. This contradicts the fact that (M (k) ) k. Thus we may draw the conclusion that (4) (L (4;k) ) k as we claimed. Continuing in this way we can complete the proof of the theorem. 2 3 Test problems Problems 2 and 3 are commonly referred as k-clique and maximum clique problems, respectively. As we have pointed out it is a well known result of the complexity the- ory of algorithms that the maximum clique problem is NP- hard. Loosely speaking it can be interpreted such that the maximum clique problem is computationally demanding. As at this moment there are no readily available mathe- matical tools to evaluate the performance of practical clique search algorithms, the standard procedure is to carry out numerical experiments on a battery of well selected bench- mark tests. The most widely used test instances are the Erd˝ os–Rényi random graphs, graphs from the the second DIMACS chal- lenge 1 [8], combinatorial problems of monotonic matrices [25, 22], and hard coding problems of Deletion-Correcting Codes 2 [20]. Evaluating the performances of various clique search al- gorithms is a delicate matter. On one hand one would like to reach some practically relevant conclusion about the competing algorithms. On the other hand this conclusion is based on a finite list of instances. One has to be ever cautious not to draw overly sweep- ing conclusions from these inherently limited nature ex- periments. (We intended to contrast this approach to the asymptotic techniques which are intimately tied to infinity.) The situation is of course not completely pessimistic. Af- ter all, these benchmarks were successful at shedding light on the practicality of many of the latest clique search pro- cedures. However, we should strive for enhancing the test procedures. The main purpose of this paper is to propose new benchmark instances. There are occasions when we are trying to locate a large clique in a given graph such that the clique is not necessar- ily optimal. This approach is referred as non-exact method to contrast it to the exhaustive search. For instance con- structing a large time table in this way can be practically important and useful even without a certificate of optimal- ity. The benchmark tests are of course relevant in connec- tion with non-exact procedures too. In order to avoid any unnecessary confusion we would like emphasize that in this paper we are focusing solely on the exact clique search methods. Let n be a positive integer and let p be a real number such that 0 p 1. An Erd˝ os-Rényi random graph with 1 ftp://dimacs.rutgers.edu/pub/challenge/ 2 http://neilsloane.com/doc/graphs.html parametersn,p is a graphG with vertices 1; 2;:::;n. The probability that the unordered pairfx;yg is an edge ofG is equal top for eachx,y, 1 x