Informatica 37 (2013)245-251 245 Influence of CNF Encodings of AtMost-1 Constraints on UNSAT-based PMSAT Solvers Mohamed El Bachir Menai Department of Computer Science, College of Computer and Information Sciences King Saud University, P.O.Box 51178, Riyadh 11543, Saudi Arabia E-mail: menai@ksu.edu.sa http://faculty.ksu.edu.sa/menai Tasniem Nasser Al-Yahya Department of Computer Science, College of Computer and Information Sciences King Saud University, P.O.Box 51178, Riyadh 11543, Saudi Arabia E-mail: talyahya@ccis.imamu.edu.sa Keywords: artificial intelligence, satisfiability problems, constraint satisfaction, boolean cardinality constraint, CNF encoding, AtMost-1 constraints Received: July 7, 2012 Partial maximum Boolean satisfiability (Partial MaxSAT or PMSAT) is an optimization variant of Boolean Satisability (SAT). It asks to find a variable assignment that satisfies all hard clauses and the maximum number of soft clauses in a Boolean formula. .Several exact PMSAT solvers have been developed since the introduction of the MaxSAT evaluations in 2006, based mainly on the Davis- Putnam-Logemann-Loveland (DPLL) procedure and branch and bound (B&B) algorithms. One recent approach that provides an alternative to B&B algorithms is based on unsatisfiable (UNSAT) core identification. All PMSAT algorithms based on UNSAT identification are dependent on two essential external components: (1) a cardinality constraint encoder for encoding AtMost-1 constraints into conjunctive normal form (CNF); and (2) a SAT solver. Ensuring the effectiveness of both components directly affects the performance of the PMSAT solver. Whereas great advances have been made in PMSAT algorithms based on UNSAT core identification, only a few research work has been conducted to understand the influence of CNF encoding methods on the performance of PMSAT solvers. In this paper, we investigate the influence of three CNF encoding methods for AtMost-1 constraints on an UNSAT-based PMSAT solver. We implement the solver using the pairwise, parallel, and sequential encodings, and evaluate its performance on industrial instances. The experimental results show the impact of the CNF encoding method on the performance of the PMSAT solver. Overall, the best results were obtained with the sequential encoding. Povzetek: Predstavljena je nova metoda PMSAT, t.j. optimirane variante izpolnjivosti Boolovih enacb (SAT). 1 Introduction The Boolean satisfiability (SAT) problem is the core of computationally intractable NP-complete problems [7]. Informally, SAT asks if a Boolean expression can be made True by assigning Boolean values to its variables. The maximum satisfiability (MaxSAT) [12] problem is an optimization version of SAT, which consists in finding an assignment that maximizes the number of clauses that are True in a CNF formula. In recent years there has been an increasing interest in designing and implementing MaxSAT solvers. Indeed, state-of-the-art MaxSAT solvers are able to solve large number of instances that were beyond the reach of solvers developed just few years ago. The partial maximum satisfiability (PMSAT) [5, 22] is a problem between SAT and MaxSAT that is better suited for modeling and solving over constrained problems such as packing, planning, and scheduling. PMSAT instances can be solved using either an exact or stochastic local search method. Exact methods are complete, but are limited to small problem instances. Many exact methods for PMSAT have been introduced in the 2007-2011 SAT conferences, e.g. [10, 14, 15]. Stochastic local search methods for PM-SAT can handle very large problem instances, but do not guarantee that an optimal solution will be provided, especially as the instance grows in size [5, 21]. In 2007, the MaxSAT evaluation devoted a special track for exact PMSAT solvers. Most PMSAT solvers submitted to past MaxSAT evaluations are based on B&B methods, e.g. IncWMaxSatz [14, 15] and WMaxSatz+ [13], or DPLL methods, e.g. QMaxSAT and PM2 [1]. The 20072011 MaxSAT evaluations show that PMSAT solvers based on B&B methods are superior in the random category, while PMSAT solvers based on DPLL are the best in the 248 Informatica 37 (2013) 245-251 M.B. Menai et al. industrial category. DPLL-based solvers for PMSAT are built on top of powerful DPLL-based SAT solvers. Many take advantage of the improvements to DPLL-based SAT solvers, including unsatisfiable (UNSAT) core generation. All PMSAT algorithms based on UNSAT core generation are dependent upon two essential external components: (1) a CNF encoder for Boolean cardinality constraints, which expresses constraints in CNF representation; and (2) a SAT solver, which checks the satisfiability state of the SAT instance and extracts UNSAT cores. This paper studies the influence of three CNF encoding methods for AtMost-1 constraints on UNSAT-based PM-SAT solvers and gives experimental evidence of their impact on the performance of the PMSAT solvers. The rest of the paper is structured as follows. Brief overview of the topics addressed in the paper, namely the CNF encoding problem and PMSAT algorithms using UNSAT core generation are given in Section 2. Related work to CNF encoding methods for AtMost-k constraints on SAT solvers is presented in Section 3. The proposed method and the design of a PMSAT solver with three different encodings for the AtMost-1 constraint are explained in Section 4. Experimental results are presented and analyzed in Section 5. The paper is concluded in Section 6. 2 Background Boolean cardinality constraints state that at most one Boolean variable (AtMost-1) is allowed to be True). Many problems are expressed by CNF clauses and AtMost-1 constraints, such as mixed Horn formulas, planning, and PMSAT [2]. Because it was generally believed that solving such problems through pure CNF encoding is inefficient, many authors have proposed specialized algorithms: Pseudo-Boolean solvers. However, it has been shown [3] that an appropriate CNF encoding method and a robust SAT solver can provide a competitive approach, thus allowing modern SAT techniques, e.g. clause learning, restarts, etc., to be fully functional without the necessity of adapting PM-SAT solvers within a mixed ad-hoc solver. PMSAT was first defined in 1996 by Miyazaki et al. [22], in the context of the optimization of database queries. A PMSAT instance is a CNF formula in which some clauses are soft and the rest are hard. Solving a PMSAT instance consists in finding an assignment that satisfies all the hard clauses and the maximum number of soft clauses. The following sub-sections discuss the properties of a CNF encoding method and give a quick overview on UNSAT-based PMSAT methods. 2.1 CNF encoding problem The goal of the CNF encoding problem is the correct and efficient translation of a Boolean cardinality constraint over a set of Boolean variables into a CNF formula. Although there is no general definition of a good encoding technique, such encodings are generally judged in terms of their correctness, consistency enforcing level, and encoding size. 2.1.1 Correctness Correctness is an essential property that must be preserved in CNF encoding methods. It is formally presented in Definition 1. Definition 1. Given a CNF formula F over X = {xi, x2,..., xn}, F is said to encode a Boolean cardinality constraint, e.g. (xi,x2, ...,xn) < k, correctly if and only if, for any complete truth assignment I on X, I satisfies F if and only if I satisfies (xi,x2, ...,xn) < k. 2.1.2 Consistency enforcing level Consistency enforcing methods appeared in the context of constraint satisfaction problems (CSPs) as inference methods to assist searching. It has been shown that applying consistency enforcing methods to CSP instances defined with constraints only in extension, i.e., Boolean cardinality constraints, is equivalent to applying unit propagation to a polynomial SAT encoding of the constraints [11]. Consistency enforcing methods that infer constraints based on pairs of variables are referred to as arc consistency (AC) algorithms. AC ensures that any legal value in the domain of a single variable has a legal match in the domain of any other selected variable. The strength of consistency achieved by the unit propagation rule is reflected in the DPLL performance in many problems. However, achieving stronger consistency is not always beneficial, because of the additional time and space overheads. Thus, there is a trade-off between the effort spent on consistency and that spent on subsequent DPLL search. To date, according to the present research, AC and AC+ are the only consistency enforcing methods incorporated into CNF encoding methods. The AC property is defined below: Definition 2. Given a CNF formula F over X = {xi,x2, ...,xn}, F is said to preserve the AC property when encoding a Boolean cardinality constraint, e.g. (xi,x2,...,xn) < k if and only if, for any partial truth assignment I on X, unit propagation restores AC for (xi,x2, ...,xn) < k, specifically: 1. Unit propagation produces an empty clause when more than k variables in X are assigned to be True. 2. Unit propagation assigns to False all other variables in X when k variables in X are assigned to be True. 3. Unit propagation assigns to True all other variables in X when n — k variables in X are assigned to be False. The definition of the AC+ property is given below: Definition 3. Given a CNF formula F over X = {xi,x2, ...,xn}, F is said to preserve the AC+ property when encoding a Boolean cardinality constraint, e.g. Influence of CNF encodings of AtMost-1 constraints ... Informatica 37 (2013) 245-29 249 (xi, x2,..., xn) < k, if and only if, for any complete truth assignment I on X, applying unit propagation on F assigns the same variables of X as restoring AC+ for (xi,x2, ...,xn) < k, specifically: 1. Whenever an empty clause is not generated, and 2. All the variables of X are assigned. Table 1: Summary of CNF encoding methods. The table entry n indicates: Number of variables to be encoded. The columns "Auxiliary variables" and "Auxiliary clauses" indicate the number of new variables and clauses introduced by the encoding, respectively. The "AC/AC+" column shows whether unit propagation for the encoding enforces AC, AC+, both, or neither. CNF encoding Auxiliary Auxiliary AC /AC+ method variables clauses Pairwise none n(n-1) Both Sequential n — 1 2 n + n — 4 Both Parallel 2 n — 1 < 7 n — 3 [log nj — 6 none 2.1.3 Encoding size In practice, reducing the size, i.e., the number of literals or variables, of the resultant CNF formula in an encoding does not guarantee enhanced performance, regardless of how this is measured. Nevertheless, small encoding size is worth aiming for; because computational results can be unpredictable, all else being equal, a smaller encoding is preferable. The three CNF encoding methods of AtMost-1 cardinality constraints considered in this paper are: pairwise, parallel counter, and sequential counter encoding methods [24]. Table 1 summarizes these methods and their properties. The encodings are presented in the order in which they were introduced. 2.2 UNSAT-based PMSAT solvers UNSAT-based methods consist in using iteratively SAT solvers to identify and relax UNSAT formulas in PMSAT instances. Any UNSAT subset of clauses in an UNSAT formula is called an UNSAT core. Fu and Malik [10] proposed the first UNSAT-based solver for PMSAT, that consists of the following steps: (1) Identification of UNSAT sub-formulas; (2) Relaxation of each clause in each UN-SAT sub-formula by adding a relaxation variable to each clause in an UNSAT sub-formula; and (3) Addition of new Equals-1 constraint indicating that exactly one of these relaxation variables can be assigned the value True. The 2008-2011 MaxSAT evaluations have seen many competitive PMSAT solvers based on the Fu&Malik method, including MSU1.2 [18, 19], MSU4.0 [18], MSUn-Core [16, 18-20], PM2 [1], and WPM1 [1]. 3 Related work This section provides some insights into works that compare the performance of various CNF encoding methods for AtMost-k constraints, for different MaxSAT instances. In [18, 19], sequential counters [24], sorter networks, binary decision diagram (BDD), and bitwise encodings [9] are proposed as alternative encodings to the pairwise encoding used in the Fu&Malik algorithm. The experiments conducted on some selected MaxSAT instances show that the best results were obtained with BDD and bitwise encodings. According to the study in [18], one may conclude that the bitwise encoding is the most appropriate for AtMost-1 constraints. Marques-Silva and Planes [20], compare BDD and sorting network encoding, in terms of CPU time. Their results show that sorting network encoding is less time consuming than BDD encoding, with a few outliers. The pairwise, bitwise [9], and sequential counters [24] encodings are tested on a wide range of industrial MaxSAT instances [23]. The results indicate that bitwise encoding is the best one, as already suggested in [18], whereas the performance of the sequential counters is considered as quite good. 4 Proposed method All UNSAT-based PMSAT solvers are built on top of powerful SAT solvers. Thus, the studies in [6, 8, 17] have motivated the investigation of CNF encoding methods of Boolean cardinality constraints on the performance of UNSAT-based PMSAT solvers. The most appropriate combination of: (1) Boolean cardinality constraint type; (2) CNF encoding method; and (3) SAT solver; are based on the arguments below. Marques-Silva and Planes [19] have shown that clauses introduced by Equals-1 constraint used in the original Fu&Malik algorithm [10] can be reduced by replacing it with an AtMost-1 constraint, while maintaining the correctness of the algorithm. In order to quantify the impact of CNF encoding methods on the performance of UNSAT-based PMSAT solvers, we implement and test a PMSAT solver based on the following CNF encoding methods for AtMost-1 constraint: pairwise, parallel, and sequential encodings [24]. In the literature only few competitive SAT solvers generating UNSAT cores are described. Research and discussions with some developers of SAT solvers led to the selection of PicoSAT-936 [4] as the SAT/UNSAT solver that will be used to extract UNSAT cores for our PMSAT solver. PicoSAT-936 [4] is a state-of-the-art conflict-driven clause learning (CDCL) SAT solver that comes with the PicoMUS utility to compute minimal UNSAT cores. 248 Informatica 37 (2013) 245-251 M.B. Menai et al. 1 Fu&Malik (a) Input : a is PMSAT instance Output: if a satisfiable then I Number of soft clauses falsified in a else i ro 2 Optimal = 0 3 while True do 4 5 6 7 8 9 10 11 12 13 14 15 16 17 (aSAT) = PMSATtoSAT(a) (call PMSATtoSAT to convert PMSAT instance a to SAT instance aSAT) State = PicoSAT-936(asat) (call PicoSAT-936 solver & return True if a satisfiable, otherwise False) if State == True then I return Optimal auNSAT = PicoMUS(asAT) (call PicoMUS utility & return minimal UNSAT cores auNSAT) ß = 0 foreach soft_clause G auNSAT do ß = ß U anew (anew is a new auxiliary variable) a = (a\soft_clause) U (soft_clause V anew) if ß == 0 then I return