Informatica 37 (2013) 267-278 267 Random Search Algorithm for the p-Median Problem Alexander N. Antamoshkin and Lev A. Kazakovtsev Siberian State Aerospace University prosp. Krasnoyarskiy rabochiy, 31, Krasnoyarsk 660014, Russian Federation E-mail: levk@ieee.org Keywords: continuous location problem, Weber problem, random search, genetic algorithms, discrete optimization Received: November 5, 2012 Authors investigate the p-median location problem on networks and propose a heuristic algorithm which is based on the probability changing method (a special case of the genetic algorithm) for an approximate solution to the problem. The ideas of the algorithm are proposed under the assumption that, in the large-scale networks with comparatively small edge lengths, the p-median problem has features similar to the Weber problem. The efficiency of the proposed algorithm and its combinations with the known algorithms were proved by experiments. Povzetek: Avtorji predlagajo nov hevristicni algoritem za iskanje p-mediane v omrezjih. 1 Introduction The aim of the location problem [13] is to determine the location of one or more new facilities in a set of possible locations (discrete or continuous). The main parameters of such problems are the coordinates of the facilities and distances between them [37, 14, 15]. Examples of the location problems include the location of warehouses [15], computer and communication networks, base stations of wireless networks [30] etc. They are also useful in the approximation theory, statistical estimation problem [28], signal and image processing and other engineering applications. The Fermat-Weber problem (Weber problem) [35, 37] is the problem of searching for such a point that a sum of weighted distances from this point to the given points (demand points) is minimal. In the location theory, several generalizations of these problems are known [11]. The first one is the multi-Weber problem where the aim is to find optimal locations of p new facilities: N F(X1, = wi min L(Xj,Ai) ^ min. (1) i=1 jelhp} Here, {Ai|i = 1 , N} is a set of the demand points, {Xj j = 1,p} is a set of new placed facilities, wi is a weight coefficient of the ith demand point, L() is a distance function. One of the problems of the discrete location theory, a p-median problem, can also be considered as a generalization of the Weber problem [17, 23]. The medians are searched for in a finite set of graph vertices. Generally this problem is NP-hard [24]. The polynomial-time algorithm is developed for trees only [20]. A procedure for network flow searching (an algorithm for the p-median problem solving) is adapted for location problems in a continuous space with the rectangular metric [10]. Despite the complexity of the problem, various heuristic algorithms could give good results for most problems in reasonable time. One of the simplest but efficient heuristics for the p-median problem is local search [31, 32]. Rabbani [29] proposes an algorithm based on new graph theory for small size problems. Using Lagrangian relaxation allows an approximate solving of huge-scale problems [5], up to 90000 vertices in a network. However, "good" solutions [6] were achieved by the analogous technique for problems with n = 3795 which were also considered as large-scale problems. Hosage and Goodchild [19] proposed the first genetic algorithm for the p-median problem. In [9], authors propose a genetic algorithm providing rather precise results but its convergence is slow. In [1], authors propose a quick and precise genetic algorithm for the p-median problem. However, solving large-scale problems still takes too much time. In [14], a continuous problem with an arbitrary lp metric is solved. In [24] and [25], authors prove that the unconstrained Weber problems with Euclidean or rectangular metric are NP-complete. For the continuous location problems with Euclidean (l2), rectangular (l1) and Chebyshev (lTO) metrics, the algorithms based on the Weiszfeld procedure are proposed [36]. However, a solution of the same problems with restricted zones [39], barriers [8] or modified distance functions is not trivial. In practically important problems, models based on the Euclidean or rectangular metric are usually rough approximations [27, 38] since they do not take into account characteristics of the space and transportation means, in particular, the presence and quality of roads, barriers, relief etc. Sometimes, the distance function is given algorithmically as a solution of another optimization problem [38]. Thus, if the problem is formulated as a Weber 268 Informatica 37 (2013) 267-278 A. Antamoshkin et al. problem, its discretization should be often useful [21]. We can always consider practical problems as discrete. Any scheme (or map) has finite resolution, digital copy of any map is always discrete. The implementation of the obtained solution of a problem is also performed by the tools with finite precision. However, such a discrete problem is a large-scale problem and any algorithms which guarantee an exact solution in polynomial time do not exist (polynomial time algorithm exists for a discretized generalized Weber problem with rectangular metric only [10]). The unconstrained location problems with mixed coordinate systems (discrete and continuous) are considered in [33, 16]. Heuristic random search algorithms do not guarantee an exact solution. However, they are statistically optimal, i.e., the percentage of the "near optimal" solutions increases with growth of the problems dimension [3]. In case of discrete location problems, genetic algorithms [30, 26], greedy search [26,18] and other methods are implemented. The probability changing method initially proposed for unconstrained optimization is a random search method described by the pseudo-code below. Algorithm 1. Basic probability changing method 1: Set k = 0; set the initial values of the components of the probability vector P0 = {po,i,po,2, ...,po,n} (here, the value of each element at the kth step is the probability (expectation) of generating a vector X with the corresponging element equal to 1: pkj- = xj = 1}); 2: In accordance with the distribution given by the elements of the vector P, generate a set of Npop vectors Xk,i, i € {1, Npop}; 3: Calculate the value of the objective function F(Xk i) Vi € {1, Npop}. ' 4: Select some vectors Xk,i (for example, a vector with the best and the worst value of the objective function); 5: Based on the results of the Step 4, modify the probability vector P; 6: k = k + 1; if k< Nsteps then goto 2 (other stop conditions are also possible); 7: STOP. This simple method is a special variant of the genetic algorithm [12]. The modifications of this algorithm for the constrained optimization problems proposed in [22] can solve pseudo-Boolean problems (multi-knapsack problem, travelling salesman problem) with dimensions up to millions of Boolean variables. 2 Problem statement, known algorithms Let G = (V, E) be an undirected adjacent graph (a network), V = {v1,..., vn} be a set of its vertices (Fig. 1), E = {ei|i = 1, m} be a set of its edges, ei = (vj, vk), (ei = (vj, vj) Vi = 1, m, j = 1, n). For each edge ei, its length li is defined, li > 0Vi = 1,m. For an edge ei = (vj, vk), let us denote j,k = li. Weight Wj > 0 is defined for each vertex vj. For each pair of the vertices (vj, vk), a distance function L(j, k) is defined as the length of the shordest path from vi to vj. L(j,k) = E lq = pmpn E lq (2) j qep Here, Pjk is a set of the edges of the shortest path between vj and vk, Pj k is a set of all possible paths between these vertices. We can formulate the thep-median problem as arg mm _ mi ,...,mpG{1,n} = arg mm f (mi,..., mp) mi ,...mpGl,n i=i i=1,p min L(mj, i), p < n. (3) The aim of this problem is to select p vertices so that the sum of weighted distances from each of the vertices of the graph to the nearest selected vertex is minimal. Let Si = {j|3ej = (vi,vk), j € {1,m},k € {1,n}} be a set of the edges of the vertices incident to the ith vertex; Ci = {k|3ej = (ci,vk), j € {1, m},k € {1,n}} be a set of the indexes of the vertices adjacent to the ith vertex; l* = min L(mj, i) je{i,p} be the length of the shortest path from the ith vertex to the nearest of p vertices m1,..., mp. For calculating the value of the objective function f (m1,..., mp), we use the algorithm below. Algorithm 2. p-median objective function calculation Require: indexes m1,..., mp of p selected vertices. 1: l* = +wVi = 1, n; 2: lmi = 0Vi = 3: V* = {m1,..., mp}; V** = V*; 4: while |V= 0 do 4.1: V' = 0; 4.2: for i € V** do 4.2.1: for j € C do 4.2.1.1: if vj- € V* then V' = V' U {j}; l* = l* + li,j; 4.2.1.2: else if j = l* + li,j then l* = l* + li,j; 4.2.1.3: next 4.2.1; 4.2.2: next 4.2; 4.3: V** = V'; V* = V* U V'; 4.4: next 4; j € {1, n}, k € {1, n}, i € {1,m} without loops 5: return f (m1,..., mp) = wil*. Random Search Algorithm for. Informatica 37 (2013) 267-278 269 a) solution for p = 32 b) solution for p = 3 Figure 1: Scheme of a problem and its solutions, n = 400 For comparison, we used the local search algorithm [32] with a random order of vertices evaluation (Algorithm 3) as one of the simplest but efficient algorithms. Algorithm 3. Local search Require: array of indexes M = {mi,..., mp} of the vertices (initial solution), value of the objective function f* = /...,mp). 1: shuffle elements of M; r = 0; 2: for each element m of the array M do 2.1: for each vertex m* which is adjacent to m do 2.1.1: f** = f (m1,..., m*,..., mp) (here, the vertex m is replaced by m*); 2.1.2: if f ** < f * then replace m by m* in M; f * = f**; r = 1; 2.1.3: next 2.1; 2.2: next 2; 3: if r = 1 then goto 1; 5: return new solution (m1,..., mp). The genetic algorithm with greedy heuristic proposed in [1] includes a special crossover procedure (Algorithm 4). The "chromosomes" of this algorithm are sets of the vertices (solutions of the problem). Algorithm 4. Crossover procedure for the GA with greedy heuristic Require: sets of indexes M1 = {m1,1,..., m1jP}, M2 = {m2j1,..., m2,p} of the vertices ("chromosomes"). 1: M = M1 UM2;Pm = |M|; 2: while pm > P do 2.1: f* = 2.2: for each vertex m* in M do 2.2.1: M* = M \ {m*}; 2.2.2: f ** = f (M*); 2.2.2: if f** < f * then m** = m*; 2.1.3: next 2.2; 2.3: M = M \ {m**}; pm = Pm - 1; 2.3: next 2; 5: return new solution ("chromosome") M. This method uses an original procedure of the initial population generation [1]. It does not use any mutation procedure. The probability changing method is a pseudo-Boolean optimization method, the objective function must be a function of Boolean variables. Let us introduce new variables x1,..., xn: 1, i € {m-1,..., m?} 0, i € {m-1,..., mp} In this case, our problem has a constraint n =p. ¿=1 (4) (5) The transformation of the problem back into the problem with integer variables is performed as follows: {mi, ..., mp} = {¿|xj = 1, i = 1, n}. (6) The problem with the pseudo-Boolean objective function is stated as follows: fb (X1, ...,Xn) = f ({j |Xj = 1, j = 1, n}) n = ^2 wi min _L(i,j) (7) ¿=1 j|xj=1>j=1>n with the constraint (5). In this form, our problem can be solved using many methods [3, 4, 2, 22] including the probability changing method. 3 An algorithm based on the probability changing method Continuous location problems such as multi-Weber problem with Euclidean, rectangular or Chebyshev meric are 270 Informatica 37 (2013) 267-278 A. Antamoshkin et al. amenable to analysis and analytical solution methods. Weiszfeld [36] procedure and Trubin's procedure for rectangular metric [34] are based on the assumption that the partial derivatives of the objective function are defined, i.e., a small change of the location of any point in a feasible solution leads to some small change in the value of the objective function. If X = (x\, x2) is a solution of some 2D unconstrained location problem then Af (X )_ Af (xi,x2) AX Av/Ax1 + Ax 2 < const. Let Lmax be the maximum distance between 2 vertices: Lm max L(i,j ), i,jE{1,n} lavg be the average length of the edge: m lavg ^ ] lj/m. j=1 Our algorithm is based on 2 hypotheses: |f (mi,...,'' f (mi, ...,mi, ...,k, ...,mp) 'p\ > > 0.5 mp and (8) (9) Proof. Let us choose arbitrarily the i*th vertex, i* e If i* e {mi, ...,mi, ...,mp} then, obviously, mini//£{mi,...mi,...mp> L(i* ,i") = min L(i*, i") i" e{mi ,...,j,...,mp} = min < min L(i*,i") + L(mi,j); I i"£{mi ,...mi,...mp} L(i*,j) } = min{0 + 0; L(i*,j)} =0. If i* e {mi,..., mi,..., mp}, let us introduce the notation: Pi* j is a set of all possible paths from the i*th vertex to the j th one, Pi*,mi is a set of all possible paths from the i*th vertex to the mith, Pi*{mi)j is a set of all possible paths from the i*th vertex to the jth one through the mith vertex, Pi*(mi)j is a set of all possible paths from the i*th vertex to the jth one which do not include the mith vertex. Hypothesis 1. If lavg/Lmax is small (lavg/Lmax ^ 0) then the character of the p-median problem shows features of the continuous problem. In particular, if {mi,..., mp} is a solution of the p-median problem then, having replaced any mith vertex of this solution to any jth point such that L(j, mi) is small, the change in the objective function value is also small: 3lA, AFmax > 0 : (j, mi) I k ; min > I k) PeP- lk + 0; min > lk} . * ekEP i)J ekinP Hypothesis 2. If the solution {m1,..., mi,..., mj,..., mp} contains 2 vertices vi and vj such that L(mi, mj) is small then there is high probability (expectation) that for the solution {m1,..., mi,..., k,..., mp} with a vertex Vj replaced by another arbitrary chosen vertex vk, the value of the objective function is "better" than for the original solution: 3lmin • L(mi, mj) < lmin A L(mi, k) > l„ f (mi, ...,mi, ...,mj,...,mp) = miniL(i*,mi); min lk| P eP..(mi)^D ( i)j ekinP < L(i*, mi). L(i*,mi) = min > lk pep** ^ ekEP min min lk; min lk P ep**(j)m^v P epi*(j)m ^ ekEP ek inP mini min lk + min lk; { peP**j k pePj.m* k' .j ek EP * ekEP mm ¿.s lk P €Pi*(j)™i ek EP £ lk} iim A f(mi,...mi,..m,...mp) imin^o> f (mi, ...mi, ...k, ...mp) ' Let us prove the consistency of the hypotheses for the special cases. Lemma 1. Let L(mi,j) = 0. Then f (mi, ...,mi, ...,mp) = f (mi, ...,j, ...,mp). mini min lk + 0; min lk| ek EP ek inP = min{L(i*, j); min ^ lk} < L(i*,j). (j)mi ek inP L(i*,mi) < L(i*,j) A L(i*,j) < L(i*,mi) ^ L(i*,mi) = L(i*,j). (10) □ k p Random Search Algorithm for. Informatica 37 (2013) 267-278 271 Lemma 2. Let {mi,..., m^..., mj,..., mp} be a solution of the p-median location problem on a network with n vertices, Wj > 0Vi = 1, n, L(mj, mj) =0 and 3k G {1, n}: L(mq, k) > 0Vq = 1,p. Then f (mi,...,mi,...,mj, ..., mp) >f (mi,...,mi,...,k,...,mp). (11) Proof. Let us introduce the notation Mo = {mi,..., mj, ..., mj,..., mp}; Mi = {mi,..., mj,..., k, ..., mp}. Let us define a function /m(i',S) = min L(i'',i'). Its value for the sets denoted above is /m (i',Mo)=min { /m(i', {m,}); /m(i', {mj}); /m(i',Mo \{mi}) }Vi' €{W. Taking into account L(mj, mj) = 0, from Lemma 1, for the set of vertices of the solution /m (i', {mi,..., mj, ..., mj,..., mp}) = /m(i', {mi,..., mj, ..., mj, ..., mp}) = /m(i',Mo) Vi' € {T"n}. Further, /m(i',Mi ) = min { /m(i', {mj}); /m(i', {k}); /m(i',Mi \ ({mj}U{k})) } Vi' € {T~n}; /m (i',Mo) = min {/m(i', {mj}); /m(i', Mo \ {mj})} = min {/m(i', {mj}); /m((Mi \ {k}) \ {mj})} = min {/m(i', {mj}); /m(Mi \ ({k} U {mj}))} > min { /m(i', {mj}); /m(i', {k}); /m(i',Mi \ ({mj}U{k})) } = /m(i',Mi) Vi' €{W; Thus, / (mi,...,mj,...,k,...,mp) n n = ^ /m(i',Mi) < ^ /m(i',Mo) j'=i j'=i = /(mi, ...,mj, ..., mj ,...,mp). In our version of Algorithm 1, steps 2 and 5 are modified. At Step 2 (generation of the samples of vectors X), the constraint (5) must be taken into consideration. The solutions generated by algorithm below are always feasible. Algorithm 5. Step 2 of Algorithm 1 Require: Probability vector P = (pi, ...,pn). 1: X = 0; _ 2: for each i g {1,p} do 2.1: r = Random() • pj; S = 0; 2.2: for each j G {1,n} do 2.2.1: S = S + pj; 2.2.2: if S > r then x = X U {j}; goto 2.3; 2.2.3: next 2.2; 2.3: next 2; 3: return X. The result of this algorithm is a set X. The corresponding vector X of boolean variables can be calculated in accordance with (4). Here, Random () is a generator of the random value with continuous uniform distribution (Random() - U[0,1)). Let Lo be the maximum distance considered as "small" in terms of Hypothesis 1 and Hypothesis 2. In accordance with Hypothesis 2, ^ {3mj, mj G X : L(mj, mj) < Lo} < ^ {L(mj, mj) > LoVmj, mj G x} ; lim u{3mj, mj € x : L(mj, mj) < L*} = 0. Let us modify Algorithm 5. Algorithm 6. Step 2 of Algorithm 1, v. 2 Require: Probability vector P = (pi, ...,pn). 1: X = 0; _ 2: for each i g {1,p} do 2.1: P * = P; 2.2: r = Random() • p*; S = 0; 2.3: for each j G {1,n} do 2.3.1: S = S + p**; 2.3.2: if S > r then x = X U {j}; j' = j; goto 2.3; 2.3.3: next 2.3; 2.4: for each j g {k|k G {T"n} A L(j', k) < Lo} do: p* = p* • L(j, j') • Lo ; next 2.4; 2.5: next 2; 3: return X. Step 5 of he Algorithm 1 (adaptation of the probability vector P) in its kth iteration must be performed in accordance with Hypothesis 2. We use the multiplicative adaptation. □ Pk (12) 272 Informatica 37 (2013) 267-278 A. Antamoshkin et al. dk. = / 1 + i+OTO, L(i,ib(i)) Lo , (13) = ^ 1+ i+LL(*) ,*), L(i,iW(i)) Lo Here, ib(i) = arg*' min L(i,i'), *'exb iw (i) = arg*, min L(i,i'), *'exw (15) (16) Xb and xw are the best and the worst samples of the sets of vertex indexes x generated by Algorithm 2 or Algorithm 5. In the simplest case, Xb = argx' min f (x'), x'eX Xb = argx' max f (x'). x'eXfc (17) (18) Here, Xk is a set of all samples of the vector x at the kth iteration of Algorithm 1. Note that the discretized Weber problem described in [21] is a special case of the p-median problem considered in this paper. 4 First results, adjusting parameters For testing purposes, we used the p-median problems generated automatically by the special Algorithm 7. Algorithm 7. Sample problem generating Require: n. 1: for i in {1, n} do 1.1: Cx = Random() ■ 500; cly = Random() ■ 500; w* = 1 + 9Random(); 1.2: if 3j € {j~n—l} : (cX + cX)2 + (cy + j)2 < 10 then goto 1.1; 1.3: next 1; 2 Fill the adjacency matrix A with the zero values; E = 0; 3: for i in {1, n} do 3.1: 1, i € {[0.6n + 1], n}, D = J 2, i € {[0.4n] + 1, [0.6n]}, * = 3, i € {[0.25n] + 1, [0.4n]}, 4 + [Random() ■ 4], i < [0.25n]. 3.2: for d in {1^™=1A*,j}; 3.2.1: j = argminje{^},Ai,j(cX - cX)2 + cy - 4)2; A*,j = 1; Aj,* = 1; E = E u {(i, j)}; l*j = (cX - cX)2 + cy - cy)2; 3.2.2: next 3.2; 4: return adjacency matrix A, edges set E, edge lengths {l*,j}V(i,j) € E and weights w*Vi € {1, n}. Scheme of of such problem example is shown in Fig. 1. The lengths of the edges are proportional to ones shown in the scheme, the diameters of the vertices show their weights. In addition, in Fig. 1, the solutions calculated by our algorithm for p = 32 and p = 3 are shown. The vertices selected as the solution are shown in a circle. For each of the vertices, the color of the vertex shows the distance to the nearest selected vertex. The nearest vertices are shown in dark, the farthest are shown in white. For our experiments, we used a computer Depo X8Sti (6-core CPU Xeon X5650 2.67 GHz, 12Gb RAM), hyper-threading disabled and ifort compiler with full optimization and implicit parallelism (option -O3). Comparison of the results reached with this hardware configuration with the results of the small system with 2-core Atom CPU N2701.6GHz, 1Gb RAM are shown in Fig. 2 (a combined method "probability changing+GA" is explained in Section 6). Fig. 3 illustrates change in the probability vector for p = 3. The vertices with high value of the expectation of being included in the generated solutions are shown in white, the vertices with the smallest value of the expectation are shown in black. The diameter L0 of the consistency area of Hypothesis 1 and Hypothesis 2 is an important parameter of the algorithm. For the problem with p =12, the comparison of the algorithm efficiency with various values of Lo is shown in Fig. 4. The results of running the algorithm with use of Algorithm 5 as the generating procedure is shown as "L0=0". The best results are achieved with Lo = 90. The optimal value of L0 depends on p. With p = 32, the best results are achieved with Lo = 60. Nevertheless, the algorithm with a wide variety of the parameter values L0 € [10,350] gives the results better than the "classical" probability changing method (the case Lo = 0). For the further experiments, we used L0 = Lavg/3 where Lavg is the expectation of the average distance to the closest facility in a randomly generated solution (arbitrary chosen set of p vertices): n Lavg = min L(i,j)/n}. (19) t! je{1,P} As the estimation the value Lavg, we used the average distance from 10 randomly chosen vertices to the closest vertex from 10 randomly generated sets of p vertices. The calculation of Lmax used in the conditions of the Hypotheses 1 and 2 takes significant time. Instead, we used lavg/Lavg ^ 0 (19) as the condition of applicability of our algorithm. 5 Comparison with the existing methods For testing purposes, we used the local search method (Algorithm 3) with multistart from randomly generated initial Random Search Algorithm for. Informatica 37 (2013) 267-278 273 8300 8200 8100 8000 7900 7800 7700 7600 pmed11, n=300, p=5, average results Prob.changing, Xeon 2.67GHz Prob.chang+GA,reduced population, Xeon@2.76GHz Prob.changing, Atom@1.6GHz Prob.chang.+GA,reduced population, Atom 1.6GHz Hr 0.5 1 1.5 2 Time, sec. 2.5 0 3 Figure 3: Probability values change L0 = 100, p = 3 solution as one of the simplest methods and the genetic algorithm (GA) [1] with the crossover procedure given by Algorithm 4 (greedy heuristic) as one of the most efficient methods . As the testbed, we used the p-median problems from the OR Library [7]. The same library was used in [1]. Since this library contains problems with numbers of vertices up to n = 900, we used our Algorithm 7 for generating larger problems. The results of our algorithm based on the probability changing method used standalone show its low convergence in comparison with the GA (Fig. 5). Problems "pmed22" and "pmed39" are included in the OR Library, a problem with n = 5000 was generated by Algorithm 7. This figure shows the average values for 10 runs and the worst result. The results of the combined method ("Probability changing+GA") are explained in the next section. To calculate the quantity of exemplars of the generated solutions in each population Npop, we used formula Vno npop = r- 100\n/p] Ïïn/Pl. (20) The GA with greedy heuristic uses formula nC npop = r- 100rn/p\ Ïïn/Pl. (21) 6 Results of combined methods The genetic algorithm [1] uses the regular method of filling of the initial population. In case of large-scale problems (pmed39, pmed32, generated problems with n = 2000, n = 5000), experiments with the randomly filled initial population decrease the accuracy of the results and the convergence insignificantly. Our experiments with using the last generated population of the probability changing method as the initial population of the GA with greedy heuristic show significant speed-up of such combined algorithm in comparison with the original GA. We used two variants of the population size, standard population (21) and reduced population (20). Both variants show significant speed-up for most problems. 274 Informatica 37 (2013) 267-278 A. Antamoshkin et al. Figure 4: Comparison of the efficiency of the algorithm with various values L0, p = 12 pmed22, n=500, p=10, average results pmed39, n=900, p=10, worst case 10000 9800 9600 9400 9200 9000 8800 8600 8400 400000 380000 360000 340000 320000 300000 280000 260000 Prob.changing Greedy GA Prob.chang.+GA,standard population Prob.chang.+GA,reduced population 4 6 Time, sec. 10 Generated problem, n=5000, p=150, average results L____ Prob.changing Greedy GA Prob.chang.+GA,reduced population 0 10 20 30 40 50 60 70 80 90 Time, sec. 11000 10500 10000 9500 i Prob.changing Greedy GA Prob.chang.+GA,standard population Prob.chang.+GA,reduced population -I--. 10 Time, sec. 15 20 Generated problem, n=5000, p=150, worst case 400000 380000 § 360000 al £ 340000 st e 320000 R ¡T 300000 280000 260000 Prob.changing--- Greedy GA - Prob.chang.+GA,reduced population ...... 10 20 30 40 50 Time, sec. 60 70 80 90 0 2 8 0 5 0 Figure 5: Comparison of the probability changing methods and its combinations with the GA The variant with the reduced population shows worse accuracy but it can be useful for fast search for an approximate solution of the large-scale problems. We performed 5 steps of Algorithm 1 (Nsteps = 5 in the Step 6) with the probability adaptation (Algorithm 5) and used its last population {X5ji|i = 1, NPOP} as the initial population for the GA. The "chromosomes" M G {X5ji|i = 1, Npop} are then passed to the crossover procedure (Algorithm 4). The results shown in Fig. 5 and Fig. 6 were calculated for 10 runs of the original and combined algorithms (3 runs for n = 7500). The dependency of the average and worst Random Search Algorithm for. Informatica 37 (2013) 267-278 275 pmed11, n=300, p=5, average results pmed11, n=300, p=5, worst case 7900 7850 7800 7750 7700 7650 7600 7550 7800 7700 7600 7500 7400 7300 7200 7100 7000 6900 Greedy GA Prob.chang+GA.standard population Prob.chang.+GA,reduced population 0.1 0.2 0.3 0.4 Time, sec. 0.5 0.6 0.7 pmed17, n=400, p=10, average results Greedy GA Prob.chang.+GA,standard population Prob.chang.+GA,reduced population 0 0.5 1 1.5 2 2.5 3 3.5 4 Time, sec. pmed32, n=700, p=10, average results 9800 9700 9600 9500 9400 9300 9200 165000 160000 155000 150000 145000 140000 135000 130000 125000 120000 Greedy GA Prob.chang+GA.standard population Prob.chang.+GA,reduced population 10 15 Time, sec. 20 Generated problem, n=2000, p=100, average results 5 10 15 20 25 30 35 40 Time, sec. 7900 7850 7800 7750 7700 7650 7600 7550 Greedy GA _ Prob.chang.+GA,standard population Prob.chang.+GA,reduced population 0.1 0.2 0.3 0.4 Time, sec. 0.5 0.6 0.7 pmed17, n=400, p=10, worst case Greedy GA Prob.chang.+GA,standard population Prob.chang.+GA,reduced population 0.5 1 1.5 2 2.5 3 Time, sec. 3.5 4 pmed32, n=700, p=10, worst case 9800 9700 9600 9500 9400 9300 9200 330000 328000 326000 324000 322000 320000 318000 316000 314000 312000 310000 308000 Greedy GA Prob.chang.+GA,standard population Prob.chang.+GA,reduced population 10 Time, sec. 15 20 Generated problem, n=7500, p=300, average results Prob.chang.+GA,reduced population 0 10 20 30 40 50 60 70 80 90 Time, sec. 0 0 0 0 5 0 5 0 Figure 6: Comparison of the original and combined GAs 276 Informatica 37 (2013) 267-278 A. Antamoshkin et al. values achieved by the algorithms on the spent time was calculated for problems from the OR Library with comparatively small value of lavg/Lavg (see (9) and (19)). The results for the problems "pmed11", "pmed12", "pmed14", "pmed16", "pmed18", "pmed19", "pmed21", "pmed23", "pmed35", "pmed31" show the analogous tendencies. We used a combination of 3 stop conditions: reaching the best result announced in the OR Library (if such exists), reaching • p] steps which do not improve the best result or reaching the time limit. For the problem with n = 7500, the results are shown for the combined algorithm with the reduced population only due to memory allocation problems in case of standard population (21). In case of using the probability changing method, the standard deviation of the objective function in the population of decreases faster than in case of the GA (Fig. 7). In the combined algorithm, the search continues with a comparatively "good" population. lems with high density (p/n) and comparatively large size ("pmed19", "pmed24", "pmed25", "pmed29"). In this case (case "b" of Fig. 8), combined algorithm allows to reach the exact result faster. For the large scale problems (n > 2000, case "c") generated by Algorithm 7, the combined algorithm gives better results. 7 Conclusion The proposed algorithm based on the probability changing method is useful for solving the p-median problem in a network under the assumption that the lengths of the edges are much smaller than the expectation of the path length from a randomly selected vertex to the closest vertex of the solution. Very slow convergence of the algorithm obstructs its implementation as a standalone algorithm. However, its running in combination with other algorithms improves their efficiency significantly. Adjusting the parameters of the algorithm is the subject to the future research. pmed17, n=400, p=10, average results 700 g 600 uT 500 "o n o 400 ra > 300 e ■q 200 35 100 0 Greedy GA Prob.chang. ^_GA after prob.chang.,standard population 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Time, sec. Figure 7: Comparison of the standard deviation of the original and combined GAs We used the local search (Algorithm 3) with randomly selected initial solutions for testing purposes. Also, the local search was implemented as a procedure of the algorithm based on the probability changing method. We modified Step 2 of Algorithm 1 : 2: In accordance with the distribution given by the elements of the vector P, generate a set of NPOP vectors Xk,i,i e {l, Npop}; 2.1: If [k/5] = k/5 and k > 0 then apply Algorithm 3 to each vector Xk,i; The results are shown in Fig. 8. Both, original and combined algorithm ran 10 times. The size of the population of the probability changing algorithm was calculated in accordance with (20). For small size problems, local search in both variants is more efficient than the GA. For most problems included in the OR Library, the results of the combined method are the same as the results of the local search with multistart (case "a" on Fig. 8) because 2-100 starts of the local search procedure are enough for obtaining the exact solution. An exception is the prob- References [1] O. Alp, E. Erkut and Z. Drezner (2003) An Efficient Genetic Algorithm for the p-Median Problem, Annals of Operations Research, Vol.122, Issue 1-4, pp. 2142, doi 10.1023/A:1026130003508 [2] A. Antamoshkin and I. Masich (2007) Pseudo-Boolean Optimization in Case of an Unconnected Feasible Set, in Models and Algorithms for Global Optimization Optimization and Its Applications, Springer Verlag, Vol. 4, pp 111-122 [3] A. N. Antamoshkin (1987) Optimizatsiya funktsion-alov s bulevymi peremennymi (Optimization of Func-tionals with Boolean Variables), Tomsk University Press [4] A. Antamoshkin, H. P. Schwefel, A. Toern, G. Yin, A. Zhilinskas (1993) Systems Analysis, Design and Optimization. An Introduction, Krasnoyarsk, Ofset [5] P. Avella, M. Boccia, S. Salerno and I. Vasilyev (2012) An Aggregation Heuristic for Large Scale p-median Problem, Computers & Operations Research, 39 (7), pp. 1625-1632, doi 10.1016/j.cor.2011.09.016 [6] P. Avella, A. Sassano and I. Vasil'ev (2007) Computational Study of Large-Scale p-Median Problems, Mathematical Programming, 109(1), pp. 89-114, doi 10.1007/s10107-005-0700-6 [7] J. E. Beasley (1990) OR-Library: Distributing Test Problems by Electronic Mail, Journal of the Operational Research Society, 41(11), pp. 1069-1072 [8] M. Bischoff, T. Fleischmann and K. Klamroth (2009) The Multi-Facility Location-Allocation Problem with Random Search Algorithm for. Informatica 37 (2013) 267-278 277 pmed11, n=300, p=5, average results pmed24, n=500,p=100, average results 8400 8300 8200 8100 8000 7900 7800 7700 7600 7500 0.5 1 Time, sec. standard problem with low density 2950 2940 ue 2930 al ¿ 2920 s e 2910 R ÙT 2900 2890 2880 10 15 20 Time, sec. 25 30 a) b) standard problem with high density generated problem, n=7500,p=300, average results 296000 295500 295000 294500 294000 Local search,multistart Prob.changing+local search 50 100 150 200 250 300 350 400 Time, sec. c) large-scale problem 0 5 0 Figure 8: Comparison of the local search and the probability changing method with the local search procedure Polyhedral Barriers, Computers and operations Research, 36, pp. 1376-1392 [9] B. Bozkaya, J. Zhang and E. Erkut (2002) A Genetic Algorithm for the p-Median Problem, in Z. Drezner and H. Hamacher (eds.), Facility Location: Applications and Theory, Springer [10] A. V. Cabot, R. L. Francis and M. A. Stary (1970) A Network Flow Solution to a Rectilinear Distance Facility Location roblem, American Institute of Industrial Engineers Transactions, 2, pp. 132-141 [11] L. Cooper (1968) An Extension of the Generalized Weber Problem, Journal of Regional Science, Vol. 8, Issue 2, pp.181-197 [12] A. S. Degterev, F. V. Kanashkin and A. D. Sumarokov (2004) Obobshenie geneticheskikh algoritmov i algoritmov skhemy MIVER (Generalization of Genetic Algorithms and Algorithms Based on Probability Changing Method), Issledovano v Rossii, vol. 2004, pp. 1658-1665 [13] Z. Drezner and H. Hawacher (2004) Facility location: applications and theory, Springer-Verlag, Berlin, Heidelberg. [14] Z. Drezner and G. O. Wesolowsky (1978) A Trajectory Method for the Optimization of the Multifacil-ity Location Problem with lp Distances, Management Science, 24, pp.1507-1514 [15] R. Z. Farahani and M. Hekmatfar, editors (2009) Facility Location Concepts, Models, Algorithms and Case Studies, Springer-Verlag Berlin Heidelberg. [16] M. Gugat and B. Pfeifer (2007) Weber Problems with Mixed Distances and Regional Demand, Math. Methods of Orerations Research, issue 66, pp. 419-449 [17] S. L. Hakimi (1964) Optimum Locations Of Switching Centers and the Absolute Centers and Medians of a Graph, Operations Research, 12(3), pp. 450-459 [18] P. Hansen, J. Brimberg, D. Urosevic, N. Mladenovic (2009) Solving large p-median clustering problems by primal-dual variable neighborhood search, Data Mining and Knowledge Discovery, vol. 19, issue 3, pp 351-375 [19] C. M. Hosage and M. F. Goodchild (1986) Discrete Space Location-Allocation Solutions from Genetic Algorithms, Annals of Operations Research 6, 35-46. 278 Informatica 37 (2013) 267-278 A. Antamoshkin et al. [20] O. Kariv and S. L. Hakimi (1979) An Algorithmic Approach to Network Location Problems. II: The P medians, SIAMJ. Appl. Math. 37, pp. 539-560. [21] L. A. Kazakovtsev (2012) Adaptation of the Probability Changing Method for Weber Problem with an Arbitrary Metric, Facta Universitatis, series Mathematics and Informatics, vol. 27 (2), pp. 239-254 [22] L. Kazakovtsev (2012) Random Constrained Pseudo-Boolean Optimization Algorithm for Multiprocessor Systems and Clusters, Proceedings of the IV International Congress on Ultra Modern Telecommunications and Control Systems 2012 (ICUMT), S. Petersburg, 23-25 September 2012, pp. 473-480 doi: 10.1109/ICUMT.2012.6459711 [23] V. Marianov and D. Serra (2009) Median Problems in Networks, available at SSRN: http://ssrn.com/abstract=1428362 or http://dx.doi.org/10.2139/ssrn.1428362 [24] S. Masuyama, T. Ibaraki and T. Hasegawa (1981) The Computational Complexity of the m-Center Problems on the Plane, The Transactions of the Institute of Electronics and Comunication Engineers of Japan, 64E, pp. 57-64 [25] N. Megiddo and K. Suppowit (1984) On the Complexity of Some Common Geometric Location Problems SIAM Journal of Computing, 13, pp. 182-196 [26] N. Mladenovic, J. Brimberg, P. Hansen, J. A. Moreno-Perez (2007) The p-median problem: A survey of metaheuristic approaches, European Journal of Operational Research, Vol. 179, issue 3, pp.927-939 [27] J. G. Morris (1981) Convergence of the Weiszfeld Algorithm for Weber Problem Using a Generalized "Distance" Function, Operations Research, vol. 29 no. 1, pp. 37-48 [28] I. A. Osinuga and O. N. Bamigbola (2007) On the Minimum Norm Solution to the Weber Problem, SAMSA Conference proceedings, pp. 27-30 [29] M. Rabbani (2013) A Novel Approach for Solving a Constrained Location Allocation Problem, International Journal of Industrial Engineering Computations, published online, doi 10.5267/j.ijiec.2013.02.003, http://www.growingscience.com/ijiec/ IJIEC_2013_8.pdf [30] A. W. Reza, K. Dimyati, K. A. Noordin, A. S. M. Z. Kausar, Md. S. Sarker (2012) A Comprehensive Study of Optimization Algorithm for Wireless Coverage in Indoor Area, Optimization Letters, September 2012, published online, doi 10.1007/s11590-012-0543-z, http://link.springer.com/article/10.1007 %2Fs11590-012-0543-z?LI=true [31] M. G. C. Resende (2008) Metaheuristic hybridization with Greedy Randomized Adaptive Search Procedures, in TutORials in Operations Research, Zhi-Long Chen and S. Raghavan (Eds.), INFORMS, pp. 295-319 [32] M. G. C. Resende, C. C. Ribeiro, F. Glover and R. Marti (2010) Scatter search and path-relinking: Fundamentals, advances, and applications, Handbook of Metaheuristics, 2nd Edition, M. Gendreau and J.Y. Potvin, Eds., Springer pp. 87-107 [33] P. S. Stanimirovic, M. Ciric, L. A. Kazakovtsev and I. A. Osinuga (2012) Single-Facility Weber Location Problem Based on the Lift Metric, Facta Universitatis, series Mathematics and Informatics, vol. 27 (2), pp. 31-46 [34] V. A. Trubin (1978) Effective algorithm for the Weber problem with a rectangular metric, Cybernetics and Systems Analysis, 14(6), D0I:10.1007/BF01070282, Translated from Kibernetika, 6 (November-December 1978), pp. 67-70. [35] A. Weber (1922) Über den Standort der Industrien, Erster Teil: Reine Theorie des Standortes, Tübingen, Mohr [36] E. Weiszfeld (1937) Sur le point sur lequel la somme des distances de n points donnes est minimum, To- hoku Mathematical Journal, 43 no.1, pp. 335-386. [37] G. Wesolowsky (1992) The Weber problem: History and perspectives, Location Science, 1, pp. 5-23. [38] G. O. Wesolowsky and R. F. Love (1972) A nonlinear Approximation Method for Solving a Generalized Rectangular Distance Weber Problem, Management Science, vol. 18 no. 11, pp. 656-663 [39] G. G. Zabudski (2004) A Minimax Planar Location Problem with Forbidden Zones: Its Solution Algorithm, Autom. Remote Control 65, No. 2, pp. 241-247