2 zaua az & •,.- 3D STEREO CO OF BUILDINGS FR I .GES Drago Torkar Institut Jožef Stefan, Ljubljana Received November 13, 1995 Revised November 21, 1995 Abstract In the paper our approach to the stereo reconstruction problem is presented. First linear line segments are extracted from left and right image. Next structures that possibly represent buildings are detected using graph theo,y. Finally stereo correspondence is determined by graph matching algorithm. Keywords: building detection, building reconstruction, feature-basecl matching, feature exlraction, Geodetic workshop, graph theory, Otočec, 1995 1 INTRODUCTION UDC 528.77.011.56:711.6 061.3(497.12 Otočec)"1995":528 3 D stereo reconstruction from aerial images is a common application in stereo .. Since it is a tirne consuming task increased research activity toward automatic building reconstrnction (Huertas et al., 1993, Dang et al., 1994, Deren et al., 1994). ne of the prob!ems within automatic stereo reconstruction of objects is the determination of corrcspondent features on the left and right image. The position of correspondent points forms the base for depth estimation in camera and the global coordinate system. The determination of correspondent image elements involves stereo matching of the left and right image and/or feature extrc1ctton from both images and feature matching. Some methods use also the digital termin model (DTM) which can be extracted by area-based matching or by other methods (Weidner et al., 1994). The stereo correspondence det~rmination is accompanied by severa! problems caused by geometry const~aintso 2 AUTOMATIC MET.HOD he novel method for automatic building reconstruction from aerial stereo images is based on the work reported by Horaud and Skordas (Horaud et al., 1989). The correspondent elements determination was done using the procedure from the set of feature-based disparity estimation methods. Some buildings can be quite complex so the symbolic dcscription must be made by structmed features. Each feature represents one building or a set of buildings ver; close together. Therefore a procedure for building detection and their representation by structured features had Geodetski vestnik 39 (1995) 4 to be developed. Next an efficient method for stereo matching had to be used to determine correspondent buildings. The whole procedure was done in four steps: o primitive feature extraction -- • constmction of structured features using graphs o correspondent structures determination using graph matching algorithm o 3D building rcconstruction. 3 FEATURE EXTRACTION Straight line segments were used as primitive features. To extract them first Canny's edge detector (Canny, 1983) was applied. The width of the edge operator was determined experimentally for each image separately. It depends on image intensity and contrast as well as image content. Since buildings are composed of straight edges the line growing algorithm was used to detect straight segments that potentially represent building edges. Because of the noise present in almost every iniage and the tolerances introduced by the feature detection process the feature improvement module was developed. The algorithm was implemented in two steps. First ali line segments are investigated to find groups of collinear segments among them. Each such group is then replaced by a single straight line segment. Further, the distances between segment ends are compared to a certain threshold. H two ends are close enough the corresponding segments are prolonged or shortened to the point of intersection. This is done also in the case when the line segment end is near another line segment. In the case of multiple segment ends within the threshold distance, segments are extended to the center of gravity of intersection points. 4 BUILDING STRUCTURES USING GRAPHS The goal of building structures from primitive features is a set of structured features for the left and right image, each of them representing a symbolic description of a single building. Only buildings having only straight, closed edge segments are considered. The building of structures was implemented in two steps. First the relational graph for the left and right image was constructed from primitive . features Le. straight line segments. The result of this part was an unconnected, ' labeled relational graph representing the relaiionship among primitive features. In the second step some nodes were deleted and connected subgraphs representing one building were detected. · 4.1 Construction of relational graph from primitive featm:es very line segment is represented by a node in the graph weighted by the following values: the position of a segment, the segment size and the segment · contrast and the number of pointers of every type. Two types of pointers, representing graph edges, are also assigned to every node: is_connected_to and is_parallel_to. The pointers of the first type point to line segments connected to or intersecting with the segment in question. The pointers of the type is_parallel_to point to nearest parallel segments. Thus every segment can have only two pointers of this type. Geodetski vestnik 39 (1995) 4 )/], t, J, t. t5 u J, J,o -z-- ~ J, ' v J,, ~ Figure 1: Left and right image f eatures The values of weights are defined as follows: the position of a segment is determined by its end points. Segment size is defined by the length of a segment. Contrast is calculated as a difference of the average gray value of a stripe to the left of the segment and average gray value of the stripe to the right of the same segment. Number of connected segments means the number of adjacent segments intersecting or jointing with the segment in question. Number of parallel segments means the number of segments within a certain distance from the segment in question and being parallel to that segment. The pointers of the type is _ connected _ to ( on figure 2 labeled with 1) are determined in a way that the intersection points of a line containing particular segment with aU the other lines are calculated. If the intersection is within both line segments the corresponding pointer is initialized. If no po in ter of the type is _ connected _ to for the particular line segment is initialized the pointer number is set to zero. Similar the pointers of the type is_parallel_to (in figure 2 labeled with 2) are initialized. The slope for every line segment is calculated. Among other line segments having the slope within the ±10% and overlapping 60% or more, the nearest one in both normal directions is chosen. The relational graph building for example in figure 1 is shown in figure 2. Geodetski vestnik 39 (1995) 4 (!) Figure 2: Relational graphs of the left and right image 4.2 Deleting nodes and subgraph detectirnn So far the weighted relational graph constructed is unconnected because there exists a set of features neither having intersections with neighbors nor being parallel to them. The construction of the labeled relational graph is continued by deleting the nodes in the graph having no edges of both types or having only one edge of a particular type. This way the features that have only one parallel line segment or are connected to only one neighbor are eliminated. It rarely happens that two features organized this way would actually represent a building. If they do it is the consequence of bad edge detection errors during the line growing process, which is usually due to bad choice of free parameters. The task is to find subgraphs representing a single building or complex of buildings on the site. The buildings should be closed structures from the edge point of view. Therefore we search for cycles among all connected subgraphs of a relational graph. Their detection is simple and efficient enough. We do this only on graph edges representing connections using the well-known depth-first search. The subgraphs forming the cycles are then investigated again to eliminate the ones representing line segments intersecting at a single point. The remaining connected subgraphs are then used to determine correspondence between left and right image features. 5 DETECTING CORRESPONDENT FEATURES USING GRAPH lV"iATCHING ALGORITHM Due to severa! reasons, such as photometric differences between left and right image, occlusions, shadows, errors during primitive featurc extraction etc., double subgraph isomorphism among connected subgraphs of relational graphs can not be expected. Thcrefore the correspondence had to be detected based on similarity among subgraphs. First the similarity measure between two nodes of the relational graph was dcfined. Geodetski vestnik 39 ( 1995) 4 l = 2 ( min(Ki, Kj) 5 ma:x(Ki K) ' ' J .!_ minJJi, lj) 1 rt - 2$ii + + + 4 max(/i, lj) 4 rt ., .!_ min("Ep 1;, EfJ 1;) + 1_ 1ninCf/Yzi, L/]4)) r 2 max(Ip 1;, Ip 1) 2 max(I/J2i, 41·2) The similarity measure is a number from the interval (0,1). The first term represents similarity in contrast. The second one is weighted by and checks out matching in length. The third one measures slope similarity and is also weighted by. The fourth term determines the difference in the number of pointers of the type is_connected_to and is weighted by ½ . The fifth one finds out the difference in the number of pointers is_parallel_to and is also weighted by ½. The greatest weight was given to the similarity in,contrast. Tothe pointers of both types was given only the weight of ½ because frequentiy a certain feature is connected to or parallel to the features not belonging to the building. The slope and the length of line segments depend on the distance between the camera position taking the left and right image. ext the similarity measure between two connected subgraphs of relational graphs of the left and right image was defined as the sum of similarity measures between all possible node pairs of the two subgraphs and normalized to the interval (0,1). Nm Nn Mmn 1 II Nm Nn mpij i=l j=l The similarity measure that was defined reflects the relationship between every subgraph of the left image and every subgraph of the right image. Now a new graph, called a correspondence graph, can be constructed. The nodes are the subgraphs of the left and right relational graphs. The edges are made from all the nodes corresponding to the left image subgraphs to ali the nodes corresponding to the right image subgraphs. The edges are labeled (weighted) by similarity measures between the nodes. The task is to find such a subset of connected nodes in which every node appears only once. In other words every node can be connected to only the node to which it has a maximal measure of similarity. Some nodes can remain unconnected. The problem of finding the correspondent structu:res between the left and the right image is so translated to the maximum matching problem in the graph theory. This is solved using the solution of the classic stable marriage problem (Sedgewick, 1988). The algorithm assigns to every connected relational subgraph from the left image the most suitable pair among connected relational subgraphs of the right image. The same algorithm is applied again within these pairs of connected subgraphs to find out the correspondent nodes of relational graphs, i.e. correspondent features. Using the known positions of line segment ends in the left and right image, its orientation and camera orientations in the moments of taking the left and right image, the 3D position of feature points can be calculated in the left or right camera coo:rdinate system. If the camera perspective center is known in absolute coordinates the point position can be expressed in any of absolute coordinate systems. Geodetski vestnik 39 (1995) 4 6 3D STEREO RECONSTRUCTION OF BUILDINGS The described method for 3D building reconstruction from aerial stereo images was tested on aerial photographs during the systematic periodic aerial survey of Slovenian territory (CAS). The absolute camera orientations were known. Photographs at a scale of 1:5 000 were used. The scanning of photographs was made with pixel size of 20 m which means a resolution of app. 1 270 dpi. The size of the image from the photography of 23x23 cm was therefore about 11 500 x 11 500 pixels which means about 130 MB of data. Because of this enormous size overlapping squares of only 2x2 cm were used. The overlapping of images was 60%. Below the example of 3D reconstruction of buildings with an intermediate result is shown. 7 CONCLUSION A method for almost automatic stereo reconstruction of buildings was developed. The user must specify some parameters that define the behavior of the edge detection module and tolerances for the feature improvement module. The appropriate values can be obtained by trial and error or through experience. The extracted features are usually part of building roofs because of vertical camera position. This and errors during feature extraction and structure building cause only partial reconstruction of buildings. The number of correctly reconstructed points is too small to make this method a serious competitor to semi-automatic or · non-automatic methods. The described procedure thus represents only one of the possible research directions in the evolution of automatic methods for stereo reconstruction of buildings from aerial images. Geodetski vestnik 39 (1995) 4 Geodetski vestnik 39 (1995) 4 /\. ( /) v/ / / ~fr, \, 1 ,1) 1 '~\ ' 1 ,' / J . \: 4: Detected features on the left ( upper) and right (lower) image Geodetski vestnik 39 (1995) 4 , Figure 5: 3D rendering of successfully reconstructed points References Canny, J.F., Finding edges and lines in images. Technical report AI-TR 720, M.l T. Artificial Intel!. Lab., Camb,idge, MA, 1983 Dang, T. et al., Applying perceptual grouping of surface mode/s to the detection and stereo reconstniction of buildings in aerial imagery. ln: Proc. ISPRS 1994 - Spatial Infonnation from Digital Photogrammetry and Computer Vision, SPIE, DGPF, IEEE, 1994 Deren, L., Juliang S., House extraction with multiresolution analysis and infonnational fusion. In Comm. III Symposium 94, WG/2, 1994 Horaud, R., Skordas, T., Stereo correspondence through feature grouping and maximal cliques. IEEE Trans. Pattem Analy. Machine Intel!., 1989, vol. II, no. 11, p. 1168-1180 Huertas, A. et al., Detecting of buildings from monocular views of aerial scenes using perceptual grouping of shadows. In: Proc. DARPA Image Understanding Workshop, 1993 Sedgewick, R., A{gorithms. Addison-Wesley Publishing Company, 1988 Weidner, U., Foerstner, W., Toward automatic building extraction from high resolution digital elevation models. In: Digital Photogrammetry, Muenchen, 1994 Review: Vasja Bric, M.Se. Radovan Dalibor, M. Se. Geodetski vestnik 39 ( 1995) 4