Image Anal Stereol 2007;26:51-61 Original Research Paper GRADUAL TRANSITION DETECTION FOR VIDEO PARTITIONING USING MORPHOLOGICAL OPERATORS V ALERY N ARANJO1, J ESUS A NGULO2, A NTONIO A LBIOL1, J OSE M. M OSSI1, A LBERTO A LBIOL1 AND S OLEDAD G OMEZ1 1Departamento de Comunicaciones, Universidad Politecnica de Valencia, Camino de Vera s/n, E-46022 Valencia, Spain, 2Centre de Morphologie Mathe´matique, Ecole des Mines de Paris, 35, rue Saint-Honore´, F-77305 Fontainebleau, France e-mail: vnaranjo@dcom.upv.es, jesus.angulo@ensmp.fr (Accepted June 12, 2007) ABSTRACT Temporal segmentation of video data for partitioning the sequence into shots is a prerequisite in many applications: automatic video indexing and editing, old film restoration, perceptual coding, etc. The detection of abrupt transitions or cuts has been thoroughly studied in previous works. In this paper we present a scheme to identify the most common gradual transitions, i.e., dissolves and wipes, which relies on mathematical morphology operators. The approach is restricted to fast techniques which require low computation (without motion estimation and adapted to compressed sequences) and are able to cope with random brightness variations (often occurring in old films). The present study illustrates how the morphological operators can be used to analyze temporal series for detecting particular events, either working directly on the 1D signal or building an intermediate 2D image from the 1D signals to take advantage of the spatial operators. Keywords: 1D morphological filtering, dissolve detection, geodesic reconstruction opening/closing, video shot segmentation, wipe detection. INTRODUCTION Temporal segmentation of video data for partitioning the sequence into shots is a prerequisite in many applications: automatic video indexing and editing, old film restoration, perceptual coding, etc. The detection of abrupt transitions or cuts has been widely studied in many previous works (cf. Brunelli et al., 1999; Cotsaces et al., 2006 a comparative survey). We have also implemented a new method for cut detection, presented in Albiol et al. (2000), which is based on differences between consecutive frames and a morphological set of filters. But the cut detection must be accompanied by specific algorithms for detecting gradual effects such as dissolves (one scene gradually disappearing while another gradually appears) and wipes (one scene gradually entering across the view while another gradually leaves). This paper is only focused on the detection of gradual transitions. Different approaches have been proposed to extract the shots defined by gradual transitions, using different algorithms and models for these phenomena (Meng et al., 1995; Yeo and Liu, 1995; Demarty and Beucher, 1999; Fernando et al., 1999; Lu et al., 1999; Truong et al., 2000; Joyce and Liu, 2006). However, the results are not completely satisfactory even for very complex techniques. Previous works have already illustrated the usefulness of mathematical morphology to process temporal signals (video sequences, Pardas et al., 1992; Naranjo et al., 2004). In particular, morphological operators have also used temporal segmentation metrics to filter out the video, mainly to detect cuts (Demarty and Beucher, 1999; Llach and Salembier, 1999; Albiol et al., 2000). In particular, the algorithms proposed by Demarty and Beucher (1999); Demarty (2000) dealt with different kind of transition (cuts, dissolves and geometric transitions such as wipes) and were based on the morphological filtering of a metric for dissolves and on the study of the geometry of a local difference image mask between successive frames for wipes. In this paper we present a scheme to identify the most common gradual transitions, i.e., dissolves and wipes, which also relies on mathematical morphology operators. The algorithm for dissolves is based on the computation of a simple metric between frames, which is morphologically filtered to detect the dissolve effects, in combination with the variance of the frames (the method of variance detection was proposed in Meng et al., 1995) and improved in other contributions such as Yoo et al. (2006). The combination of morphological analysis of the evolution a metric with the modeling of variance makes our method more robust than other previous approaches based on a single parameter. The technique for wipes is totally original and uses the orthogonal 51 Naranjo V et al: Video gradual transition detection using morphological operators projections of the frames, filtered by reconstruction in order to define a “strip image”, where the wipe transitions are identified again by morphological filtering. The approach is restricted to fast techniques which require low computation (without motion estimation), are adapted to compressed sequences (in fact the algorithms are applied to the dc image, Meng et al., 1995; Yeo and Liu, 1995) and are able to cope with random brightness variations (often in old films). The algorithms can be used for sequences composed of grey level or color image frames. If the image is color we can use the luminance or the sum of the RGB components to define the metrics of the algorithms. Previous works (Gargi et al., 1995) have evaluated the influence of the chosen color space in the detection of cuts, and the representation luminance, saturation and hue seems to give the best performance. We have tested our algorithms using only the luminance or the luminance together with the hue and the saturation but no improvement is obtained. The organization of the rest of the paper is as follows. Section “Methods” is decomposed into several parts: first, it is introduced the notation and a brief reminder on morphological operators for temporal series; second, the method for dissolve detection is introduced; then, the approach for wipe detection is presented. The description and the analysis of the experimental results using our transition detector are discussed in section “Results”, where we will present not only the results of our detectors of gradual transitions, described in this paper, but also the results of the detector of cuts proposed in Albiol et al. (2000). Finally, in section “Discussion”, some conclusions and perspectives are given. METHODS MORPHOLOGICAL OPERATORS FOR TEMPORAL SERIES Image and signal lattices Let {ft(x,y)}Nt=1 be a video sequence ofN frames, where the frame t is a grey level or color image ft (x,y). Assume that s(t) is an equidistant time series (1D signal). We work in this paper on images and time series, therefore we need to precise some notations. Let us consider two complete lattices: ^image and ^signal. An image is a function f(x,y) : E -»¦ ^image where the spatial domain is a discrete set E c Z2, 1 < x < X, 1 < y < Y (X and Y are the number of image columns and rows respectively) and the image lattice is an ordered set of grey levels ^image c Z (or C Z3 for color images). A temporal signal is a function s(t) : T ->¦ ^signal where TcZis the discrete time index, i.e., T = {1 < t < N}) with real values into the signal lattice signal = R. We also consider that for each morphological operator Y (Serra, 1982; 1988; Soille, 1999) we may associate the image mapping YEB : ^image -> ^image (where B is the size/shape of flat structuring element) or the signal mapping ^ : ^signal -»¦ ^signal (where At is the size or length of the temporal structuring element). We remind in the rest of this section the main morphological operators ^ for temporal series. Temporal erosion and dilation The basic morphological operators for temporal series are - Erosion: eL(s(t)) = {s(y) : s(y) = A[s(z)],zL At}, - Dilation: STt(s(t)) = {s(y) : s(y) = V[s(z)],z G At}, where At is the temporal structuring element, which is typically an odd symmetric centered time window, i.e., [t0 - At/2,t0 - At/2 + 1, • • • ,t0 - 1,t0,t0 + 1, • • • ,t0 + At/2-1,t0 + At/2]. The erosion and the dilation are increasing operators, i.e., s1(t) < s2(t) => e^( s 1( t )) < eü(s2(t )), Vt. Moreover, the erosion is anti-extensive, i.e., el(s(t)) < s(t) ; and the dilation is extensive s(t) < S[t(s(t)). In practice, the erosion shrinks the positive structures; “peaks of signal” shorter than the structuring element disappear by taking the value of remaining neighboring signal structures. Dilation produces the dual effects, enlarging the positive peaks of signal. Fig. 1b shows two examples of erosion/dilation of sizes 3 and 7 for the same time series. Temporal opening and closing, and derived operators The two elementary operations of erosion and dilation can be composed together to yield a new set of operators having desirable feature extractor properties which are given by - Opening: y^(s(t)) = 5Tt[eL(s)], - Closing: cu(s(t)) = eL[5L(s)], The morphological openings (closings) filter out positive (negative) peaks (1D structures) from the signals according to the predefined length of the temporal structuring element, see the examples given in Fig. 1c using again two sizes of operators. The opening (closing) is an anti-extensive (extensive) 52 Image Anal Stereol 2007;26:51-61 operator and both are increasing and idempotent operators. The top-hat transformation is a powerful operator which permits the detection of contrasted structures or relevant peaks on non-uniform backgrounds. There are two versions, - White top-hat: The residue of the initial series s and an opening yjf(s); i.e., p^+(s) = s(t) -rl(s(t)), extracts positive peaks. - Black top-hat: The residue of a closing ç>L (s) and the initial series s; i.e., p%~(s) = 0, VAtm > 0, yl yl = yl "yl = yl . Moreover, granulomeres by closings (or anti-granulometry) can also be defined as families of increasing closings O = ((^Ja^o. Performing the granulometric analysis of a series s(t) with F is equivalent to mapping each opening of size Atn with a measure 0. The pattern spectrum PS(s,?tn) maps each size ?tn to some measure of the positive variations with this size (loss of positive peaks between two successive openings). The pattern spectrum PS(s,?tn) is a probability density function: a large impulse in the pattern spectrum at a given time scale indicates the presence of many peaks at that time scale. It is also possible to use standard probabilistic definitions to compute the moments of PS(s,?tn). An example of PS(s(t),?tn) useful to analyze the frequencies of positive peaks is given in Fig. 1e. Temporal reconstruction A morphological tool that complements the opening and closing operators for feature extraction (extract the marked particles) is the morphological reconstruction, implemented using the geodesic dilation, operator based on restricting the iterative dilation of a function marker sm(t) by the unitary temporal structuring element ?t1 to a function reference sr(t), i.e., ?sTr,(n)(sm) = ?sTr,(1)?sTr,(n-1)(sm), where ?sTr,(1)(sm) = ??Tt1(sm(t)) ? sr(t). The reconstruction by dilation or opening by reconstruction is then defined as ?T-rec(sm,sr) = ?sTr,(i)(sm) , such that ?sTr,(i)(sm) = ?sTr,(i+1)(sm) (idempotence). (e) (f) Fig. 1. Application of temporal morphological operators to the time series “Airline” from Falk et al. (2006): (a) original time series s(t); (b) erosions/dilations, el(s(t)) and 8l(s(t)), At = 3,7 ; (c) openings/closings y[t(s(t)) and (pl(s(t)), At = 3,7 ; (d) modified white top-hat, s(t) - yL2 (p[ti (s(t)), Atx = 5, At2 = 20 ; (e) pattern spectrum, PS(s(t),At), 1 < At < 31; (f) dynamic-based opening by reconstruction, yT^(s(t)-H,s(t)),H = 75. 53 Naranjo V et al: Video gradual transition detection using morphological operators Whereas the adjunction opening ??Tt(s) (from an erosion/dilation) modifies the structures of the series, the associated opening by reconstruction ?T-rec(sm,s) (where the marker sm =??Tt( or sm =??Tt(s)) is aimed at efficiently and precisely reconstructing the “shape” of the structural peaks which are not totally removed by the marker filtering process (peaks of length ?t). Other useful markers for extracting peaks according to their dynamics correspond to series of type sm = s-H such as ?T-rec(sm,s) will remove the peak of contrast lower than H, see the example of Fig. 1f. DISSOLVE DETECTION Dissolves are the most usual gradual transition between two shots (see Fig. 2 for an example). The blend between the two sequences is usually linear and involves several frames. Fig. 2. An example of dissolve (from the film “Torbellino”). where th is a threshold to avoid the random variations due to noise (typically th = 2 yields satisfactory results). Using this pixel parameter, we can define a metrics for each frame t by computing s?(t) LXxiLUpt xy) XY If the difference between the pixel (x,y) in the frame ft and the same pixel in the frame ft-1 has the same sign than the difference between the same pixels in frames ft and ft+1, the luminance of this pixel varies monotonously and we can suppose that this point-to-point evolution is linear. When this situation occurs in most of the pixels of a frame, we obtain high values for s?, which indicates a linear luminance variation in the whole image. Consequently, during a dissolve all its frames present high values for s?. Fig. 3 shows the result of s?(t) calculation using a sequence from the film Torbellino. In order to simplify the detection of peaks in s?(t) we propose to carry out a 1D morphological filtering. Linear intensity metrics, s?(t) Our method is based on the assumption of the following simple hypothesis: “The intensity of the pixels in the frames of a dissolve follows a monotonous variation”. We must then define a new metrics, s?(t), to quantify the monotony of consecutive frames in order to detect dissolves. Consider the three successive frames ft-1(x,y), ft(x,y) and ft+1(x,y) of the video sequence {ft(x,y)}tN= 1. We define the following two differences: and dt -(x,y) = (ft(x,y)-ft-1(x,y)) d+(x,y) = (ft+1(x,y)-ft(x,y)), for each pixel (x,y). The coefficient of monotonous linearity is given by T 1 If (\d^(x,y)\>th and \df(x,y)\>th) and sign{df (x, y)) = sign(d+ (x,yj) pt(x,y) = l -1 If(\d^(x,y)\ >th and \d?(x,y)\> th) , and sign{df (x,y)) ^ sign(d+ (x,yj) I 0 otherwise _^^w^ hj^_^j-- Fig. 3. Dissolve metrics and corresponding morphological filtering (temporal closing of size 24 and temporal top-hat of size 12) using a sequence from the film “Torbellino”. Initially a temporal closing of size ?t1 removes the negative peaks of temporal length less than ?t1: clos t) (pit { (0)- _ 500 1000 No. frame _ 54 Image Anal Stereol 2007;26:51-61 In fact, the value of ?t1 allows us to fix the minimal duration between two dissolves, e.g., ?t1 = 24 involves a minimal distance equal to 1 second (frame rate = 24 frames/second). Then, using a top-hat of size ?t2 the positive peaks are extracted: tophat (0=C(0-^(C(0) In this case the value of At2 defines the maximum duration of a dissolve, e.g., taking At2 = 12 corresponds to 0.5 second (typical value). In the s tppha\t) of Fig. 3, we can observe a peak produced by the dissolve placed in the interval 522-525. Applying a threshold value up = 0.15, the dissolve is detected. However, a false alarm will also be detected in frames 626-628. These false alarms are produced by high motion objects (objects in motion which take up many pixels in the frame). In order to reduce these false alarms, our method combines the information achieved using sp{t) and the information of the parabolic variance evolution in a dissolve. Detection of a parabolic variance Let fi(x,y) and f?(x,y) two uncorrelated sequences whose intensity variance are o\ y erf, respectively. In a dissolve, the frames are obtained by the weighted average offt/ (x,y) and fi{x,y) during the transition interval, in the following way: ftissolve{x,y)=f}{x,y)[1-a{t)]+ff{x,y)a{t), where h < t < t2 is the dissolve interval. The weight is given by C 0 t < t\ a(t) = l (t-h)/(t2-h) h t2 The variance of the dissolve sequence ft (x,y) is a parabolic curve, such as for each frame t in the dissolve ?2(t) = (?12 +?22)?2(t)-2?12?(t)+?12. 1500 1400 1300 1200 \ 1100 \ rA 900 \ / 800 / 700 (a) (b) Fig. 4. (a) Curve of variance for an ideal dissolve. (b) Variance of the sequence from the film “Torbellino” (s?2 (t)), where we can observe the dissolve placed between the vertical lines. Below, a zoom of the dissolve variance is shown. Ideally, in the frames belonging to a shot, the variance remains constant while the variance of the dissolve frames has a parabolic shape (Fig. 4a). In real sequences, the variance signal s?2 (t) in the dissolve region is approximately a parabola, but in a shot the variance could not remain constant and even presents a parabolic curve. This last effect is due to the motion in the scene. Fig. 4b shows the variance of a sequence from the film Torbellino. The variance of the dissolve was zoomed in order to observe its shape. We can observe other regions where the variance is also a parabola. The algorithm In summary, the steps of the proposed algorithm to detect the dissolves are: (1) Calculate the signals s?(t) and s?2(t) for each frame t of the sequence. (2) Fix ?t1 and ?t2 and filter out s?(t) to obtain the signal st?ophat(t). Then, apply a threshold u?. All transitions with a value at st?ophat(t) higher than the threshold will be candidates for a dissolve. (3) If the difference between the ideal variance model ?2(t) and the obtained variance for the candidate frames s?2(t) is less than a threshold u?2, the candidate transition is detected as a dissolve. The values selected for the thresholds have been u? = 0.15 and u?2 = 350. To obtain these values we have achieved a deep study (Angulo, 1999) on a selection of sequences to estimate the probability density functions of the signals s?(t) and s?(t) for transition and non-transition situations. Fig. 5 shows the probability density functions of s?(t). The threshold is selected as the intersection point between the two curves (pdf for transition and pdf for non-transition), i.e., the hypothesis selected is which originates with higher probability s?(t). Note that this is equivalent to use the maximum likelihood test: P(H1/x) H ?H1 P(H0/x) where x = s?(t), H0 is the hypothesis that a transition doesn’t occur and H1 is the hypothesis that a transition occurs. The same study has been carried out for the variance signal, obtaining the intersection between the two curves at the point s?2 (t) = 350, so, this value is selected as threshold u2 . 55 Naranjo V et al: Video gradual transition detection using morphological operators 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 — — ¦ transition | no transition |- s (t) Fig. 5. Probability density fu? nction of signal s?(t) for transitions and non-transitions. ft (x,y) Fig. 7. Two “Torbellino” and the frames. ft+1(x,y) consecutive frames | ft(x,y)- ft+1(x,y) | from the film difference image between both WIPE DETECTION A wipe is a gradual transition between two shots where one image belonging to a sequence ft1(x,y) is linearly shifted by another ft2(x,y), and this effect lasts several frames. In each frame of a wipe, the second image superimposes W pixels on the first image, from left to right (or from right to left) if the wipe is vertical, and from top to bottom (or from bottom to top) if the wipe is horizontal. An example of a vertical wipe is given in Fig. 6. Fig. 6. An example of a vertical wipe where the second image replaces the first image from left to right (from the film “Torbellino”). Fig. 7 shows us the difference d(x,y) = \ft(x,y) -ft+i (x,y)| between two consecutive frames t and t + 1 belonging to the vertical wipe. This image difference has a width area of W pixels, where the intensity of the pixels is brighter than in all the height of the image. Orthogonal projections and reconstruction The first phase of our method consists of determining the position of that area within the width of the image. Using the image difference d(x,y), we have to calculate the normalized vertical projection signal by applying the following equation: svp{x)t = YlJy=idt{x,y) 0µt +th -1 ft (x,y) < µt -th 0 otherwise, (2) The signal CMS(t) is in the range -1 ?CMS(t) ? 1. A cut occurs when the CMS(t) is near to -1, which means that a high number of pixels has a different behavior with respect to the mean in frame ft and ft-1. The cut detector is improved applying a set of morphological filters to the CMS signal, resulting a detector highly robust to the false alarms due to flicker. The whole process is presented in Albiol et al. (2000). (2) Gradual transition detector which consists of the dissolve and wipe detectors presented in sections “Dissolve Detection” and “Wipe Detection”, respectively. PERFORMANCE EVALUATION In order to evaluate the performance of our detectors, the recall and the precision will be obtained using the following expressions: Recall Detections Detections+MD’s and Precision Detections Detections+FA’s where “Detections” is the number of detected effects, “MD’s” the number of missed detections and “FA’s” that of false alarms. Three different categories of video sequences have been used in order to test the detectors: _ X X 58 Image Anal Stereol 2007;26:51-61 Synthetic sequence: Is the linking, using dissolves of different lengths and wipes of different lengths and directions, of the QCIF sequences: Bridge close view, Bridge far view, Carphone, Claire, Container Ship, Foreman, Grandma, Highway drive, Mother and Daughter, Salesman and Silent, all of which are free available in http://trace.eas.asu.edu/yuv/qcif.html. This sequence has 10563 frames and is available in an AVI file http://personales.upv.es/?vnaranjo/effectfilm.zip Real new sequences: Is a set of color sequences of high quality from current television movies and news: · Sport sequences: ? cycling: a fragment from the cycle race around Spain with 1997 frames and 3 dissolves and 0 cuts. ? football: a fragment of 7060 frames of a football match with 14 cuts and 13 dissolves. ? basket: 6750 frames of a basket match with 15 cuts and 20 dissolves. · News sequences: ? news a: a piece of a news report (from TVE: Spanish Television) with 1907 frames and several transition effects: some cuts, 3 dissolves and 1 vertical wipe and also several sophisticated edition effects. ? news b: similar sequence to the previous one with 1499 frames and only 1 dissolve. ? news NBC: similar to the previous sequences with 13727 frames, 35 cuts, 19 dissolves and 16 wipes. · Film sequences: ? movie: a fragment, 3010 frames, of the film La sombra del cipre´s es alargada with many cuts but no transition effects. ? drama: a fragment, 3012 frames, of the Spanish TV series Pepa y Pepe also with many cuts but no transition effects. ? zorro: sequence from the film ”The mask of Zorro”, 5075 frames, 35 cuts and no gradual transition. · Other sequences: ? cartoon: fragment of a cartoon TV series called Don Quijote de la Mancha, 8778 frames, 55 cuts and 29 dissolves. ? culture: a documentary about villages from Spain, 14896 frames, 75 cuts and 19 dissolves. Real old sequences: Is a set of degraded black and white sequences from several old films: malva: a fragment of 1789 frames from Spanish film Malvaloca (1942). the torbe: 2214 frames from Torbellino (1940) the Spanish film We have analyzed all these sequences, near 70000 frames in all, obtaining the results shown in Tables 1– 3, for the detection of cuts, dissolves and wipes respectively. In spite of the high number of processed frames, the number of gradual transition effects evaluated is not very high, due to the rare appearance of this kind of effects in real sequences, in comparison with the appearance of cuts. Table 1. Results of cut detection. Sequence Detections MD’s FA’s Synthetic 4 0 0 Cycling Basket Football 0 23 21 0 7 6 0 0 0 news a news b news NBC 16 4 37 0 0 1 0 1 2 movie drama Zorro 14 11 87 0 0 5 0 0 1 Cartoon Culture 52 75 3 0 2 0 malva torbe 5 3 0 0 0 1 Total 352 22 7 Table 2. Results of dissolve detection. Sequence Detections MD’s FA’s Synthetic 4 1 0 Cycling Basket Football 3 20 14 0 3 2 0 2 5 news a news b news NBC 19 1 2 1 0 0 1 0 1 1 movie drama Zorro 0 0 0 0 0 0 0 0 0 Cartoon Culture 26 19 3 2 5 2 malva torbe 1 5 0 0 0 2 Total 96 12 18 59 Naranjo V et al: Video gradual transition detection using morphological operators Table 3. Results of wipe detection. Sequence Detections MD’s FA’s Synthetic 3 0 0 Cycling Basket Football 0 0 0 0 0 0 0 0 0 news a news b news NBC 2 0 24 1 0 0 1 0 5 movie drama Zorro 0 0 0 0 0 0 0 0 0 Cartoon 0 0 0 malva torbe 1 3 0 0 0 1 Total 33 1 7 With the detector results presented in Tables 1 (for cut detection), 2 (for dissolve detection) and 3 (for wipe detection), the values of precision and recall are 93.2% and 93.7% respectively. Even in the case of old films, the detector does not obtain a high number of false alarms and misdetections. However, in old very degraded sequences a high number of false positives appears, which are associated to strong intensity degradation throughout several frames. A flicker correction step can be considered in order to improve the results (Naranjo and Albiol, 2000), achieving, after this flicker correction, similar values of recall and precision. DISCUSSION We have presented in this paper morphological techniques for detecting dissolves and wipes in video sequences. It is a necessary step, which combined with the detection of cuts, allows the temporal segmentation of sequences into shots. Experimental results have shown the satisfactory performance of our methodology with degraded sequences and still better on sequences in good condition. The developed low complexity algorithms yield to fast implementations and therefore can be adapted for (quasi) real-time applications. From a methodological viewpoint, the present study illustrates how the morphological operators can be used to analyze time series for detecting particular non periodic events, either working directly on the 1D signal or building an intermediate 2D image from the 1D signals to take advantage of the spatial operators. Regarding this subject, we can consider other applications of the “strip images”, for instance, in video surveillance algorithms, to identify pedestrians (Fig. 10) or other events. As well as the orthogonal projections, we can use other “image parameters” for the non temporal axis of the “strip images”, for instance, the luminance histogram to detect abrupt illumination variations, or color images combined with the saturation histogram to detect highlights and shadows, or skin color-centered hue histogram to detect people, etc. ACKNOWLEDGMENTS This work has been supported by the Polytechnic University of Valencia interdisciplinary project 5607-2004 and the Cicyt project TIC 2002-02469. We would like to thank the Foreign Language Co-ordination Office at the Polytechnic University of Valencia for their help in revising this paper. The authors wish also to acknowledge the support received from the IVAC-Filmoteca de Valencia as regards the selection of the film material to be restored. The authors gratefully thank also the reviewers for the valuable comments and improvements they suggested. REFERENCES Angulo J (1999). Temporal segmentation of video sequences. Master Thesis, Universidad Polite´cnica de Valencia, October 1999. Albiol A, Naranjo V, Angulo J (2000). Low complexity cut detection in the presence of flicker. Proceedings of IEEE International Conference on Image Processing (ICIP’00), Vol. III: 957–60. Brunelli R, Mich O, Modena CM (1999). A survey on the automatic indexing of video data. J Vis Commun Image R 10:78–112. Cotsaces C, Nikolaidis N, Pitas I (2006). Video shot detection. A review. IEEE Signal Proc Mag 23:28–37. Demarty CH, Beucher S (1999). Morphological tools for video indexing. Proceedings of IEEE International Conference on Multimedia Computing and Systems (ICMCS’99), Vol. 2: 991–2. Demarty CH (2000). Segmentation et Structuration d’un Document Vide´o pour la Caracte´risation et l’Indexation de son Contenu Se´mantique. Ph.D. Thesis, Centre de Morphologie Mathe´matique-Ecole des Mines de Paris, January 2000. Falk M et al. (2006). A First Course on Time Series Analysis, by Chair of Statistics. University of Wu¨rzburg. Fernando WAC, Canagarajah CN, Bull DR (1999). Fade and dissolve detection in uncompressed and compressed video sequences. Proceedings of IEEE International 60 Image Anal Stereol 2007;26:51-61 Conference on Image Processing (ICIP’99), Vol. III: 299–303. Gargi U, Oswald S, Kosiba DA, Devadiga S, Kasturi R (1995). Evaluation of video sequence indexing and hierarchical video indexing. In: Niblack W, Jain RC, eds. Proceedings of SPIE Conference on Storage and Retrieval for Image and Video Databases III, SPIE Vol. 2420, 144–51. Joyce R A, Liu B (2006). Temporal segmentation of viedeo using frame and histogram space. IEEE Trans Multimedia 8(1):130–40. Llach J, Salembier Ph (1999). Analysis of video sequences: Table of contents and index creation. Proceedings of International Workshop on Very Low Bitrate Video (VLBV’99). Lu HB, Zhang YJ, Yao YR (1999). Robust Gradual Scene Change Detection. Proc. of IEEE International Conference on Image Processing (ICIP’99), Vol. III: 304–8 . Meng J, Juan Y, Chang SF (1995). Scene Change Detection in a MPEG Compressed Video Sequence. In: Rodriguez AA, Safranek RJ, Delp EJ, eds. Proceedings of IST/SPIE Symposium, Vol. SPIE 2419, 14–25. Naranjo V, Albiol A (2000). Flicker reduction in old films. Proceedings of International Conference of Image Processing 2000 (ICIP’00), 1300–3. Naranjo V, Albiol An, Mossi JM, Albiol Al (2004). Morphological ?-reconstruction applied to restoration of blotches in old films. In: Villanieva JJ, ed. Proceedings of the IASTED International Conference on Visualization, Imaging, and Image Processing (VIIP’04), 251–7. Pardas M, Serra J, Torres L (1992). Connectivity filters for image sequences. In: Gader PD, Dougherty ER, Serra JC, eds. Proceedings of SPIE Symposium on Image Algebra and Morphological Image Processing III, SPIE Vol. 1769:318–29. Serra J (1982). Image Analysis and Mathematical Morphology, Vol I, Image Analysis and Mathematical Morphology. London: Academic Press. Serra J (1988). Image Analysis and Mathematical Morphology, Vol II: Theoretical Advances. London: Academic Press. Soille P (1999). Morphological image analysis. Berlin, Heidelberg: Springer-Verlag. Truong B T, Dorai C, Venkatesh S (2000). Improved fade and dissolve detection for reliable video segmentation. Proceedings of IEEE International Conference on Image Processing (ICIP’00), Vol III: 961–4. Vincent L (1993). Morphological Grayscale Reconstruction in Image Analysis: Applications and Efficient Algorithms. IEEE Trans Image Process 2(2):176– 201. Yeo B, Liu B (1995). Rapid Scene Analysis on Compressed Video. IEEE Trans Circ Syst Vid 5(6):533–44. Yoo HW, Ryoo HJ, Jang DS (2006). Gradual shot boundary dtection using localized edge blocks. Multimed Tools Appl 28:283–300. 61