Image Anal Stereol 2014;33:231-234 doi:10.5566/ias.1155 Short Research Communication FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE DATABASE: THE MESSIDOR DATABASE ETIENNE DECENCIEREC,1, XIWEI ZHANG1, GUY CAZUGUEL2,8, BRUNO LAY3, BÉATRICE COCHENER4,8, CAROLINE TRONE5, PHILIPPE GAIN5, JOHN-RICHARD ORDÓNEZ-VARELA1, PASCALE MASSIN6, ALI ERGINAY6, BÉATRICE CHARTON7 AND JEAN-CLAUDE KLEIN1 1MINES ParisTech, PSL Research University, Centre for mathematical morphology, Fontainebleau, France; 2Télécom Bretagne, Institut Mines-Télécom, ITI Department, Brest, France; 3ADCIS, Saint-Contest, France; 4Department of Ophthalmology, Brest University Hospital, Brest, France; 5Department of Ophthalmology, University Hospital of Saint-Etienne, Saint-Etienne,France; 6Service d’ophtalmologie, Hôpital Lariboisiere, AP–HP, Paris, France; 7Centre de Ressources Informatiques de Haute-Normandie, France; 8LaTIM -INSERM UMR 1101, SFR ScInBioS (IFR 148), Brest, F-29200 France e-mail: etienne.decenciere@mines-paristech.fr, xiwei.zhang@mines-paristech.fr, guy.cazuguel@telecom-bretagne.eu, bruno.lay@adcis.net, beatrice.cochener@ophtalmologie-chu29.fr, caroline.trone@univ-st-etienne.fr, philippe.gain@univ-st-etienne.fr, ordonez@cmm.ensmp.fr, p.massin@lrb.aphp.fr, ali.erginay@lrb.aphp.fr, bg@crihan.fr, klein@cmm.ensmp.fr (Received April 18, 2014; revised June 30, 2014; accepted July 31, 2014) ABSTRACT The Messidor database, which contains hundreds of eye fundus images, has been publicly distributed since 2008. It was created by the Messidor project in order to evaluate automatic lesion segmentation and diabetic retinopathy grading methods. Designing, producing and maintaining such a database entails signi.cant costs. By publicly sharing it, one hopes to bring a valuable resource to the public research community. However, the real interest and bene.t of the research community is not easy to quantify. We analyse here the feedback on the Messidor database, after more than 6 years of diffusion. This analysis should apply to other similar research databases. Keywords: diabetic retinopathy, image database, image processing, Messidor. INTRODUCTION (Niemeijer et al., 2010) and e-ophtha (Decenciere et al., 2013). Public databases are precious tools for researchers. They bring the necessary data to develop and test new methods, and allow for quantitative comparisons THE MESSIDOR DATABASE between different approaches. The Messidor database is one of such databases. It was created within The Messidor download page1 gives an the Messidor project to evaluate different lesion appropriate description of the database, which we segmentation methods for color eye fundus images, in quote here: the framework of diabetic retinopathy screening and “The 1200 eye fundus color numerical images diagnosis. It has been publicly distributed since 2008. of the posterior pole for the MESSIDOR Other databases are available for researchers on database were acquired by 3 ophthalmologic retinal image analysis. Concerning the segmentation of departments using a color video 3CCD retinal blood vessels, the DRIVE (Staal et al., 2004) and camera on a Topcon TRC NW6 non-mydriatic STARE (Hoover et al., 2000) databases have become de retinograph with a 45 degree .eld of view. The facto standards. DRIONS-DB contains retinal images images were captured using 8 bits per color for optic nerve head segmentation benchmarking plane at 1440*960, 2240*1488 or 2304*1536 (Carmona et al., 2008). Concerning retinal lesions, like pixels. microaneurysms and exudates, several databases are 800 images were acquired with pupil dilation available such as DIARETDB0, DIARETDB1 (Kauppi (one drop of Tropicamide at 0.5%) and 400 et al., 2007), HEI-MED (Giancardo et al., 2012), ROC without dilation. 1http://messidor.crihan.fr/download-en.php DECENCIERE E ET AL: Messidor feedback The 1200 images are packaged in 3 sets, one per ophthalmologic department. Each set is divided into 4 zipped sub sets containing each 100 images in TIFF format and an Excel .le with medical diagnoses for each image.” Note that, as the description indicates, the database contains a medical diagnosis for each image, but no manual annotations on the images, such as lesions contours or position. This is an important difference with respect to other databases, such as DIARETDB1 and e-ophtha. The download procedure asks the user to .ll-in the following .elds: E-mail address; First Name; Last Name; Professional Interests; Country and University/Organization. An e-mail is then sent to a member of the Messidor team, who checks the validity of the request, and sends an appropriate link to the submitter. Some requests are not accepted, typically because the .elds requested in the download procedure are clearly incorrectly .lled. Precise statistics on refused requests are not kept, but we estimate that they represent less than 25% of the total number of requests. They are not taken into account in the statistics below. It should be noted that Messidor database users are asked to acknowledge the Messidor project partners in their related publications. EXPERIENCE FEEDBACK ON MESSIDOR Most of the statistics on the Messidor database diffusion presented in this section are summarized in Fig. 1. Fig. 1. Evolution of number of citations, web site visitors and dowload requests. People tend to underestimate support and maintenance costs associated to a publicly distributed database. For instance, given the increasing number of download requests for the Messidor database, processing these requests and related questions requires approximately one hour per week. On top of that, users ask general questions about the database – even if most answers are available in the website. Finally, hosting the database and web pages also takes resources. Table 1 gives the number of download requests per year between 2011 and 2013, broken into different countries. It can be seen that download requests clearly increase over time: there have approximately been three time more requests in 2013 than in 2011. This increase comes mainly from less developped countries. Table 1. Download requests for the Messidor database, per year. Some countries, where only few requests originated, are not indicated. Country 2011 2012 2013 Algeria 7 6 7 Australia 0 2 4 Brazil 3 8 4 Canada 2 1 5 China 10 14 31 Egypt 0 9 19 France 2 5 5 Germany 3 3 4 India 84 141 280 Indonesia 4 12 33 Iran 9 19 19 Iraq 0 0 8 Malaysia 9 12 16 Mexico 3 2 9 Netherlands 3 1 3 Portugal 10 6 4 Spain 6 5 10 UK 7 9 7 USA 7 19 7 Other countries 22 50 72 Total 191 324 547 Another measure on the success of the database can be obtained through access statistics to the corresponding web page (see Table 2). Again, one can see a clear increase in web site access since 2008. The number of visitors is approximately two times higher in 2013 than in 2011. This trend clearly appears in Fig. 1. Image Anal Stereol 2014;33:231-234 Table 2. Web statistics on the access to http://messidor.crihan.fr/. Year Visitors Visits Pages 2008 54 129 561 2009 144 276 1011 2010 157 301 1761 2011 312 476 2706 2012 402 644 3913 2013 669 968 5595 The link between download requests or web access and the actual contribution to the research domain is not necessarily simple to apprehend. Indeed, people might download the database or consult the web site for reasons not related to public research. In order to clarify this point, we have looked into the number of citations of the Messidor database in scienti.c papers. The results are summarized in table 3. Interestingly, it can be seen that the Messidor database has been cited three times more often in 2013 than in 2011 -the same increase as for the number of download requests (Fig. 1). Table 3. Citations per year. Values were obtained through Google Scholar using the keywords “Messidor diabetic retinopathy” on June 19, 2014. Year Citations 2008 2009 2010 2011 2012 2013 1 6 16 27 50 89 Total 189 Finally, if we pool the results for two of the most cited journals in the .eld of biomedical image processing, that is Medical Image Analysis and IEEE Transactions of Medical Imaging, we .nd that, since 2008, 47 papers deal with “diabetic retinopathy”, and among these 10 papers cite the Messidor database. Note that other databases used in the same domain follow similar trends. DIARETDB1, which has been distributed since 2007, has been cited 295 times (as of June 19, 2014), while HEI-MED, which was established in 2012, 26 times. CONCLUSION The Messidor database has been publicly distributed since 2008. It is of interest mainly for researchers in a relatively specialized domain: retinal image processing, and more speci.cally computer-assisted diagnosis of diabetic retinopathy. In spite of this, it has gathered a large amount of citations. We have also shown that the number of web site visitors, as well as the number of download requests, seem to be correctly correlated with the number of citations, which provides a simple and convenient method to monitor the success of a database. The experience gathered by our team on the management of the Messidor database allows us to propose some recommendations for the design of future databases: – Hosting and managing the database takes resources; this point should be taken into account during the database design, in order to reduce this cost as much as possible. – The database is typically described on a web page. This description has to be clear and complete, in order to limit the number of requests for additional information (and therefore to reduce the management cost). – The database managers should ask potential users to acknowledge the database or, better, to cite a relevant paper on the database. This simpli.es the evaluation of the success of the database. – Last but not least, we have shown that an automatic validation procedure seems to be enough to treat download requests. Moreover, we believe that this study con.rms the important role that databases play in medical image processing. In the case of the Messidor database, this is true in spite of the fact that the images contained in the database are progressively getting outdated. Indeed, they were acquired before 2007, and modern fundus cameras offer increasing image resolutions and sensitivities. As far as we know, only two databases have been released in this .eld after 2010: HEI-MED (for exudate-based macula oedema detection) and e-ophtha (microaneurysms and exudates segmentation). This stresses the importance of new databases, corresponding to the current clinical practice. ACKNOWLEDGEMENTS The Messidor project was funded by the French Techno-Vision program. The Messidor database and site have been hosted by Crihan (http://www.crihan.fr) DECENCIERE E ET AL: Messidor feedback since 2008. Researchers from the Center for Mathematical Morphology, including Adnan Rashid and Estelle Parra-Denis, have offered support to the Messidor database users and processed their download requests for several years. REFERENCES Carmona EJ, Rincón M, García-Feijoó J, Martínez-de-la Casa JM (2008). Identi.cation of the optic nerve head with genetic algorithms. Artif Intell Med 43:243–59. Decenciere E, Cazuguel G, Zhang X, Thibault G, Klein JC, Meyer F et al. (2013). TeleOphta: Machine learning and image processing methods for teleophthalmology. IRBM 34:196–203. Giancardo L, Meriaudeau F, Karnowski TP, Li Y, Garg S, Tobin Jr. KW, Chaum E (2012). Exudate-based diabetic macular edema detection in fundus images using publicly available datasets. Med Image Anal 16:216–26. Hoover A, Kouznetsova V, Goldbaum M (2000). Locating blood vessels in retinal images by piecewise threshold probing of a matched .lter response. IEEE T Med Imaging 19:203–10. Kauppi T, Kalesnykiene V, Kamarainen JK, Lensu L, Sorri I, Raninen A et al. (2007). The DIARETDB1 diabetic retinopathy database and evaluation protocol. In: Rajpoot NM, Bhalerao AH, eds. Proc Brit Mach Vision Conf. Warwick, Sept 10–13. pp 15.1–10. Niemeijer M, van Ginneken B, Cree M, Mizutani A, Quellec G, Sanchez C et al. (2010). Retinopathy online challenge: Automatic detection of microaneurysms in digital color fundus photographs. IEEE T Med Imaging 29:185–95. Staal J, Abramoff M, Niemeijer M, Viergever M, van Ginneken B (2004). Ridge-based vessel segmentation in color images of the retina. IEEE T Med Imaging 23:501–9.