ON a o\ o\ a\ [f^j [«s S i I 0 Informatica An International Journal of Computing and Informatics Special Issue: Information Society and Inteligent Systems Guest Editors: Cene Bavec, Matjaž Gams The Slovene Society Informatika, Ljubljana, Slovenia X Informatica An International Journal of Computing and Informatics Basic info about Informatica and back issues may be FTP'ed from ftp. arnes. si in magazines/informatica ID: anonymous PASSWORD: FTP archive may be also accessed with WWW (worldwide web) clients with URL:http ://www2.ij s.si/"mezi/informatica.html Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarski pot 12, 1000 Ljubljana, Slovenia. The subscription rate for 1999 (Volume 23) is - DEM 100 (US$ 70) for institutions, - DEM 50 (US$ 34) for individuals, and - DEM 20 (US$ 14) for students plus the mail charge DEM 10 (USS 7). Claims for missing issues will be honored free of charge within six months after the publication date of the issue. Tech. Support: Borut Žnidar, Kranj, Slovenia. Lectorship: Fergus F. Smith, AMIDAS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. Printed by Biro M, d.o.o., Žibertova 1,1000 Ljubljana, Slovenia. Orders for subscription may be placed by telephone or fax using any major credit card. Please call Mr. R. Mum, Jožef Stefan Institute: Tel (+386) 61 1773 900, Fax (+386) 61 219 385, or send checks or VISA card number or use the bank account number 900-27620-5159/4 Nova Ljubljanska Banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). J i According to the opinion of the Ministry for Informing (number 23/216-92 of March 27, 1992), the scientific ; journal Informatica is a product of informative matter (point 13 of the tariff number 3), for which the tax of traffic \ amounts to 5%. t i f f Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pemuš) Slovenian Artificial Intelligence Society; Cognitive Science Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupančič) Slovenian Association of Technical and Natural Sciences (Janez Peklenik) Informatica is surveyed by: AI and Robotic Abstracts, AI References, ACM Computing Surveys, ACM Digital Library, Applied Science & Techn. Index, COMPENDEX*PLUS, Computer ASAP, Computer Literature Index, Cur. Cont. & Comp. & Math. Sear., Current Mathematical Publications, Engineering Index, INSPEC, Mathematical Reviews, MathSci, Sociological Abstracts, Uncover, Zentralblatt für Mathematik, Linguistics and Language Behaviour Abstracts, Cybemetica Newsletter The issuing of the Inforwaticajoumal is financially supported by the Ministry for Science and Technology, Slovenska 50, 1000 Ljubljana, Slovenia. Post tax payed at post 1102 Ljubljana. Slovenia taxe Percue. Introduction: Information Society and Intelligent Systems This "Information Society" special issue of Informatica consists of selected papers, presented at the "Information society - IS'99" international conference held in Ljubljana, Slovenia. The conference papers were modified and typically extended for around 30%. The multiconference included five independent conferences: Information Society and Intelligent Systems, Education in Information Society, Data Mining and Warehouses, Development and Reengineering of Information Systems, Biology and Cognitive Sciences. All of them are related to information society. Most of the papers are dealing with information society in general and with information society in Slovenia. No doubt we are rushing into the information society. Information, the Internet and intelligent systems characterize this, the most developed era of human civilization. In the information age, everybody has to adapt constantly. Even the mighty Microsoft, the undisputed king of SW-tools generation, is trembling in new circumstances. While big companies are intensively searching for new approaches and markets, an Internet society of millionaires is emerging in developed countries. These nouveau richies do not need enormous investments or decades of hard work. What they need is a good idea to be implemented on the Internet. Risks may be high, but incomes are enormous. Internet business is growing by 70-80% each year. One year ago we had to observe Europe lagging behind USA. This year, Europe is speeding into information society with the same rate as America. Finland and some European countries are among most developed information countries in the world. Nokia, Finnish telecommunications and information giant, controls over 50% of the mobile telephone market. Slovenia faces a kind of stagnation in 1999, opposite to the year before. Previous successful introduction of the Internet and information society was strongly correlated with the growth of national providers such as ARNES ("Academic and Research Network of Slovenia") for science and education, and CVI ("Government Centre for Informatics") for governmental institutions. Mobile telephony in Slovenia is growing rapidly because of the end of state monopoly. In other telecommunication areas, state monopoly remains the greatest obstacle for further progress. This in turn decreases the Internet and information society growth in Slovenia. Institutions, politicians and leading managers in Slovenia spend much of their energy fighting for their positions. Instead, they should take a clear position towards progress of information society and its introduction into private enterprises and governmental institutions. For example, unlike many Eastern European countries we have not managed to establish our national ACM chapter or national Information society Forum. Next year we are going to try again. Information society is the only possibility of development if Slovenia wants to catch up with the most highly developed countries in Europe and elsewhere. The process of becoming a member of the European Union is a unique opportunity for Slovenia to decide clearly in favor for development strategy, and thus to become prepared to enter into the third millennium and into the information society. The special issue consists of contributions grouped into general papers and technical applications. The first five papers in the special issue deal with information society in general, and often also with its introduction in Slovenia: The first paper by M. Gams "Information Society and the Intelligent Systems Generation" describes general properties of information society and intelligent systems. It is claimed that information society promotes a primitive network intelligence displayed as connected autonomous intelligent agents. A short history introducing intelligent systems and agents in Slovenia is presented. The "A New Perspective in Comparative Analysis of Information Society Indicators" paper by P. Sicherl analyses number of hosts in European countries and in particular, relations to Slovenia. Although Slovenia's position is relatively good - above European average, recent years indicate certain stagnation in comparison to most developed European countries like Finland. V. Vehovar and M. Kovačič similarly to Sicherl conclude that Slovenia is generally on European average with respect to the penetration of the information technologies in society. In the paper "Measuring Information Society: Some Methodological Problems" they analyse different technical indicators and attitudinal measures and not just the number of hosts. Another modelling and visualisation of information society development is presented in "Modelling of an Information Society in Transition - Slovenia's Position in the CE Countries" by M. Krisper and T. Zrimec. Different techniques are used, from clustering, portfolio, diagramming etc. Six Central European countries associated to the European Union are compared. Results show that Slovenia is often most closely associated with EU. K. H. lizuka and M. Wada in "Customer Satisfaction of Information System Integration Business in Japan" describe customer satisfaction as one of the major motivating factors in information society. They analyse and propose elements that affect total customer satisfaction. The next group of papers describes information society and an additional technical subject, e.g. digital signatures; First of these papers is the "Digital Signatures Infrastructure" by T. Klobučar and B. Jerman Blažič. They present an infrastructure for the use of digital signatures, technical aspects, and a short overview of several existing legal frameworks. E. Jereb and B. Šmitek in "Using an Electronic Book in Distance Education" present experience they obtained by forming an electronic book in distant education, and student opinion studying in the new Distance learning centre. In "Multi-Attribute Decision Modeling: Industrial Applications of DEX" M. Bohanec and V. Rajkovič describe application of a decision-support system DEX. M. Ankerst, C. Elsen, M. Ester and H.-P. Kriegel describe algorithms and system for data visualisation in the "Perception-Based Classification" paper. Large networks can be reduced into a smaller comprehensible structure that can be easier interpreted as proposed by V. Batagelj, A. Ferligoj, and P. Doreian in "Generalized Blockmodeling". Clustering methods are described in the "Adapted Methods For Clustering Large Data-Sets Of Mixed Units" by S. Korenjak Čeme. I. Nančovska, L. Todorovski, A. Jeglič and D. Fefer describe an application of two systems - an equation discovery system and a neural network in "Equation Discovery System and Neural Networks for Short-term DC Voltage Prediction". Soft computing and neural network methods are applied in "Adaptive On-line ANN Learning Algorithm and Application to Identification of Non-linear System" as presented by D. Sha and V.B. Bajič. We hope that the special issue will be a contribution to information-society related efforts and that it will, from an independent, scientific aspect, give answers to a number of practical questions appearing in economy and in policy. Information society is based on the civil society of free-thinking individuals and of academic, economic and other groups offering the possibility to release the intellectual potential which will enlighten societies and countries, including Slovenia. Cene Bavec, Matjaž Gams Information Society And The Intelligent Systems Generation Matjaž Gams Jožef Stefan Institute, Jamova 39, Ljubljana, Slovenia Phone:+38661 1773900; Fax:+386 61 1251038 matjaz.gams@ijs.si, http://www2.ijs.si/~mezi/matjaz.hlml Keywords: information age, Internet, intelligent agents, overview paper, viewpoint Edited by: Cene Bavec Received: October 12, 1999 Revised: December 8, 1999 Accepted: December 19, 1999 In this overview paper we analyze basic laws and properties of the information society in general, and its introduction in Slovenia. It is claimed that information society initiated the emergence of primitive network intelligence demonstrated through intelligent assistants on the Internet. One of the key reasons for emergence of the new software generation is the growth of the Internet, and the other information overload. The introduction of intelligent systems, and particularly intelligent agents in Slovenia is analyzed. Finally, the EMA employment agent, one of important intelligent agent applications in Central Europe, is described in detail. 1 Introduction Information society is often seen as another step in the progress of human civilization. We are moving from post-industrial society and economy into information society and information-technology dominated economy. Changes essentially inlluence the way we work and live. By 2002, it is predicted that over 80 million Europeans will have access to the Internet, and that 5 % of EU gross domestic product will be affected by the use of digital systems. Great trends are expected in the Internet commerce. In 1998, nearly $8 billion sales were generated in USA by around 9 million American households. In comparison, only $1.2 billion were accounted for online shopping in Europe. Europe was and is significantly lagging behind USA. But in 1999, Europe progressed much faster. Forecasts predict that 500.000 e-commerce-related jobs will be created within the next few years. Each year, Web sales to consumers are expected to grow by 70%. This growth is expected to be exponential for a couple of forthcoming years. The technological basis of the information age is the Internet with its constant growth (Etzioni 1996; http://www.cio.com/WebMaster/metcallel.html). Another important improvement is the emergence of network intelligence that represents a natural step in the computational evolution heading towards more helpful, adaptive and creative programs. These programs are essential for humans because without intelligent assistants we can not cope with information overload. At the same time, the pace of progress is so quick and unpredictable in details that we can not determine future in any detail. What we can do is to recognize major information society laws (Lewis 1998; Metcalfe 1997) as described in Section 2. These laws are related to electronics, informatics, and the Internet. In Section 3 and 4 we analyze Slovenian introduction of information society and intelligent systems. The first major application of intelligent agents in Slovenia, the EMA employment agent, is described in Section 5. 2 Global Information Society Information society is by definition global, however, its implementation is to a large extend dependent on the GNP of a particular country. Therefore, while the state and the pace of progress depend on each country, the basic information society laws stay valid for the global world. Moore's Law (http://www.whatis.com/mooresla.htm) describes a constant trend in chip properties. The chip capacity doubles in a time span from L5 to 2 years depending on the type of particular performance of a chip. The formula is: Perlbrmance(new) = Performance(old) * 1.5 time where "lime" is the number of years: The basic property of the law - the constant exponential growth - remains unchanged over several decades (Moore 1975, Hamilton 1999). Metcalfe's Law (http://www.cuug.ab.ca/~branderr/csce/ metcalfe.html) says that the value of a network is proportional to the square of the number of nodes, connected by the network: Value =: K * nodes' In other words, the bigger the net, the square bigger the value. "K" is a constant. Sidgemore's Law determines the growth of traffic over nets. The law says that the traffic doubles every three months: Traffic(new) = Traffic(old) * 2'"'""' Andreesen's Law says that the cost of bandwidth is dropping exponentially and inversely proportional to Sidgemore's law: Cost(new) = Cost(old) * 1/2 (4 lime) Lewis/Flemig's Law describes the network type of capitalism. It denotes nearly "friction-free economy" in the sense that there is small marginal cost and a huge shelf space. The exponential growth indicates that a genuine new market idea will get awarded by huge profits. But in addition to quick rise, an exponential decline is expected when new, more advanced systems appear on the market. The equation describing the law is: MarketShare(time) = 1/(1+ K * B * time) where "K" is a constant. The "B" parameter denotes the learning parameter. Rules of the thumb: Put on the Internet all your information and information activities. This law means that it is cheaper to put information and information activities on the Web sooner than later (Petrie et al. 1998). Not only it is indeed cheaper and more cost-effective than when done in a standard way, it is also the only way to go along with competition. The cyber-world doubles fortune. Besides the material world we actually live in, the cyber-copy of our world matures. Since the introduction of the cyber-world in effect tends to double activities and money in circulation, stories of reach youngsters or rich Internet population in the devèloped countries are well grounded by a general trend. It also guarantees further growth of the developed world despite saturation in other human activities, which are related to classical material world. Another important trend is that our information systems on computers are becoming more and more a cyber-copy of ourselves. Side-effect of information society is information overload. In infosphere we have to cope with more and more information from one month to another in order just to stay competitive. As a consequence, the information overload causes disappearance of free time, it causes the brain overload and decrease in classical human social life. Information society demands intensive information knowledge for successful leadership. It is commonly accepted that there is a huge gap between existing knowledge of top executives, politicians and other leaders, and the desired knowledge for successful managing and leadership (O'Leary 1997). The gap is higher in Europe than in USA, and higher in Slovenia than in EU. Information society belongs to all of us. In a democratic society there are several institutions cooperating in the process of governing and creating strategic directions. Among essential institutions of democratic societies are civil institutions (Borenstein 1998). Information society is by definition a civil society although governmental institutions typically implement it. An example would be Clinton's advocating of information highway or several governmental information society projects in Europe; e.g. Bangemann's reports (http://europa.eu.int/comm/ dg03/speechba.htm). In countries like Slovenia, lacking richness of civil society structures developed in decades of Western democracy, the introduction of Internet is a major inhibitor of faster progress. The Internet is the most democratic and free media in the world. This was legally established with the American Supreme Court decisions about pornography and free speech on the Internet. In the simplest way it can be observed as a fact that pornography (inside "reasonable" limits) is allowed on the Internet and not on TV. It also means that anarchy and even criminal organizations can exploit this freedom, but the freedom of speech is accompanied by the fact that in such a case sooner or later we are going to hear things we don't like. Whatever the case, the Internet is the most democratic media humans ever had. While countries differ in their social and economic order, the Internet enhances democracy and civil society regardless of their previous level. The Internet and information society are our hope for the future. There have been many technological innovations that spurred human progress. For example, we speak of the "iron age" historic period. These days we speak of the "information age" or of the "information society" age. Not only new technology changes the way we live in the technical sense; the changes are essential also in the way society functions (Negroponte 1998). At the same time, the world of computer systems we use is rapidly changing due to the massive introduction of information activities. The trend is towards more user-friendly intelligent systems. 3 Slovenia and IS The introduction of information society and intelligent systems in Slovenia was accompanied with problems of all kinds. Slovenia is one of independent countries that emerged from former Yugoslavia. In those turbulent transitions, funds for science continued to decrease for at last 5 years. Having in mind that there are only a couple of computer R&D institutions in the country with sufficient critical mass of educated stuff, the introduction of information society looi^ed bleak. Surprisingly, the progress in turbulent times was faster than anticipated. There might be at least two reasons: first, Slovenia lacked state institutions and by not being burdened by old institutions, new ones could be more up to date. Second, due to the inevitable conflict in the independence days the Internet played substantial role in helping to inform and thus motivate public opinion in the West. The introduction of information society started with the introduction of the Internet at the Jozef Stefan Institute (http://www.iis.si) as result of a long-term cooperation with the Cern European Laboratory for Particle Physics (www.cern.ch/CERN/Technology/ index.html). Soon, the use spread through universities and schools. The major Internet provider was (and still is) ARNES (http://www.arnes.si/). After transformation into a market society, the economy trends changed to positive, but several state firms still faced substantial problems. Foreign investitures often bought firms and soon afterwards transformed research departments into production units. As a result, 45% of researchers and developers in the Slovenian R&D sector moved into other sectors. Universities and research institutions were not as heavily stressed because of transition problems. University stuff even increased while for example the major research institute, the Jozef Stefan Institute with over 900 employees declined to 700. The initial fast growth of the Internet hosts was slowed down after majority of governmental institutions got connected. Private institutions and individuals followed cautiously. Another important indicator is the number of people earning money through the Internet. While Bill Gates and CEOs of major computer companies dominate the list of world's richest, Slovenian computer professionals seldom appear on the list of nation richest. While in the developed countries an Internet generation of rich youngsters emerges, this phenomenon is substantially less present in Slovenia. Unlike in the developed countries, Slovenian political and partially business leaders often belong to the computer hardly literate generation. As a consequence, the real progress of information society is not as fast as it could be. Overall, unemployment rate in Slovenia has grown up substantially in- the last 10 years, especially in the first transition years. In recent years the unemployment in Slovenia is stable - with 125.000 unemployed and 2.000.000 inhabitants the unemployment rate is close to European average. On the other hand, Slovenia is currently one of the most perspective candidates for joining the European Union ba.sed on political stability and economic parameters. It borders Italy, Austria, Hungary and Croatia (see Figure I). GNP is roughly 10.000 US $. The number of computer connections to the Internet per capita is close to average in Western Europe. In recent years, all economic trends tend to be positive (with the exception of 7% inflation and growing depth). Even science and research, while not as supported and worshiped as they should be, are facing better times. Figure I: Slovenia's position in Europe. Today, practically all schools from universities to elementary schools, all research and development institutions, and large majority of other governmental institutions are connected to the Internet. Several R&D conferences are more or less strongly related to the introduction of IS in Slovenia: "Information society" http://ai.iis.si/i.s/indexa99.html. "Electronic commerce". Infos, BRK, Informatika etc. In 1999, the major improvement has been in the mobile phone area. The number of mobile phones is expected to soon overcome the number of classical telephones. 4 Intelligent Systems Intelligent systems are computer systems aimed at developing advanced user-friendly systems that work in real-life environments (Goonatilake, Treleaven 1995; Bielawski, Lewland 1991). The Internet is the media enabling substantial advantages for intelligent systems (Etzioni 1996; Etzioni 1997). Intelligent systems use a wide variety of artificial intelligence techniques typically implemented on top of classical modules (Bratko, Muggleton 1995): rule-based systems, production systems, expert systems, fuzzy logic systems, neural networks, memory-based reasoning. Advanced systems often combine various methods into one hybrid or integrated system (Gams et al., 1996). The emphasis of intelligent systems design is on combination of AI methods and engineering techniques enabling construction of systems performing practical tasks better than classical systems. Intelligent agents (Bradshaw 1997; Mueller 1996) arc a special branch of intelligent systems, capable of learning. 452 Informatica 23 ( 1999) 449-454 M. Gams adapting to the environment, to each specific user, and to each specific situation as much as possible. According to Pattie Maes (Maes at al. 1999) intelligent agents are an important step ahead in humanising computers. Intelligent agents represent personal assistants collaborating with the user in the same environment (Maes, 1994; Minsky, 1987; 1991). Intelligent agents are basically intelligent interfaces providing specific utilities of the system while the core of the system is typically an Internet based query or database system. Unlike passive query languages, agents and humans both initiate communications, monitor events and perform tasks. The essential properties of agents are autonomy and sociability (Bradshaw 1997; Jennings, Wooldridge 1995; 1997; Etzioni, Weld 1995). Intelligent systems have been developed in a couple of SW centers in Slovenia, among others in the Department of intelligent systems (headed by Prof. Ivan Bratko) at the Jozef Stefan institute. One example is an intelligent system for controlling quality of the Sendzimir rolling mill emulsion (Gams et al. 1996). Practically all national production of rolling steel is manufactured through this machine. The application represents one of major national intelligent system in regular industrial use. In addition to this application, the department has in the last ten years designed around 10 intelligent systems now available on the Internet http://turing.ijs.si/Ales/katalog-a/KATALOG-A.html. 5 The EMA Employment Agent In 1993, the first agent was designed in Slovenia - lOI, an Intelligent Operating Interface (Hribovsek 1994; Gams, Hribovsek 1996). The basic task of lOI was correcting typing errors and providing help for users communicating with the VAX/VMS operating systems. lOI is an intelligent agent able to learn, adapt, and communicate in a relatively complex environment with human users, Its most important property is self-learning through observing the user performing tasks in the environment. Later, lOI uses accumulated knowledge through user experience to advise new users. The system thus performs a task similar to MS Office Assistant with the essential difference that knowledge in Assistant is coded in advance while lOI constructs most of its knowledge through user observation. lOI is implemented as a 2000-line program in Pascal with parts of it written in the VAX/VMS command language. The lOI agent was implemented as a research prototype only, however, its flexibility and adaptability as a personal intelligent agent have shown reasonable improvements over classical systems. The most positive properties as observed by users in the testing period, are; lOI is easy to use, it does not demand specific knowledge, is easy to learn and use, and is very transparent. These favorable properties are typical intelligent-agent properties. Two other major intelligent agents developed in Slovenia are Personal WebWatcher (Mladenič 1998) and Ema (Gams et al. 1998). The EMA project (see Figure 2) started seven years ago as an R&D project "An Integrated Information System for Employment in Slovenia" to provide help regarding unemployment problems. The project was partly funded by the Slovenian Ministry of Science and Technology and partly by the Employment Service of Slovenia (ESS). The system consists of two parts; one is applied at the Employment Service of Slovenia (http://www.ess.gov.si/English/elementi-okviriev/F-Introduction.htm) where one gets basic information about employment activities in Slovenia, about ESS, and about interesting employment functions. The top part of the system is the EMA agent. For the last three years, the system was further developed as part of the INCO-Copernicus Project: 960154, cooperative research in information infrastructure, CRII (http://www-ai.ijs.si/~ema/proj.html). The intelligent system/agent EMA with a natural language interface consists of several modules. ■ '-Inl^ ...g „ŽS., & - iLlV'lployniciit Agent ■ ^"H'l^P'PB'rt^W'^ C" ^tt-oKtiuiij aiai o CvcQT 2-•'000 free jobs. »tj.-" (r..-iuant>j or ■ o, >t. I VP I. le m điff "üaidc Eouj' hb««. 1. ' 'V"^ «Uli. u 2." B.tgisii'e«! uoera t5T>« u!iei Figure 2: The first version of the EMÀ employment agent was among the first in the world to offer substantial amount of nationally available jobs on the Internet. A user has to identify with a username (note that here security is not very relevant) or through a favorite/bookmark list. EMA has four basic functional modules: storing patterns and ordering mails regarding vacant jobs, available workers, it enables storing and observing interesting WEB sites chosen by users, and enables matching jobs and workers. EMA is a "classical" agent providing user-friendly information upon demand or when it notices relevant information for each particular user. The system is a 15.000 lines program written mainly in C, partially in other languages. Together with text and data it occupies 30 M on a disk. EMA receives data as limited Slovenian text (with the exception of bulletin boards with language independent free input) and translates it into English. The translation is based on a dictionary consisting of up to four words observed before in the employment data. New combinations are in the worst case translated as direct word-by-word translation and stored for further overview by humans. Stored combinations are sorted by frequency and translated by humans if reasonable. In addition, the translation system looks into the morphology dictionary to capture different forms of the same words. Finally, a spell-checker submodule corrects spelling errors. The translation is currently not yet at the level performed by systems translating between larger European languages, however, it is sufficiently good to enable understanding since the syntax is quite limited. In the next stage, the text is transformed into appropriate computer readable forms and HTML forms as outputs. Two speech modules transform the data into speech. The English speech system is based on the Microsoft agents. We have designed our own Slovenian speech module (Sef et al. 1998). The EMA agent was and still is among most successful applications of intelligent agents in Slovenia. In the first year of its implementation, our country was the third in Europe to offer national employment information through the Internet. At that time, we were the first country in the world to provide over 90% of all nationally available jobs on the Internet. 6 Conclusion Without any doubt human civilization further evolves into information society. We are able to establish certain laws and rules of this development while specific details and further progress remain enigma to all of us. Intelligent systems and agents through the Internet and partially through PCs form the new software generation, the intelligent systems generation. While this generation still lacks true human intelligence and consciousness, the primitive network intelligence emerges consisting of intelligent assistants capable of autonomous and social activities (Munindar 1997; Mylopoulos 1997). In Slovenia, one of the countries wishing to join European Union, information society is perceived as a global phenomenon and as a major technological field which can bring us fortune or stagnation. The essential question is whether the existing or at least the forthcoming generation of political and business leaders will fully embrace the information-age rules of the game. Acknowledgement: Financial support for the EMA project was provided by an international project INCO-Copernicus 960154, Cooperative Research in Information Infrastructure, CRII; by the Ministry of science and technology in Slovenia; and by ESS. We would like to thank the CEO of the Employment Service of Slovenia, Mr. J. Glazer. 7 References [1] L. Bielawski, R. Lewland, Intelligent Systems Design; Integrating Expert systems. Hypermedia and Database Technologies, Wiley, 1991. [2] N. S. Borenstein, "Whose Net is it Anyway", Communications of the ACM, April 1998, pp. 1921. [3] M. Bradshaw (ed.). Software Agents, AAAI Press/The MIT Press, 1997. [4] I. Bratko, S. Muggleton, "Applications of Inductive Logic Programming", Communications of the ACM, Vol. 38, No. I I, 1995, pp. 65-70. [5] O. Etzioni, D.S. Weld, "Intelligent Agents on the Internet: Fact, Fiction, and Forecast", IEEE EXPERT, Intelligent Systems & Their Applications, Vol. 10, No. 4, 1995, pp. 44-49. [6] O. Etzioni, "The WWW: Quagmire or Gold Mine?" Communications of the ACM, Vol. 39, No. 11, 1996, pp. 65-68. [7] O. Etzioni, "Moving Up the Information Food Chain", AI Magazine, Vol. 18, No. 2, 1997, pp. IIIS. [8] M. Gams, B. Hribovšek, "Intelligent-Personal- Agent Interface for Operating Systems", Applied Artificial Intelligence, Vol. 10, 1996, pp. 353-383. [9] M. Gams, M. Drobnič, N. Karba, "Average-Case Improvements when Integrating ML and KA", Applied Intelligence 6, No. 2, 1996, pp. 87-99. [10] M. Gams, A. Karalič, M. Drobnič, V. Križman, "EMA - An Intelligent Employment Agent", Proc. of the Forth World Congress on Expert Systems, Mexico, 1998, pp. 57-64. [11] S. Goonatilake, P. Treleaven (eds.), Intelligent Systems for Finance and Business, Wiley, 1995. [12] S. Hamilton, "Taking Moore's Law into the Next Century", IEEE Computer, Januar 1999, pp. 43-48. [13] B. Hribovsek: Intelligent Interface for VAX/VMS, M: Sc. Thesis (in Slovene). [14] N.R. Jennings, M. Wooldridge, "Intelligent Agents and Multi-Agent Systems", Applied Artificial Intelligence, An International Journal, Vol. 9, 1995, pp. 357-369. [15] N.R. Jennings, M. Wooldridge, Agent Technology, Springer, 1997. [16] M. Lewis, "Designing for Human-Agent Interaction", AI Magazine, Vol. 19, No. 2, 1998, pp. 67-78. [17] P. Maes, "Agents that Reduce Work and Information Overload", Communications of the ACM, 37, 1994, pp, 31-40. [18] P. Maes, R.H. Guttman, A. G. Moukas, "Agents That Buy and Sell", Communications of the ACM, Vol. 42, No. 3, 1999, pp. 81-91. [19] B. Metcalfe, "What's Wrong with the Internet", IEEE Internet Computing, 1997, pp. 6-8. [20] M. Minsky, The Society ol' Mind, Simon and Scliuster, New York, 1987. [21] M. Minsky, "Society of mind: a response to four reviews", Artificial Intelligence 48, 1991, pp. 371396. [22] D. Mladenič, "Turning Yahoo into an Automatic Web-Page Classifier", in Proceedings of the 13th European Conference on Artificial Intelligence ECAI^S, 1998, pp. 473-474. [23] G. E. Moore, "Progress in digital integrated electronics". Technical Digest of 1975 International Electronic Devices Meeting 11, 1975. [24] J.P. Mueller, The Design of Intelligent Agents, Springer, 1996. [25] P.S. Munindar, "Agent Communication Languages: Rethink the Principles", Computer, December 1997, pp. 40-47. [26] J. Mylopoulos, "Cooperative Information Systems", IEEE EXPERT, Intelligent Systems & Their Applications, Vol. 12, No. 5, 1997, pp. 28-30. [27] N. Negroponte, "A Wired Worldview", EU RTD info, March 1998, pp. 28-30. [28] D. E. OLeary, "A Lack of Knowledge at the Top", IEEE Expert, November 1997, p. 2. [29] C. J. Petrie, A. M. Rutkovski, M. Zacks etc., "Dimensioning the Internet", IEEE Internet Computing, April 1998, pp. 8-9. [30] T. Sef, A. Dobnikar and M. Gams, "Improvements in Slovene Text-to-Speech Synthesis", Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98), pp. 2027-2030, 1998, Sydney. A New Perspective in Comparative Analysis of Information Society Indicators Pavle Sicherl Law School, University of Ljubljana and SICENTER, Brajnikova 19, 1000 Ljubljana, Slovenia Tel:+386 61 1501510; fax:+386 61 1501514 Pavle.Sicherl@sicenter.si Keywords: time distance, S-distance, two-dimensional comparison in time and indicator space Edited by: Cene Bavec and Matjaž Gams Received: October 8, 1999 Revised: November 23, 1999 Accepted: December 12, 1999 The analysis of information society indicators can he enriched by supplying a new view of data that can provide new insight from existing data. The slowdown of growth of Internet hosts per 10000 inhabitants in Slovenia after mid-1997 increased the time lag of Slovenia behind leading Finland from 3 years at the end of 1996 to nearly 5 years by August 1999. Time distance methodology is used as a presentation and communication tool to raise awareness of the problem and its consequences in simple understandable terms and to signal the need for an in-depth analysis and action. 1 Introduction Problem: In Slovenia, after a very high rate of growth in the indicator Internet hosts per 10000 inhabitants until mid-1997, such growth slowed down substantially. One can describe the facts in various ways and with various measures. Objective: To make the government, other agents and general public aware of these developments and signal the need for immediate action to correct them. Method: Time distance will be used as a presentation and communication tool to raise awareness of the problem and its consequences in simple widely understandable terms. Since this method can be a useful addition to existing methods of analysing differences between compared units in many fields, a further illustration is provided for the case when the benchmark for comparison is the average value of the analysed indicator for EU15. 2 Methodology: time distance concept and statistical measure S-distance The time perspective, which no doubt exists in human perception when comparing different situations, is systematically introduced both as a concept and as a quantifiable measure. Since events are dated in time, in time series comparisons, regressions, models, forecasting and monitoring, the notion of time distance always existed as a "hidden" dimension. In order to systematise and formalise the approach and defme an appropriate statistical measure for operational use, amendments to the present state-of-the-art are needed on two levels: conceptual and analytical. First, a broader theoretical framework is required. The conventional approach does not realise that, in addition to the disparity (difference, distance) in the indicator space at a given point in time, in principle there exist a theoretically equally universal disparity (difference, distance) in time when a. certain level of the indicator is attained by the two compared units. Second, a statistical measure S-distance has been defined to suggest a possibility how the broader concept and reference framework can be measured in operational terms. The aim is to provide new insights from existing data due to an added dimension of analysis and thus to complement conventional statistical measures. Time distance in general means the difference in time when two events occurred. We define a special category of time distance, which is related to the level of the analysed indicator. The suggested statistical measure S-distance measures the distance (proximity) in time between the points in time when the two compared series reach a specified level of the indicator X. The observed distance in time (the number of years, quarters, months, days, minutes, etc.) is used as a dynamic (temporal) measure of disparity between the two series in the same way as the observed difference (absolute or relative) at a given point in time is used as a static measure of disparity [1,2,3]. For a given level of Xl, Xl = Xi(ti) = Xj(tj), and the S-distance, the time separating unit (i) and unit (j) for the level Xl, will be written as Sij(XL) = AT(XL) = tiCXJ - ti(X,J where T is determined by Xl. In special cases T can be a function of the level of the indicator Xl, while in general it can be expected to take more values when the same level is attained at more points in time, i.e. it is a vector which can in addition to the level Xl be related to time. Three subscripts are needed to indicate the specific value of S-distance: (1 and 2) between which two units is the time distance measured and (3) for which level of the indicator (in the same way as the time subscript is used to identify the static measures). In the general case also the fourth subscript would be necessary to indicate to which point in time it is related (T|,T2,...,T|,). The sign of the time distance comparing two units is important to distinguish whether it is a time lead (-) or time lag (+) (in a statistical sense and not as a functional relationship): Sìì(XL) = -SÌÌ(XL) . Using the comparison between two units it can be shown that the generic concept of time distance goes together very naturally with the existing concepts of static disparity at a given point in time and the notion of the growth rate over time. Table 1 provides a schematic example for such comparisons for a given indicator. Row one is the most frequently used type of comparative analysis; levels of the indicator at a given point in time are compared. In such comparison two points are used, for each of them we have three elements of information: (i) the respective level of the indicator, (ii) to which unit it belongs, and (iii) at what time it happened. In this case unit as well as time (since it is constant for static comparison) serve as identifiers, while the levels are used to calculate the static difference. Row two compares two levels of the indicator for each unit at two points in time, separately for each unit, which means that one calculation indicated in row two refers to unit 1, and another to unit 2. The simplest example would be growth rate for unit 1 and growth rate for unit 2. Here the unit is the identifier, while the numerical values on levels and time are used in calculating this measure. These two steps are standard procedures. The first one represents the static type of comparison; the second one measures the dynamic properties of the indicator for each unit separately. Following the same logic, for the novel statistical measure S-distance in row three level is the same, level and unit serve as identifiers, and time is used for calculating time distance. It is remarkable that the notion of time distance, which can be in principle developed from the same information used in steps one and two, has not been developed theoretically and as a standard statistical measure. TIME UNIT LEVEL Measure TIME same 2 2 static difference UNIT 2 same 2 change over time LEVEL 2 2 same time distance Table 1. Points of comparison for static difference, change over time and time distance (two units) While there may be different problems involved in the calculation of these three types of measures, in terms of availability and comparability of data, in principle these three types of measure can be integrated into a formally consistent analytical framework. There are alternative ways of doing this, following from the distinction between backward looking (ex post) and forward looking (ex ante) time distances. They relate to different periods, past and future, the first belongs to the domain of statistical measures based on known facts, the second is important for describing the time distance outcomes of the results of alternative policy scenarios for the future. Looking backwards, ex post or historical time distance indicates how many years ago the more developed unit experienced a specified level of the indicator of the less developed unit at a given point in time [3]. A very important relationship shows that, ceteris paribus, time distance is a decreasing function of the magnitude of the growth rate of the indicator. This conclusion shows that the S-distance as a dynamic (temporal) measure of disparity offers a perspective which may be quite distinct from that provided by static measures. This new view of the information is using level(s) of the variables as identifiers and time as a focus of comparison and numeraire. This approach and the broad range of its possible applications is much more complex and general, but the time distance is the priority choice because of its intuitive nature, and the importance of the time dimension in semantics of describing various situations in real life and forming our perceptions about them. In this paper only the application to comparison of one indicator between several units will be used. However, the approach has been generalised to complement conventional measures in time series comparisons, regressions, models, forecasting and monitoring, and to analysis of single time series [3] and lo variables other than time [4], In all such applications it can provide from existing data new insights due to an added dimension of analysis. 3 Data and results for Slovenia, EU15 countries and candidate countries Data on Internet hosts per 10000 inhabitants used relate to the period end of 1993-August 1999 [5,6,7], At present is the measurement and empirical analysis of information society indicators beset with problems. It is stated that the single most important obstacle to effective data collection is the lack of standardised definitions of information technology and the exclusion of important costs associated with its use, like personnel and training expenses. A further weakness is the relative absence of systematic information how information technology is actually being u.sed [8]. In addition to these general obstacles there may be also some specific reasons that the slowdown of the increase in Internet hosts per capita in Slovenia in the last two years shown in RIPE data may have been exaggerated [9], We shall proceed by analysing the available RIPE data, yet there should be an appropriate caution about possible inaccuracy in the available data. Comparative analysis of the differences among countries can be presented in two dimensions. The conventional static differences at a given point in time are in this paper complemented by the time distance dimension. Time distance in Table 3 is for practical reasons calculated for the levels of the indicator for those countries, which are behind Slovenia, and for the level of Slovenia for the countries, which are ahead of Slovenia. 1993 1994 1995 1996 1997 1998 Aug. 1999 LUX lA 12.5 46.0 85.2 113.4 182.3 218.8 DAN 16.1 35.4 96.9 203.3 321.1 571.1 608.3 BEL 7.0 17.3 30.2 64.0 104.8 202.4 307.5 AUT 18.9 34.0 66.3 110.2 134.4 214.0 235,7 DEU 13.7 24.4 58.0 84.4 137.7 177.0 186,3 FRA 9.3 14.4 26.0 40.6 60.7 84.2 106.4 NED 28.6 55.8 110.8 173.4 249.2 395.1 481.9 ITA 2.9 5.0 13.1 25.8 44.2 64,5 96.4 SVE 47.0 84.8 164.1 269.0 394.0 429.9 569.5 UK 19.1 38.7 75.1 122.4 167.3 247.4 272.2 FIN 65.2 133.9 416.7 612.1 945.8 902.6 930.5 IRL 6.5 15.3 37.3 74.2 109.3 155.6 181.7 ESP 3.6 7.0 13.1 28.8 49.9 78.2 94.2 PRT 3.6 5.1 11.9 23.6 42.7 56.3 67.4 GRE 1.7 3.4- 7.4 16.0 26.7 47.1 63.5 SLO 3.1 8.2 28.3 69.5 98.2 1 15.3 116.3 CZE 4.3 10.1 21.1 39.6 55.2 83.6 101.8 SVK 0.7 2.6 5.6 14.8 27.0 41.0 48,3 HUN 3.0 6.6 15.4 29.2 66.7 87.8 106.2 POL 1.3 2.8 6.0 13.7 22.9 32,4 42.9 EST 2.9 7.7 24.1 54.3 108.4 151.2 . 180.8 ROM 0.0 0.2 0.8 3.5 6.0 9,9 14.1 LIT 0.3 1.2 4.7 10.9 26,0 32.7 LAT 0.2 2.0 5.2 23.1 28.6 54,4 63.7 BG 0.0 0.2 1.3 4.0 8.2 12.2 18.3 EU15 12.2 23.6 50.5 78.6 124.3 171,1 199.3 Table 2: Data on Internet host density per 10000 inhabitants Source: Internationai Telecommunication Union Database, Geneva 1998 for 1993-1997 [5]; RIPE [6] in RIS [7] for 1998 and 1999. In Tables 2 and 3 the countries are sorted by the level of GDP per capita (at purchasing power standards) in 1997. Obviously, the Internet hosts per capita are not firmly correlated with GDP per capita. In 1996 Slovenia was occupying a comfortable comparative position in terms of Internet hosts per capita: it was lagging less than 3 years behind Finland as the leading country, and was ahead of several EU countries, i.e. Belgium, France, Italy, Spain, Portugal and Greece. The last four mentioned countries had substantially lower values than Slovenia. The slowdown of growth rate in this indicator for Slovenia after mid-1997 led to a quick deterioration of the comparative situation of Slovenia. By August 1999 the lag behind Finland increased to nearly 5 years. Namely, in case of indicators with high rates of growth the situation can change very quickly, as distinct from the fields where the rate of change is slow. Figure 1 provides visualization of these changes. Tables 2 and 3, and Figure 1 compare Slovenia with EU 15 countries and the nine candidate countries from Central and Eastern Europe. One could also speculate what would be the situation if the rate of growth for the period 1997-August 1999 would continue until the end of 2000 (this should not be interpreted as projections). 1994 1995 1996 1997 1998 Aug. 1999 LUX -0.9 -0.5 -0.4 -0.5 -1,0 -1.5 DAN #N/A -1.4 -1.5 -2.0 -2.8 -3.4 BEL -0.9 -0.2 0.1 -0.2 -0.9 -1.5 AUT #N/A -1.4 -0.9 -1.3 -1.8 -2.3 DEU #N/A .-0.9 -0.6 -0.7 -1.4 -2.0 FRA #N/A 0.1 0,7 1.2 1.5 1.1 NED #N/A #N/A -1.8 -2.2 -2.9 -3.5 ITA 0.6 0.8 1.1 1.6 2.1 1.7 SVE #N/A #N/A -2.4 -2,8 -3.6 -4.2 UK #N/A -1.5 -1.2 -1.5 -2.2 -2,7 FIN #N/A #N/A -2.9 -3.5 -4.3 -4.8 IRL -0.8 -0.4 -0.1 -0.3 -0.9 -1.4 ESP 0.2 0.8 1.0 1.5 1.7 1.7 PRT 0.6 0.8 1.2 1.7 2.3 2.6 GRE 0.9 1.2 1.6 2.1 2.5 2.7 SLO 0.0 0.0 0.0 0.0 0.0 0.0 CZE -0.3 0.4 0.7 1.4 1.5 1.4 SVK #N/A 1.5 1.7 2.1 2.7 3.1 HUN 0.3 0.6 1.0 1.1 1.4 LI POL #N/A 1.4 1.7 2.3 2.9 3.2 EST O.l 0.2 0.4 -0.2 -0.8 -1.4 ROM #N/A #N/A 2.9 3.4 3.9 4.3 LIT #N/A #N/A 2,7 2.9 3.1 3.5 LAT #N/A 1.6 1.3 2.0 2,4 2.7 BG #N/A #N/A 2.8 3.0 3,8 4.1 EU15 #N/A 0.8 0.3 0.6 1,2 1.8 Table 3: Time distance between compared countries and Slovenia, S-distance in years: - time lead, + time lag, Slovenia=0 Source: Own calculation based on data in Table I. If no action would be taken and such slowdown would continue until the end of 2000, a further deterioration of the relative position of Slovenia for this indicator would take place. Slovenia would within a period of only a few years move from a comfortable position near the EU15 average in 1996 (despite being more than 30 per cent below the average EU 15 level of GDP per capita) to a position where the lag behind the foreruntier Finland would be already 6 years. The lag behind Sweden, Denmark and Netherlands would be around 5 years, France, Italy, Spain and Greece would surpass or catch up with Slovenia, and only Portugal out of the EU15 countries would be still behind it. Time distance seems to be an excellent way of presenting the danger of a rapidly deteriorating situation, which everybody can understand, and to signal that an in-depth analysis and corresponding actions are necessary. Some other conventional measures may not provide such warning. E. g., static comparison showed that in 1996 Finland had 8.8 times the number of Internet host per capita in Slovenia, and in 2000 it would be 6.6 times. Time distance adds a qualitatively different conclusion. Similar consequences can be seen from comparison with selected Central and Eastern European countries. In 1996 Slovenia was with Estonia a clear leader in the region for the indicator Internet hosts per capita. In the meantime Estonia moved ahead, and the gap would widen if the present trends would continue. By August 1999 Slovenia is lagging behind Estonia for more than I year. The quality of time distance measure, being transparent and easy to perceive and understand, can be even more appreciated when a larger set of indicators is analysed, involving more issues and different fields of concern. For instance, in 1997 Italy was 18,3 years ahead of Slovenia for GDP per capita at purchasing power parity, while Slovenia was 1.6 years ahead of Italy for Internet hosts per capita. Some of these indicators can change very quickly, some others, like some demographic variables and some other characteristics of human factor, very slowly. Time distances will be different, smaller for those indicators that are more dynamic by their nature, more conducive to policy measures and given higher priority in decision-making process. _____X ____«— "if -- J ------ -^LUX -B-DAN .....i,- BEL .....*......AUT -«-DEU -•-FRA —t—NED -ITA .. .g UK FIN RL .....*.....ESP .....» PRT —•—SLO ---------CZE -SVK POL ■~A-EST -«-ROM -*-LIT -«-LAT -+-BG Time Figure 1. Time distance for Internet host density per 10000 inhabitants, EU and candidate countries, Slovenia=0 Finkwl I IKii _ Austria | Be emhoijrcj | :ui Germany "Ül Ireland Estonia jHungaiv j Czech R, _llaly ■ Spain gWiliBWiiti^Sjiliti^M ■lili J ' ' I Lrilvi.n ]]cireijM= _ akia Foland i'iiUiHhtitM Llllnkiiir.i ' \ I Bulgaria 11 Roman -6 -5 -4 -3-2-1012 S-distance (In years): - time lead, + time lag Figure 2. Differences from EU15 average for Internet hosts per capita expressed in time (August 1999) Figure 2 is an illustration of application of time distance presentation in a similar case of comparative analysis. In this example the average value of Internet hosts per capita for EU 15 is the benchmark for comparison. The dispersion of situations in this respect for EU 15 countries and Central and Eastern European candidate countries can be presented in various ways, like ratios, percentages, absolute value and absolute differences, etc. Furthermore, various summary measures of dispersion could be calculated. Absolute values of the indicator are presented in Table 2. A widely used conventional measure would be indeces or percentage differences. For instance, in August 1999 the index for forerunner Finland would be 467, for Portugal and Greece about 33 as the lowest value for EU 15 countries, and 9 for Bulgaria and 7 for Romania (EU 15= 100). Figure 2 presents another conaplementary view of this set of data. Time distances are calculated in cases of above the average countries for the level of EU15 average, and in cases of below the average countries for the level of indicator in these countries in August 1999. Finland had a lead of about 4.5 years ahead of the EU 15 average, Portugal and Greece were laging the EÜ15 average for about 3 years, and Bulgaria and Romania for more than 4 years. Time distances alow for a distinct new insight that can help to form a richer perception of the situation. Since time distance is expressed in units of time, which everybody understands from ministers, managers to general public, it possesses one of the ideal characteristics of a presentation and communication instrument. It is expected that the analysis of and discussion about time distances will have considerable influence on how people will form their perception about a situation and on public opinion. For instance, in the EU the consideration of economic and social cohesion is an important goal. A series of presentation of results like Figure 2 for a number of relevant indicators would without any doubt provide a new additional insight to a complex multidimensional problem. Similarly, it would be very useful if the results in Table 3 and in Figures I and 2 would be provided for a broad selection of information society indicators. This offers improved semantics for analysis and policy debate, and can in many cases lead to qualitatively different conclusions from those reached in a static conceptual and analytical framework. By analogy, there is a wide-open possibility to apply this methodology to numerous business problems at the micro, corporate and sector levels. Another important advantage of this approach is that the results and conclusions based on the two-dimensional analysis add new information and new insight, while none of the earlier results are lost or replaced. 4 Conclusions In empirical research the art of handling and understanding of different views of data is crucial for discovering the relevant patterns. The time distance approach (with associated statistical measure S-distance) is useful at least in two domains: it offers a new view of data that is exceptionally easy to understand and communicate, and it may allow for developing and exploring new hypotheses and perspectives that cannot be adequately dealt without the new concept. The generic nature of the time distance concept and the S-distance measure leads to the conclusion that the methodology can be usefully applied as an important analytical and presentation tool in numerous applications in a wide variety of substantive fields. Especially in the field of information technology indicators, which is characterised by great speed of change, it would be of great interest to complement rather than replace the conventional measurement of differences between countries or other units with this new perspective of the situation. 5 References [1] P. Sicherl. A Novel Methodology for Comparisons in Time and Space. East European Series No. 45. Vienna Institute for Advanced Studies. Vienna. 1997. [2] P. Sicherl. Time Distance in Economics and Statistics; Concept, Statistical Measure and Examples. In A. Ferligoj (ed.). Advances in Methodology, Data Analysis, and Statistics. Metodološki zvezki. 14. FDV. Ljubljana. 1998. [3] P. Sicherl. The Time Dimension of Disparities in the World. XIIth World Congress of the International Economic Association. Buenos Aires. August 1999. [4] P. Sicherl. Measuring disparities in two dimensions: proximity in time and proximity in indicator space. 70''' International Conferene on Socio-economics, Vienna, Austria 13-16 July 1998. [5] International Telecommunication Union. Database. Geneva. 1998. [6] http://www.ripe.net [7] http://www.ris.org [8] National Science Board, Science & Engineering Indicators - 1998. Arlington, VA: National Science Foundation, 1998. [9] V. Vehovar, Spremljanje informacijske družbe, in P. Sicherl, A. Vahcic (eds.). Model indikatorjev za podporo odločanju o razvojni politiki in za spremljanje izvajanja SGRS, Sicenter, Ljubljana, oktober 1999. Measuring Information Society: Some Methodological Problems Vasja Vehovar, Matej Kovačič Faculty of Social Sciences, Kardeljeva ploščad 5, 1000 Ljubljana, Slovenia Phone: +386 61 1805 100; Fax: +386 61 1805 102 vasja.vehovar®uni-lj.si, matej.kovacic@uni-lj.si Keywords: information society, Internet indicators, survey research, electronic commerce Edited by: Cene Bavec and Matjaž Gams Received: October 10, 1999 Revised: November 30, 1999 Accepted: December 20, 1999 The paper addresses methodological problem of measuring information society. Both, technical indicators and attitudinal measurements for Slovenia are discussed in this context. In particular, the results related to the interest for information society services are presented. The comparison between Slovenia and European Union - despite some methodologiccd problems - shows that the interest for these services is extremely high in Slovenia. Other figures also confirm that Slovenian households and businesses are generally on European average with respect to the penetration of the basic information technologies. However, certain discrepancies with other sources of data call for more efforts in performing these kind of analysis. 1 Preface The concept of the information society has already been around for many decades. Nevertheless, its definition is relatively unclear, and the same is true for the corresponding indicators. There do exist some ad-hock measures from as early as sixties, particularly in the area of services, the information professions and the extent of the business information sector. However, it is extremely hard to establish official indicators for the phenomena in such a dynamic field. The Internet, in particular, has brought even more complications in these measurements. Even in United States, the official estimates about the scope of the electronic commerce will be available only in year 2000, after the electronic commerce transactions have already reached hundreds of billions of USD. The first official estimated will thus arrive after several years of extreme variety in the estimates from numerous consulting agencies. In addition, there are considerable discrepancies also in other measurements of information society. The paper presents an overview of most frequent divergences that arise from the interpretation of indicators of information society. The methodological misunderstanding is also an important reason that unnecessarily hinders the understanding of the position of Slovenia. 2 Quantitative measurement Quantitative indicators of the information society usually refer to numerical Figures — expressed in numbers/percentages of users, or, in the form of financial totals — which most often relate to the use and penetration of modem technologies, especially the Internet, mobile telephony and electronic commerce. However, we have to be extremely careful when interpreting these data. We can, for instance, classify the use of electronic payment orders of the Slovenian Agency for Payments as a form of electronic commerce. Many experts claim that this is also a specific form o Electronic Data Interchange (EDI). In this case the amount of electronic commerce in Slovenia can be measured in hundreds of billions of USD. But if we talk about the transactions where searching, ordering, billing and payment procedures are all performed in the electronic forms — without any paper recording — it is clear that the amount of electronic commerce is only a fraction of this amount, e.g. only around few millions USD. Therefore, we have to be very precise when stating such observations. It is not surprising that the estimates of leading consulting agencies on electronic commerce have varied at rates 1:10 in the past years, and at present they still vary at the rate 1:2. Often, the source of the problems is not even in the statistical methodology or in the definitions, but in a simple fact that the methodological framework is not properly reported. Recent international efforts for standardised measurement on electronic commerce, particularly those at OECD (1999) already brought some results and we can perform certain international comparison of the Internet and electronic commerce usage among companies. The available comparisons with Singapore, Scandinavian countries and Australia shows that with respect to PC usage, Internet penetration and Web site penetration among companies there is, as for 1998, no significant lagging for Slovenia. However, there is a certain time lag in the adoption of electronic commerce applications. Unfortunately, the international comparisons in the area of electronic commerce are much more complicated. Recent experiments in RIS 1999 survey of companies clearly demonstrated the sensitivity in these kinds of measurements. It has been shown that when electronic commerce was defined as any business transaction performed over the computer networks the percentage of companies claiming to use electronic commerce was 10% higher compared to the definition that was restricted only to the transactions that lead to the purchase (RIS, 1999). A similar problem of definition is the estimate of the number of Internet users. There exist at least five categories of Internet users (Vehovar et al. 1998). This prompts to a need for an exact definition of the term "Internet user". For instance, the estimate of EITO (1999) talks about 60.000 Internet users in Slovenia in 1998, however, the definition of Internet user and description of how the eslimate was obtained is not available there. In addition, this estimate differs from all other estimates for Slovenia. Even the number of users with personal e-mail accounts in 1998 is much higher than 60.000. Of course, EITO's estimate could refer to some specific group of intensive users. When measuring information society phenomena we are also faced with divergence, which originates from the methodology of data collection. The IDC corporation, for instance, provides estimates ba.sed on distribution channels and one of the figures states that 65.000 purchases (shipments) of new personal computers were made in Slovenia in 1998, thereof 17% in households. This suggests that households buy around 10.000 personal computers yearly. Such an estimate also matches the ESIS (1999) figures stating that Slovenian households possess 100.000 personal computers (with a processor 486 or more). However, this does not match the survey estimates that Slovenian households have more than 200.000 personal computers, which is a result of practically all surveys (Statistical office. Mediana, Slovenian public opinion, RIS...). Survey estimates consistently show that the number of personal computers has surpassed 200.000 also in business use, what suggests that Slovenia is highly ranked by number of personal computers per 100 residents. There are more than 25 personal computers per 100 habitants in Slovenia. This is surprisingly high, however, as the usage of information technology is rather complex, the criticisms regarding low technology penetration in Slovenian economy may still hold true (The World Competitiveness Yearbook 1999, IMD Lausanne). One of the most exposed indicators of the Internet and information society is the statistics on the Internet hosts (Vehovar, 1998). This indicator shows extremely inconvenient trends for Slovenia: the growth of hosts in the last two years has almost entirely stopped while all other countries rapidly progress. However, the number of hosts is a typical example of an indicator that is more complex than a casual observer might think. For instance — all the hosts which are not included in *.si domain are excepted from Slovenian host cunts. This does not happen so often in larger countries or in countries with more liberal legislation for assigning domains. In Slovenia, non-domestic domains are very frequent, even among the most visited sites and among the largest Internet access providers: siol.net, s.net, s5.net, amis.net. It seems that the large majority of commercial dial-up modems/ hosts is registered under domains *.net. The high usage of dial-up access in Slovenia also presents a problem for itself and contributes to a low host counts, because each host/modem serves many dial-up users. Additional problems can present the multiple IP numbers - e.g. virtual hosts - located on one computer. This is more often the case in countries such as Estonia than in Slovenia. We have to understand that the "host" does not necessarily mean a computer connected to the Internet, but only an IP number. In Slovenia, additional problems are also the computers that are connected to many large local networks with full access to the Internet but without an IP number. The problem of host counting is getting even more complex because of the technical problems of measurement procedures, which are becoming increasingly more difficult due to fire-walls and other forms of security protections. This forced Network Wizards to change entirely the methodology and broke with the time series. The data about host numbers from RIPE (http://www.ripe.net) and Network Wizard (hltp://www.nw.com) thus vary considerably. The RIPE host count often shows a clear monthly a recession in the number of hosts for some countries (Italy, for instance) what is not realistic. All the above arguments may explain the situation for Slovenia, where the host counts in the last two years show less than 10% yearly growth (Picture 1, Picture 3), but all other indicators (number of registered domains, number of companies connected to the Internet, number of households with access to the Internet, number of Internet users) demonstrated more than 50% growth (Picture 2). Number of hosts (Dec. 92 - Nov. 99) Source: RIPE 25000 20000 15000 10000 5000 number of domains - sum(dec.92 - nov.99) {Stxjrce: ARNES) 6000 5000 4000 3000 2000 1000 dst;.92 das.93 d6c.94 dec.95 dsc.SS > > > > Indicators on some areas are not correlated with economical power of the countries; Absolute investments are significantly lower then EU average; Administrative structure is still developing; Legal and organisational framework is rapidly improving; Level of capitalisation in telecommunication is improving although some monopolies still exits; ICT advanced development is beyond legal, organisational and administrative progress. IliiHlB'^I^^BB E<1.1-^OUP |>«T .................... ■I lllÄSlIHlwtilfl • IB^BÌPBIBIIB illilBiiiWIii^pli^Élll ■■■I 8' c 7.1. „1si h tTIP ni 1'19? i™ Figure 6: Portfolio showing GDP/pc and Telecommunication growth rate in 1997/1998 The proposed approach to data analysis and the developed tools are not only good for comprehensible modelling and presentation of the results, but also lor permanent monitoring and studying of Slovenia and other CE countries' progress. Figure 7 shows a six dimensional diagram for CE countries comparison, based on the total values of six indicator's groups. Very clearly can be seen the leading position of Slovenia (SLO-white). This diagram and the diagram in the figure 6 were generated using our ISIT system. ■jll^pllliiM ^^^^liiil^HliiiBilBilBIH lEWiiBiiiir liplSlilSiliBUiflii I 1 I- ..................null l>-IIIILlll|nt .l,il L I I I >.011 'IUI' I .KMliKilll'lli i.il M I I hitodinliriiii I hiiMt lyfol II I I I I riiil iiiil. 11 iiiif ill! iiilfi.iiii.'A'iikt ir il r'[ I iMni. Ill K|.'ii'.=E >=E =E ::i:|:Tiü':ii.i;cir'n.i'-- .■.-li-.i: ;:i .iL-.e.rr.'. rr ..iiJ i Jun .n- , -.^'.-ii'i,: ^ ■ ..I ■'i.' i;. ...J • - : ---------------------....a------------------ Figure I : Example of electronic book screen At the end of our inquiry we reckoned up the arithmetic means of answers for each statement so that we could see weather the students had a positive or a negative relation to the specific statement and so to the specific element of studying process. The results of inquiry are shortly described below: • Students were satisfied with a new way of study, they meant that the learning quality was grooving by using the electronic book. • They said that the chapter separation in the electronic book was good and that all parts were well linked. • Opinions about the learning speed were divided. • Students' opinion weather concrete examples contribute to better understanding of material or not were also divided. • Students were satisfied with the contents of the electronic book and they meant that the extent of the learning material is appropriate. • They showed a positive statement about methodological point of view of the electronic book. • Students were satisfied with the pretentiousness of the electronic book, with the systematic work, with learning goals and with the acquired knowledge. • They could not tell weather the learning material mediates only facts or not. These could be the consequence of not knowing the learning material in full. Maybe the answers would have been different if the inquiry had been curried out after the final exam, when the students would have been more acquainted with learning material, and not immediately after the first work with the electronic book. 6 Conclusion In the article the use of an electronic book as a new method of distance learning is represented. The results of the research, which was carried out among the students of Faculty of Organisational Sciences who were using an electronic book by their study, are shown. Students are satisfied with this kind of study and are looking forward to using electronic books for other subjects as well. The use of an electronic book variegates the study and increases the individual work and motivation of students. The positive experiences of using an electronic book are pointing out that the introducing of distance learning would probably also have a good response among the students. 7 References [1] Batagelj V., Rajkovič V.: Information Technology Project in Slovenia Schools, Proc. of 1st Euro Education Conference, Aalborg, 9.-15. [2] Boud D.: The Challenge of Problem Based Learning, Kogan Page, London, 1992. [3] Collins J.: Computers in Classroom and College, Computer Education, June 1994. [4] Dhanarajam G. ed.: Economics of Distance Education: Recent Experience, Open Learning Institute, Hongkong, 1994. [5] Hedberg, J.: Converging Technologies in Education: Interactive Multimedia and Online Learning; The University of Wollongong, New South Wales, Australia, 1996. [6] Jereb J., Jug J.: Učna sredstva v izobraževanju, Moderna organizacija, Kranj, 1987. [7] Jereb J.: Računalnik v izobraževanju, Mc&Boss, Kranj, 1991. [8] Jereb J.: Strokovno izobraževanje in razvoj kadrov, Moderna organizacija, Kranj, 1989. [9] Jerram P.; Gosney M.: Multimedia Power Tools, Verbum Inc. and Gosney Company, 1995. [10] Jereb J, Jug J. et.al.: Izobraževanje odraslih, poročilo o raziskovalni nalogi. Fakulteta za organizacijske vede, Kranj, 1992. [11] Keegan, D.: The Study of Distance Education: Terminology, Definition and Field of Study in Research and Distance Education, Peter Lang, Frankfurt am Main, 1991. [12] Laurillard, D.: Rethinking University Teaching: A Framework for the Effective Use of Educational Technology, Routlcdge, London, 1993. [13] Marentič - Požarnik B.: Prispevek k visokošolski didaktiki, DZS, Ljubljana, 1978. [14] Rowntree, D.: Teaching through Self Instruction , How to develop Open Learning materials, Kogan Page, London, 1991. [15] Rowntree, D.: Exploring Opem and Distance Learning, Kogan Page, London, 1992. [16] Rowntree D.: Preparing Materials for Open, Distance and Flexible Learning, Kogan Page, 1994. [17] Strmčnik F.: Sodobna šola v luči programiranega pouka, DDU Univerzum, Ljubljana, 1978. [18] Van der Brande, L.: Flexible and Distance Learning, John Wiley & Sony, Chichester, 1993 [19] Zorman L: Sestava testov znanja in njihova uporaba v šoli. Zavod za šolstvo, Ljubljana, 1974. Multi-Attribute Decision Modeling: Industrial Applications of DEX Marko Bohanec', Vladislav Rajkovič ' Jožef Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia Phone:+386 61 1773 309, Fax:+386 61 1258 058 ^University of Maribor, Faculty of Organisational Sciences, Kranj, Slovenia {marko.bohanec, vladislav.rajkovic} @ijs.si Keywords: decision support, multi-attribute decision making, qualitative decision models Edited by: Cene Bavec and Matjaž Gams Received: October 2, 1999 Revised: November 20, 1999 Accepted: December 12, 1999 DEX is an expert system shell for qualitative multi-attrihute decision modeling and support. During the last decade, it has been applied over fifty times in complex real-world decision problems. In this article we advocate for the applicability and great potential of this approach for industrial decision-making. The approach is illustrated by a typical industrial application in land use planning, and supplemented by an overview of some other completed industrial applications. The learned lessons indicate the suitability of the qualitative DEX methodology particularly for "soft", i.e., less structured and less formalized, decision problems. Practical experience also indicates the importance of methods that facilitate the analysis, simulation, and explanation of decisions. 1 Introduction In complex decision-making processes, it is often necessary to deal with the problem of choice (Simon, 1977). Given a set of options (or alternatives), which typically represent some objects or actions, the goal is (1) to choose an option that best satisfies the aims or goals of decision maker, or (2) to rank the options from the best to the worst one. One of the approaches to such problems, which is well known and commonly employed within Decision Support Systems (Andriole, 1989), is based on evaluation models (Figure I). The idea is to develop a model that evaluates options giving an estimate of their worthiness (utility) for the decision-maker. Based on this estimate, the options are ranked and/or the best one is identified. Usually, a decision model is designed in an interaction between the decision maker and decision analyst. An important feature of evaluation models is that they can be, in addition to the sole evaluation of options, used for various analyses and simulations, which may contribute to a better justification and explanation of decisions. For example, a what-if analysis can provide a better insight into a causal relation between problem parameters and outcomes. Another example is a sensitivity analysis that can assess the sensitivity of model with respect to small changes of options. An evaluation model can be developed in many ways. The approach that prevails in decision practice is based on multi-attribute decomposition (Chankong and Haimes, 1983; Saaty, 1993; Buede and Maxwell, 1995): OPTIONS UTILITY EVALUATION ..EVALUATION MODEL ANALYSIS Figure 1: Evaluation-based decision modeling CARS UTILITY _j|„.,; »I buying mainr PRICE safety | "Sil_ CAR f doors I ,,T-1, , .. -K^ I TECH I pers I^COMF 1 protjioni decomposition Figure 2: Multi-attribute decision modeling we take a complex decision problem and decompose it into smaller and less complex subproblems. The result of such development is a decision model that consists of attributes, each of which represents a decision subproblem. Attributes are organized hierarchically and connected by utility functions that evaluate them with respect to their immediate descendants in the hierarchy. Figure 2 illustrates this basic principle of multi-attribute modeling by showing a simple hierarchy of attributes for the evaluation of cars. Real-life applications of multi-attribute methods, which were conducted at Jožef Stefan Institute in Ljubljana, were all based on DEX (Bohanec and Rajkovič, 1990). This is an expert system shell for multi-attribute decision making that combines the "traditional" multiattribute decision making with soine elements of Expert Systems and Machine Learning. The distinguishing characteristic of DEX is its capability to deal with qualitative models. Instead of numerical variables, which typically constitute traditional quantitative models, DEX uses qualitative variables; their values are usually represented by words rather than numbers, for example "low", "appropriate", "unacceptable", etc. Furthermore, to represent and evaluate utility functions, DEX uses if-then decision rules. In contrast, this is traditionally carried out in a numerical way, using weights or similar indicators of attributes' importance. An important additional feature of DEX is its capability to deal with inaccurate, uncertain or even missing data about options. In such cases, DEX represents options by distributions of qualitative values, and evaluates them by methods based on probabilistic and/or fuzzy propagation of uncertainty. During the last decade, DEX was used in more than fifty real-life decision problems. The aim of this article is to advocate for the wide applicability of DEX to complex decision problems that occur in industry. In the next section, we first illustrate the approach by a typical industrial application in land use planning. This is followed by an overview of several other completed industrial applications in performance evaluation of companies, evaluation of products, projects and investments, ecology, and loan allocation. Finally, we summarize the lessons learned in these applications, and propose some future directions for the development of underlying methodology. 2 A Real-World Case One of the most typical applications of DEX occurred with Goriške opekarne, a company located near the Slovenian city of Nova Gorica. The company is engaged in a very traditional business: production of bricks and tiles. Decades ago, they had built a factory near a suitable clay pit that was then providing raw material for their production. Until 1993, however, the clay pit has become almost completely exhausted, so the company was faced with a critical strategic decision of how to survive and continue with this type of production. Their only option was to find a new appropriate clay-pit location. An exploratory study revealed three possible candidate locations. Unfortunatelly, none of them was really appropriate as numerous difficult problems were foreseen, ranging from technological, transportational and financial to environmental and socio-psychological. The latter two problems seemed particularly important as the project was inevitably going to affect the environment, leading to a possible rejection of local inhabitants. For these reasons, a group of experts was formed to thorougly analyze the problem and propose alternative solutions (Bohanec, et al., 1993). ■ ATTRACT j VULNERAB ' SOC-PSYCH j| ECONOM „.develop;,; 'site char, land attr. • pollution • valuation • land use ' Figure 3: Topmost levels of clay-pit evaluation model In the first stage, the experts developed the structure of multi-attribute model for the evaluation of clay-pit locations.. Two primary evaluation dimensions were taken into account: Environmental impact and Feasibility of the project. For each of these, the most relevant attributes were identified and organized into a hierarchical structure (Figure 3). Note that only topmost levels of the model are shown in the figure. In total, the model contained 49 attributes: 29 basic (tenninal nodes) and 20 aggregate (internal nodes). Table 1: Decision rules for Site suitability ENVIRONMENT FEASIBILITY SITE 1 * unacc unacc 2 unacc * unacc 3 less-acc less-acc marg-acc 4 > acc less-acc less-acc 5 less-acc acc less-acc 6 acc acc acc 7 good acc good The second stage involved the definition of decision rules. Basically, these are simple if-then rules that for each of the 20 internal nodes in the model determine its evaluation with respect to its lower-level descendants in the hierarchy. Usually, they are represented in a tabular form. For example. Table I shows decision rules that were defined by the experts for the topmost node Site suitability. In the table, an asterisk '*' represents any value, and '>' means 'belter or equal'. In the third stage of the decision-making process, the options are identified and described by the values of basic attributes. In our case, there were three clay-pit locations, each of which was represented by 29 data items that corresponded to basic attributes of the model. Furthermore, as some of these items, such as Social-psychological feasibility, were inherently inaccurate or difficult to obtain, several variations of the descriptions were formed, anticipating either an "optimistic" or "pessimistic" development of the project. Effectively, this increased the number of considered options to eight (Figure 4) and provided a foundation for subsequent what-if analysis. which clearly indicate the wide applicability of DEX for a variety of decision problems. The description of some other early industrial applications can also be found in (Urbančič, et al., 1991). 3.1 Performance Evaluation of Companies Here, the general task is that a company or agency develops an evaluation model that assesses the performance of some other companies. The aim is, for example, to find a suitable business partner. The work with DEX in this area began in 1987, where a number of such models were developed in collaboration with the International Center for Public Enterprises (Bohanec and Rajkovič, 1990). An example hierarchy of attributes that was used to assess the performance of 54 public enterprises in Pakistan, is shown in Figure 5. This work culminated in 1989 with the development of models that were used in the privatization of Peruvian public enterprises. »latjglnica o UNE Fit S0C-PS3 Figure 4: Visualization of clay-pit evaluation results In the last stage, the model was utilized to evaluate the clay-pit locations. As shown in Figure 4, the best location was the one called Marjetnica, which was evaluated as "acceptable", but only in its "optimistic" instance. On the other hand, all the "pessimistic" instances were unacceptable, indicating the great sensitivity of decision. Therefore, thorough what-if and sensitivity analyses were performed for each location. The most important result was achieved by comparing "optimistic" and "pessimistic" options with respect to basic attributes. The outcome of this comparison was a comprehensive list of possible problems that could occur with each location. On this basis, the expert team not only was able to find the best location, but also to foresee potential pitfalls and suggest how to avoid them. 3 Other Applications In about ten years time, DEX was used in more than fifty real-life decision problems in various areas. About one half of the problems can be classified as industrial, while the remaining were conducted in the fields such as education or medicine and health care (Bohanec, et al., 1999). Some of the industrial problems were very difficult and involved substantial financial and other risks for decision-making organizations. In what follows we briefly outline five representative application areas. Figure 5: Topmost levels of the model for performance evaluation of public enterprises 3.2 Product portfolio evaluation The problem is to assess the quality of products made by a company or production unit. This assessment is vital for the formation of strategies-. The approach with DEX was based on the so-called portfolio method (Krisper, et al., 1991), which evaluates products using two primary evaluation dimensions: market attractiveness and competitive ability. Several practical cases were analyzed in this way, including the products of some well-known Slovenian companies Fructal, Radenska, SRC, and DZS. 3.3 Evaluation of projects and investments The evaluation of projects or investment strategies is an industrial application context in which DEX has got the largest number of applications. The most typical investments included various software, hardware and technology, such as data base management systems, production control software, meteorological radar equipment, or a production line. The decision problems were often related to various investment proposals and tenders. An example of such applications, which is documented quite in detail, is a model for the evaluation of research and development projects (Bohanec, et al., 1995). 3.4 RemediatioM of dumpsites This is a recent application in the field of environmental care. In order to alleviate the problem of illegal dumpsites in Slovenia, an expert system was developed that assesses the environmental impact of dumpsites and suggests activities for their remediation (Spendi, 1998). The environmental impact of dumpsites is assessed by a qualitative DEX model (Figure 6), which is embedded in the expert system. Figure 6: Model for the assessment of dumpsile's environmental impact (topmost levels only) 3.5 Housing loae alllocatioini This is an example of a repetitive decision-making task being supported by a DEX model. The model is a part of a management decision support system that is used since 1991 by the Housing Fund of the Republic of Slovenia for the allocation of housing loans with favorable terms to citizens (Bohanec, et al., 1996). Until 1999, the Fund has issued 16 floats of loans, i.e., about two per year, and approved almost 20 thousand loans. The amount requested by applicants in a float typically far exceeds the available funds. Thus, the applicants must be ranked into a priority order. The procedure is required to be fast, reliable, transparent, and fair for all applicants. The request for transparency asks for effective explanations of loan priority order, which have to be provided to both the decision-making committee and a large number (usually, several thousands) of applicants. In the Fund's system, these requirements were fulfilled by a qualitative model that ranks the applications into five priority classes and provides a foundation for various explanations, which are obtained by analyses and simulations of application data and the model itself. 4 Experieece Some important lessons have been learned in the applications of DEX. Here, we present some findings related to the duration of model development processes, difficulty of development stages, and categories of decision problems that seem to be particularly well suited for the application of DEX. The time needed to develop a DEX model turns out to be extremely problem-dependent: it may take from few hours to several months. Most typically, however, the development requires about two working days for the development of model structure, from one to two days to define decision rules, and from one to several days to coiled data about options, to evaluate, and analyze them. Therefore, the process most typically lasts from two to ten working days. The most difficult stage of the process is its first one, in which the relevant attributes must be identified and appropriately organized into a hierarchical structure. This stage heavily relies on knowledge and experience of decision-makers and experts, and requires a deep understanding of the decision problem. It can still be considered more art than science. The remaining stages have been found much less problematic. Therefore, an appropriate identification of model structure mostly determines the success of the decision-making process. DEX with its qualitative modeling and ability to handle inaccurate and/or incomplete data about options appears particularly well suited for decision problems that involve qualitative concepts and a great deal of expert judgement. Also, it seems that the usefulness of DEX increases with the increasing difficulty, or "complexity", of the decision problem. So far, the best results were achieved in problems that required large models, consisting of at least 15 attributes, and/or involving a large number of options, i.e., from about 10 to several hundreds of options. On the other hand, DEX turned out to be unsuitable for problems that require exact formal modeling, numerical simulation and/or optimization. 5 Ferther Work Currently, there are three limitations of the DEX approach that, we believe, can be greatly improved by appropriate extensions of the methodology. First, the difficult stage of model structure development could be additionally supported by a machine learning method that would develop (or at least suggest) model structure using decision examples taken either from an existing database of past decisions, or provided explicitly by the decisionmaker. A considerable progress in this direction has already been made by the development of a learning method called HINT (Zupan, et al., 1999). Given training examples, HINT develops a hierarchical multi-attribute evaluation model that explains and possibly generalizes the examples. The structure of the models developed by HINT is essentially the same as the structure of models developed "manually" using DEX. The HINT'S model development is based on function decomposition, an approach that was originally developed for the design of digital circuits. Another limitation of DEX is that it is strictly limited to qualitative decision models; it cannot use numerical variables nor analytically represented utility functions that are commonly used in traditional quantitative models. This is sometimes advantageous in comparison with other decision modeling systems, which exclusively rely on quantitative models. However, many real-life decision problems require both qualitative and quantitative attributes, so the integration of these two may have a great practical impact: it may increase the flexibility of the method and extend the range of decision problems that can be successfully approached. Methodologically, such integration appears quite difficult and requires more research. In the context of DEX, we consider it a long-term goal. Last but not least, the major part of DEX software has been developed about ten years ago and currently appears quite outdated. Therefore, an overall redesign and renewal of software is planned for the near future. Currently, we are developing a program called DEXi, an educational subset of DEX to be used by students and teachers in secondary schools and faculties. We plan to follow this by the development of a functionally complete state-of-the-art DEX system. 6 Conclusion The DEX system effectively integrates two methodologies: multi-attribute decision making and expert systems. To a limited extent, it also includes some elements of machine learning and fuzzy logic. By this, it facilitates a structured and systematic approach to complex decision problems. So far, DEX has been successfully used in over fifty real-life decision problems in industry, medicine, health care and education, which all speak in favor for its wide applicability and flexibility. From the practical viewpoint, the most important characteristics of DEX are: 1. Qualitative (symbolic) decision modeling, which is particularly well suited for "soft" decision problems, i.e., less structured and less formalized problems, which involve a great deal of expert judgement. 2. Focus on the explanation and analysis of options, which lead to better-understood and justified decisions. 3. Active support of the decision-maker in the acquisition of decision rules, which speeds up model development and reduces the number of errors. The goals of further research and development related to DEX are twofold. First, we wish to improve the support in the difficult stage of model structure development, and propose to use machine learning methods, such as HINT, for that purpose. To further improve the flexibility and general applicability of the approach, we suggest further research towards an integration of qualitative and quantitative decision models. 7 References [1] S.J. Andriole: Handbook of Decision Support Systems. TAB Books, 1989. [2] M. Bohanec, V. Rajkovič, V.: DEX: An expert system shell for decision support. Sistemica I, 145157, 1990. [3] M. Bohanec, B. Kontić, D. Kos, J. Marušič, S., Polič, J. Rakovec, B. Sedej, et al.: Comparison of clay-pit locations Okroglica, Bukovnik, Marjetnica with respect to environmental protection (in Slovenian). Ljubljana: Jožef Stefan Institute, Report DP-6742, 1993. [4] M. Bohanec, V. Rajkovič, B. Semolič, A. Pogačnik: Knowledge-based portfolio analysis for project evaluation. Information & Management 28, 293302, 1995. [5] M. Bohanec, B. Cestnik, V. Rajkovič: A management decision support system for allocating housing loans. Implementing Systems for Supporting Management Decision (eds. P. Humphreys, L. Bannon, A. McCosh, Migliarese, J.-C. Pomerol). Chapman & Hall, 1996. [6] M. Bohanec, B. Zupan, V. Rajkovič: Hierarchical multi-attribute decision models and their application in health care. Proc. Medical Informatics Europe 99 (eds. P. Kokol, B. Zupan, J. Stare, M. Premik, R. Engelbrecht), Amsterdam: lOS Press, 670-675, 1999. [7] D.M. Buede, D.T. Maxwell: Rank disagreement: A comparison of multi-criteria methodologies. Journal of Multi-Criteria Decision Analysis 4, 1-21, 1995. [8] V. Chankong, Y.Y, Haimes: Multiobjective Decision Making: Theory and Methodology. North-Holland, 1983. [9] M. Krisper, V. Bukvič, V. Rajkovič, T. Sagadin: Strategic planning with expert system based portfolio analysis. EXPERSYS-9I: Expert system applications (eds. J. Hasemi, J.G. Gouardères, J.P. Marciano). IITT International, 1991. [10]T.L. Saaty: Multicriteria Decision Making: The Analytic Hierarchy Process. RWS Publications, 1993. [11]A.H. Simon: The New Science of Management Decw/o«. Prentice-Hall, 1977. [12]R. Spendi: An expert system for the evaluation of environmental impact and remediation of illegal dumpsites (in Slovenian). M.Sc. Thesis. University of Ljubljana, Faculty of Information and Computer Science, 1998. [13]T. Urbančič, I. Kononenko, V. Križman: Review of Applications by Ljubljana Artificial Intelligence Laboratories. Ljubljana: Jožef Stefan Institute, Report DP-6218,"l991. [14]B. Zupan, M. Bohanec, J. Demšar, I. Bratko, L: Learning by discovering concept hierarchies. Artificial Intelligence 109,211-242, 1999. Perception-Based Classification Mihael Ankerst, Christian Elsen, Martin Ester, Hans-Peter Kriegel Institute for Computer Science, University of Munich Oettingenstr. 67, D-80538 München, Germany (ankerst I ester I kriegel} @dbs.informatik.uni-muenchen.de, c.elsen@elsen.net Keywords: classification, decision tree, data mining, visualization Edited by: Cene Bavec and Matjaž Gams Received: October 2, 1999 Revised: December 2, 1999 Accepted: December 19, 1999 Classification is an important problem in the emerging field of data mining. Given a training database of records, each tagged with a class label, the goal of classification is to build a concise model that can be used to predict the class label of future, unlabeled records. A very popular class of classifiers are decision trees because they sati.sfy the basic requirements of accuracy and understandability. Instead of constructing the decision tree by a sophisticated algorithm, we introduce a fully interactive method based on a midtidimensional visualÌ7.ation technique and appropriate interaction capabilities. Thus, domain knowledge of an expert can he profitably included in the tree construction phase. Furthermore, after the interactive construction of a decision tree, the user has a much deeper understanding of the data than just knowing the decision tree generated by an arbitrary algorithm. The interactive approach also overcomes the limitation of most decision trees which are fixed to binary splits for numeric attributes and which do not allow to backtrack in the tree construction phase. Our performance evaluation with several well-known datasets demonstrates that even users with no a priori knowledge of the data construct a decision tree with an accuracy similar to the tree generated by state of the art algorithms. Additionally, visual interactive classification significantly reduces the tree size and improves the understandibility of the resulting decision tree. 1 Introduction The success of computerized data management has resulted in the accumulation of huge amounts of data in several organizations. There is a growing perception that analyses of these large databases can turn this "passive" data into useful information. The term Data Mining refers to the discovery of non-trivial, previously unknown, and potentially useful patterns embedded in databases. Classification is one of the major tasks of data mining. The goal of classification is to assign a new object to a class from a given set of classes based on the attribute values of this object. Different methods [12] have been proposed for the task of classification, for instance decision tree classifiers which have become very popular. Decision tree classifiers are primarily aimed at attributes with a categorical domain, that is a small set of discrete values. Numeric attributes, however, play a dominant role in application domains such as astronomy, earth sciences and inolecular biology where the attribute values are obtained by automatic equipment such as radio telescopes, earth observation satellites and X-ray cristallographs. [6] discusses an approach that splits numeric attributes into multiple intervals rather than just two intervals. The well-known algorithms, however, perform a binary split of the form for a numeric attribute a and a real number v. The SPRINT decision tree classifier [3] processes numeric attributes as follows. There are n - 1 possible splits for n distinct values of a. The gini index is calculated at each of these n - I points and the attribute value yielding the minimum gini index is chosen as the split point. CLOUDS [4] draws a sample from the set of all attribute values and evaluates the gini index only for this sample thus improving the efficiency. A commercial system for interactive decision tree construction is SPSS CHAID [15] which - in contrast to our approach - does not visualize the training data but only the decision tree. Furthermore, the interaction happens only before the tree construction yielding user defined values for global parameters such as maximum tree depth or minimum support for a node of the decision tree. Visual representation of data as a basis for the human-computer interface has evolved rapidly in recent years. [8] gives a comprehensive overview over existing visualization techniques for large amounts of multidimensional data. Recently, several techniques of visual data mining have been introduced. [5] presents the technique of Independence Diagrams for visualizing dependencies between two attributes. The brightness of a cell in the two-dimensional grid is set proportional to the density of corresponding data objects. This is one of the few techniques which does not visualize the discovered knowledge but the underlying data. However, the proposed technique is limited to two attributes. [10] presents a decision table classifier and a mechanism to visualize the resulting decision tables. It is argued that the visualization is appropriate for business users not familiar with machine learning concepts. In contrast to well-known decision tree classifiers, our novel interactive approach enables arbitrary split points for numeric attributes, the use of domain knowledge in the tree construction phase and backtracking. In this paper, we introduce a novel interactive decision tree classifier based on a multidimensional visualization of the training data. Our approach allows to integrate the domain knowledge of an expert in the tree construction phase and it overcomes the limitation of binary splits for numeric attributes. The rest of this paper is organized as follows. In section 2 we introduce our technique for visualizing the training data. The support for interactively constructing a decision tree - which we have implemented in the Perception-Based Classification (PBC) system - is discussed in section 3. Section 4 reports the results of an extensive experimental evaluation on several well-known datasets. Section 5 summarizes this paper and outlines several issues for future research. 2 In our approach, we visualize the training data in order to support interactive decision tree construction. We introduce a novel method for visualizing multidimensional data with a class label such that their degree of impurity with respect to class membership can be easily perceived by a user. Our pixel-oriented method maps the classes to colors in an appropriate way. The basic idea of pixel-oriented visualization techniques [8] is to map each attribute value v; of each data object to one colored pixel and to represent the values belonging to different attributes in separate subwindows. The proposed techniques [9] differ in the arrangement of pixels within a subwindow. Circle Segments [2] is a recent pixel-oriented technique which was introduced for a more intuitive visualization of high-dimensional data. The Circle Segments technique maps d-dimensional objects to a circle which is partitioned into d segments representing one attribute each. Figure I illustrates the partitioning of the circle as well as the arrangement. Within each segment, the arrangement starts in the middle of the circle and continues to the outer border of the corresponding segment in a line-by-line fashion. These lines upon which the pixels are arranged are orthogonal to the segment halving lines. An extension of this technique has been applied in the context of cluster analysis [1] . While most approaches of visual data mining visualize the discovered knowledge, our approach is to visualize the training data in order to support interactive decision tree construction. attr. 8 attr. ^^^atóbute 1 ^^^^^attr. 2 attr. attr^l^ ^^^^attr. 3 attr. 4 Figure l.Illustration of the Circle Segments technique for 8-dimensionall data objects We introduce a novel method for visualizing multidimensional data with a class label such that their degree of impurity with respect to class membership can be easily perceived by a user. Our method performs pixel-oriented visualization and maps the classes to colors in an appropriate way. Let D be a set of data objects consisting of d attributes y4|, . . ., /4j and having a unique class label from the set of Classes = (c'l, ci, . . . c^}. For each attribute Aj, let a total order < be defined, for example the <-order for numeric attributes or the lexicographic order for string attributes. To map each attribute value of D to a unique pixel, we follow the idea of the Circle Segments technique, i.e. we represent all values of one attribute in a segment of a circle with the proposed arrangement inside a segment. We do not use, however, the overall distance from a query to determine the pixel position of an attribute value. Instead, we sort each attribute separately and use the induced order for the arrangement in the corresponding circle segment. The color of a pixel is determined by the class label of the object to which the attribute value belongs. In the following, we introduce our technique for mapping classes to colors. Let Colors be the set of all different colors which can be represented in a given color model such as the RGB model, denoted as Colors - {coli, cok, . . . col„,}, m {^...mj- maps class indices to color indices as follows: map(i) = 1 if i=i map{i-])+ dist{c-_^, f ■ ) total - class - disi ■X(m-\) else Note that (m-1) is the maximum difference between the indices of two elements from Colors and X denotes the smallest integer i with i> x . Finally, we define the function visualize:Classes —> Colors mapping classes to colors as follows: visualize(Ci) = cok,,,^-,^^) Several color scales satisfying these requirements have been proposed [11]. These color scales are appropriate when a total or partial order is defined for the classes. For the purpose of comparability of the results, the experiments reported in this paper have been performed on several datasets where no semantics about the classes is known. If no order of the classes is given, we do not need the first requirement to preserve the order of the attribute values. Furthermore, the second requirement is weakened such that each pair of colors co/j and co/| is perceived as being different, i.e. ,/-' - \1 else represented in a 374x374 window and 10.000 objects with 20 attributes fit into a 516x516 window. We have developed a color scale for class labels based on the HSI color model [7] , a variation of the HSV model. The HSI model represents each color by a triple (hue, saturation, intensity). In our experiments, we observed the most distinctly perceived colors for the following parameter settings: For col 1 we set hue = 2.5 and intensity = saturation = 1.0, for col m we set hue = 0.5 and intensity = saturation = 1.0, and all other colors were obtained by partitioning the hue scale into equidistant intervals. Our approach of visualizing the training data also considers attributes having a low number of distinct values. In that case, there are many objects sharing the same attribute value and their relative order is not uniquely defined. Depending on the chosen order, we might create homogeneous (with respect to the class label) areas within the same attribute value. To avoid the creation of artificial homogeneous areas, we use the technique of shuffling-, for a set of data objects sharing the same attribute value the required order for the arrangement is determined randomly, i.e. their class labels are distributed randomly. 3 Perception-based classification dist(col-, col The amount of training data that can be visualized at one time is approximately determined by the product of the number of attributes and the number of data objects. For example, 2.000 data objects with 50 attributes can be Figure 2. A model for interactive classification 496 Informatica 23 ( 1999) 493-499 Ankerst et al. hie roois Oper«ions oijtioiis Vievj Help ~n-> 3 rawbluBTtnean ISplIlC- 0.31-38.11- 7 1 which determine a network M = {E,RuR2,... ,Rr) In the following we restrict our discussion to a single relation R described by a corresponding binary matrix R = ' zjjnxn where _ r 1 XiRXj \ 0 otherwi otherwise In some applications ry can be a nonnegative real number expressing the strength of the relation R between units Xi andXj. 1.1.1 Example: Student Government In Table 1 and Figure 1 the Student Government network is presented. It consists of communication interactions among twelve members and advisors of the Student Government at the University in Ljubljana (Hlebec, 1993). The results of the measurement are not real interactions among actors but cognition about communication interactions. Data were collected with face to face interviews, conducted in May 1992. Figure 1 : Network graph: Student Government - discussion, recall Table 1 : Student Government matrix m p m m m m m m a a a ! 2 3 4 5 6 7 8 9 10 11 minister 1 1 0 1 1 0 0 1 0 0 0 0 0 p.minister 2 0 0 0 0 0 0 0 1 0 0 0 minister 2 3 1 1 0 1 0 1 1 1 0 0 0 minister 3 4 0 0 0 0 0 0 1 1 0 0 0 minister 4 5 0 1 0 1 0 1 1 1 0 0 0 minister 5 6 0 1 0 1 1 0 1 1 0 0 0 minister 6 7 0 0 0 1 0 0 0 1 1 0 1 minister 7 8 0 1 0 1 0 0 1 0 0 0 1 adviser 1 9 0 0 0 1 0 0 1 1 0 0 1 adviser 2 10 1 0 1 I 1 0 0 0 0 0 0 adviser 3 11 0 0 0 0 0 1 0 I 1 0 0 Communication flow among actors was identified by the following question: Of the members and advisors of the Student Government, whom do you (most often) talk with? The content of the communication flow was limited to the matters of the Student Government. The time frame was also defined: the question was referred to the six months period. One respondent refiised to cooperate in the experiment. As he was not considered in the analysis, the network consists of eleven actors. 1.2 Cluster amd ClmsterMg One of the main procedural goals of blockmodeling is to identify, in a given network, clusters (classes) of units that share structural characteristics defined in terms of R. The units within a cluster have the same or similar cormection patterns to other units. They form a clustering C = {CijCa,... ,Cfc} which is a partition of the set E: {}^Ci= E and i ^ j Cif] C j = 0. Each partition determines an equivalence relation (and vice versa). 1.3 Block A clustering C partitions also the relation R into blocks R{Ci,Cj) = RnCixCj Each such block consists of units belonging to clusters Ci and C j and all arcs leading from cluster C» to cluster Cj. If i = j, a block iž(Cj, Ci) is called a diagonal block. 1.4 Blockmodd amd Blockmodeling The goal of blockmodeling is to reduce a large, potentially incoherent network to a smaller comprehensible structure that can be interpreted more readily. Blockmodeling, as an empirical procedure, is based on the idea that units in a network can be grouped according to the extent to which they if O i/ >o Figure 2: Blockmodeling scheme. complete col-dominant row-functional col-functional Figure 3 : Types of connection between two sets; the left set is the ego-set. are equivalent, according to some meaningful definition of equivalence. A blockmodel consists of structures obtained by identifying all units fi-om the same cluster of the clustering C. For an exact definition of a blockmodel (see Figure 2) we have to be precise also about which blocks produce an arc in the reduced graph and which do not, and of what type. Some types of connections are presented in Figure 3. A block is symmetric if V(X,F) e Ci X Cj : {XRY YRX) Note that for nondiagonal blocks this condition involves a pair of blocks Cj) and R{Cj,Ci). Table 2: Block types and 1 1 1 1 1 1 0 0 1 1 1 1 0 1 0 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 1 0 Ci C2 complete regular C2 null complete Q I é) The reduced graph can be presented by relational matrix, called also image matrix (see Table 2). A clustering and the induced blockmodel of the Student Government is presented in Figure 4. Figure 4: Blockmodeling example. 2 Blockmodeling - Formalization Let C/ be a set of positions or images of clusters of units. Let /X : £ C/ denote a mapping which maps each unit to its position. The cluster of units C{t) with the same position ^ G 17 is Therefore C{ii) = {C{t) :teU} is a partition (clustering) of the set of units E. A blockmodel is an ordered sextuple M = (Ì7,r, <5, TT, a) where: - C/ is a set of positions (types of units); - JT C X [/ is a set of connections', - T is a set of predicates used to describe the types of connections between different clusters in a network. We assume that nul 6 T. - a mapping -k : K ^ T \ {nul} assigns predicates to connections; - Q is a set of averaging rules. A mapping a : K Q determines rules for computing values of connections. A (surjective) mapping p, : E U determines a blockmodel M of network Af iff it satisfies the conditions: G K ■.Tr{t,w){C{t),C{w)) and \/{t,w) eUxU\K: nul{Cit),C{w)). 2.1 Equivalences Let « be an equivalence relation over E and [JT] — {Ye jS : X « y}. We say that w is compatible with T over a network M iff "iX^Y € E 3T £ T ■. T {\X],[Y]). It is easy to verify that the notion of compatibility for T = {nul, reg} reduces to the usual definition of regular equivalence (White and Reitz 1983). Similarly, compatibility for T = {nul, com} reduces to structural equivalence (Lorrain and White 19 71 ). For a compatible equivalence k the mapping /i : X i-> [A'] determines a blockmodel with U = EJ 3 Optimization 3.1 Criterion Function The problem of establishing a partition of units in a network in terms of a selected type of equivalence is a special case of clustering problem that can be formulated as an optimization problem: determine the clustering C* for which P(0=minP(C) Table 3: Characterizations of Types of Blocks. null nul all 0* complete com all r row-regular rre each row is 1-covered col-regular ere each column is 1 -covered row-dominant rdo 3 all 1 row* col-dominant cdo 3 all 1 column* regular reg 1-covered rows and 1-covered columns non-null one 3 at least one 1 * except may be diagonal where C is a clustering of a given set of units E, $ is the set of all feasible clusterings and P : $ IR the criterion function. One of the possible ways of constructing a criterion function that directly reflects the considered equivalence is to measure the fit of a clustering to an ideal one with perfect relations within each cluster and between clusters according to the considered equivalence. Given a set of types of connection T we can introduce the set of ideal blocks for a given type T e T by ß{Ci,Cf,T) = {BCCixCr.T{B)} Using Table 3 we can efficiently test whether the block R{Ci,Cj) is of the type T; and define the deviation 5{Ci,Cj]T) of a block R{Ci,Cj) from the nearest ideal block. For example <5(C,,Q;reg) = |Ci| ■ - c,) + |C7,-| • - n) where cj is the number of non-zero columns, and r^ is the number of non-zero rows in the block R{Ci,Cj). For details see (Batagelj 1997). For the proposed types all deviations are sensitive 6{Ci,Cy,T) = Q <^T{R{Ci,Cj)). Therefore a block R{Ci, Cj) is of a type T exactly when the corresponding deviation 5{Ci,Cj\ T) is 0. In the deviation 5 we can also incorporate values of lines v. Based on deviation S(Ci, C j ; T) we introduce the block-error e{C u C f,T) of R(Ci,Cj) for typer. An example of block-error is £iCi,Cf,T) = w(T)SiCi,Cj-,T) where w{T) > 0 is a weight of type T. We extend the block-error to the set of feasible types T by defining and siCuCj^T) = min B{Ci, Cf,T) TT{fx{Ci),ßiCj)) = aigmiaj^^j-e{Ci,Cj-,T) To make tt well-defined, we order (priorities) the set T and select the first type from T which minimizes e. We combine block-errors into a total error - blockmodeling criterion function P{c{ßy,r)= Y. e{c{t),c{w)-,r). (t,w)euxu For criterion fiinction P we have P{C{ß)) = 0 /li is an exact blockmodeling The obtained optimization problem can be solved by local optimization. Once a partitioning ^ and types of connection TT are determined, we can also compute the values of connections by using averaging rules. 3.2 Local Optimization For solving the blockmodeling problem we use a local optimization procedure (relocation algorithm): Determine the initial clustering C; repeat: if in the neighborhood of the current clustering C there exists a clustering C such that P(C') < P{C) then move to clustering C . The neighborhood in this local optimization procedure is determined by the following two transformations: - moving a unit Xk from cluster Cp to cluster Cq {transition)-, - interchanginguniis Xu Xy from different clusters Cp and Cq {transposition). 3.3 Benefits from Optimization Approach - ordinary / inductive blockmodeling: Given a network M and set of types of connection T, determine M, i.e.,/It, TT and a; - evaluation of the quality of a model, comparing different models, analyzing the evolution of a network (Sampson data, Doreian and Mrvar 1996): Given a network tV, a model M, and blockmodeling /i, compute the corresponding criterion fiinction; - model fitting / deductive blockmodeling: Given a network Af, set of types T, and a model M, determine ß which minimizes the criterion fiinction (Batagelj, Feriigoj, Doreian, 1998). - we can fit the network to a partial model and analyze the residual afterward; - we can also introduce different constraints on the model, for example: units X and Y are of the same type; or, types of units X and Y are not connected; 4 Pre-Specified Blockmodels Figure 5: Symmetric acyclic blockmodel of Student Gov-ermnent. The pre-specified blockmodeling starts with a blockmodel specified, in terms of substance,to an analysis. Given a network, a set of ideal blocks is selected, a reduced model is formulated, and partitions are established by minimizing the criterion function. The pre-specified blockmodeling is supported by the program MODEL 2 (Batagelj, 1996). As an example of pre-specified blockmodel we present in Figure 5 a symmetric acyclic blockmodel of Student Government. The obtained clustering in 4 clusters is almost exact - acyclic model with symmetric clusters. The only error is produced by the arc (a3, m5). 5 Final Remarks The current, local optimization based, programs for generalized blockmodeling can deal only with networks with at most some hundreds of units. What to do with larger networks is an open question. For some specialized problems also procedures for (very) large networks can be developed (Doreian, Batagelj, Ferligoj, 1998). Another interesting problem is the development of blockmodeling of valued networks. MODEL 2 and related programs and data can be obtained from http://vlado.fmf.uni-Ij.si/ pub/networks/stran/ Acknowledgment: This work was supported by the Ministry of Science and Technology of Slovenia, Project Jl-8532. References [1] Batagelj, V. (1991): STRAN - STRucture ANalysis. Manual, Ljubljana. [2] Batagelj, V. (1997): Notes on Blockmodeling. Social Networks, 19, 143-155. Also in: Abstracts and Short Versions of Papers, 3rd European Conference on Social Network Analysis, München, 1993: DJI, 1-9. [3] Batagelj, V., R Doreian, and A. Ferligoj (1992): An optimizational approach to regular equivalence. Social Networks, 14:121-135. [4] Batagelj, V., A. Ferligoj, and R Doreian (1992): Direct and indirect methods for structural equivalence. Social Networks, 14:63-90. [5] Batagelj, V., Ferligoj, A., and Doreian, R (1998): Fitting Pre-Specified Blockmodels, in Data Science, Classification, and Related Methods, Eds., C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. H. Bock, and Y. Baba, Springer-Verlag, Tokyo, p.p. 199206. [6] Borgatti, S.R and M.G. Everett (1989): The class of all regular equivalences: Algebraic structure and computation. Social Networks, 11:65-88. [7] Doreian, R, V. Batagelj, and A. Ferligoj (1994): Partitioning Networks on Generalized Concepts of Equivalence. Journal of Mathematical Sociology, 19/1:127. [8] Doreian, R, V. Batagelj, and A. Ferligoj (1998): Symmetric-Acyclic Decompositions of Networks. To appear in Journal of Classification. [9] Doreian, P. and A. Mrvar (1996) A Partitioning Approach to Structural Balance. Social Networks 18:149-168. [10] Faust, K. (1988): Comparison of methods for positional analysis: Structural and general equivalences. Social Networks, 10:313-341. [11] A. Feriigoj, V. Batagelj, and Doreian, R (1994): On Connecting Network Analysis and Cluster Analysis. In Contributions to Mathematical Psychology, Psy-chometrics, and Methodology {G.U. Fischer, D. Laming Eds.), New York: Springer. [12] Lorrain, F. and H.C. White (1971): Structural equivalence of individuals in social networks. Journal of Mathematical Sociology, 1:49-80. [13] Hlebec, V. (1993): Recall versus recognition: Comparison of two alternative procedures for collecting social network data. Developments in Statistics and Methodology. (A. Ferligoj and A. Kramberger, editors) Metodološki zvezki 9, Ljubljana: FDV, 121-128. [14] White, D.R. and K.R Reitz (1983): Graph and semigroup homomorphisms on networks of relations. Social Networks, 5:193-234. Adapted Methods For Clustering Large Datasets Of Mixed Units Simona Korenjak-Čeme IMFM Ljubljana, Dept. of TCS, Jadranska 19, 1 000 Ljubljana, Slovenia E-mail: simona.korenjak@finf.uni-lj.si Keywords: clustering, large datasets, mixed units, hierarchical clustering, cluster description compatible with merging of clusters, leaders method, adding clustering method Edited by: Cene Bavec and Matjaž Gams Received: October 17, 1999 Revised: October 30, 1999 Accepted: December 11, 1999 The proposed clustering methods are based on the recoding of the original mixed units and their clusters into a uniform representation. The description of a cluster consists for each variable of the frequencies of the variable values over its range partition. The proposed representation can be used also for clustering symbolic data. On the basis of this representation the adapted version of the leaders method and adding clustering method were implemented. We describe both approaches, which were successfully applied on several large datasets. 1 Introduction Abstraction is the main tool to deal with large amounts of data. The first step is to identify groups of similar units -clusters. In data analysis this is a task of clustering methods. The most popular are hierarchical clustering methods. Because they usually use a similarity/dissimilarity matrix they are appropriate only for clustering datasets of a moderate size (some hundreds of units). On the other hand well known nonhierarchical methods are mostly implemented for datasets of variables measured in the same scale type (such as for example 'k-means method'). Because of these limits we are searching for new clustering methods or at least trying to adapt known methods to be appropriate for clustering large datasets of mixed units, where variables (properties) of the units are measured in different scales. Let £? be a finite set of units. A nonempty subset C C E is called a cluster. A set of clusters C = {Ci} forms a clustering. In this paper we shall require that every clustering C is a partition of E. The clustering problem can be formulated as an optimization problem: Determine the clustering C* 6 for which P(C*) = minP(C) ^ ' Ce«' IR+ where $ is a set of feasible clusterings and P is a criterion function. In many clustering methods the criterion function measures the deviation of units from representatives {leaders) of corresponding clusters. In our method we select the criterion function in one of the most frequent form E E cecx€c where Rc is a representative of cluster C and d is a dissimilarity. The cluster representatives usually consist of variable-wise summaries of variable values over the cluster. For homogeneous units with only numerical variables their means are usually selected as representatives of clusters. For mixed (nonhomogeneous) units a new description has to be selected. In this paper we investigate a description satisfying two additional requirements: 1. it should require a fixed space per variable; 2. it should be compatible with merging of clusters -knowing the description of two disjoint clusters we can, without additional information, produce the description of their union. Note that only some of the cluster descriptions are compatible with merging. For example mean (as sum and number of units) for numerical variables and (min, max) intervals for ordinal variables. 2 A description of a cluster For our adaptation of clustering methods to be appropriate for clustering large datasets of mixed units, we choose a cluster description based on frequencies. For this purpose, the ranges of the variables are partitioned into selected number of classes. Let {Vi, i = 1,... ,k{V)} be a partition of the range of values of variable V (the number of classes k{V) depends on variable). Then we can define for a cluster C the sets Q{i, C-,V) = {X^C-. ViX) € Fi}, i = 1,... , k{V) where V{X) denotes the value of variable V on unit X. In the case of an ordinal variable V (numerical scales are a special case of ordinal scales) the partition {Vi, i = 1,... , k{V)} usually consists of intervals determined by selected threshold values to < h < t2 < ts <■■■ < h(v)-\ < h(v), to = inf V, hiv) = sup V. For nominal variables we can obtain the partition, for example, by selecting k{V) — 1 values ti,t2,t3,... , tk(v)-i from the range of variable V (usually the most frequent values on jS) and setting Vj = {ti}, i = l,... ,A;(y)-l;and putting all the remaining values in class Vk{v) • Units are not necessarily represented with single value for each variable, but they can also be represented with frequencies over the classes of variables ranges. Using classes of ranges we get frequencies q{i,C-,V) = caxàQ{i,C-,V) and relative frequencies q{i,C-,V) p{i,C-,V) = card C Note that k(V) i=\ AVhen only a single unit is in the cluster C we get pV'C-.yi'io] " itx eQ(i,c-,v) otherwise We can add, for each variable, a new class for a missing value and treat it as a special value, or we can also consider a missing value on V for a unit X by setting p{i, {xy, V) = i = 1,... , k{V) (or by some other distribution). It is easy to see that such a description is compatible with merging, because for two disjoint clusters Ci and C2 we have Q{i, Ci U C2; y) = Q{i, Ci; y) U Q{i, C2-,V), q{i, Ci U C2; V) = q(i, Ci;V) + q(i,C2;V). The threshold values are usually determined in such a way that, for the given set of units E (or the space of units Q, it holds ±at p(i,E;V) « = 1,... ,k(V). As a compatible description or nominal variable over a cluster C also its range V(C) can be used, since we have V(CiUC2) = V(Ci)UV(C2). Example: Recoding of flags dataset Original data are taken from the address f tp://ftp.ics.uci.edu/pub/ machine-learning-databases/flags (Flags from Collins Gem Guide to Flags, donated by Richard S. Forsyth.) Let us consider the following three variables; - population (in round millions), - mainhue (predominant color in the flag (tie-breaks decided by taking the topmost hue, if that fails then the most central hue, and if that fails the leftmost hue)), - text (1 if any letters or writing on the flag (e.g., a motto or slogan), 0 otherwise). The range of the variable population is divided into 5 classes with approximately the same number of units in each of them. The ranges of the others variables are so small that we put for discretization of them each possible value in a separate class: var= population var= mainhue var= text map 1 = {0} 2 = (0,4] 3 = (4,18] 4 = (18,158] map 1 = {red} 2 = {green} 3 = {blue} 4 = {gold} map 1 = {0} 2 = {1} 5 = (158,1100] 5 = {white} 6 = {black} 7 = {orange} ORIGINAL DATA unit ID population mainhue text Austria 8 red 0 New-Zealand 2 blue 0 Saudi-Arabia 9 green 1 Switzerland 6 red 0 USA 231 white 0 RECODED DATA Austria New-Zealand Saudi-Arabia Switzerland USA 3 2 3 3 5 1 1 3 1 2 1 1 In our case for each variable a unit is represented with index of the appropriate class. The description of a cluster Ce (only for considered variables) obtained with the leaders method is qiCe] population) 8 1 ICQ qiCei mainhue) 1 0 9 0 0 q{C6',text) . 6 4 0 0 >From this description we can see that in eight countries the population is less than a million, in one country is between 1 and 4 millions and in one country population is between 4 and 18 millions. This cluster is one of the seven clusters obtained with the adapted version of the leaders method with maximal allowed dissimilarity between a unit and its nearest leader 0.5. In one of the countries flags red is a dominant color and all of the remaining units have blue mainhue. In six units some text is presented and four units inside cluster Ce have no text in their description. For better understanding, cluster Cg consists of: Bermuda, Brit. Virg. Isles, Cayman Islands, Falklands Malvi, Fiji, HongKong, Montserrat, St. Helena, Turks Cocos Islands and Tuvalu. 3 Dissimilarity between clusters Let us return to our approach to clustering problem as an optimization problem. After deciding to use the uniform representation of units and clusters, we have to define a measure of dissimilarity between clusters (a unit is a special case of a cluster with only one element). First the dissimilarity between clusters for individual variable V is defined as , k(v) i=l We shall use the abbreviation d{X,C-,V) = di{X},C;V). In both cases it can be shown that 1. d{Ci, C2 ; F) is a semidistance on clusters; i.e. (a) d{Ci,C2;V)>0 (b) diC,C-,V)==Q (c) ,C2-,V)+ d{C2, C3-,V)> d{Ci, C3; F) 2. d(Ci,C2;F)e[0,l] and for the representation of a single unit also X € Q(i, E- V) ^ d{X, C; y) = 1 - p{i, C; V) The semidistances on clusters for individual variable can be combined into a semidistance on clusters for complete descriptions by m j=i where m is the number of variables and aj are weights {a j > 0 and a j = 1); often a j = i. We can use weights to consider dependencies among variables or to tune the dissimilarity to a given learning set in AI applications. 4 Clustering procedures In the proposed approach the original nonhomogeneous data are first recoded to a uniform representation. For the recoded data efficient clustering procedures can be built by adapting leaders method (Hartigati, 1975) or adding clustering method (Zupan 1982, Jambu and Lebeaux 1983, Batagelj andMandelj 1993). 4.1 The adapted version of the leaders method The adapted version of the leaders method is a variant of a dynamic clustering method (Diday 1979, Batagelj 1985). To describe the dynamic clustering method for solving the clustering problem let us denote: A a set oirepresentatives', L C A a representation-, $ a set offeasible representations', P : # IRq criterion function', G : ^ ^ a representation function', F : $ ^ $ a clusteringfiinction and suppose that the functions G and F tend to improve (diminish) the value of the criterion function P. Then a simple version of the dynamic clustering method can be described by the scheme: L := Lo; repeat C := F(L) L := G(C) until the leaders stabilize We begin with the initial representation and then repeat to assign each unit to the nearest leader and after that select leaders for each (new) cluster until we reach the minimum of the criterion function or until the leaders don't change any more (local minimum). Let us assume the following model C = {Cj}ig/, L = {Li}iei, M^) = Li : X Ci (the nearest leader to die unit X), L = [L{Vi),... , L(Kr»)], L{V) = [s{l,L'V),...,s{k{V),L',V)], E-Z's{j,L',V) = 1 (the description of a leader has the same form as the description of a cluster) and , k{V) diC,L',V) = -J2\pU,C;V)-sij,L',V)\. j=i For selected criterion function P(C) = ^ d{X,L{X)) = xeE iei where p{C,L) = ^ diX,L) X€C we define F(L) = {C'i) with X E Cl: i = minArgmin{cf(X,Lj) : Lj G L}. 3 This means that each unit is assigned to the (first) nearest leader. We define G(C) = {LJ} with L'i = argminp(C, L). Let The unique symmetric optimal solution of this optimization problem is i; ifjGM otherwise where M — {j : q(j, C; V) = maxi q(i, C; and t = card M. The representative (leader) of a cluster is obtained from the most frequent range(s) of values of variables on this cluster. Example: Leader of a cluster For the description of a cluster Ce q(C6;population) 8 1 qiCß-, mainhue) 1 0 q{Ce;text) 6 4 the optimal leader Le is q{Ce\ population) 1 0 q^Ce; mainhue) 0 0 q{C6-,text) 1 0 1 0 0 9 0 0 0 0 0 0 0 1 0 0 0 0 The characteristics of the cluster are population = less than a million 80 % mainhue in the flag = blue 90 % text in the flag = no 60 % For example, 80% of all countries in the cluster Cq have less than a million inhabitants, 90% of all countries flags have blue mainhue and 60% of the flags in the cluster have no text in their descriptions. Properties of the leaders method The main properties of the adapted version of the leaders method are: 1. Selection of the leaders and formation of new clusters diminish the value of the criterion function. 2. The program always stops (converges). The number of iterations is usually less than 10. 3. The program is suitable for clustering (very) large datasets. 4. The leaders descriptions provide us with simple interpretations of clustering results. 4.2 The adapted adding method The adding clustering method is a hierarchical clustering method in which a new unit is added in a clustering tree. Each vertex corresponds to a cluster. For large datasets usually only the upper part of the hierarchy is maintained, the lower levels subtrees are replaced by 'bags' containing all units from a subtree. We shall use the same description of a cluster (vertex) and the same definition of a dissimilarity as in the leaders method. Every time we add a unit in a cluster (vertex) the frequencies are recalculated. There are two possible ways how to add a new unit: a) To maximize the dissimilarity between clusters (sons) of the current vertex or, b) To minimize the dissimilarity from clusters (sons) of the current vertex. In the first case (see Figure 1) the dissimilarities between both sons of a current vertex are calculated. Because of greedy approach the case with the biggest dissimilarity is chosen: ma,x{d(Cp U {X},Cg),d{Cp,Cg U {X}),diC,{X})}. C A CU{X} A CpU{x} c, Cp c, u{a:} Cp c. Figure 1 : Maximize the dissimilarity between clusters c A ■X Cp c, ifd(Cp,X) < d(C„JV)then CU{X} A CpU{X} c. else CU{X} A Cp C, u {X} Figure 2: Minimize the dissimilarity from clusters In the second case (see Figure 2) the dissimilarities from each of the sons of current vertex are calculated and the unit is added to the nearest one: mm{d{Cp,{X}),d{Cg,{X})}. The proposed approaches can also be extended on non-binary trees. The adding clustering method has some advantages: 1. Presentation of the result with a tree. 2. It can be used for classification. 3. Speed up - if the tree has many (hundreds of) leaves which represent the leaders, it is more efficient adding unit into the tree with this method than to calculate the dissimilarities to each of the leaders. A drawback of the adding method is that the result strongly depends on the ordering of the input sequence of units. A possible way to avoid this problem is to select a 'good' initial tree. We are suggesting to built the initial tree with some agglomerative hierarchical clustering method on leaders obtained with the leaders method. The other possibility is to include balancing of the tree in the process of adding new unit. Both possibilities are still under the development. 5 Conclusion We successfully applied the proposed approach on the dataset of types of cars (1 349 units, 26 variables), on the ISSP data (45 784 units, 21 variables) and also on some large datasets from AI collection http: //www. ics .uci. edu/~inlearn/ MLRepos i tory.html The first version of the program ClaMix (based on the adapted version of the leaders method) and some of the results are available at http://www.educa.fmf.uni -1j.si/datana/ Acknowledgment: This work was supported by the Ministry of Science and Technology of Slovenia, Project Jl-8532. [5] Diday, E. (1979) Optimisation en classification au-tomatique. Tome l.,2. INRIA, Rocquencourt, (in French). [6] Diday, E. (1997) Extracting Information fi-om Extensive datasets by Symbolic Data Analysis. Indo-French Workshop on Symbolic Data Analysis and its Applications, Paris, 23-24. September 1997, Paris IX, Dauphine, p. 3-12. [7] Hartigan, J.A. (1975) Clustering Algorithms. Wiley, New York. [8] Jambu, M. & Lebeaux, M.O. (1983) Cluster Analysis and Data Analysis. North-Holland Publishing Company. [9] Korenjak-Čeme, S. & Batagelj, V. (1998). Clustering large datasets of mixed units. Advances in Data Science and Classification. Rizzi, A., Vichi, M. and Bock, H.-H. (Eds.), Springer, Berlin, 1998, p. 43-48. [10] Tukey, J.W. (1977) Exploratory Data Analysis. Addison-Wesley, Reading, MA. [11] Zupan, J. (1982) Clustering of Large Data Sets. Research Studies Press, John Wiley & Sons LTD. [12] Flags from Collins Gem Guide to Flags. Collins Publishers (1986). Donated by Richard S. Forsyth. ftp://ftp.ics.uci.edu/pub/ machine-learning-databases/flags References [1] Batagelj, V. (1985) Notes on the dynamic clusters method. Proceedings of the IV conference on applied mathematics. Split, May 28-30, 1984. University of Split, Split, p. 139-146. [2] Batagelj, V. & Bren, M. (1995) Comparing Resemblance Measures. Journal of Classification, 12, 1, p. 7390. [3] Batagelj, V. & Mandelj, M. (1993) Adding Clustering Algorithm Based on L-W-J Formula. Paper presented at: /FC5'Pi,Paris, 31.aug-4.sep 1993. [4] Brucker, P. (1978) On the complexity of clustering problems. Lecture Notes in Economics and Mathematical Systems J 75, in: Optimization and Operations Research, Proceedings, Bonn. Henn,R., Korte,B., Oettli,W. (Eds.), Springer-Verlag, Beriin 1978. Equation Discovery System And Neural Networks For Short-Term Dc Voltage Prediction Irena Nančovska, Anton Jeglič and Dušan Fefer Faculty of Electrical Engineering, Tržaška 25, Ljubljana, Slovenia Phone: +386 61 1768 216, Fax: +386 61 1768 214 E-mail: {Irena.Nancovska,Anton.Jeglic,Dusan.Fefer}@fe.uni-lj.si AND Ljupčo Todorovski, Jozef Stefan Institute, Jamova 39, Ljubljana, Slovenia Phone: +386 61 1773 307, Fax: +386 61 1258 058 E-mail: Ljupco.Todorovski@ijs.si Keywords: neural networks, equation discovery, machine learning Edited by: Cene Bavec and Matjaž Gams Received: October 12, 1999 Revised: November 25,1999 Accepted: December 15, 1999 The aim of the paper is to compare the predictive abiUties of the novel method for time series prediction that is based on equation discovery with neural networks. Both methods are used for short-term (one-step ahead) prediction and have the abihty to learn from examples. With purpose to validate the predictive models, they are applied to several data sets. The successful predictive models could be used for voltage monitoring in a high precision solid-state DC voltage reference source (DCVRS) without presence of a high level standard, and further for voltage correction as a segment in the software controlled voltage reference elements (VRE). 1 Introduction some important characteristics: universal approximation (input-output mapping), ability to leam from and adapt to Measured time series could be described as mixtures of dy- their environment and the ability to evoke weak assump-namic, deterministic part which drives the process and ob- tion about the underlying physical system which gener-servational noise which is added in the measurement pro- ates the input data (Haykin 1998). In the paper we use cess, but does not influence the future behavior of the sys- three types of neural networks. The first one is a su-tem. Many up-to-date scientific researches on predicting pervised multilayer feedforward network, which is trained the future behavior of system are based on modelling of the with back-propagation learning algorithm (Haykin 1998, deterministic part. Examples ranges from the irregularity iii Nielsen 1990, Pham 1995). The second type emphasizes the annual number of sunspots to the changes of currency the role of time as an essential dimension of learning. It exchange rates. To make a forecast if the underlying deter- is a natural extension of the first type, replacing the or-ministic equations of the observed system are not known, dinary synaptic weights with finite-duration impulse re-one must find out both the rules governing system dynam- sponse (FIR) filters (Haykin 1998, Gershenfeld & Weigend ics and the present state of the system (Gershenfeld & 1992). The third type of network a recurrent structure Weigend 1992). Mainstream statistical techniques for pre- with a hidden neurons which introduce time in the network dieting include variations of the auto-regressive technique processing by virtue of the built-in feedback loop (Alippi that Yule invented in 1927. The technique uses weighted 1996, Haykin 1998). sum of previous observations of the series to predict the Equation discovery systems explore the hypothesis next value. However, there are a number of cases for which space of all equations that can be constructed given a set of this paradigm is inadequate because of the non-linearity of arithmetical operators, fiinctions and variables, searching the underlying model (Gershenfeld & Weigend 1992). In for an equation that fits the input data best. In the paper, the paper we present two different paradigms for forecast- we present an equation discovery system Lagramge that ing: neural networks and equation discovery. Neural net- uses context free grammars for restrictmg the hypothesis works used for prediction are characterized as black-box space of equations. The hypothesis space of lagramge models whereas models obtained with equation discovery is a set of equations, such that the expressions on their systems are transparent (white-box). right hand sides can be derived from a given context free Neural networks represent an emerging technology with grammar. For the purpose of time series prediction, we use difference equations, that predicts the present value of the time series. Three different grammars for Hnear, quadratic and piecewise linear equations are used. In order to compare the predictive abilities of two described paradigms we performed experiments in two synthetic and three real world time series prediction problems. The domains used in the experiments present models with different amounts of non-linear dynamics (determinism) and noise (randomness). The predictive models obtained in the experiment with reference voltage domain can be used in to improve the metrological characteristics of a DCVRS in two different manners: voltage monitoring and voltage correction. For the purpose of voltage monitoring, predictive models could be used during the inter-calibration period without presence of a high level standard while the predictors are obtained during the calibration period by using a high precision instrument. Further, the models could be used for voltage correction, as a segment in a software controlled VRE. By implementation of a control loop for voltage correction, based on the obtained predictors, the sensitivity of the reference source could be reduced, which contributes to enhancement of the robustness of the system and thereby the stability of the reference voltage (Nancov-ska 1997). The paper is organized as follows. First two sections describe the techniques used for time series predicting. In Section 2 a brief description of used neural networks is given and Section 3 gives overview of the equation discovery system Lagramge. The results of applying both techniques on five time series data sets are presented in Section 4, Finally, Section 5 concludes with a summary of the results and directions for further work. 2 Neuairal Neitworks The time series a;(l),a;(2),a;(3)..., which describes the system is given. From the series we generate vectors x(n) = [x(n - 1), x{n - 2),... , x{n - p)]^, which describe the last p values of the phenomenon until time n -1. We are trying to find a map F(x(n)) = x{n) such that the predicted value x(n) in time n is the most similar to the original signal value x{n) in time n. For accomplishment of F we use three different types of neural networks (Ger-shenfeld & Weigend 1992, Narendra 1990, Pham 1995). 2.1 Mmltilayer perceptrom (MP) We use a general multilayer feed-forward network (Lippmann 1987) whose learning algorithm is generalized S-rule or back-propagation (BP). The user interface provides regulation of the following parameters: number of layers, number of neurons in each layer, learning rate t] and momentum term a. Parameters rj and a could be changed during the training. Neurons in input layer act as buffers for distributing the input signals x(n) to neurons in the hidden layer (Haykin 1998, Lippmann 1987, Nielsen 1990, Pham 1995). MP is usually used as pattern recognition tool, but from a systems theoretic point of view it can be also used for approximation of non-linear maps (Narendra 1990). 2.2 FIR mmWilayer perceptron (FIR-MP) In order to allow time to be represented by the effect it has on signal processing or to make the network to be dynamic, time delays are introduced into the synaptic structure of the network and their values are adjusted during the learning phase (Haykin 1998). In fact each synapse is represented by a finite-duration impulse response (FIR) filter (Figure 1). FIR-MP network is a vector generalization of the MP and its learning algorithm is a vector generalization of the standard BP algorithm, called temporal BP (TBP). The basic form of TBP is non-causal because the computation of weights requires knowledge of future values of weights' changes 6-s and weights w-s. It could be made causal by adding a finite number of delay operators on the feedback connections so that only present and past values of S-s and w-s are used. We hypothesize that by introducing tapped delay feedbacks the NN performance on problems involving time dependencies could be improved. In (Lin 1996) FIR-MP is compared to the NARX recurrent network by its computational power. NARX is computationally as strong as fully connected recurrent network thus is Turing machine equivalent (Siegelmann 1995, Siegelmann & Sontag 1995). 2.3 ReciuirreEt metwork im real time (EN) The net (Haykin 1998, Pham 1995) consists of connected input-output layers and processing layer. RN has ability to connect the external time-varying input with its previous output by using delay operator. The learning algorithm used is real time recurrent learning (RTLL) (Haykin 1998), which is gradient-descent learning algorithm and minimizes the error function by changing the weights of all visible neurons. This architecture is capable of representation of arbitrary non-linear dynamical system and it is Turing equivalent (Alippi 1996, Siegelmann 1995, Siegelmann & Sontag 1995). However, learning simple behavior can be quite difficult by using gradient descent'. RTRL is not guaranteed to follow the negative gradient of the error function. This is a consequence of the feedback connection and it can be improved by slow changing of weights. Although RN has difficulty capturing the global behavior (Lin 1996) it is usefial for learning short-term dependencies and thus can be used for short-term predictions. ' For example, even it is Turing equivalent, it has been difficult to get it successfully learn finite-state machines from example strings encoded as sequences. )-»-j ^/p-l) ) > , I 1-»-0 Figure 1 : Signal-flow graph of a synaptic FIR filter 3 Equation Discovery The problem of equation discovery, as addressed by La- gramge, can be defined as follows. Given are - a context free grammar G = (N, T, P, S) (see next section) and - input data D = {V,Vd, M), where - V = {vi,v2,-. -Vm} is a set of domain variables, - Wd G is the dependent variable and - M is a set of one or more measurements. Each measurement is a table of measured values of the domain variables at successive time points: time Vl V2 Vm to Vlfi V2,0 ■ ■ Vm.O h fl,l V2,l ■ ■ Vm^i t2 Vl,2 V2,2 . ■ Vm,2 tN Vl,N V2,N ■ ■ Vm,N Find an equation for expressing the dependent variable Vd in terms of variables in V. This equation is expected to minimize the discrepancy between the measured and calculated values of the dependent variable. The equation can be: - differential, i.e. of the form dvd/dt = va = E, or - ordinary, i.e. of the form Vd = E, where £ is an expression that can be derived from the context free grammar G. 3.1 Restricting the space of possible equations The syntax of the expressions on the right hand side of the equation is prescribed with a context free grammar (Hopcroft & Ullman 1979). A context free grammar contains a finite set of variables (also called nonterminals or syntactic categories) each of which represents expressions or phrases in a language (in equation discovery, nonterminals represent sets of expressions that can appear in the equations). The expressions represented by the nonterminals are described in terms of nonterminals and primitive symbols called terminals. The rules relating the nonterminals among themselves and to terminals are called productions. The original motivation for the development of context free grammars was the description of natural languages. For example, a simple grammar for deriving sentences consists of the productions sentence noun verb, noun network, noun equation, and verb predicts. Here sentence, noun and verb are nonterminals, while words that actually appears in sentences (i.e. network, predicts) are terminals. The sentences networkpredicts and equation predicts can be derived with this grammar. We denote a context free grammar as a tuple G = (iV, T, P, S), where N and T are finite disjoint sets of nonterminals and terminals, respectively. P is a finite set of productions; each production is of the form A a, where ^ is a nonterminal and a is a string of symbols From NUT. We use the notation A ^ ai \ a^ \ ... | a^ for a set of productions for the nonterminal A: A ai, A a2,..., A ^ ak- Finally, 5 is a special nonterminal called starting symbol. Grammars used in equation discovery system La-gramge have several symbols with special meanings. The terminal const e T is used to denote a constant parameter in an equation that has to be fitted to the input data. The terminals Vi are used to denote variables from the input domain D. Finally, the nonterminal v e N denotes any variable from the input domain. Productions connecting this nonterminal symbol to the terminals Uj are attached to v automatically, i.e., \/vi ■. v Vi e P. The only restriction on the grammar G is that the right sides of the productions in P have to be expressions that are legal in the C programming language. This means that we can use all C built-in operators and functions in the grammar. Additional functions, representing background knowledge about the domain at hand can be used, as long as they are defined in conjunction with the grammar. Note that the derived equations may be non-linear in both the constant parameters and the system variables. Expressions can be derived by grammar G from the nonterminal symbol S by applying productions from P. We start with the string w consisting of S only. At each step, we replace the leftmost nonterminal symbol A in string w with a, according to some production A a from P. When w consists solely of terminal symbols, the derivation process is over. 3.2 Lagramge - the algorithm Expressions generated by the context free grammar G contain one or more special terminal symbols const. A nonlinear fitting method is applied to determine the values of these parameters. The fitting method minimizes the value of the error function Error{c), i.e. if c is the vector of constant parameters in expression E, then the result of the fitting algorithm is a vector of parameter values c*, such that .Brror(c*) = mincgfl-^ {£'rror(c)}. The error function Error is a sum of squared errors function, defined in the following manner: - for a differential equation of the form dvd/dt — E: Error{c) = Eilo - (vifi + //; Eic,vu.. , and - for an ordinary equation of the form v d = E: Error {c) — EZoi^d.i - E(C, Vi^i, . . . Vd-i,i, Vd+i,i, . . . where N is the size of the measurement table and vj^i the value of the system variable v j at time ti. Note that in the case of calculating the error fiinction for differential equations we use the integral of the expression on the right hand side of the equation instead of the derivative of the dependent variable. This is done because the error of algorithms for numerical integration is in general smaller than the error involved of numerical derivation. We use a simple trapezoid formula for numerical integration with the same step size as the time step between successive measurements in the measurement table. The downhill simplex and Levenberg-Marquardt algorithms (Press et al. 1986) can be used to minimize the error function. Furthermore, the value of a heuristic fiinction for the expression is evaluated. It is equal to the sum of squared errors value SSE calculated by the fitting method (SSE{E) = Error{c*)). An alternative heuristic function MDL (minimal description length) can be used, that takes into account the length I of expression E: MDL{E) = SSE{E) + I 10-L where l^ax is the length of the largest expression generated by the grammar and a^^ is the standard deviation of the dependent variable v^. The length is measured as the number of terminals in the expression. The MDL heuristic function prefers shorter equations. A context free grammar can in principle derive an infinite number of expressions (equations). Lagramge thus uses a bound on the complexity (depth) of the derivation used to produce the equation (Todorovski & Džeroski 1997). The Lagramge algorithm exhaustively or heuris-tically searches for the best equation (according to the selected heuristic function) within the allowed complexity (depth) limits. 3.3 Time series prediction with equation discovery We reformulate the problem of time series prediction mto the equation discovery problem in the following way. Given a time series a;(l), x{2),x(3),..., we choose a constant p and build matrix M as follows: time Vi V2 .. Vp+l to x(l) . x{p+l) h x{2) x(3) .. . x{p + 2) ti x(3) x(4) .. . x(p + 3) Now the input domain for equation discovery problem equivalent to the problem of time series prediction is D = ({vi,v2,... ,Vp+i},Vp+i,M). We search for ordinary equation of the form Vp+i = F{vi ,V2,... Vp). The obtained equation can be interpreted as difference equation for predicting the next value of the time series x(n) = F{x{n-l),x{n-2),... ,x{n-p)). The form of function F on the right-hand side of the equation is biased with a context free grammar G. We used three different context free grammars for restricting the space of possible equation in time series prediction domains. First grammar is used to produce linear models: E —> const I const *v\E + const * v The second grammar generates quadratic multivariate polynomials: E const I const * F \ E+ const * F F —^ u I u * u Finally, the third grammar used in the experiments generates piecewise linear models. The breakpoint is set to 0.5, which is the middle of the interval of normalized values of time series: double If(double v, double el, double e2) { return((v < 0.5) ? el : e2); } IfE E\lf{v,E,E) E const I const * u | £ const * v 4 Experiments 4.1 Data sets descriptions We applied the techniques described in the previous two sections to two synthetic and three real worid data sets: Lorenz system Model of the Lorenz attractor is one of the most frequently used examples of the deterministic chaos system. It is described with the following differential equations: X = a{y - x) y = x{R-z)-y ž = xy — bz The values of the constant parameters were chosen to be: a = 16.0, iž = 451992,0 = 4.0. For initial state a;(0) = 0.06735,2/(0) = 1.8841,2(0) = 15.7734 the system is well-conditioned. The equations were simulated for 2000 time steps of length h = 0.001. For the prediction task we use the time series for variable z, because it clearly reflects non-linear dynamics of the system. Reference voltage The observed time series are generated by solid-state VRE-s, based on 7V zener diodes LTZIOOO of a group DCVRS. They are produced by measuring the absolute voltage values of VRE, which is controlled by PC-computer. The PC communicates with DCVRS via serial port RS232. Measuring instrument is digital voltmeter HP3458A. The time series present 2000 samples taken in time intervals of 15 minutes during 500 hour measurement. Fractional Brownian motions or 1//noises FBM is a random function provided by Mandelbrot and Van Ness (FBM). The most important feature of FBM is that its increments [Bnit + T) - Bnit)] = h~"[BHÌt + hT) - Bnit)] are stationary, statistically self-similar and have Gaussian distribution with a standard deviation ChT'^ , where C h is constant. This is usually called T" law of FBM. The parameter H is directly related to the fractal (Hausdorff) dimension D. For generation of the FBM signals we use the method of spectral synthesis (Nancovska 1997). Lorenz-like chaos in NH3-FIR lasers Far infrared lasers have been proposed as examples of a physical realization of the Lorenz model, mentioned earlier (Hubner et al. 1992). However, the actual laser systems are more complex then simple coherently coupled three-level systems. The data set was chosen to obtain several of the important quantities pertinent in comparison to the parameters of numerical data sets obtained by the integration of Lorenz equations. We took into consideration first 2000 time points of the time series. Sunspots The data set is standard benchmark test for various techniques for time series prediction. It contains the observation of the number of annual sunspots for 280 years. 4.2 Experimental setting The following methodology of the experiments with the time series prediction data sets was used. Each data set was divided in two parts of equal sizes (1000 time points per set, expect for the Sunspot data set which has only 280 time points). The first part was used as input for the learning system in the training phase, and the second one was used for testing the performance of the obtained predictive model. Furthermore, in the traming phase 80% of the training set was used directly for learning and the rest is used for error evaluation only (the evaluation applies the estimation of the performance of the predictor learned so far). The length of the input vector p varied between 1 and 6. The criterion for choosing the best predictor was the root mean square error (RMSE). In the experiments with neural networks, the value of parameter rj is gradually decreased from 0.95 to 0.003 to avoid local minima of the error surface. When training the recurrent network 77 is set to the lower value from the be-girming to make the time scale of the weight changes small enough to allow the learning algorithm to follow the negative gradient of error function. We used beam search strategy in equation discovery system Lagramge with beam width set to 50 and both heuristic functions (SSE and MDL) with downhill simplex method for constant parameters fitting. The depth complexity parameter was set to 10 and three different context free grammars (from the previous section) were used. The best equation was then chosen that minimizes the RMSE on the test training set. 4.3 Results The results of the experiments for neural networks and equation discovery system Lagramge are given in Table 1 and Table 2, respectively. The architecture of the MP neural network is represented with x — y — z, where x, y and z denote numbers of neurons in first, second and third layer, respectively, l^y^l denotes FIR-MP neural network architecture with p and q time operators (taps) between corresponding layers and p equals the length of the input vector. The architecture of the recurrent neural network is represented with x y - I, where x denotes the length of input vector and y the number of feedback connections. Recurrent neural networks has the best performance for the Reference voltage and FBM data sets. Both data sets represent time series with very fast changing values without long-term trend. The recurrent neural network has worst performance for the time series with trend. In that case the MP and FIR-MP networks better identify the underlying system, as we can see from the results of the experiments for Lorenz and Sunspots data sets. For Lorenz-like chaos data set the best performance is surprisingly achieved with MP network^. For all data sets, expect the Lorenz-like, the prediction performance of different types of neural networks are comparable. In the experiments with Lorenz-like chaos MP is significantly better then other two types. ^Finding a simple representation for a complex signal might require looking for relationship among input variables. In the case of Lorenz-like chaos input vectors are representative enough to allow the MP to tind the "simple" regression model which is good enough for local description of the model. Data set Winning NN RMSE Type Architecture P Training Testing Lorenz FIR-MP 5 9.9-10-^ Ref voltage Ree. 6<--4-l 6 0.0646 0.0749 FBM Ree. 5 •M 4- 1 5 0.0895 0.0823 Lorenz-like MP 6-3-1 6 0.0153 0.0260 Sunspots FIR-MP iHil 5 0.09795 - Table 1: Results of the experiments with neural networks Data set Winning equation RMSE Type P Training Testing Lorenz piecewise linear 6 1.919-10-« 2.465-10-« Ref. voltage piecewise linear 6 0.06375 0.07405 FBM linear 6 0.08846 0.0827 Lorenz-like quadratic 3 0.03889 0.05939 Sunspots quadratic 4 0.103 - Table 2: Results of the experiments with equation discovery In the experiments with equation discovery for die Lomez, Reference voltage and FBM data sets the discovered equations are linear. Although the Lorenz data set is obtained with simulating three non-linear differential equations, the interval enclosed in the data set do not expose the non-linearity of the underlying equations (due to the stability of the numerical integration a small time step was chosen). For Lorenz-like and Sunspots data sets quadratic equations were discovered, which was expected because of their non-linearity. The parameter p (number of previous values used for prediction) is significantly smaller in cases where quadratic equations were discovered. As in the experiments with neural networks, for all data sets, expect the Lorenz-like, the prediction performances of different types of equations are comparable. In the experiments with Lorenz-like chaos quadratic equations are significantly better then other two types. Both methods manifest comparable performance on three data sets. Equation discovery outperforms neural networks for the Lorenz data set, which was expected because of the determinism of the underlying model. Neural networks have better performance on the Lorenz-like and Sunspots data sets where quadratic equations were discovered by Lagramge. Figure 2 shows the performance of the obtained predictors for different types of neural networks and equations. 5 Discussion In the paper, we presented the equation discovery system Lagramge that uses context free grammars for restricting the hypothesis space of equations. Background knowledge from the domain of use in the form of function definitions can be used along with common arithmetical operators and functions built in the C programming language. The hypothesis space of Lagramge is a set of equations, such that the expressions on their right hand sides can be derived from a given context free grammar. In contrast with system identification methods, where the structure of the model has to be provided explicitly by the human expert, Lagramge can use a more sophisticated form of representing the expert's theoretical knowledge about the domain at hand. A context free grammar can be used to specify a whole range of possible equation structures that make sense from the expert's point of view. Therefore, the discovered equations are in comprehensible form and can give domain experts better or even new insight into the measured data. This also distinguishes Lagramge from other system identification methods like neural networks, which can be used for obtaining blackbox models, i.e., models with incomprehensible structure. On the equation discovery side, the presented work is related to equation discovery systems, such as Bacon (Langley et al. 1987), EF (Zembowitz & Zytkow 1992), E* (Schaffer 1993), LaGRANGE (Dzeroski & Todorovski 1993) and GoldHorn (Krizman et al. 1995). However, none of them was applied to the task of time series prediction. Various architectures of neural networks have already been used for system identification and prediction. Some of them are closely related to the architectures used in this paper (Haykin 1998, Lippmann 1987, Narendra 1990, Pham 1995), and others are different, such as radial basis function (RBF) neural network (Haykin 1998) and Group-Method-of-Data-Handling (Pham 1995). The NARX recurrent neural networks (Alippi 1996, Lin 1996) architecture is suitable for learning long-term dependencies in time series. For the task of short-term prediction, addressed in this paper, learning the local structure is good enough (Narendra Referenco Voltoge Lorenz- „ Ukochao. Data sat NNtypa Figure 2: RMSE of the predictors for different types of neural networks and equations 1990). Support vector machine (SVM) for non-linear regression (Haykin 1998), which is approximate implementation of the method of structural risk minimization, could be also used. Finally, SVM may be implemented in the form of a polynomial learning machine, RBF network or MP. References [1] Alippi, C., and Piuri, V. (1996) Experimental Neural Network for Prediction and Identification, In IEEE Transactions on Instrumentation and Measurement, Vol. 45, No. 2, 1996, pages 670-676. The outcomes of the experiments confirm the potential applicability of both paradigms for time series prediction. For each domain, the performances of the predictors, obtained with both methods, are comparable. The best predictors were obtained for the deterministic Lorenz-z time series. The efficiency of the predictors for real world domains (Reference voltage and Lorenz-like chaos) are comparable. The predictors with worst efficiency were obtained for FBM and Sunspots time series, due to the randomness of the underlying model (FBM) and the small number of measurements available (Sunspots). The predictive models could be used as a segment of software controlled voltage reference element (VRE), which consists of three main parts: measuring, predictive and control part. Stability of a reference voltage source could be enhanced by implementation of voltage control, which includes a function of correction (feedback loop). This could be done by correction of the current voltage by using a prediction based on measurements made before. It is anticipated for a solid-state voltage reference source to achieve the stability better than Ippm/lOOOh (Nancovska 1997). First step towards the further work will be the comparison of the methods described in the paper with mainstream statistical methods for time series prediction, such as ARIMA or exponential smoothing. It will be of great interest to apply these methods to the variety of data sets from different domains, where some background knowledge from the concrete domain can be used for restricting the equation space. Alternative types of neural networks architectures (NARX recurrent network, SVM for non-linear regression (Haykin 1998)) could be also implemented. [2] Džeroski, S., and Todorovski, Lj. (1993) Discovering dynamics In Proc. Tenth International Conference on Machine Learning, pages 97-103. Morgan Kaufmann, San Mateo, CA. [3] Gershenfeld, N. A., and Weigend, A. S. (1992) The future of time series: learning and understanding. In Time series prediction: Forecasting the future and understanding the past. Proceedings of the NATO Advanced Research Workshop on Comparative Time Series Analysis held in Santa Fe, New Mexico, May 14-17, 1992, pages 1-70. Addison-Wesley Publishing Company, Reading, MA. [4] Haykin, S. (1998) Neural Network - A Comprehensive Foundation, Second Edition, Macmillan College Publishing Company, Inc. [5] Hopcroft J. E., and Ullman, J. D. (1979) Introduction to automata theory, languages, and computation. Addison-Wesley, Reading, MA. [6] Hübner, U., and Weiss, C. O., and Abraham, N. B., and Dingyuan T. (1992) Lorenz-like chaos in NHz-FIR lasers (Data Set A). In Time series prediction: Forecasting the future and understanding the past. Proceedings of the NATO Advanced Research Workshop on Comparative Time Series Analysis held in Santa Fe, New Mexico, May 14-17,1992, pages 1-70. Addison-Wesley Publishing Company, Reading, MA. [7] Hecht-Nielsen R. (1990) Introduction to back-propagation, In Neurocomputing, HNC, Inc. and University of California, San Diego, Addison-Wesley Publishing Company. [8] Križman, V., andDžeroski, S., and Kompare, B, (1995) Discovering dynamics from measured data. Electrotech-nicalReview, 62: 191-198. [9] Langley, P., and Simon, H., and Bradshaw, G. (1987) Heuristics for empirical discovery. In Bole, L., editor, Computational Models of Learning. Springer, Berlin. [10] Lin, T., and Home, B. G., and Tino, P., and Giles, C. L. (1996) Learning Long-Term Dependences in NARX Recurrent Neural Networks. In IEEE Transactions on Neural Networks, Vol. 7, No. 6, November 1996, pages 1329- 1338. [11] Lippmann, R. P., (1987) An Introduction to Computing with Neural Nets, In IEEE ASSP Magazine, April 1987, pages 4-22. [12] Mandelbrot, B., and Van Ness, J. W., (1968) Fractional Brownian Noises and Applications, In 5X4MRev. 10(4), 1968, pages 422-436. [13] Nančovska, L, and Kranjec, P., and Fefer, D., and Jeglič, D. (1998) Case Study of the Predictive Models Used for Improvement of the Stability of the DC Voltage Reference Source, In IEEE Transactions on Instrumentation and Measurement, vol.47, no. 6, 1998, pages 1487- 1491. [14] Narendra, K. S., and Parthasarathy, K. ( 1990) Identification and Control of Dynamical Systems Using Neural Networks. In IEEE Transactions on Neural Networks, Vol. 1, No. 1., March 1990, pages 4 - 27. [15] Siegelmann, H. T., and Home, B. G., and Giles, C. L. (1995) Computational capabilities of recurrent NARX neural network, In Tech. Rep. UMIACS-TR-95-12 nad CS-TR-3408, Inst, of Adv. Comp. Stud., Univ. of Maryland. [16] Siegelmann, H. T., and Sontag E. D. (1995) On the Computational Power on Neural Networks. In Journal of Comp. Systems in Science, Vol. 50, No. 1, pages 132 - 150, 1995. [17] Pham, D.T., and Liu, X. (1995) Neural Networks for Identification, Prediction and Control. Springer-Verlag, London, GB. [18] Press, W. H., and Flannety, B. P, and Teukolsky, S. A., and Vetterling, W. T. (1986) Numerical Recipes. Cambridge University Press, Cambridge, MA. [19] Schaffer, C. (1993) Divariate scientific fiinction finding in a sampled, real-data testbed. Machine Learning, 12: 167-183. [20] Todorovski, Lj., and Džeroski, S. (1997) Declarative bias in equation discovery. In Machine learning. Proceedings of the 14th international conference (ICML '9 7), pages 376-384. Morgan Kaufmann publishers, San Francisco, CA. [21] Zembowitz, R., and Zytkow, J. (1992) Discovery of equations: experimental evaluation of convergence. In Proc. Tenth National Conference on Artificial Intelligence. Morgan Kaufmann, San Mateo, CA. Adaptive On-line ANN Learning Algorithm and Application to Identification of Non-linear Systems Daohang Sha and Vladimir B. Bajić Centre for Engineering Research, Technikon Natal, P.O.Box 953, Durban 4000, South Africa dhsha@hotmail.com, bajic.v@umfolozi.ntech.ac.za tel/fax: +27-31-2042560, http://nsys.ntech.ac.za Keywords: Soft computing, neural networks, gradient descent method, real-time algorithms, Input/Output modelling Edited by: Cene Bavec and Matjaž Gams Received: October 13, 1999 Revised: November 10, 1999 Accepted: Decembers, 1999 Anew on-line adaptive learning rate algorithm for I/O identification based on two ANNs is proposed. The algorithm is derived from the convergence analysis of the conventional gradient descent method. Simulation experiments are given to illustrate the advantages of the proposed algorithm in its application to an identification problem of some non-linear dynamic systems. 1 Introduction Feedback linearization can be used for controlling a broad class of non-linear processes. To perform feedback linearization it is necessary that the process allows description by a particular model structure which iits into the affine non-linear form (see [l]-[4]). When the process is unknown we can let two multilayer perceptron (MLP) models approximate the linear relationship between the input and output data of the process. In this paper, we will develop an on-line variable rate training algorithm for this double neural network system. Determination of the fixed learning rate of the conventional back-propagation (BP) algorithm for feedforeward neural networks (FNNs) [5] has to be made with care. If the learning rate is large, learning may occur quickly, but it may also become unstable. To ensure stable learning, the learning rate must be sufficiently small. However, with a small learning rate, an ANN, for example an MLP, may adapt reliably, but it may take quite a long time. It is thus difficult to select a suitable fixed learning rate for different initial values of the parameters of the ANNs and for different structures that ANNs may have. Moreover, a good fixed learning rate for one system is not necessarily good for another. These are characteristics of the basic ANN learning rule that relies on the gradient descent (GD) method and the chain rule [6]. For the convergence of such algorithms see [7] and [8]. The GD method is known for its slowness and its tendency to become trapped in local minima. To reduce these shortcomings, a number of faster ANN training algorithms have been developed, such as different adaptive learning algorithms (see [9], [10] and [11]), and other modified algorithms (see [12]-[23]). In spite of their improved convergence, these methods are not based on the optimal instantaneous learning rates of the GD approach. One may use second-order nonlinear optimizing methods to acceler- ate the MLP learning, such as the conjugate gradient algorithm (see [24]) or the Levenberg-Marquardt based method (see [25]). The critical drawbacks of these methods, however, are the ill conditioning of the Hessian matrix in many applications and the computational complexity related to the Hessian. In addition, most of these algorithms are developed only for off-line training of the ANNs. Recently, the layer-by-layer optimizing algorithm was proposed in which each layer of MLP's is decomposed into both a linear part and a nonlinear part [26]-[33]. The linear part of each layer is solved via the least squares problem formulation. Although these algorithms show fast convergence with much less computational complexity than those of the conjugate gradient or Newton methods, they result in an unavoidable problem caused by target assignments at hidden nodes. When the targets for a hidden layer cannot be linearly separated, it is impossible to reduce the MSE sufficiently at both the hidden layer and the output layer. Another class of fast learning algorithms is the one based on the extended Kaiman filter (EKF) technique for training of a multilayer FNN. It has received considerable attention recently (see [34]-[38]). These algorithm improved the convergence rate considerably and exhibited good performance. However, their numerical stability is not guaranteed. This may degrade learning convergence, increase training time and, generally, can make on-line implementation questionable. In this paper, we develop an on-line variable rate learning algorithm for double neural network system which can speed up the learning process substantially and can simultaneously provide stability of the learning process. In [39], we proposed a variable rate algorithm for the on-line training of an MLP. Here we extend this solution to online training of a double neural network system for I/O modelling of SISO time-invariant non-linear systems. As ANNs are widely used for the identification of non-linear 522 Infomiatica 23 ( 1999) 521-529 D. Sha et al. systems (see [40]-[61]), as well as for prediction of their behavior (see [62]-[64]), we will test this algorithm in the on-line identification and prediction of behavior of three non-linear systems. ©Linear Nods •Nonlinear Node 9 ' On-line Training Multiplier INk, k = and HNk, k = f,g, denote the number of nodes of the hidden layer. Then, starting from (1) one can consider a double three-layers-forward neural network as an identification model for a non-linear plant (Fig.l), where the network model is governed by / Figure 1 : Double neural network system for identification of non-linear plant 2 Modeling Non-linear Plants by ANNS 2.1 Problem Description Consider a SISO time-invariant non-linear system for which we will attempt to obtain an I/O model in the form of y{k + l) = fMk)]+g[^{k)].uik). Here cp(k) = [-yik) ...-y{k-n + l) - 1)... u{k -m)]^. Integer parameter n may be associated with the system order; m is a non-negative integer. Let the functions / and g of the above model be unknown. We will use two neural networks to model / and g, in order to obtain their approximations / and g, respectively. The assumption is that these networks are governed by yNN{k + l\k,0 Vk = [ Vk,0 Vk,l Wjt = Wk,io Wkao Wk,ll S{WkV>k) = [ s{netk,o) s{netk,i) ■ Wk,lINh Wk,2INk Wk,HNuINu sinetk,HN,) the activation function for non-linear nodes is a symmetric hyperbolic tangent function, i.e. s{x) = tanh(^o its derivative is s'{x) = /Xq^I " where fio is the shape factor of the activation function. 3 Derivation of an On-line Learning Algorithm 3.1 GD Method We define the error function as m = leHk) = l[y{k)-yNN{k)f = l[yik)-vJS{WfVf)-vJS{W,^,)uik)]' Applying the GD method to J and using Lemmas 2 and 3 from [39] one obtains increments for the parameters of ANNs as Aw/ = dJ{k) vJSiWf^f) dVfik) = r)S{Wf>pf)e{k), AWf = -T] dJ{k) dvf{k) vJS(Wfipf) e(k) (3) dWf(k) ' dWf = rjS' (Wfg)VgipJ^g] u\k) . Our intention is to make limfc_>oo e(A;) 0 as the number of iterations k increases. For this the condition |1 - vC{k)\ < 1 has to be satisfied, i.e. 0 < 7? < Apparently, the upper bound (fc) of the learning rate t] is variable because the value of C(fc) depends on the input (Pj and the current values of the neural network parameters Vj andWj, fori = /,5- 3.3 Variable Learning Rate Algorithm To obtain the variable learning rate algorithm we consider the case when the fastest learning occurs. This will be when T) = i.e. it will imply e(fc -f 1) w 0. Substituting J] = into (3)-(6) one obtains the adaptive learning rate back-propagation increments of ANN parameters, i.e. Avf = AWf = AVg = = 1 — SiWf^f)eik), ^^S{WgVg)u{k)e{k), ^S' (Wg^g)Vg^Ju{k)e{k). D. Sha et al. We use these formulae in the on-line algorithm in the following simulation experiments. 4 Simulation Experiments Extensive simulation studies were carried out with several examples of nonlinear dynamic systems which were used in [40], [65]. Two typical sets of results are illustrated in the following examples. Example 1 We will apply the new algorithm to a spring-mass-damper system with a hardening spring governed by y"{t)+y'{t)+yit)+y\t)-\-FL=u{t), and we will compare its performance with the fixed learning rate algorithm. Let us use a network with 5 hidden units for approximating /, and a network with 4 hidden units for approximating g. We will use ip{k) = [-y{k) - y{k -1) u{k - 1)] = ^f{k) = ipg{k). The input ii(t) for both the non-linear system and the neural network is taken as a random input band-limited white noise signal. The target for the neural network is the output y of the system. The sampling interval is Tg = 0.1 sec. The total simulation time is 300 sec, i.e. we will have 3000 iteration steps. The neural network is trained during the first 2500 iterations in the on-line mode. After training at the time instant of 250 sec, the trained neural network is used to predict the output of the non-linear system during the subsequent 500 iterations. This is done only for the purpose of assessing the quality of the training process. The adaptive learning rate algorithm proposed in this paper and the algorithm with the fixed learning rates ofrj = 0.001, 0.002, 0.004, are compared. The time evolution of the training error for these four cases are shown in Fig.2. The training data of the neural network with adaptive learning rate and with the fixed learning rate of 7/ = 0.002 are depicted in Fig.3 and 4, respectively. As mentioned previously, the fixed learning rate for online training must be chosen with some care. Unlike the situation when training is done off-line, in the on-line situation it is not known which input vectors will be presented to the network. Therefore the learning rate should be fairly conservative so as to assure stable learning. However, if it is too small, the learning will take a long time (as for Tj = 0.001). If the learning rate is large, learning occurs quickly, but if it is too large learning can become unstable and errors may even increase (as for r] = 0.004). To reduce these problems the fixed learning rate must be set to a suitable value. But, in general, the different initial values of weights and different processes will require different optimal fixed learning rates. It is thus impossible to find the best one for all of the cases. These problems do not exist with the adaptive learning rate algorithm. It can be observed from the simulation results that the adaptive rate 11=0.001 ti=0.002 11=0.004 Variable rate 1000 1500 2000 Epoch (0.1 sec. per epoch) 3000 Figure 2: Time evolution of training error algorithm is far better than the fixed rate algorithm in terms of both the learning speed and the training error. Example 2 Two plants considered are governed by y{k + 1) = 0.3y{k) + OMk - 1) + / [«(fe)], (7) where y(k) and u{k) are the output and input, respectively, at time k, and the fiinction / is assumed unknown to the ANNs. For the purpose of plant simulation, it is taken in the form f(u) - { - I f2(u) 0.6sin(7ru) -I-0.3sin(37ra) -f 0.1 sin(57ru), 0.8sin(2y(k)) + 1.2u(k). The first system (A) is defined by (7) having f(u) = fi (u), and the other system (B) is defined by (7) with f(u) = /2 (u). The MLPs used for identification and prediction in both cases have the same structure as in Example 1. The input to the plant is a sinusoid defined as u{k) = sin(|^). when k < 500, and On-line learning and prediclion ot NNs with variable rate 0 50 ( 00 150 200 250 300 350 400 450 2000 2050 2100 2150 2200 2250 2300 2350 2400 2450 2500 2500 2550 2600 2650 2700 2750 2800 2B50 2900 2950 1000 1500 2000 samples,Ts=0.lsec On-line learning and prediction of NNs with fixed rate 50 100 150 200 250 300 350 400 450 500 2000 2050 2100 2150 2200 2250 2300 2350 2400 2450 2500 2500 2550 2600 2650 2700 2750 2800 2850 2900 2950 3000 Figure 3: On-line learning and prediction with adaptive learning rate Figure 4: On-line learning and prediction with fixed learning rate rj = 0.002 „ - . /ŽttA;, „ . . ,2'ivk, u{k) = 0.5 sin(—) -I- 0.5 sm(- ■250' 25 when k > 500. In the case of system A, the results of online identification with fixed and variable learning rates are shown in Fig.5 and Fig.6, and the time evolution of the training error in Fig.7, respectively. Note that we selected by trial and error (experimentally) a fixed learning rate of 7] = 0.34 to satisfy both the learning speed and learning stability requirements. The variable rate learning algorithm achieves similar or better results directly, without any requirements for tuning the learning process. However, for system B, if the fixed learning algorithm is used and the learning rate r] is kept the same, i.e. t] = 0.34, then the learning process will become unstable as can be observed from Fig.8 and 10. At the same time, the variable learning rate algorithm retains its good behavior (see Fig.9 and Fig. 10). These examples serve to illustrate the convenience of the variable learning rate algorithm and its suitability for real-time identification. 5 Conclusion Based on the analysis of the convergence of the gradient descent method, a new on-line adaptive rate learning algorithm for I/O modelling by a double ANN system is proposed. Compared to a fixed rate learning algorithm, the adaptive rate learning algorithm can achieve a much better performance in terms of the learning speed and the training error. References [1] E. B. Kosmatopoulos and P. A. loannou, A Switching Adaptive Controller for Feedback Linearizable Systems, IEEE Transactions on Automatic Control, Vol. 44, No. 4, pp. 742-750,1999. [2] A. Yesidirek, F. L. Lewis, Feedback linearization using neural network. Automatica, Vol. 31 No. 11, pp.1659-1664,1995. [3] G. A. Rovithakis, M. A. Christodoulou, Neural Adaptive Regulation of Unknown Nonlinear Dynamical Systems, IEEE Transaction on Systems, Man, And Cybernetics, Part B: Cybernetics, Vol. 27, No. 5, pp. 810-822,1997. [4] K. Nam, Stabilization of Feedback Linearizable Systems Using a Radial Basis Function Network, IEEE D. Sha et al. 100 200 300 400 600 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 i 2 UJ I o 1-2 o 100 200 300 400 500 600 700 800 900 1000 Samples-. Ts=0.1sec 6r E 2 LU fo ^-2 3 -4 100 200 300 400 500 600 700 800 900 1000 Samples-. Ts»0.1sec Figure 5: On-line learning with fixed rate t] = 0.34 for AW Transactions on Automatic Control, Vol. 44, No. 5, pp. 1026-1031,1999. [5] D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing. Cambridge, MA: MIT Press, 1986. [6] B. Widrow, M. A. Lehr, 30 Years of Adaptive Neural Networks: Perceptron, Madaline, and Backpropa-gation, Proceedings of The IEEE, Vol.78, No.9, Special Issue on Neural Networks, I: Theory & Modeling, pp.1415-1442, September 1990. [7] C. -M. Kuan, K. Homik, Convergence of Learning Algorithms with Constant Learning Rates, IEEE Transactions on Neural Networks, Vol. 2, No. 5, pp. 484-489,1991. [8] Q. Song and J. Xiao, On the Convergence Performance of Multi-layered NN Tracking Controller, Neural & Parallel Computation, Vol. 5, No. 3, 1997. [9] R. A. Jacobs, Increased rates of convergence through learning rate adaption, Neural networks, Vol.1, 295307, 1988. [10] T. R Vogl, J. K. Mangis, A. K. Rigler, W. T Zink, and D. L. Alkon, Accelerating the convergence of the back-propagation method, jSio/. Cybern., vo\. 59, pp. 257-263,1988. Figure 6: On-line leammg with variable rate for fx (w) [11] D. C. Park, M. A. El-Sharkawi, R. J. Marks II, An Adaptively Trained Neural Network, IEEE Transactions on Neural Networks, Vol.2, No.3, pp. 334- 345, 1991. [12] R. Batruni, A Multilayer Neural Network with Piecewise-Linear Structure and Back-Propagation Learning, IEEE Transactions on Neural Networks, Vol. 2, No. 3, pp. 395- 403,1991. [13] M. Fukumi, S. Omatu, A New Back-Propagation Algorithm with Coupled Neuron, Transactions on Neural Networks, Vol. 2, No. 5, pp. 535-489,1991. [14] S.-H. Oh, Improving the error backpropagation algorithm with a modified error function, IEEE Trans. Neural Networks, vol. 8, pp. 799-803,1997. [15] A. van Ooyen and B. Nienhuisl, improving the convergence of the back-propagation algorithm, Neural Networks, vol. 5, pp. 465-471,1992. [16] P. Saratchandran, Dynamic Programming Approach to Optimal Weight Selection in Multilayer Neural Networks, Transactions on Neural Networks, Vol. 2, No. 4, pp. 465-467, 1991. [17] M. A. Sartori, R J. Antsaklis, A Simple Method to Derive Bounds on the Size and to Train Multilayer Neural Networks, Transactions on Neural Networks, Vol. 2, No. 4, pp. 467-471,1991. 1000 300 400 500 600 700 Epoch (O.lsec. per epoch) 900 1000 400 500 600 Samples: Ts=0.1 sec 1000 Figure 7: Time evolution of training error for /1 (u) [18] S. Shah, F. Palmieri, M. Datum, Optimal filtering algorithms for fast learning in feedforward neural networks, Neural Networks, Vol.5, No.5, pp. 779-787, 1992. [19] C. M. Bishop, Curvature-Driven Smoothing: A Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, Vol. 4, No. 5, pp. 882-884, 1993. [20] A. Back, E. Wan, S. Lawrence, A. C. Tsoi, A Unifying View of Some Training Algorithms for Multilayer Perceptrons with FIR Filter Synapses, Neural Networks for Signal Processing 4, Edited by J. Vlont-zos and J. Hwang and E. Wilson, IEEE Press, pp. 146-154, 1995. [21] Q. Song, Implementation of Two Dimensional Systolic Algorithms for Multilayered Neural Networks, JSA Journal of Systems Architecture, Vol. 44, No. 8, 1998. [22] E. D. Sontag, A Learning results for continuous-time recurrent neural networks. Systems & Control Letters, VoL34,No.3,pp.l51-158,1998. [23] S. Cavalieri, O. Mirabella, A novel learning algorithm which improves the partial fault tolerance of multilayer neural networks. Neural Networks, vol. 12, No.l,pp.91-106,1999. [24] R. P. Brent, Fast Training Algorithms for Multilayer Neural Nets, IEEE Transactions on Neural Networks, Vol 2, No. 3, pp. 346-354,1991. Figure 8: On-line learning with fixed rate rj = 0.34 for f2{u) [25] M. T. Hagan and M. Menhaj, Training feedforward networks with Marquardt algorithm, IEEE Transactions on Neural Networks, Vol.5, n0.6, pp.989-993, 1994. [26] R. Parisi, E. D. Di Claudio. G. Oriandi, and B. D. Rao, A general- ized learning paradigm exploiting the structure of feedforvard neural networks, IEEE Trans. Neural Networks, vol. 7, pp. 1450-1459,1996. [27] G.-J. Wang and C.-C. Chen, A fast multilayer neural networks training algorithm based on the layer-by-layer optimizing procedures, IEEE Trans. Neural Networks, vol. 7, pp. 768-775,1996. [28] F. Biegler-Konig and F. Marmann, A learning algorithm for multilay- ered neural networks based on linear least squares problems, Neural Networks, vol. 6, pp. 127-131, 1993. [29] J. Y. F. Yam and W. S. Chow, Extended least squares based algorithm for training feedforward networks, IEEE Trans. Neural Networks, vol.8, pp. 806-810, 1997. [30] S.-H. Oh, S.-Y. Lee, A new Error Function at Hidden Layers for Fast Training of Multilayer Perceptrons, IEEE Transaction on Neural Networks, Vol.10, No.4, pp. 960-964, 1999. 528 Informatica 23 (1999) 521-529 D. Sha et al. fo -4 -6. 1000 100 200 300 400 600 600 Sampl8s:Ts=0.1seo 700 800 900 1000 300 400 500 600 700 Epoch (0.1 sec. per epoch) 1000 Figure 9: On-line learning with adaptive variable rate for /2^ [31] Y. Lee, S.-H. Oh, and M. W. Kim, An analysis of premature saturation in back-propagation learning, Neural Networks, vol. 6, pp. 719-728,1993. [32] S.-H. Oh and Y. Lee, Effect of nonlinear functions on correlation between weighted sums in multilayer perceptrons, IEEE Trans. Neural Networks, vol. 5, pp. 508-510,1994. [33] S. Ergezinger and E. Thomsen, An accelerated learning algorithm for multilayer pereceptrons: Optimization layer by layer, IEEE Trans. Neural Networks, vol. 6, pp. 3142, 1995. [34] Y. Zhang, X. R. Li, A Fast U-D Factorization - Based Learning Algorithm with Applications to Nonlinear System Modelmg and Identification, IEEE Transaction on Neural Networks, Vol.10, No.4, pp. 930-938, 1999. [35] G. Chen, H. Ogmen, Modified extended Kaknan filtering for supervised learning, Int. J. Syst. Sci., Vol.24, No.6, pp. 1207-1214,1993. [36] Y. liguni, H. Sakai, H. Tokumaru, A real-time learning algorithm for a multilayered neural network based on the extended Kaiman filter, IEEE Trans. Signal Processing, Vol.40, No.4, pp. 959-966,1992. [37] G. Puskorius, L. A. Feldkamp, Neuralcontrol of nonlinear dynamical systems with Kahnan filter trained Figure 10: Time evolution of training error for f-2,{u) recurrent networks, IEEE Trans. Neural Networks, Vol.5, pp.279-297,1994. [38] S. Shah, F. Palmieri, M. Datum, Optimal filtering algorithms for fast learning in feedforward neural networks, Neural Networks, Vol.5, pp.779-787,1992. [39] D. Sha, V. B. Bajić, On-line Variable Learning Rate BP Algorithm for Multilayer Feedforward Neural Networks, in Development and practice of artificial intelligence techniques (V. Bajic and D. Sha, Editors), pp.51-58, lAAMSAD, Durban, South Africa, September 1999. [40] K. S. Narendra, K.Parthasarathy, Identification and control of dynamical systems using neural networks, IEEE Transactions on Neural Networks, Vol. 1, No. l,pp. 4-27, 1990. [41] K. S. Narendra, K.Parthasarathy, Gradient methods for optimization of dynamical systems containing neural networks, IEEE Transactions on Neural Networks, Vol. 2, pp. 252-263,1991. [42] S. Bhama, H. Singh, Single Layer Neural Networks for Linear System Identification Using Gradient Descent Technique, IEEE Transactions on Neural Networks, Vol. 4, No. 5, pp. 884-888,1993. [43] J. Patra, R. N. Pal, B. N. Chatterji, C. Panda, Identification of Nonlinear Dynamic Systems Using Functional Link Artificial Neural Networks, IEEE Transactions on Systems, Man, .And Cybernetics, Part B: Cybernetics, Vol. 29, No.2, pp. 254-262,1999. [44] S. Chen, S. A. Billings, P. M. Grant, Nonlinear System Identification using Neural Networks, Int. J. Contr., Vol.51, No.6, pp.1191-1214,1990. [45] S. Chen, S. A. Billings, P. M. Grant, Recursive hybrid algorithm for nonlinear system identification using radial basis fiinction networks. Int. J. Contr., Vol.55, No.5,pp.l051-1070,1992. [46] S. Chen, S. A. Billings, Neural networks for nonlinear dynamic system modeling and identification, Int. J. Contr., VoL56, No.2, pp.319-346,1992. [47] N. Sadegh, A perceptron based neural network for identification and control of nonlinear systems, IEEE Transactions on Neural Networks, Vol. 4, pp. 982988, Nov. 1993. [48] T. Yamada, T. Yabuta, Dynamic system identification using neural networks, IEEE Transactions on Systems, Man, And Cybernetics, Vol. 23, pp. 204-211, Jan./Feb. 1993. [49] B. Srinivasan, U. R. Prasad, N. J. Rao, Back Propagation Through Adjoints for the identification of Nonlinear Dynamic Systems Using Recurrent Neural Models, IEEE Transactions on Neural Networks, Vol. 5, No.2, pp. 213-228, March 1994. [50] S. V. T. Elanayar, Y. C. Shin, Radial Basis fiinction neural network for approximation and estimation of nonlinear stochastic dynamic systems, IEEE Transactions on Neural Networks, Vol. 5, pp. 594-603, July 1994. [51] A. Parlos, K. T. Cheng, A. F. Atiya, Application of the Recurrent Multilayer Perceptron in Modeling Complex Process Dynamics, IEEE Transactions on Neural Networks, Vol. 5, No.2, pp. 255-266, March 1994. [52] R S. Sastry, G. Santharam, K. R Unnikrishnan, Memory Neural Networks for Identification and Control of Dynamical Systems, IEEE Transactions on Neural Networks. Vol. 5, No.2, pp. 306-319, March 1994. [53] S. Mukhopadhyay, K. S. Narendra, Disturbance rejection in nonlinear systems using neural networks, IEEE Transactions on Neural Networks, Vol. 4, pp. 63-72,1993. [54] E. S. Kosmatopoulos, M. M. Polycarpou, M. A. Christodoulou, P. A. loannou. High-order neural network structures for identification of dynamical systems, IEEE Transactions on Neural Networks, Vol. 6, pp. 422-431,1995. [55] A. U. Levin, K. S. Narendra, Recurrent identification using feedforward neural networks. Int. J. Contr., VoL6, No.3, pp.533-547,1995. [56] A. Alessandri and T. Parisini, Nonlinear Modeling of Complex Large-Scale Plants Using Neural Networks and Stochastic Approximation, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, Vol. 27, No. 6, pp.750-757,1997. [57] Q. Song, Robust Training Algorithm of Multi-layered Neural Network for Identification of Nonlinear Dynamic Systems, lEE Proceedings-D, Control Theory and Applications,Vol 145, No. 1, 1998. [58] C. L. Philip Chen, J. Z. Wan, À Rapid Learning and Dynamic Stepwise Updating Algorithm for Flat Neural Networks and the Application to Time-Series Prediction, IEEE Transactions on Systems, Man, And Cybernetics, Part B: Cybernetics, Vol. 29, No. 1, pp. 6272, 1999. [59] S. Moon, Ali Keyhani, S. Pillutla, Nonlinear Neural -Network Modeling of an Induction Machine, IEEE Transactions on Control Systems Technology, Vol. 7, No. 2, pp. 203-211, 1999. [60] Y. Z. Tsypkin, J. D. Mason, E. D. Avedyan, K. Warwick, I. K. Levin, Neural Networks for Identification of Nonlinear Systems Under Random Piece-wise Polynomial Disturbances, IEEE Transactions on Neural Networks, Vol. 10, No.2, pp. 303-311, MARCH 1999. [61] M. latrou, T. W. Berger, V. Z. Marmarelis, Modeling of Nonlinear Nonstationary Dynamic Systems with a Novel Class of Artificial Neural Networks, IEEE Transactions on Neural Networks, Vol. 10, No.2, pp. 327-339, MARCH 1999. [62] J. T. Connor, R. D. Martin, L. E. Atlas, Recurrent Neural Networks and Robust Time Series Prediction, IEEE Transactions on Neural Networks, Vol. 5, No.2, pp. 240-254, March 1994. [63] E. S. Chang, S. Chen, B. Mulgrew, Gradient radial basis function networks for nonlinear and nonstationary time series prediction, IEEE Transactions on Neural Networks, VoL 7, pp. 188-194,1996. [64] A. G. Kogiantis, T. Papantoni-Kazakos, Operations and Learning in Neural Networks for Robust Prediction, IEEE Transactions on Systems, Man, And Cybernetics, Part B: Cybernetics, Vol. 27, No. 3, pp. 402-411,1997. [65] J.-S. R. Jang, ANFIS: Adaptive-Network-Based Fuzzy Inference System, IEEE Transaction on Systems, Man, And Cybernetics, Vol. 23, No. 3, pp. 665685,1993. A Spanish Interface To LogiMoo: Towards Multilingual Virtual Worlds Veronica Dahl, Stephen Rochefort and Marius Scurtescu School of Computing Science Simon Fraser University Bumaby, BC, Canada V5A 1S6 {veronica,srochefo,mas}@cs.sfu.ca AND Paul Tarau Department of Computing Science University of Moncton Moncton, NB, Canada EIA 3E9 tarau@info.umoncton.ca Keywords: virtual worlds, Internet applications, natural language processing, Assumption Grammars, LogiMOO Edited by: Vladimir Fomichov Received: December 21, 1998 Revised: February 8, 1999 Accepted: March 12, 1999 LogiMOO is a BinProlog-based Virtual World running under Netscape for distributed group-work over the Internet and user-crafted virtual places, virtual objects and agents. LogiMOO is implemented on top of a multi-threaded blackboard-based logic programming system featuring Linda-style coordination. Embedding in Netscape allows advanced VRML and HTML frame-based navigation and multi-media support, while LogiMOO handles virtual presence and acts as a very high-level universal object broker In this talk we shall brießy describe Assumption Grammars (the logic grammar tool used in our Spanish interface to LogiMOO) and how they can help solve some crucial computational linguistic problems such as anaphora and coordination. We shall then discuss our translator from Spanish sentences into LogiMoo commands. Finally, we shall discuss what is needed to parameterize a single language processor into specifìc natural languages, with the ultimate objective of transforming LogiMoo into a multilingual virtual world. In it users from various linguistic backgrounds could communicate using their own language, to be automatically translated into LogiMoo as universal interlingua. 1 Introduction would make an ideal candidate for machine translation. But this very simplicity makes it attractive to explore In a world where the distance from information is con- another avenue which can provide the illusion of an auto-stantly shrinking due to the world wide web- that enormous matic translator while being much simpler: we can abstract repository of easily accessible knowledge-, one crucial ob- from our English parser a simple core grammar which is Stade to the true availability of information remains: Ian- language independent, and complement it with as many guage. language-dependent modules as languages we want to ad-Ideally, a user should be able to retrieve documents of in- mit for our front ends. This article describes how this is terest in his/her own native tongue. Automatic translation done for Spanish, how it could easily be done for other of documents is not possible, because even when the do- languages as well, and discusses possible Spanish specific main is restricted (say, to legal documents, or to technical extensions to the language coverage, jargon), the problem of translating otherwise free language is too complex to be amenable to automatic solution. Automatic machine translation usually requires downsizing both 2 Background the language coverage and the application to manageable P'^^P^rtions 2.1 MUDs and MOOs In the specific domain of application of LogiMOO- a virtual world for distributed group-work over the Internet MUDs and MOOs (Multi User Domains - Object Ori-and user-crafted virtual places, virtual objects and agents- anted) have started with virtual presence and interaction. , language coverage is naturally restricted, giving rise to a Their direct descendents. Virtual Worlds are a strong uniform of controlled English, in which for instance, sentences fying metaphor for various forms of net-walk, net-chat and are subjectless and in imperative form, since LogiMOO is Internet-based virtual presence in general. They start where typically used to invoke commands, requests, etc. Thus it usual HTML shows its limitations: they do have state and require some form of virtual presence. "Being there" is the first step towards full virtualization of concrete ontologies, from entertainment and games to schools and businesses. Some fairly large-scale projects (Intel's Moondo [Int], Sony's Cyber Passage [Son], Black Sun's CyberGate [Bla], Worlds Inc.'s AlphaWorld [Wor]) converge all towards a common interaction metaphor: an avatar represents each participant in a multi-user virtual world. Information exchange reuses our basic intuitions with almost instant learnability for free. The sophistication of their interaction metaphor, with VRML landscapes and realistic 'avatars' (i.e., visual representations on screen of the user) moving in shared multiuser virtual spaces, will require soon high-level agent programming tools, once the initial fascination with 'looking' human is translated into automation of complex behavior. Towards this end, high-level consultation abilities are among the most important additions to various virtual world modeling languages. These should proceed in the user's mother tongue, whose sentences are automatically parsed into a common form which is invisible to all users but is used by LogiMOO to perform its internal operations. Thus sociability can proceed among distant users, each typing in their own language, and receiving feedback in it as well, while internally all proceeds in the invisible common language which triggers the LogiMOO actions requested. 2.2 LogiMOO: a multi-paradigm virtual world LogiMOO [DBPT96, TDB96, Tar96] is a BinProlog-based Virtual World running under Netscape or Internet Explorer for distributed group-work over the Internet and user-crafted virtual places, virtual objects and agents. LogiMOO is implemented on top of a multithreaded blackboard-based logic programming system (Multi-BinProlog 5.25) [Tar97] featuring Linda-style coordination'. Virtual blackboards [DBT96] allow efficient mirroring of remote sites over TCP/IP links while Solaris 2.x threads ensure high-performance local client-server dynamics. Embedding in Netscape allows advanced VRML or HTML frame-based navigation and multi-media support, while LogiMOO handles virtual presence and acts as a very high-level universal object broker The LogiMOO kernel behaves as any other MOO while offering a choice between interactive Prolog syntax and a Controlled Natural Language parser allowing people unfamiliar with Prolog to get along with the basic activities of the MOO: place and object creation, moving from one place to the other, giving/selling virtual objects, talking ('whisper' and 'say'). At login time, a main interactive shell and background notifier (for messages and events targeted to the user) are created. Netscape is used to implement CGI-based BinProlog remote topievel interacting ' An important difference between Multi-BinProlog and predecessors like [BC91] is direct use of Linda operations, instead of a guard notation. with a remote LogiMOO server (Fig. 1). Objects in LogiMOO are represented as hyper-links (URLs) towards their owners' home pages where their 'native' representation actually resides in various formats (HTML, VRML, GIF, JPEG etc.). 2.3 Assumption Grammars Assumption Grammars are logic grammars augmented with a) linear and intuitionistic implications scoped over the current continuation, and b) hidden multiple accumulators, useful in particular to make the input and output strings invisible. 2.3.1 Linear and intuitionistic implications Implications are additional information which is only available during the continuation, i.e., the remainder of the current proof If declared to be intuitionisic (noted as -sumei), they can be used (noted assumed) an indefinite number of times. In contrast, linear implications (noted a s sume 1) can be used at most once, and then they dissa-pear. For instance, the Prolog query: ?- assumei(p(5)), assumed(p(X)), assumed(p(Y)). instantiates both X and Y into 5; whereas the query: ?- assumei(p(5)), assumed(p(X)), assumed(p(Y)). fails after instantiating X to 5, since p(5) is no longer available. Both types of implication vanish upon backtracking. Intuitionistic implications have scoped versions as well: Clause=>Goal and [File] =>Goal make Clause or respectively the set of clauses found in File, available only during the proof of Goal. Clauses assumed with => are usable an indefinite number of times in the proof, e.g. a (13) => (a (X) , a (Y) ) will succeed. The scoped version of linear implication. Clause -: Goal or [File] - : Goal, makes Clause or respectively the set of clauses found in Fi 1 e available only during the proof of Goal. They vanish on backtracking and each clause is usable at most once in the proof, i.e. a ( 13 ) - : (a (X) , a (Y) ) will fail. Note however, that a (13 ) - : a (12) - : a (X) will succeed with X=12 as alternative X=13 as answers, while its non-affine counterpart a(13) -o a(12) -o a (X) as implemented in Lolli or Lygon, would fail. We can see the assumel/1 and assumei/1 builtins as linear affine and respectively intuitionistic implication scoped over the current AND-continuation, i.e. having their assumptions available in future computations on the same resolution branch. The following sample Assumption grammar illustrates the use of linear assumptions to handle relativization. It Logi Moo under Netscape 3.0 Figure 1: LogiMOO on the Web also makes use of lambda calculus representations. These are attractive if one views the meaning of syntactic categories as mappings from either variables or other properties into new properties. For example, we can consider the category "name" ("n" for short in the grammar below) as a "meaning device" that takes a variable X and constructs a property from it- such as dilemma(X), or lesson(X). A determiner ("det" in our grammar), on the other hand, can be viewed as a device that takes two properties (roughly representing the subject and the predicate of the sentence) and constructs a new property trying those two up and rendering the meaning of the specific determiner. The meanings of constituents are, then, composition-ally built up by beta-reduction ^ from the representations of their subconstituents. Thus, for instance, the rule that produces a sentence's meaning representation S from the meaning representations of its subject noun phrase (represented NP) and its verb phrase (represented VP) can be stated as: s(S):- np(NP), vp(VP), apply (VP, NP, S) . where "apply" implements beta-reduction, and is (strikingly simply) defined in turn as follows ("stands for "lambda", used as an operator in infix notation): apply(X\P,X,P). When representing a noun phrase with a relative clause, we can make use of the general definition for sentence given above to parse the relative clause itself. But since a relative clause is a sentence with an implicit noun phrase recoverable from its antecedent (e.g. "the flower that Florence planted" exhibits a relative clause, "that Florence planted" whose missing direct object is understood to be the antecedent "the flower"), in constructing its semantics ^beta-reductiomis the lambda-calculus operation that applies an expression A(X,P) to an expression Q, obtaining Q\ which is equal to Q except that all occurrences of X have been replaced by P. we need to save the antecedent in some place from which we can recover it when we come across a missing noun phrase. We can save it as an assumption, made by a noun phrase rule, that the variable representing the noun phrase's head noun will be needed in some missing noun phrase to be found later in the relative clause: np(NP):- det(D), n(X\P), +missing(X), nl, relclause(PI), apply(D,X\and(P,PI),NP). All we have to do now is to recover this assumption made when we expect a noun phrase that does not materialize: np(P\Q):- -missing(X), % retrieve antecedent apply(P,X,Q). Its representation X will be used in the call to beta-reduction. The predicates "used" and "all_consumed", which we have not bothered with here but which appear in our complete listing below, serve to ensure, at appropriate points, that no assumptions made are left unconsumed. Before showing the complete grammar, here are some sample tests. Our grammar translates into an English-like interlingua, to anticipate the LogiMOO grammar we shall present later, which has this same characteristic. But it would be just as easy, of course, to provide a Spanish-based internal representation instead. 2.3.2 Sample Tests % Every dilemma costs sentence([todo,dilema,cuesta]). every(_x2376,dilemma(_x2376) => costs(_x2376)) % John baffles sentence([juan,desconcierta]). baffles{john) % Every lesson that costs baffles sentence([todo,aprendizaje,que,cuesta, desconcierta]). every(_x2 3 8 0,and(1es son (_x2380), costs(_x2380)) => baffles(_x2380)) % Evey dilemma that John solves costs sentence([todo,dilema,que,juan,resuelve, cuesta]). every(_x3 3 0 3,and(man(_x3 3 0 3), saw(mary,_x3303)) => paints(_x3303)) % Johns solves every dilemma that costs sentence([juan,resuelve,todo,dilema,que, cuesta]). every(_x2407,and(dilemma (_x2407) , costs(_x2407)) => resolves(john,_x2407)) % Every dilemma that John solves % solves Prolog sentence([todo,dilema,que,juan,resuelve, resuelve,prolog]). every(_x2384,and(dilemma (_x23 84), resolves(john,_x2384)) => resolves(_x2384,prolog)) % Every dilemma that presents a dilemma % solves a dilemma sentence([todo,dilema,que,presenta,un, dilema, resuelve,un,dilema]). every(_x2388,and(dilemma(_x2388) , exists( _x2551,and(dilemma(_x2551), presents(_x2388,_x2551)))) => exists(_x2629,and(dilemma(_x2629) , resolves{_x2388,_x2629)))) % John solves a dilemma that presents % every dilemma sentence([juan,resuelve,un,dilema,que, presenta,todo,dilema]). exists(_x2411,and(and(dilemma(_x2411), every(_x2592, dilemma(_x2592) => presents(_x2411,_x2592))), resolves(john,_x2411))) % John solves a dilemma that solves % every dilemma that presents a dilemma sentence([juan,resuelve,un,dilema,que, resuelve,todo,dilema,que,presenta, un,dilema]). exists (_x2419,and(and(dilemma(_x2419), every(_x2 600,and(dilemma{_x2 600), exists(_x2780,and(dilemma(_x2780) , presents(_x2600,_x2780)))) => resolves(_x2419,_x2600)}), resolves(john,_x2419))) 2.3.3 The Complete Grammar % N.B. Words are noted % with # preceding them :-op(300,xfy,\). apply(X\P,X,P). all_consumed:- \+assumed(missing(_) ) . % Grammar: % Lexicon: pn(P\Q):- #juan, apply(P,john,Q). pn(P\Q):- ttprolog, apply(P,prolog,Q). det(Pl\P2\every(X,Q1 => Q2)):-#todo, apply(PI,X,Ql), apply{P2,X,Q2). det(Pl\P2\exists(X,and(Ql,Q2))): -#un, apply(PI,X,Ql), apply(P2,X,Q2). n(X\dilemma(X)): - #dilema. n(X\lesson(X)): - ftaprendizaje. vi(P\Q):- #cuesta, apply(P,X\costs(X),Q). vi(P\Q): - #desconcierta, apply(P,X\baffles (X) , Q) . vt(Pl\P2\Q2) ; - #presenta, apply(P2,X\Q1,Q2), apply(PI,Y\presents(X,Y),Q1). vt(P1\P2\Q2) : - #resuelve, apply(P2,X\Q1,Q2), apply(PI,Y\resolves(X,Y),Q1). % Syntax: s(S) np(NP) , vp(VP) , apply(VP,NP,S), all_consumed. np(NP) : - pn(NP) . np(P\Q) : - -missing(X), % retrieve antecedent apply(P,X,Q). np(NP):- det(D), n(X\P),apply(D,X\P,NP) np(NP):- đet(D), n(X\P), +missing(X), relclause(PI), used{X), ap- ply(D,X\and(P,Pl) ,NP) . used(X):- -missing(X), !, fail, used{_) . relclause(Rei): - #que, s(Rei), vp(VP): - vi(VP). vp(VP):- vt(V), np(NP), apply(V,NP,VP). test:- sentence(X), dcg_def(X), s(S), dcg_val([]), write(S), nl. % dcg_def gives in X the value of % the input stream; % dcg_val puts the stream's current % value in its argument. This grammar is an extension of a grammar developed by Alain Colmerauer which only handled simple noun phrases (i.e., with no relative clauses). Although in our grammar we only treat relative clauses, our methodology for treating them is also applicable to other long-distance dependency phenomena. In [DTL97] we examine the uses of AGs for three crucial such problems in natural language processing: free word order, anaphora and coordination. An alternative to programming beta-reduction as in the above grammar would be to use a language such as lambda-Prolog. In our opinion this would not provide too much economy with respect to our already very concise code, and would lose the advantage of portability 3 The Core LogiMOO Grammar We first use an English lexicon for exemplifying purposes. Next we explain how to make this core grammar language independent, taking Spanish as an example. The resulting Spanish grammar is shown in the Appendix. 3.1 The Lexicon 3.1.1 Noun Definitions. The rule: proper_name(john-[masc,sing]) :- #john. defines the word j ohn as a masculine and singular proper name represented by the constant j ohn. Objects are introduced by noun definitions, e.g., noun(car-[neut,sing]) :- #car. Virtual places are also introduced by nouns, and are always set to a neutral gender and singular form, e.g. noun (south-[neut, sing] )■ :- #south. 3.1.2 Verb Definitions. Intransitive verbs are verbs not requiring other objects/persons. correspond to actions performed by an avatar her/himself and involve no other specific avatar, place, or object, e.g. intrans_vb(smile) :- #smile. Transitive verbs requires one extra object/person Y. involve one other avatar, object or place in the the virtual world. For instance, the rule: trans_vb(Y,go(Y)) :- #go. corresponds to a user actioning her/his avatar to go someplace in the virtual world. The two arguments identify the object Y, and the translation to the unary predicate go, which can be used to associate some action to this verb. Bitransitive verbs, requiring two extra objects/persons, specify an action by an avatar that involves any two objects, avatars, or places. As an example, bitrans_vb(Y,Z,give(Y,Z)) :- #give. specifies the action of giving somebody Y, some object Z. The third argument is the predicate, give(Y,Z), which can again be used to associate some action to this verb. Notice that in all verbs, the subject is left implicit. In the application we further describe it will be clear fi-om the context who should be the subject, given that the LogiMOO kernel recognizes it as the avatar who logged in. 3.1.3 Pronouns, Determiners, and Prepositions. Pronouns are specified in a similar way as nouns: pronoun(_X-[fern,sing]) :- #she. This specification identifies a pronoun, she, with a feminine gender and singular form. Agreement information is used to resolve pronoun references into the correct object or person. Determiners and prepositions are specified as det :- #the. preposition :- #to. 3.2 Syntactic Rules All sentences are in imperative form, with their subject left implicit. Thus they reduce to verb phrases, which can be of the following forms: (VPl) An intransitive verb. (VP2) A transitive verb followed by a noun. (VP3) A transitive verb followed by a noun phrase. (VP4) A transitive verb followed by a prepositional phrase. (VPS) A bitransitive verb followed by two noun phrases. (VP6) A bitransitive verb followed by a noun phrase and a prepositional phrase. A prepositional phrase is defined as (PPl) A preposition followed by a noun phrase. The noun phrase forms allowed are (NPl) A proper name. (NP2) A pronoun (anaphora). (NP3) A determiner followed by a noun. (more complex noun phrases will be explained in the next section) In addition, we identify communication inputs which occur when a user wants their avatar to say, whisper or yell some message, e.g. say hi how are you. This form of input is introduced by either: (Fl) The word "whisper" followed by a prepositional phrase followed by a message. (F2) The word "say" followed by a message. (F3) The word "yell" followed by a message. Table 1 shows some sample parses. 4 Adapting the Core Grammar to Spanish Our technique for splitting the grammar into a language-independent core subset of rules and a language-dependent one is very simple. It comes from the observation that there are two types of rules in the English grammar which are language dependent: a) rules that create a predicate name which is reminiscent of the noun, verb or adjective from which they spring, b) rules containing a lexical item (i.e., a symbol preceded by '#')■ For rules of type a), we simply translate the predicate name into its Spanish equivalent by means of the Bin-Prolog builtin "means", e.g. whisper means susurra For rules of type b), we replace the lexical item by a non-terminal of same name, and relegate its final rewriting into a word to the language-specific lexical module which is called for each language. E.g., for "wizard" we would have: % wizard is now a non-terminal name --> wizard %English lexicon: % Spanish lexicon: wizard --> #wizard wizard --> #brujo. Of course, more realistically we will need features such as gender and number in order to produce the right words in each language. For instance, whereas in English we have only one lexical form for the definite noun, whether it is singular, plural, feminine or masculine, in Spanish we have four different lexical items covering all these forms. The same technique used here for Spanish can be used for implementing at least other romance languages within our language coverage. Thus the core grammar can largely be made language independent with relatively little effort. 5 Conclusion and future work We have provided another dimension of extensibility to an already extensible English front end to LogiMOO which was described in [DTL97]. This front end was extensible in the sense that "content" words could be added to the grammar definition through a high level description of their syntactic type (e.g. verb requiring one complement, etc.) plus the sequence of LogiMOO commands into which tliey should translate. In this article we have explored extensibility into different natural languages, not by using the usual machine translation approach, but by abstracting a core set of language independent rules from our English parser and then adding a language specific lexicon (English, Spanish or other) to complete the grammar definition. A simple change of one lexicon module into another effects the language change invisibly, so that users across the world can type in their interactions in their own language, these are recorded in a "neutral" but invisible form, from which any retrieval continues to respect the language of the caller. The syntax covered by our controlled natural language subset should not be expanded much more, because it is precisely owing to the controlled nature of our subsets that we are able to get away with such an easy specialization into one language or another of our language-independent core grammar. However within a single language and a single application our techniques can be fruitfully used to cover more ambitious natural language subsets. An extension that would not result in a larger subset of language but which would increase human-like comprehension would be that of recovering implicit meanings from various forms in language. For instance, lexical definitions for the Spanish singular definite article could include an indication of its presupposition of existence and unicity, so that if the presupposition fails this could be indicated on the fly. 6 Appendix 6.1 The LogiMOO Grammar % NL interface to LogiMOO % Authors: Veronica Dahl, Paul Tarau NL Input Translation LogiMOO Action look. look. Provides the user with a description of the room that their avatar currently occupies. craft a car. craft(car). Creates a virtual object, car, owned by th e avatar. craft a car. give it to john. craft(car), give(john, car). Creates a virtual object, car, and gives i t to john. take the car that john crafted. and(take(X), crafted(john, X), is_a(X, car)) Puts a car object crafted by j ohn into the avatar's possession. whisper to john whisper(john, 'How are you'). Sends the message 'How are you' to john. Table 1 : Sample Parses -op(400,xfx, {©) ) . -op(400,fx, (?) ) . -op(400,fx, (++)) . % Dictionary expresed as % OtherLanguageQEnglish in file % english.pl ==> trivial def: X@X ©EnglishW:- #OtherW,OtherW@EnglishW. look_ahead(W): - \+ \+ (@ _,#w). look_ahead(Wl,W2) : - \+ \+ (@ _, #W1, #W2) . % optional 'glue' words: skip if % present, do not mind if not {W} :- nonvar(W),!,(®w->true;true). {W} : - logimoo_err{should_be_nonvar(W)) . ?X :- is_assumed(future(X)). % this is just heuristics: occasionally % we get it wrong % anaphora requires more: for instance, % feature matching % at most 2 uses for an anphora ++X:- assumeal(X),assumel(X). % we assume both in asserta and % assertz order is_avatar(X) :- is_a_fact(user(X,_,_) ) ; ?avatar{X). is_crafted(A,C) :- is_a_fact(crafted(A,C)) ; ?crafted(A,C). is_place(P) : - is_a_fact(place(P)) ; ?place(P). is_port{P) :- is_a_fact{port{_, P,_) ) ; ?port(P). is_in(C,X) :- is_a_fact(contains(C,X)) ; ?contains(C,X). does_have(Who,What) :- is_a_fact(has(Who,What)) ; ?has(Who,What). % EVALUATES A LIST OF CHARS eval_nat(Chars) : - translate_nat(Chars). translate_nat([S,A,Y|Cs]) : - member([S,A,Y],["say","Say"]),!, that_mes(Cs). translate_nat(Chars) : -% ignore case !,toLowerChars(Chars,Cs), % split in words chars_words(Cs,Ws), patch_words(Ws,Words), % generate and execute commands translate(Words). % split in senteces, then parse each translate(Words) :- !,writeln(['WORDS:',Words]), split_nat(Words,Sentences), ! , writeln(['SENTENCES :'|Sentences]), ! ,nl, writeln(['==BEGIN CMD RESULTS==']), ( text(Sentences)->true ; true ) , ! ,nl, writeln(['==END CMD RESULTS==']), nI. % parse each sentence text( [] ) . text([SI Ss]) : - parse_sent(S,C), evaluate(C), text(Ss) . evaluate (T) : - metacall(T),!, quietmes(2,'SUCCEEDING'(T)). evaluate(T) :- logimoo_err( unexpecteđ_evaluation_f ailure_in(T) 1 % parse a sentence parse_sent(S,Cs) : - dcg_def(S), % open dcg stream % Cs contains a list of commands sent(Cs), dcg_val([]), % close dec stream % send errors to 'wizard' over the net parse_sent(S,_) : -logimoo_err( unable_understand_sentence(S)) . % look for a verb phrase sent(R) : - vp(R). do_relative(crafted,Who,What) :- is_crafted(Who,What). do_relative(has,Who,What) :- does_have(Who,What). do_relative(have,Who,What) :- does_have(Who,What). % to add: others, % based on: is_avatar,is_place,is_in % generates a command to craft an object def_crafted(X) : - do_craft(X), whoami(I), ++future(crafted(I,X)). % recognizes a place using % domain knowledge place(X) :- @X, is_place(X). % generates a command % to build a new place def_place(X) :- ®X, ++future(place(X)) % recognizes a port direction(X) :- ®X, is_port(X), ++future(direction(X)). % handles him,her,it anaphora(avatar,X) :- ®P, get_avatar(P,X). anaphora(crafted,X) :- ©it, -future(crafted{_,X)). % recognizes an avatar, % using world knowledge avatar(X) : - art, @X, is_avatar(X), !, -H+future (avatar (X) ) . avatar(X) :- anaphora(avatar,X) get_avatar(i,X) :- !, whoami(X) get_avatar(P,X) :- member(P,[him,her]),!, -future(avatar(X)). % VERBS % recognizes objects using % world knowledge crafted(What): - ©the,©What, is_crafted(Who,What) , relative(What),!, -H+future (crafted (Who, What) ) . crafted(What): - anaphora/crafted,What). % uses relative sentences as filters relative(What) :- ©that,!,avatar(Who), ©Verb,do_relative(Verb,Who,What). relative(What) :- nonvar(What). % handles relatives, % based on world knowledge vp(go(Where)) :- ©go,!,do_go(Where). vp(come(Who)) : - ©come,!,avatar(Who). vp(craft(What)) :- ©craft, !, art,def_crafted(What) . vp(dig(Place)) : - ©dig,1,art,def_place(Place). vp(open_port(Port,Place)) :- ©open,!, art,©port,©Port,{to},art, def_place(Place). vp(close_port(Port)) :- ©close,!,{the},©port,{to},art,©Port, vp(take(What)) :- ©take,!,art,crafted(What). vp (drop(What)) :- ©drop,!,art,crafted(What). vp(show(What)) :- ©show,!,art,crafted(What). vp(give(Whom,What)) :- ©give,!,obp(Whom,What). vp(iam(Who)) : - @i,©am,1,art,©Who, ++future(avatar(Who)). vp (Cmd) : - ©who, ! , do_who (Cmd) . vp(what(Verb,Object)) :- ©what,!,do_what(Verb,Object). vp(Cmd) ©where, !, do_where (Cmd) . vp(please(Who,What)) :- ©please,!,avatar(Who),vp(what). vp(X) :- ©X,nonvar(X), member(X, [look,list,listo,users,online, whoami,whereami,test,ttest, listing,save,help,lobby,vanish, messages,sstop,sstart]). art:- ©a;©an;©the;true. obp(Whom,What): - ©to,avatar(Whom),crafted(What). obp(Whom,What) : - crafted(What),©to,avatar(Whom). is_dot(X),!,append(Ws,[X],Words) is_dot(X) : - member(X, [' . ' , ',' ,'?' , '!']),! dot :- #X,is_dot(X). nl_word (_): - dot,!,fail. nl_worđ(X) :- #X. split_nat(Ws,Ss): -dcg_def(Ws), plus(a_sent,Ss), dcg_val([]),!. a_sent(S):-plus(nl_word,S),dot. toLowerChar(X,Y) :-[A]="A", [LA]="a", [Z]="Z", X >= A, X =< Z, ! , Y is LA-i-X-A. toLowerChar(X,X). do_where(whereami):- ©am,©i;©i,©am. do_where(where(X)): - ©(is),art, (avatar(X) ; crafted(X)) . toLowerChars(Cs,Ls): - map(toLowerChar,Cs,Ls). do_craft(X) :- ©Pref, {-}, ©Suf,!, namecat(Pref,'.',Suf,X). do_craft(X) :- @X. do_go(Where) :-{to},{the}, (place(Where); direction(Where)) . đo_go(Where) :- ©there, - future(place(Where)) . do_who(whoami): - ©am,©i,!. do_who(whoami): - @i,©am,!. do_who(who(Verb,Object)) : -©Verb,crafted(Object). do_what(Verb,X) : - ©Verb,avatar(X). % LOW LEVEL TOOLS % can be used as message sender logimoo_err(Mes) : - errmes('LogiMOO error',Mes). patch_words(Ws,Words) : -append (_, [Last],Ws), is_dot(Last),!,Words=Ws. patch_words(Ws,Words) : - toLower(X,LX) :- term_chars(X,Cs), toLowerChars(Cs,Ls), term_chars(Ls,LX). test:- test_data(Cs), write_chars("TEST: "), write_chars(Cs),nl, eval_nat(Cs),nl, fail ; nl. 6.2 The Spanish Grammar construyaScraft. puerta®port. dese®give. dele®give. estaSis. estoy@am. soy®am. tiene®has. caveSdig. diga®dig. abraSopen. vayaOgo. vengaOcome. mireSlook. construiOcrafted. construyaOcraft. un®a. unaOa. perro®dog. perraOdog. gatoScat. gataOcat. brujo®wizard. bruj a®wizard. yo®i. dormitorio@beäroom. vestibulo®lobby. habitacion®room. lo®it. el@the. del®the. la@the. a®to. al@to. de®to. dondeSwhere. quien@who. que@that. alsur@south. alnorte@north. alli®there. porfavor®please. X®X. 6.3 Sample Tests test_data("Yo soy Paul."). test_data("Cave una habitacion_huespedes. Vaya alli. Cave una cocina."). test_data("Vaya al vestibule. Mire."). test_data("Yo soy el bruj o. Donde estoy yo?"). test_data("Cave el dormitorio. Vaya alli. Cave una cocina, abra una puerta alsur de la cocina, vaya alli, abra una puerta alnorte del dormitorio. Vaya alli. Construya un cuadro. Dese lo al bruj o. Mire."). test_data("Yo soy Diana. Construya un automovil. Donde està el automovil?"). test_data("Construya un Gnu. Quien tiene lo? Donde està el Gnu? Donde estoy yo?"). test_đata("Dele al bruj o el Gnu que yo construi. Quien tiene lo?") . /* TRACE: ==BEGIN COMMAND RESULTS== TEST: Yo soy Paul. WORDS: [yo,soy,paul,.] SENTENCES: [yo,soy,paul] ==BEGIN COMMAND RESULTS== login as: paul with password: none your home is at http://199.60.3.56/~veronica SUCCEEDING(iam(paul)) ==END COMMAND RESULTS== TEST: Cave una habitacion_huespedes. Vaya alli. Cave una cocina. WORDS: [cave,una,habitacion_huespedes,., vaya,alli,.,cave,una,cocina,.] SENTENCES : [cave,una,habitacion_huespedes] [vaya,alli] [cave,una,cocina] ==BEGIN COMMAND RESULTS== SUCCEEDING(dig(habitacion_huespedes)) you are in the habitacion_huespedes SUCCEEDING(go(habitacion_huespeđes)) SUCCEEDING(dig(cocina)) ==END COMMAND RESULTS== TEST: Vaya al vestibule. Mire. WORDS : [vaya,al,vestibulo,.,mire,.] SENTENCES: [vaya,al,vestibulo] [mire] ==BEGIN COMMAND RESULTS== you are in the lobby SUCCEEDING(go(lobby)) user(veronica,none,'http://...') . user(paul,none,'http://...'). login(paul). online(veronica). online(paul). place(lobby). place(habitacion_huespedes). place(cocina). contains(lobby,veronica). contains(lobby,paul). SUCCEEDING(look) ==END COMMAND RESULTS== TEST: Yo soy el bruj o. Donde estoy yo? WORDS : [yo,soy,el,brujo,.,donde,estoy,yo, ?] SENTENCES: [yo,soy,el,brujo] [donde,estoy,yo] ==BEGIN COMMAND RESULTS== login as: wizard with password: none your home is at http ://199.60.3.56/~veronica SUCCEEDING(iam(wizard)) you are in the lobby SUCCEEDING(whereami) ==END COMMAND RESULTS== TEST: Cave el dormitorio. Vaya alii. Cave una cocina, abra una puerta alsur de la cocina, vaya alii, abra una puerta alnorte del dormitorio. Vaya alii. Construya un cuadro. Dese lo al bruj o. Mire. WORDS: [cave,el,dormitorio,.,vaya,alii, ., cave,una,cocina, (,),abra,una,puerta,alsur, de,la,cocina,(,),vaya,alli,{,),abra,una, puerta,alnorte,del,dormitorio,.,vaya,alli, .,construya,un,cuadro,.,dese,lo,al,brujo, ., mire, . ]. SENTENCES: [cave,el,dormitorio] [vaya,alli] [cave,una,cocina] [abra,una,puerta,alsur, de,la,cocina] [vaya,alli] [abra,una,puerta, alnorte,del,dormitorio] [vaya,alli] [construya,un,cuadro] [dese,lo,al,brujo] [mire] ==BEGIN COMMAND RESULTS== SUCCEEDING(dig(bedroom)) you are in the bedroom SUCCEEDING(go(bedroom)) SUCCEEDING(dig(cocina) ) SUCCEEDING(open_port(south,cocina)) you are in the cocina SUCCEEDING(go(cocina)) SUCCEEDING(open_port(north,bedroom)) you are in the bedroom SUCCEEDING(go(bedroom)) SUCCEEDING(craft(cuadro)) logimoo:# 'wizard:I give you cuadro' SUCCEEDING(give(wizard,cuadro)) user(veronica,none,'http://.. . ' user(paul,none,'http://...'). user(wizard,none,'http://...' ). login(wizard). online(veronica). online(paul). online(wizard). place(lobby). place(habitacion_huespedes). place(cocina). place(bedroom). contains(lobby,veronica). contains(lobby,paul). contains(bedroom,wizard). ) . contains(bedroom,cuadro). port(bedroom,south,cocina). port(cocina,north,bedroom). has(wizard,cuadro). crafted(wizard,cuadro). SUCCEEDING(look) ==END COMMAND RESULTS== TEST: Yo soy Diana. Construya un automovil. Donde està el automovil? WORDS: [yo,soy,diana,.,construya,un, automovil,.,donde,està,el,automovil,?] SENTENCES: [yo,soy,diana] [construya,un,automovil] [donde,està,el,automovil] ==BEGIN COMMAND RESULTS== login as: diana with password: none your home is at http://199.60.3.56/~veronica SUCCEEDING(iam(diana)) SUCCEEDING(craft(automovil)) automovil is in lobby SUCCEEDING(where(automovil)) ==END COMMAND RESULTS== TEST: Construya un Gnu. Quien tiene lo? Donde està el Gnu? Donde estoy yo? WORDS: [construya,un,gnu,.,quien,tiene, lo,?,donde,està,el,gnu,?,donde,estoy,yo,?] SENTENCES: [construya,un,gnu] [quien,tiene,lo] [donde,està,el,gnu] [donde,estoy,yo] ==BEGIN COMMAND RESULTS== SUCCEEDING(craft(gnu)) diana has gnu SUCCEEDING(who(has,gnu)) gnu is in lobby SUCCEEDING(where(gnu)) you are in the lobby SUCCEEDING(whereami) ==END COMMAND RESULTS== TEST: Dele al bruj o el Gnu que yo construi. Quien tiene lo? WORDS: [dele,al,brujo,el,gnu,que,yo, construi,.,quien,tiene,lo, ?] SENTENCES : [dele,al,bruj o,el,gnu,que,yo, construi] [quien,tiene,lo] ==BEGIN COMMAND RESULTS== logimoo:# 'wizard:I give you gnu' SUCCEEDING(give(wizard,gnu)) wizard has gnu SUCCEEDING(who(has,gnu)) ==END COMMAND RESULTS== V. Dahl et al. SUCCEEDING(test) ==END COMMAND RESULTS== Acknowledgement We thank for support from NSERC (grants OGPO107411 and 611024), and from the FESR of the Université de Monoton. Special thanks go to Daniel Perron for long discussions helping to come out with the initial idea of Logi-MOO, to Koen De Bosschere for the implementation of the Multi-BinProlog Linda engine. References [BC91] A. Brogi and P. Ciancarini. The Concurrent Language, Shared Prolog. TOPLAS, 13(1):99-123, 1991. [Bla] BlackSun. CyberGate. http ://www.blacksun.com/. [DBPT96] Koen De Bosschere, Daniel Perron, and Paul Tarau. LogiMOO: Prolog Technology for Virtual Worlds. In Proceedings of PAP'96, pages 51-64, London, Aprili 996. [DBT96] K. De Bosschere and P. Tarau. Blackboard-based Extensions in Prolog. Software — Practice and Experience, 26( 1 ):49-69, January 1996. [DTL97] Veronica Dahl, Paul Tarau, and Renwei Li. Assumption Grammars for Processing Natural Language. In Lee Naish, editor. Proceedings of the Fourteenth International Conference on Logic Programming, pages 256-270, MIT press, 1997. [DTRS98] Veronica Dahl, Paul Tarau, Stephen Rochefort, and Marius Scurtescu. Assumption Grammars for Knowledge Based Systems. Informatica. An International Journal of Computing and Informatics, 22(4), 1998. Special Issue on NLP and Agent Communication. [Int] Intel. Moondo. http://www.intel.comy iaweb/moondo/index.htm. [Son] Sony. Cyber Passage. http://vs.sony.co.jp/VS-E/vstop.html. [Tar97] Paul Tarau. BinProlog 5.75 User Guide. Technical Report 97-1, Dé-partement d'Informatique, Université de Moncton, April 1997. Available from http://clement.info.umoncton.ca/BinProlog. [TDB96] Paul Tarau and Koen De Bosschere. Virtual Worid Brokerage with BinProlog and Netscape. In Paul Tarau, Andrew Davison, Koen De Bosschere, and Manuel Hermenegildo, editors. Proceedings of the 1st Workshop on Logic Programming Tools for INTERNET Applications, JICSLP'96, Bonn, September 1996. http://clement.info.umoncton.ca/lpnet. [Wor] Worlds. AlphaWorld. http://www.worIds.net/ products/alphaworld. [Tar96] Paul Tarau. Logic Programming and Virtual Worlds. In Proceedings of INAP96, Tokyo, November 1996. Keynote Address. Efficient Computation Of Frequent Itemsets In A Subcollection Of Multiple Set Families Hong Shen School of Computing and Information Technology, Griffith University Nathan, QLD 4111, Australia AND Weifa Liang Department of Computer Science, Australia National University Canberra, ACT 2600, Australia AND Joseph Ng Department of Computer Science, Hong Kong Baptist University Kowloon, Hong Kong Keywords: Algorithm, data mining, frequent itemset. Edited by: Rudi Mum Received: March 26,1998 Revised: September 16,1998 Accepted: February 26, 1999 Many applications need to deal with the additive and muhiplicative subcollections over a group of set families (databases). This paper presents two efficient algorithms for computing the frequent itemsets in these two types ofsubcollections respectively. LetT be a given subcollection of set families of total size m whose elements are drawn from a domain of size n. We show that if T is an additive subcollection we can compute all frequent itemsets in T in 0{m2'^/(jm) + log p) time on an EREW PRAM with 1 Ti ® • • • ® Tk-i the group covers all items in the whole database. On the = ■ • • >1 f € T,, 0 < i < fe - 1, other hand, a multi-category grade maintenance system in R{t° ,... ^)holds}. (2) In Tg, each element is a fc-tuple (sets) satisfying relation R. For any element U - ... in T®, we define t C ti iff t C t^ for all O < j < A; - 1. In this paper, we consider an interesting problem of computing the frequent itemsets across a well-defined subcol-lection of additive and multiplicative types on a group of set families. We show how to compute these frequent itemsets efficiently by applying the relevant bit-vector operations. We organize the paper as follows. As the main technical body of the paper the next two sections present algorithms for computing the frequent itemsets in additive and multiplicative subcollections respectively. We conclude the paper in Section 4 with some open problems for future research. 2 All As shown in [10], efficient parallel solution to the problem of finding all frequent itemsets in Ti requires to first compute the frequencies of all itemsets and then "filter" them according to the values of their frequencies. We now show how to apply this algorithm to the problem we are addressing and to produce efficient solution to our problem. Let Te = Tq || Ti || ... || Tfc_i, where "||" is the operator of simple set concatenation. Assume that all elements of Tj are dravm from (itemset) domain I, i.e. I = UiT,, and let U contain all subsets generatable on I. Clearly U covers all possible subsets in any Ti and \U\ = 2" — 1. In our approach, we spend an extra space A of \U\\T\/d words, where d is the machine word-length, to record all frequent itemsets' occurrences in T. A is organized as an |;7| X |T| bit-array, where A[i][j] = 1 if the ith itemset of U occurs in the jth element (set) of T, and A[i] [j\ = 0 otherwise. Computing A can be viewed as a precomputation which is invoked only once. We also assume that T© = To ® Ti © ... , ©Tfc_i is given as input in the form ofabit-vector y of |T| bits, where V[i] = 1 if r[i] E T© and V[i] = 0 otherwise. In most practical applications, T® can be easily computed without incurring much cost because each Ti is usually stored in certain way of classification in the hard disk. The basic idea of our approach is to use these bit-vectors to reduce the computation. Algorithm AddFrequentSets {A, V, L) {*Input A and V, output L containing all frequent itemsets in To © Ti ® ... , ©Tfc_i.*} for « = 0 to |?7| - 1 do 1. ^'[i][0..|T|-l]:=^[i][0..|T|-l]A T^[0..|T| - 1]; 2. Compute the number of 1 's in A'[z][0..|T| - 1] and store it in C[i]; 3. ifC[i]/|T©|><5thenL:=LU{/i}. {*£- is initialized to 0.*} We now analyze the time complexity of the algorithm. We assume that we are given a machine with word length d that can perform basic arithmetic and logic operations and also extracting the number of I's in a single word in one step (constant time). The latter assumption is reasonable because extracting the number of I's in a word should not be more difficult than performing an arithmetic/logic operation (e.g. multiplication). Step 1 of the algorithm requires 0{\U\\T\/d) time since the logic AND operator"A" is carried out word-by-word throughout all | C/1 |T | /d words in A. Step 2 requires also 0{\U\\T\/d) time for word-byword extracting the number of 1 's and adding each row's together. Step 3 takes only constant time. Therefore the whole algorithm requires 0{\U\\T\/d) time. We say that a computation model is conservative if it has a word of 6 (log N) bits for processing data of magnitude N. Clearly this word length is the minimum for processing such data as it is required to store each datum in a single word and process it in a single step. Since \U\ = 2" — 1 and |T| = m, in our case a conservative machine should have max{log log |T|} > n bits. Throughout this paper, unless otherwise stated, all the computation models are conservative. Thus we have Lemma 1 For a group T of set families of total size m drawn from an domain of size n, an additive subcollection ofT, by spending an extra m2"'/n space to maintain all subset occurrences in T we can compute all frequent itemsets in a given additive subcollection of T in 0(m2"/n) sequential time. If we are given an EREW PRAM with 1 < p < m2"/n processors, it is easy to see that Steps 1 and 3 in the algorithm can be completed in m2"/(pn) time, while Step 2 requires 0{m2"/{pn) -I- logp) by parallel summation. This results in the following theorem: Theorem 1 Given an EREW PRAM with I C>.4 Cjj PMM Ca C4.2 C4i C4.4 C4.S PBäE Csj c,.2 CSJ C5.4 Table 2: Trajectory of error and its derivative drawn on the PLC's truth table loop system is optimized, it is not possible to select directly the appropriate signals e(i) and de{t)/dt. Instead one has to select appropriately the reference signal. In Table 2 a typical trajectory of error and its derivative caused by a simple reference step change shows that with such input many consequent parameters can not be appropriately optimized. To overcome this problem, several types of reference signals were tested: - a signal consisting of sinusoidals with different amplitudes and frequencies, - a signal consisting of steps with different amplitudes and delays, - a square wave signal with increasing amplitude. After substantial testing the square wave signal with increasing amplitude seemed to be the best solution. It is shown in Figure 4, together with control error for a typical example, while the appropriate trajectory is shown in Figure 5. Figure 3: Optimization of fuzzy controller PLC design is much more complex than linear controller design, because its output characteristic is non-linear. Output values for all possible input values have to be defined. The range of input values is divided into smaller intervals. The number of these intervals depends on the number of membership functions. Table 2 shows the truth table for control error e and its derivative de/dt. The fuzzyfied values are NB (negative big), NM (negative medium), ZE (zero), PM (positive medium) and PB (positive big). Eq. 5 shows that control signal u is influenced only by those consequent parameters Cij which have appropriate non zero membership grades rij. In other words, only the rules with membership functions, which are defined on the domains of the current input variables are active. So it is obvious that the optimization of controller parameters Cij is efficient only when the control error and its derivative cover the whole area defined by both variables during transient responses (simulation runs). Unfortunately as close 6 4 2 r 0 -2 -4 U 40 60 t Figure 4: Reference signal (dashed line) and control error As the whole phase area is fulfilled, it is possible to optimize all consequent parameters. However, in most cases GA finds solution close to the optimum, not the exact optimum. In our case this means, that some consequent parameters are slightly smaller than they should be, while the B. Zupančič et al. 5 Experimental results: Optimization of the PLC controller of a hydraulic system Our laboratory hydraulic set-up consists of three tanks and a main reservoir [5]. The transfer function which describes the relation between the input flow of liquid (incoming flow in the first tank) and the level in the third tank (controlled variable) is Figure 5: Trajectory in the error - derivative plane others (perhaps the neighbours, see Table 2) are slightly larger. If the FLC output characteristics is considered as a non-linear function of two independent variables e and de/dt the surface is not very smooth as it has many local minima and maxima. Such surface can not assure the appropriate performance of the control system. As this inconvenience can be considered as a kind of noise introduced by stochastic features in GA, the idea to use a kind of iiltering arose. The idea of filtering is, to calculate the consequent parameter value by averaging in which the parameter itself and all parameter's neighbours are included (see Table 2) q' _ f + C't-i.j + i + C'ž+ij-i + C'ž+l.j + l + + a + h-\-c + -h Cj-ij + Cj+ij) a + b + c c-Cij a + b + c Gp{s) = yjs) ^ 1 U{s) s^ + 2s2 + 3s -)-1 (7) As mentioned FLC was Sugeno 0"' order with two inputs (error and its derivative), for each input five equally spaced membership functions were defined (Figure 1). Knowledge base was described with 25 rules (Figure2). The controller output was calculated with Eq.(5). Optimization with GA was used to calculate optimal values of 25 consequent parameters Cij, each was coded with 12 bits. GA selects the best controllers from the generation and performs other operations (crossover and mutation). Selection is made on the basis of fitness values that depend on the control error (see Eqs. (1), (2), (3)). The important parameters of GA are shown in Table 3. number of generations 70 crossover probability Pc 1 number of crossover points Nc 3 mutation probability Pm 0.01 number of individuals in gen. N, 30 Table 3: GA parameters (6) The overall scheme is shown in Figure 3. where a a,b,c new (filtered) value of the consequent parameter current values of consequent parameters (m = i-l,i,i + l, n = j - l,j,j + 1) weights (parameters of the filter) Using the filtering, the best results are obtained with optimization in several steps. After each step the filtering is used, what means that the new individuals are calculated from all individuals of the last generation of previous optimization step. Values of the filtering parameters a, b and c depend mostly on the type of process. There is no strict rule how to set them, but in our examples the starting values were set to 1 (a = 1,6 = land c - 1). In some experiments b and c were intensified during optimization steps. The first generation of individuals was initialized with random numbers - no knowledge about the process was included. The optimal control system performance is shovra in Figure 6. The small oscillations are caused by rough FLC output characteristic which can be confirmed by Figure 7, where gray scale is used to denote the profile of the plane (darker means less). After this study the filtering was introduced. Three optimization iterations were performed, each with 70 generations. After each optimization the filtering was used. After the first optimization and filtering the criterion function was 672, after the second 275 and after the third 262. As the change between the second and the third iteration was not significant, the iterative procedure was terminated. Other examples also confirm that three iterations are usually reasonable. Figures 8, 9 and 10 represent the optimal results (reference sp, controlled variable y and control signal u ) after the first, second and the third iteration of optimization. 160 180 200 Figure 6: Optimal results (without filtering, y ... controlled variable, u ... control variable, sp ... reference) Figure 9: Results after the second iteration of optimization 1 » Figure 7: Output characteristic of the FLC optimized with GA (without filtering) 0 20 40 60 0 20 40 eo 80 100 120 140 160 160 200 Figure 10: Results after the third iteration of optimization Using filtering it is much smoother. Figure 8: Results after the first iteration of optimization It can be seen, that responses at rising and falling edges are not the same, because FLC characteristic is not symmetrical (linear). The response in Figure 10 is much better as it is fast and with a small overshoot. As the fitness function depends only on the control error, the values of controller signal u are very high in the points of reference change. If such values are unacceptable for actuator, the control variable u should be somehow included in the criterion function. Figure 11 depicts the output characteristic of the FLC. Figure 11 : Output characteristic of the FLC optimized with GA (with filtering) 6 Conclusion Genetic algorithms seem to be an efficient optimization approach in complex control systems with many tuning parameters. In fuzzy logic control systems there are many parameters, which can influence the behaviour: the number and the shape of membership ftinctions, consequent pa- rameters, etc. So conventional optimization techniques are usually not enough efficient or even unusable. In our presented study the consequent parameters of FLC were optimized with GA. Experiences show that some of the parameters can have more or less random values after the optimization if some facts are not taken into account. To avoid such situation, appropriate type of reference signal should be used. It is also recommended to plot trajectory of FLC inputs (e.g. plane e, de/dt) to see which parts of the truth table is appropriately covered by the inputs and to find out which consequent parameters can not be optimized. Observation of FLC output characteristics is also useful because smooth shapes mean that the optimization produce at least near optimum values. Filtering of FLC characteristic is another useful method, which makes output characteristics smoother and so improves responses. The procedure also decreases bad influence of parameters which are not satisfactory optimized, because their values get closer to the average of other optimized parameters. However in the future more effort should be devoted to additional experiments with different reference or disturbance signals, which are more similar to shapes, which occur in reality. A lot of possibilities give also different types of FLCs as well as the study of the influence of different approaches in GA. Eefereeces [1] Dixon L.C.W. (1972) Nonlinear optimization. The English universities press limited. [2] Goldberg D. E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley Publishing Company. [3] Àstrom. K. J. & Hägglund T. & Hang C. C. (1990), Automatic tuning and adaptation for PID controllers - a survey. Automatica, 20, p. 645-651. [4] McMillan G. K. (1990) Tuning and Control Loop Performance - A Practitioner's Guide (2nd ed.). Instrument Society of America, NC. [5] Matko D. & Karba R. & Zupančič B. (1992) , Simulation and Modelling of Continuous Systems: A Case Study Approach. Prentice Hall Int., London. [6] Beasley D. & Bull D. & Martin R. (1993) An Overview of Genetic Algorithms: Part 1, Fundamentals. University Computing, 15, 2, p. 58-69. [7] Beasley D. & Bull D. & Martin R. ( 1993) An Overview of Genetic Algorithms: Part 2, Research Topics. University Computing, 15,4, p. 170-181. [8] Punch W. F. & Goodman E. D. & Min Pei & Lai Chia-Shun & Hovland P. & Enbody R. (1993) , Further research on Feature Selection and Classification Using Genetic Algorithms. LCGA93, Champaign III, p. 557564. [9] Beasley D. & Bull D. & Martin R. (1993) A Sequential Niche Technique for Multimodal Function Optimization, Technical Report No. 93001. University of Bristol, Bristol, UK. [10] Beasley D. & Bull D. & Martin R. (1993) , Reducing Epistasis in Combinatorial Problems by Expansive Coding. Proc. 5^'' International Conference on Genetic Algorithms, Ed. S. Forrest, Morgan Kaufman, p. 400407 [11] Prong van Hoogeveen J. W. D. (1995) Optimization using Genetic Algorithms. Ph.D. thesis. Technical university Delft. 3-14. [12] JagerR. (\995) Fuzzy Logic in Control. Ph.D. thesis. Technical university Delft. [13] Zhao J. &, Gorez R. & Wertz V. (1996) Genetic algorithms for the elimination of redundancy and/or rule contribution assessment in fuzzy models Mathematics and Computers in Simulation, 41, p. 139-148. [14] Babuška and Verbruggen H. B. (1996) An Overview of Fuzzy Modelling for Control Control Eng.Practice, 4, 11, p. 1593-1606. PNNI And The Optimal Design Of High-Speed ATM Networks Abdella Battou Center for Computational Science, U.S. Naval Research Laboratory. Washington D.C. AND Bilal Khan and Sean Mountcastle, ITT Industries Systems Division, at the Center for Computational Science, U.S. Naval Research Laboratory. Washington D.C. E-mail: {battou,bilal,mountcas}@cmf.nrl.navy.mil Keywords: ATM Network Design, PNNI Simulation, PRouST Edited by: Mohsen Guizani Received: January 15, 1999 Revised: August 25, 1999 Accepted: My 31, 1999 In addition to being well-dimensioned and cost-effective, a high-speed ATM network must pass some performance and robustness tests. We propose an approach to ATM network topology design that is driven by the performance of its routing protocol, PNNI. Towards this end, we define performance indicators based on the time and traffic required for the protocol to first enter and subsequently return to the meta-stable state of global synchrony, in which switch views are in concordance with physical reality. We argue that the benefits of high call admission rate and low setup latency are guaranteed by our indicators. We use the PNNI Routing and Simulation Toolkit (PRouST), to conduct simulations of PNNI networks, with the aim of discovering how topological characteristics such as the diameter, representation size, and geodesic structure of a network affect its performance. 1 Introduction The size of operational ATM clouds continues to grow at an increasing pace. Both in anticipation of this changing scale, and to insure smooth inter-operation of these networks, the ATM Private Network-Network Interface standard (PNNI) was recently adopted (ATM Forum 1996). PNNI defines a set of protocols for hierarchical networks of ATM switches, and is designed to provide efficient and effective routing. In the long term, however, the degree to which PNNI succeeds in this regard will depend crucially on two factors: First, because PNNI does not mandate specific policies for call admission, route selection, or topology aggregation, these aspects of the protocol remain "implementation-specific". Clearly the degree to which PNNI meets the challenges posed by tomorrow's ATM networks will depend significantly on the success of switch designers in devising effective algorithms for the admission and routing of connections, and for the aggregation of topology information. Second, network designers must have the tools and information necessary to design ATM network topologies that are (i) capable of meeting anticipated traffic demands, and (ii) optimized for performance under PNNI. In this paper, we shall not address the first of these two issues, that of dimensioning networks to satisfy known costs and traffic demands. Our investigations begin at the point where a network designer, having been given projected traffic profiles and switch/fiber specifications, has arrived at a set of candidate ATM network topologies which appear equally adequate. We argue that although two topologies may appear indistinguishable in terms of the mathematics of QoS requirements, the PNNI protocol exhibits significant differentiation in their performance. Understanding how the PNNI protocol affects network performance is a necessary first step to determining how the adoption of PNNI should affect ATM network design. In subsequent sections, we shall describe our simulation experiments and begin developing guidelines for ATM network topology design that take into consideration the specific nature of the PNNI protocol. 2 PNNI Performance Indicators There are many candidate performance criteria for evaluating the relative merits of network topologies. Here we shall assume that topology design is motivated by increasing the ATM network's call admission rate and decreasing the average connection setup latency. Additionally, we desire that the background traffic due to the PNNI protocol itself, should not be excessively high. Setup Latency. In the absence of crankback, setup latency within a peergroup is seen to be linearly correlated with the number of hops in the selected path (Niehaus et al. 1997) and thus may be estimated in the worst case by the network diameter. When crankbacks occur, each failed attempt at valid route selection contributes significant additional latency, required for backtracking to the entry border node, computing an alternate route and then re-traversing the peergroup along the new path. Reduction of setup latency thus requires minimizing the crankback frequency. Crankbacks and Call Admission Rate. Recent results on PISINI aggregation schemes [2,4] indicate that ATM call admission rate and crankback frequency is directly proportional to the "distortion" present in switches' views of the network. In particular, the experimental data presented in (Awerbuch et al. 1998) confirms the intuition that when a switch has inaccurate (e.g. outdated) views of network topology and metric information, this increases the likelihood that calls entering the peergroup at that switch will be assigned sub-optimal routes. A larger discrepancy between a switch's local information and the underlying reality of the network's state, results in a larger fraction of calls originating at the switch being rejected en-route, hence undergoing crankbacks (and possibly even unwarranted rejection at the source). Thus, beyond the problem of dimensioning, selecting topologies that will yield high ATM call admission rates and low average setup latency requires that one be able to characterize which topologies minimize the divergence in switches' views. 2.1 Omr Approach Local synchrony. We define local synchrony of a peer-group to be the state where every switch in the peergroup has knowledge of the same set of PNNI Topology State Elements (PTSEs). It follows from the logic of the PNNI NodePeer finite state machine, that if a peergroup reaches local synchrony, then all member switches agree about the topology metrics describing their peergroup. Within a PNNI peergroup, each member switch is responsible for originating and flooding accurate information about its internal state and the resource availabilities on its incident links. Thus, modulo any loss due to aggregation schemes, local synchrony may be interpreted as a state in which all members of the peergroup are in agreement not only amongst themselves, but also with the underlying reality of the peergroup's state. Global synchrony. We define global synchrony of a connected ATM network to be the state where every peer-group at every level has reached local synchrony, and the PNNI network hierarchy has reached a unique apex. Admittedly, the notion of global synchrony is "artificial" in the sense that it may be rarely achieved in real dynamic networks where cormections are constantly arriving and departing. However, in a simulated network this state is attainable, and we shall use it to probe the rate of PNNI information propagation. When a switched virtual circuit (SVC) is established in an ATM network, bandwidth availability is altered for links that the SVC traverses. Assuming this change is significant, updated information is re-originated and flooded by each switch incident to the affected links. If the network had reached global synchrony prior to the SVC setup request, these re-originations cause the network to fall out of a state of global synchrony for a brief time, until the new information has reached every node. This naturally leads us to consider: - Resynchronization time: Average time required for the network to return to global synchrony, after a single, isolated, random SVC setup. - Resynchronization traffic: Average PNNI traffic required for the network to return to global synchrony, after a single, isolated, random SVC setup. The basic "trial" involved in measuring the above resynchronization parameters is as follows: allow the network to reach global synchrony, then inject a connection request between randomly chosen source and destination nodes and measure the time required and bytes transmitted before the network returns to a state of global synchrony. By repeating this trial a large number of times, we obtain an average value, which we call the resynchronization time. To illustrate the importance of resynchronization time, let us consider two extreme scenarios. First, consider a network where the average time between SVC requests (i.e. 1 / SVC arrival rate) is much higher than the network resynchronization time. In this situation, changes in bandwidth availability induced by an SVC setup will, on average, have time to flood to all other switches in the network prior to the arrival of the next SVC setup request. Thus, routing decisions for each SVC will, on average, be made according to completely accurate information at the originating switch. If an SVC setup is rejected or experiences unacceptably high latency, this undesirable behavior is attributable solely to inadequate dimensioning of the network, because there is no legitimate way to fulfill the request. In contrast, consider a network where the average time between SVC requests is much lower than the network resynchronization time. In this situation, changes in bandwidth availability induced by an SVC setup have not yet propagated to many switches in the network by the time the next SVC setup request arrives. Thus, the routing decision for an SVC is likely to be made according to stale information. The extent to which the information is stale is determined by the extent to which network resynchronization time exceeds average inter-SVC arrival time. If an SVC setup is rejected or experiences unacceptably high latency, this undesirable behavior may be due to inadequate dimensioning of the network or it may be due to suboptimal routes and unwarranted rejections induced by inaccurate information at the switches. Major changes in network topology, such as network partitioning due to link/node failure, or re-merging of components upon subsequent recovery, will cause the PNNI hierarchy to undergo severe restructuring. We define the boot parameters below to be indicators of the time and traffic required to reinstate consistent routing information after such catastrophic changes. - Boot time: Time at which the network first reaches global synchrony. - Boot traffic: PNNI traffic required for the network to reach global synchrony for the first time. We take the parameters above as a worst-case estimate of the time and traffic resources required to return to global synchrony. By comparison, the resynchronization parameters described earlier aim to measure the same quantities for the average case, i.e. during normal (stable) operation of the network. We contend that a network designer, given two topologies that are equivalent with regards to meeting anticipated QoS requirements, must take into consideration their resynchronization and boot times. In particular, reducing these two quantities increases the fraction of time that the network spends in a state of global synchrony. At the same time, the designer must keep a watchful eye on the Boot traffic and Resynchronization traffic to make sure that not too much of the network bandwidth is being expended by PNNI itself One way in which the designer might determine the values of the four parameters mentioned above is to take physical measurements of them on two live networks, each configured in the appropriate topology. But this would be difficult to do accurately because of the inherent problems in distributed measurement, and moreover it would completely defeat the intention of design before implementation! Alternately, one could simulate the candidate PNNI networks to determine both time and traffic for boot and resynchronization; we follow this latter approach here. 3 Experiments 3.1 The Simulation Environment Our simulation experiments were carried out using the PNNI Routing and Simulation Toolkit (PRouST), which was developed by the Signaling Group at the Naval Research Laboratory's Center for Computational Sciences. PRouST is a faithful and complete implementation of version 1.0 of the PNNI standard and can be used both to simulate large networks of ATM switches as well as to emulate live ATM switch stacks. In particular, PRouST includes the Hello, NodePeer, and Election finite state machines, all relevant packet encoding and decoding libraries, routing database management, and full support for hierarchy. In addition, a "plug-in" interface for call admission, path selection and aggregation policies is provided. For interswitch signaling PRouST makes use of the NRL Signaling Entity for ATM Networks (SEAN), which is a complete simulation/emulation library implementing version 4.0 of the ATM User-Network-Interface (UNI) standard. The fidelity of PRouST's PNNI implementation has been demonstrated extensively in live interoperability tests with commercial switches. Both PRouST and SEAN are written in C++ over the Component Architecture for Simulation of Network Objects (CASiNO) described in (Mountcastle et al. 1999), and both are available in the public domain. In the simulations that follow, all network links operate at the 0C3 rate. Link jitter varies uniformly between +10 /iS and —10 /xs for each transmitted message. PNNI messages that enter the switch control port experience a latency of 10 ms. The backplane of the switch routes data traffic on existing virtual circuits at 0C48 rates. These figures, while artificial, are projections of current switch vendor specifications. We found that jitter did not noticeably alter the outcome of our simulations from one trial to the next. The variations were typically less than 1% and for boot and resynchronization time, and less than 5% for boot and resynchronization traffic. The values we have listed in our tables are mean values. An outline of results presented: We seek to understand what factors influence resynchronization and boot parameters. To this end, we start by simulating single-peergroup networks with grid, chain, ring and star topologies; these results are presented in sections 3.2 and 3.3. We compare these very particular families of topologies with similar experiments using all possible topologies on 7 nodes-these results are described in sections 3.4-3.5. In addition, we simulate 100 randomly generated topologies on 20 nodes, the results of which are described in section 3.6. Finally, in sections 3.7 and 3.8 we address the impact of peergroup size and hierarchy, by simulating several different hierarchic configurations of linear networks. 3.2 Boot and Resynchronization Time We start by investigating the characteristics of network topology that influence boot and resynchronization times. The NodePeer protocol floods PNNI Topology State Elements over a link whenever the switches incident to the link have discrepant databases. Thus, information will flow from each switch radially until it has reached every other switch in the peergroup. Given this, network boot time and worst-case resynchronization times should be linearly dependent on the diameter of the peergroup. Simulation Results We simulated PNNI networks of increasing size, specifically, chains, grids rings, and star networks. The tables (1,2,3,4) show the results of the simulations, and are depicted in figures 1 and 2. The figures indicate that PNNI boot time grows linearly in network diameter; for each unit increase in diameter PNNI boot time increases by approximately 21.4 ms, while resynchronization times increase by 10.7 ms. Chain Network Boot Resynch. Length Diameter Time Time seconds .seconds 2 1 30.0864 0.0108 4 3, 30.1290 0.0323 6 • 5. 30.1719 0.0537 8 , 7- ■ . 30.2146 0.0755 10 9 30.2588 0.0962 20 ■ 19 30.4562 0.2035 30 ; 29 30.6859 0.3104 Table 1 : Chains—Boot/resynch. times Grid Network Boot Resynch. Size Diameter Time Time seconds seconds 2x2.' 2 30.1073 0.0212 3x3 4 • 30.1498 0.0428 4x4 6 30.1926 0.0636 6x6 10 .30.2785 0.1067 Table 2: Grids—Boot/resynch. times . Ring Network Boot Resynch. Length Diameter ; Time Time seconds seconds 10 ' 5 " 30.1654 0.0518 20 10 - 30.2694 0h037 . 30 15 30.3732 . 0.1555, 40 20 30.4932 0.2172 -50 25 30.6002 0.2688 Star Network Boot Resynch. Size Diameter Time Time seconds seconds 5 2 30.0872 0.02172 10 2 30.0886 0.02179 15 2 30.0894 0.02188 20 . 2- 30.0896 0.02191 25 2 30.0909 0.02198 Table 4: Starš—Boot/resynch. times Bool Tim» versus Diamelor Figure 1 : Boot time vs. network diameter group. In the worst case, when flooding is occurs over every edge in the network, traffic due to resynchronization is proportional to the number of links in the network. For 'this reason, we consider resynchronization traffic in PNNI networks as a function of representation size. Table 3: Rings—Boot/resynch. times 3.3 Boot amd Resynchronization Traffic To study PNNI boot and resynchronization traffic, we will introduce the notion of the representation size of a network: the minimum number of PNNI topology state elenients required to fully describe its topology. In our subsequent discussion, we take the representation size of a network to be twice the number of links plus the number of switches. Because the NodePeer protocol is responsible for transmission of the peergroup's current representation to each component switch, traffic required to reach initial global synchrony should be bounded below by a function propor-i tional to the product of the representation size and the number of switches. For chains, grids and rings the number of nodes and the representation size are linearly related, so , this product is quadratic in the representation size. _ Resynchronization involves flooding updated informa-: tion about links affected by Ae SVC, to all members, of the peergroup; In the best-case, flooding takes place over a spanning tree and the total traffic required to resynchro-nize is proportional to the number of switches in the peer- Simulation Results We interpret the traffic data collected from the simulations of the previous section. This data is shown in tables (5,6,7,8). The traffic required to reach initial global synchrony in each of these cases, grows super-linearly with the representation size, as can be seen in figures 3 and 4). Resynchronization traffic also manifests super-linear growth as a function of representation size. Rosynchronlzalion Tim« versus piamaler Figure 2: Resynch. times vs. network diameter Table 5: Chains—Boot/resynch. traffic Grid Repres. Boot Resynch. Size Size Traffic Traffic bytes bytes 2x2 12 73284 14412 3x3 33 182232 43240 4x4 64 1674880 260640 6x6 156 3521512 816003 Table 6; Grids—^Boot/resynch. traffic Ring Repres. Boot Resynch. Length Size Traffic Traffic bytes bytes 10 30 201812 2720 20 60 765388 5440 30 90 1379620 108160 40 120 2167876 816680 50 150 3353088 1286000 Chain Repres. Boot Resynch. Star Repres. Boot Resynch. Length Size Traffic Traffic Size Size Traffic Traffic bytes bytes bytes bytes 2 4 10292 2967 5 16 30864 9144 4 10 56028 13399 10 31 129024 43976 6 16 133976 30127 15 46 292724 104776 8 22 248124 53247 20 61 526464 191624 10 28 394476 82575 25 76 820152 302496 20 58 1629364 324487 30 88 3702444 719548 Table 8: Stars- -Boot/resynch. traffic Table 7: Rings—Boot/resynch. traffic 3.4 General Tests I: All 7 Node Networks In order confirm the validity of the above conclusions, we conducted the same experiments on the class of all (853 topologically distinct) 7 node connected graphs'. Bool Tralfic versus Represenlalion Size Figure 3: Boot traffic vs. representation size with network diameter. More simulations need to be conducted over a larger class of graphs to determine the cause of this phenomenon. The PNNI boot traffic for these simulations is shown in figure 7, and is linear with an approximate growth rate of 6000 bytes per unit of representation. 3.5 Network Geodesic Structure The plot shown in figure 8 indicates that that over 7 node graphs of any fixed representation size, there is considerable variation in PNNI resynchronization traffic. We attempted to understand the cause for this differentiation by determining, for each representation size, which topologies Simulation Results Because the 7 node networks all have relatively small diameter, there was little differentiation in their boot and resynchronization times. As the graphs in figures 5 and 6 indicate, boot and resynchronization times were clustered at discrete values spaced 10ms apart. We note that 10ms is the time required for processing of a single PNNI message at the control port of a switch. While the plot does not immediately illustrate this, the distribution of the data points was not uniform over the cluster points; we have plotted the average as a function of diameter to emphasize this. Somewhat surprisingly, on average, resynchronization time seems to grows super-linearly lizallon Traine versus Represenlalien Size 'The software used to generate all non-isomorphic 7 node graphs was the NAUTY program developed by Brendan McKay. Representation Size 100 1 20 140 Figure 4: Resynch. traffic vs. representation size 570 Informatica 23 (1999) 565-573 A. Battou et al. 30.16 30.15 30.14 30.13 30.12 30.11 30.1 30.09 Boot Timo versus Diameter tot AH 7 nodo Notworla Boot Tratric versus Reprosotitalion Sizo tor AD 7 node Networks Average boot tiiT» —* / Figure 5: Boot times for 7 node networks Resyndironlzalion Time versus Olamoier (or All 7 rwda Neiwortts Average resync. time ™>c 1.....:/• Figure 6: Resynch. times for 7 node networks g- 300000 Averajio boot tratfic - r .A" 25 30 35 40 Represontation Size Figure 7: Boot traffic for 7 node networks 120000 Resynchronlzalion Traffic lUon SIzo for AB 7 nodo Networks 110000 100000 GOOCO 80000 70000 60000 50000 40000 30000 20000 Average rosync. traffic • r 20 25 30 35 40 4S Represerttation Sizo Figure 8: Resynch. traflSc for 7 node networks achieved the highest and lowest resynchronization traffic. Figure 9 is a partial list of the best and worst performers (only those with representation sizes between 19 and 33 are shown). We noted that the worst performers had a large number of redundant geodesies between pairs of switches. That is to say, topologies with the worst performance contained a multiplicity of shortest paths between nodes. Scrutiny of the simulations revealed that this multiplicity results in a redundant flooding in the PNNI NodePeer protocol. Figure 10 illustrates the redundant flooding of a PTSE along one edge when two nodes are connected by more than one shortest path. The fact that the worst performers in figure 9 contain many unchorded 4-cycles, whereas the best performers contained many triangles, supports this conclusion. 3.6 General Tests II: Randomly Generated 20 Node Networks We also conducted measurements of boot and resynchronization parameters for randomly generated networks. A random topology is generated as follows: 20 nodes are assigned random locations on a grid. Links are added via a random process that repeatedly generates a random node-pair and adds a link between them with probability that decays exponentially with the Euclidean distance between the two nodes (Waxman 1998). Links that will cause the degree of a node to exceed eight are rejected by the random process in order to keep the graph reasonably sparse. The process of adding links terminates when the graph is connected and every node has degree at least 2. In this manner, we generated 8000 random topologies on 20 nodes. These were then sorted into classes, based on the diameter of the generated network, and 10 networks were chosen randomly from each class. Simulation Results The results of simulations of 100 random 20-node networks are shown in figures 11 and 12. The data indicates that for any given value of diameter, there is a significant variation in the resynchronization and boot times. We were not able to determine the topological structure property that is responsible for this variance, but hope to address the question in our future work. The results of previous sections indicate that if we are given two networks from the same restricted family (such as grids, for example) then diameter alone can serve to predict the approximate value of the resynchronization time. In the set of experiments presented in this section, we realize that, unfortunately, this simple estimation criterion does not hold as well over large mixed families of graphs (e.g. our random set). On the other hand, we remark that the average value of resynchronization and boot times (over the ran- Worst Repres. Performing Size Topology 19 21 23 25 27 29 31 33 Best Performing Topology XS) Source C3 Source . On this link, only database y summanes are exchanged Figure 10: Effects of geodesic structure on traffic Boot Time versus Diameter lor Random 20-node Networks Jr * + + Figure 9: Best and worst 7 nodes topologies (in terms of boot traffic domly chosen topologies) does increase linearly with diameter. 3.7 Peergroup Size Another aspect of network design we consider is choice of peergroup size. First, we note that partitioning a network into peergroups will result in more SVCs needing to be established between leaders at the next higher level. It also necessitates logical Hello finite state machines to stabilize over these SVCs and for the election process to conclude at the higher level. These two factors together are responsible for a discrete jump in the PNNI boot time when one moves from single peergroup to multiple peergroup configurations. As we begin to decrease the size of the peergroups, each requires less information to reach local synchrony, since detailed infor- Figure 11 : Boot times for 100 random 20-node networks mation about distant peergroups is being represented by a single logical node at the higher level. This causes boot traffic and resynchronization traffic to decrease. As we make peergroup size smaller still, an SVC semp on average traverses a larger number of peergroups, which in turn triggers re-aggregation of links at the higher level. Thus beyond a certain point, decreasing peergroup size causes an increase in resynchronization traffic and time. Simulation Results We simulated chains of 16 and 32 switches, configured with peergroups of sizes 2,4 and 8 (the 16 node examples are illustrated in figures 13). The data from the simulations is presented in tables 9 and 10. Resynchronizalbn Time versus Diameter for Random SO^nodo rjo!wort(S * S Figure 12: Resynch. times for 100 random 20-node networks In the 32 node chain, going from a single peergroup of 32 nodes to 4 peergroups of 8 nodes produces a jump in the PNNI boot time of over 59 seconds. On the other hand, boot traffic decreases by a factor of 3, and resynchroniza-tion traffic by a factor of 6.4. Reduction of the peergroup size from 4 to 2, causes an increase in resynchronization traffic by 30%. This happens because higher level links are re-aggregated and flooded downward in response to the change in resources at the lower level. I6:PG-I6 Topology name 16-16 16-4 16-2 #ofPGs 1 4 8 PG size 16 4 2 Boot time 30.38s 90.26s 90.27s Boot traffic 1034K 639K 587K NNI boot traffic 0 1.5K 1.8K Resync. traffic 0.30s 0.23s 0.24s Resync. traffic 494K 156K 233K Table 9: Varying peergroup size for a 16 node chain Topology name 32-32 32-8 32-4 32-2 # of PCs 1 4 8 16 PG size 32 8 4 2 Boot time 30.73s 75.43s 75.39s 75.49s Boot traffic 4217K 1401K 1121K 1420K NNI boot traffic 0 3.IK 3.7K 3.9K Resync. time 0.64s 0.46s 0.46s 0.51s Resync. traffic 1981K 306K 689K 898K 16:PG-4 16;PG-2 Figure 13: Examples of peergroups simulated. the discrete jump seen in PNNI boot time and boot traffic as one considers networks with a greater number of levels. On the other hand, hierarchy localizes the side effects of network changes. In particular, it reduces the number of switches to which updated information must be flooded. Thus hierarchical addressing lowers both resynchronization time and traffic. Simulation Results We simulated chains of 16 and 32 switches, configured with various hierarchical structure (The 16 node scenarios are depicted in figure 14). The data collected is presented in table ILA 3-level configuration of the 32 switch chain boots in 75.4897 seconds, while the 5 level one requires 150.3220 seconds to reach initial global synchrony. The 5 level hierarchy has the advantage of improving both resynchronization time and traffic by a factor of 1.3 and 1.8 respectively. Table 10: Varying peergroup size for a 32 node chain 3.8 Hierarchy Finally, we consider how the presence of hierarchy affects our performance indicators. First, we note that a network with many levels of hierarchy requires SVCs to be established between leaders in the same higher level peergroup. It also necessitates logical Hello finite state machines to stabilize over these SVCs and for the election process to conclude in each peergroup at each level. These two factors are the principal cause of Topology name 16/3 16/5 32/3 32/5 Chain length 8 8 32 32 Hierarchy structure 16,8,1 16,8,4, 2,1 32,16,1 32,16, 8,4,1 Boot time 90.27s 120.32s 75.49s 150.32s Boot traffic 587K 611K 1420K 1717K NNI boot traffic 1.8K 3.IK 3.9K 6.8K Resync. time 0.24s 0.16s 0.51s 0.38s Resync. traffic 233K 215K 1191K 638K Table 11 : Simulation results for several hierarchy configurations. 16:HIER-3 [2] Baruch Awerbuch, Yi Du, Bilal Khan, and Yuval Shavitt. (1998) Routing Through Networks with Hierarchical Topology Aggregation. Journal of High Speed Networks, vol. 7(1) p.57-73. 16:HIER-5 Figure 14: Two examples of hierarchic structures simulated. 4 Conclusion and Future Work The simulations conducted using the PNNI Routing and Simulation Toolkit (PRouST) confirmed that topological characteristics such as the diameter, representation size, and geodesic structure do affect the boot and resynchronization times and traffic. These four indicators determine the discrepancy between switches' views of the network and its underlying physical state, and are thus critical to call admission rates and crankback frequency in ATM networks. Partitioning a network into peergroups reduces the boot and resynchronization traffic and introducing hierarchy results in improved resynchronization time and traffic, although it has a minor side-affect of increasing the boot time. Peergroup size and hierarchy structure are two of the most importance choices a network designer must make. In our fixture research efforts will focus further on these two parameters. As we have shown in [section 3.7] there is an optimal value beyond which reduction of peergroup size results in increased resynchronization traffic, due to the re-aggregation and downward flooding of higher level links in response to changes at lower levels. We intend to precisely quantify both this optimal value, and the relative tradeoffs between peergroup size, hierarchy structure and resynchronization traffic and time. [3] W. C. Lee, Topology aggregation for hierarchical routing in ATM networks. Computer Communications Review, p. 82-92. [4] S. Mountcastle, et al. (1999) CASiNO: A Component Architecture for Simulating Network Objects. Proceedings of the 1999 Symposium on Performance Evaluation of Computer and Telecommunications Systems, July IIIS, 1999.■p. 261-272. [5] D. Niehaus, et al. (1997) Performance Benchmarking of Signaling in ATM Networks. IEEE Communications Magazine, vol. 35 number 8, p. 134-142. [6] Bernard M. Waxman. (1988) Routing of multipoint connections. Journal on Selected Areas of Communications, vol 6 ip. 1617-1622. 5 Acknowledgements We would like to thank David Talmage and Jack Marsh at the NRL Signaling Group for their contributions to PRouST, SEAN and CASiNO. Also we wish to acknowledge Sandeep Bhat for his assistance, particularly with the internals of the NodePeer finite state machine. References [1] ATM Forum. (1996) Private Network-Network Interface Specification, Version 1.0. Intelligent Agent Technology The first Asia-Pacific conference on intelligent agent technology with accompanying tutorials and workshops was held at Hong Kong Baptist University, 14-17 December 1999. Over 120 participants attended the conference. The acceptance rate for long conference papers was 27.8% and 22% for single- track WS papers. The WS on agents in electronic commerce, WAEC'99, was chaired by Yiming Ye, IBM T.J. Watson Research Center. At the WS, researchers presented several ideas how to incorporate intelligent agents into electronic commerce. The WS was nicely concentrated on a specific subject and dominated by IBM researchers. Among the invited speakers was also Head of the IBM Deep Blue team that defeated Kasparov. The conference had much broader spectrum than the WS. While major agents conferences are already specialized into specific agent subfields, the first Pacific conference was a uniform one with the aim to bring together all researchers in the agent area. Papers were grouped into the following categories: agent architectures, multi-agent cooperation, distributed intelligence, formal agent theories, knowledge discovery and data mining agents, personalized Web agents, software agents, mobile agents, agent-supported enterprise. Program chairs were Timing Liu and Ning Zhong. They were also editors of the conference proceedings, published as a book by World Scientific. There were four invited presentations: Ohsuga, Brad-shaw, Zytkow, Ling. Dr. Ohsuga in his presentation "How can AI systems deal with large and complex problems -Model building as problem solving" analyzed reasons for increasing introduction of complex problems in our lives and into AL The major way to tackle with complex problems is to design advanced problem models that enable automatic program generation. Dr. Zytkow is one of the best-known researchers in the automatic discovery field. Learning by discovery is the most challenging subfield of machine learning, sometimes treated as stand-alone area related to intelligence and top human creative capabilities. Since agents have to be autonomous by definition, they have to create new knowledge, i.e. autonomously persue knowledge in purposeful interaction with the internal world. More autonomous means more able to make discovery in more complex circumstances. Zytkow's approach presented in "Robot-discoverer: a role model for any intelligent agent" might be welcome since at least some of the agents today seem to be only labeled as agents in order to join the mainstream. There is little doubt that agents capable of discovery - creating new knowledge, are more advanced than agents without this possibility. Dr. Bradshaw from Boeing presented an invited paper "Steps toward the permanent colonization of cyberspace". Bradshaw's book on intelligent agents is among most often used textbooks on agents worldwide. His presentation was also one of the most futuristic. Bradshaw pointed out that agent research is a new field and that it quickly progresses and changes along the way. Today, there are several types of agents and several areas of agent research. As an interesting new application, the space ball capable of checking status of vital functions in a station was presented. The ball will soon be applied in real-life circumstances. However, it seems that the application was man-dominated thus not leaving room for full agent autonomy. The major part of the presentation was devoted to long-lived heterogeneous colonies of agents colonizing the cyber space. Such colonies need life- support services, essentials, legal and social services. Dr. Ling, head of the Microsoft Richmond research laboratories, presented an invited paper on "Intelligent agents: embodied and disembodied". In the first part of the presentation he analyzed Bayesian networks and the advantages of this methodology. A limited version of Bayesian network is applied in the Microsoft Office Assistant; a slightly more advanced version is implemented in the troubleshooting agents. In the second part of his presentation. Ling presented embodied agents, capable of animation. They imitate visual effects of feelings by showing different facial expressions or by body language. This might be a desirable function in videoconferencing. Overall, in this way computers become more personal, closer to and more acceptable by humans. The panel was another very interesting event. Bradshaw presented the old Apple animation showing a futuristic agent capable of information gathering and communication with a user. Many of these functions have not fulfilled yet. In other words it means that some challenges that agent face are the old challenges of artificial intelligence - how to simulate human-acceptable intelligence on computers. Throughout the event, there were lots of interesting papers on agent research. One of the dominating research directions is the multi-agent research, the other is dealing with information overflow. Several papers dealt with information gathering from heterogeneous sources such as mobile telephone, the Web and TV. We presented our employment agent in the single-track WS session. It seems that our idea of a uniform information gathering from any Internet employment database through normal HTML is original. Other approaches are usually based on agent languages such as KQML, or database wrappers. But these approaches demand cooperation from the database while in our approach the agent takes care of modifying its queries according to database forms. Matjaž Gams First Announcement and Call for Papers JCKBSE 2000 Fovirth Joint Conference on Knowledge-Based Software Engineering Brno, Czech Republic, September 12-14, 2000 Organized by: - Czech Society for Computer Science - Department of Computer Science and Engineering, Faculty of Electrical Engineering and Computer Science, Brno University of Technology, Czech Republic - SIG-KBSE, The Institute of Electronics, Information and Communication Engineers, Japan In cooperation with: - IEEE Tokyo Section Computer Chapter - Japanese Society of Artificial Intelligence - Russian Association for Artificial Intelligence - Slovak Society for Computer Science - The Institution of Electrical Engineering - Slovakia Centre - VEMA Brno, computers and projects, Ltd., Brno Steering Committee: - Christo Dichev, IIT, Bulgarian Academy of Sciences - Morio Nagata, Keio University - Pavol Navrat, Slovak University of Technology - Vadim L. Stefanuk, IITP, Russian Academy of Sciences - Haruki Ueno, Tokyo Denki University About the Conference JCKBSE aims to provide a forum for researchers and practitioners to discuss the latest developments in the areas of knowledge engineering and software engineering. Particular emphasis is placed upon applying knowledge-based methods to software engineering problems. The Conference originated in order to provide a forum in which the latest developments in the field of knowledge-based software engineering could be discussed. Although initially targeting scientists from Japan, the CIS countries and countries in Central and Eastern Europe, JCKBSE warmly welcomes participants from all countries. JCKBSE 2000 will continue with this tradition and is anticipating even wider international participation. Furthermore, the scope of the conference as indicated by its topics has been updated to reflect the recent development in all the three covered areas, i.e. knowledge engineering, software engineering, and knowledge-based software engineering. The conference will include several invited talks and a plenary talk by a distinguished speaker. Topics - architecture of knowledge, software and information systems including collaborative, distributed, multi-agent and multimedia systems, internet and intranet, - requirements engineering, domain analysis and modeling, formal and semiformal specifications, - knowledge engineering for domain modeling, system family engineering, and software product lines, - intelligent user interfaces and human-machine interaction, - knowledge acquisition and discovery, data mining, - automating software design and synthesis, - object-oriented and other programming paradigms, metaprogramming, - reuse, re-engineering, reverse engineering, - knowledge-based methods and tools for software engineering, including testing, verification and validation, process management, maintenance and evolution, CASE, - decision support methods for software engineering, - applied semiotics for knowledge-based software engineering, - knowledge systems methodology, development tools and environments, - practical applications and experience of software and knowledge engineering, - information technology in control, design, production, logistics and management, - enterprise modeling, workflow, - knowledge management for business process, - intelligent agents for software engineering, - program understanding, programming knowledge, learning of programming, modeling programs and programmers, - knowledge-based methods and tools for software engineering education, - software engineering and knowledge engineering education, distance learning. Program Committee Bailes P. (Australia), Banaszak Z. (Poland), Benczur A. (Hungary), Bielikova M. (Slovakia), Brusilovsky P. (USA), Devedzic V. (Yugoslavia), Dichev Ch.V. (USA), DichevaD. (USA), DochevD. (Bulgaria), Ehrlich A. (Russia), Eisenecker U. (Germany), Far Behrouz H. (Japan), Fukazawa Yoshiaki (Japan), Gams M. (Slovenia), Gladun V. P., (Ukraine), Hashimoto Masa-aki (Japan) - co-chair. Hori Masahiro (Japan), Hruška T. (Czech Republic) -co-chair, Kaijiri Kenji (Japan), Kang Kyo-Chul (Korea), Khoroshevsky V.F. (Russia), Komiya Seiichi (Japan), Koyama Temo (Japan), Kumeno Fumihiro (Japan), LloydWilliams M. (UK), Lozovskiy V. S. (Ukraina), Moessen-boeck H. (Austria), Molnar L. (Slovakia), Nagata Morio (Japan), Navrat R (Slovakia), Okamoto Toshio (Japan), Oomori Yasumasa (Japan), Osipov G.S. (Russia), Pech-ersky Y. (Moldova), Stefanuk V. L. (Russia), Sugawara Kenji (Japan), Tan Chew Lim (Singapore), Ueno Haruki (Japan), Welzer T. (Slovenia), Yamada Hiroyuki (Japan), Yamamoto Shuichiro (Japan), Young Gilbert H. (Hong Kong), Zendulka J. (Czech Republic). Important Dates February 1,2000 - Preliminary registration March 1, 2000 - Paper submission deadline May 1,2000 - Notification of acceptance May 20,2000 - Camera-ready deadline September 12-14,2000 - Conference dates Correspondence Address JCKBSE 2000 Department of Computer Science and Engineering, Brno University of Technology, Bozetéchova 2, 612 66 Brno, Czech Republic, fax: +420-(0)5-4121 1141 e mail: jckbse@fee.vutbr.cz www: http://www.fee.vutbr.czAJIVT/JCKBSE/ Venue Brno, the metropolis of the South Moravian region is the second largest city in the Czech Republic with a population of more than 400,000. From the north-east and the northwest the town is surrounded by promontories of the Dra-hany Uplands and the Czech-Moravian Highlands, while to the south Brno's streets run into gently undulated plains around the massif of the Palava Hills. Brno is connected by a motorway and a railway with Prague and Bratislava and a railway and a good road with Vienna and Ostrava. There is also an international airport inBmo. The conference is taking place in the Santon Hotel, which is located in one of the most popular recreational areas in Brno - the Brno Dam Lake. The distance from public transport facilities is approx. 200m. In Brno, it is possible to visit numerous cultural and social events, theatres, concert halls, musea and historical monuments. Proceedings All accepted papers will be published in the conference proceedings and will be available at the conference. The official language of the conference is English. Fees The cost of an integrated package including the conference fee, accomodation (three nights), meals (starting with the dinner on September 11 and finishing with the lunch on September 14), and social activities is 350 USD (before July 20, 2000). After July 20, 2000 the cost of the integrated package is 400 USD. For participants from the Central and Eastern Europe including the newly independent states, the cost of the integrated package is 150 USD. Ask about the fee if your country fits this condition. For a limited number of students, the program committee will grant an additional financial support. The fee for an accompanying person is 250 USD. The cost of the integrated package including some special services (bus transport from the Vienna or Prague airport to Brno and back, a guided tour, a boat trip on the dam) is 550 USD. This program can be arranged for groups only. We cannot refund a part of the fees in case you cannot stay for the whole period of the conference. Paper Submission Full papers should not exceed 8 pages. Short papers should not exceed 4 pages. Papers will be reviewed according to: technical quality, originality, clarity, appropriateness to the conference focus, and adequacy of references to related work. Authors should submit the papers electronically. For details see the conference web site. In addition, one copy of a manuscript should be sent, too. For details see the confer-rence web site. THE MINISTRY OF SCIENCE AND TECHNOLOGY OF THE REPUBLIC OF SLOVENIA Address: Trg OF 13, 1000 Ljubljana, Tel.: +386 61 178 46 00, Fax: +386 61 178 47 19. http://www.mzt.si, e-mail: info@mzt.si Minister: Lojze Marinček, Ph.D. Slovenia realises that that its intellectual potential and all activities connected with its beautiful country are the basis for its future development. Therefore, the country has to give priority to the development of knowledge in all fields. The Slovenian government uses a variety of instruments to encourage scientific research and technological development and to transfer the results of research and development to the economy and other parts of society. The Ministry of Science and Technology is responsible, in co-operation with other ministries, for most public programmes in the fields of science and technology. Within the Ministry of Science and Technology the following offices also operate: Slovenian Intellectual Property Office (SIPO) is in charge of industrial property, including the protection of patents, industrial designs, trademarks, copyright and related rights, and the collective administration of authorship. The Office began operating in 1992 - after the Slovenian Law on Industrial Property was passed. The Standards and Metrology Institute of the Republic of Slovenia (SMIS) By establishing and managing the systems of metrology, standardisation, conformity assessment, and the Slovenian Award for Business Excellence, SMIS ensures the basic quality elements enabling the Slovenian economy to become competitive on the global market, and Slovenian society to achieve international recognition, along with the protection of life, health and the environment. Office of the Slovenian National Commission for UNESCO is responsible for affairs involving Slovenia's cooperation with UNESCO, the United Nations Educational, Scientific and Cultural Organisation, the implementation of UNESCO's goals in Slovenia, and co-operation with National commissions and bodies in other countries and with non- govermnental organisations. General Approaches - Science Policy Educating top-quality researchers/experts and increasing their number, increasing the extent of research activity and achieving a balanced coverage of all the basic scientific disciplines necessary for: - quality undergraduate and postgraduate education, - the effective transfer and dissemination of knowledge fi-om abroad, - cultural, social and material development, - promoting the application of science for national needs, - promoting the transfer of R&D results into production and to the market, - achieving stronger integration of research into the networks of international co-operation (resulting in the complete intemationalisation of science and partly of higher education), - broadening and deepening public understanding of science (long -term popularisation of science, particularly among the young). General Approaches - Technology Policy - promotion of R&D co-operation among enterprises, as well as between enterprises and the public sector, - strengthening of the investment capacities of enterprises, - strengthening of the innovation potential of enterprises, - creation of an innovation-oriented legal and general societal framework, - supporting the banking sector in financing innovation-orientated and export-orientated business - development of bilateral and multilateral strategic alliances, - establishment of ties between the Slovenian R&D sector and foreign industry, - accelerated development of professional education and the education of adults, - protection of industrial and intellectual property. An increase of total invested assets in R&D to about 2.5% of GDP by the year 2000 is planned (of this, half is to be obtained from public sources, with the remainder to come from the private sector). Regarding the development of technology, Slovenia is one of the most technologically advanced in Central Europe and has a well-developed research infrastructure. This has led to a significant growth in the export of high-tech goods. There is also a continued emphasis on the development of R&D across a wide field which is leading to the foundation and construction of technology parks (high -tech business incubators), technology centres (technology-transfer units within public R&D institutions) and small private enterprise centres for reseai'ch. R&D Human Potential There are about 750 R&D groups in the public and private sector, of which 102 research groups are at 17 government (national) research institutes, 340 research groups are at universities and 58 research groups are at medical institutions. The remaining R&D groups are located in business enterprises (175 R&D groups) or are run by about 55 public and private non-profit research organizatios. According to the data of the Ministry of Science and Technology there are about 7000 researchers in Slovenia. The majority (43%) are lecturers working at the two universities, 15% of researchers are employed at government (national) research institutes, 22% at other institutions and 20% in research and development departments of business enterprises. JOŽEF STEFAN INSTITUTE Jožef Stefan (1835-1893) was one of the most prominent physicists of the 19th century. Born to Slovene parents, he obtained his Ph.D. at Vienna University, where he was later Director of the Physics Institute, Vice-President of the Vienna Academy of Sciences and a member of several scientific institutions in Europe. Stefan explored many areas in hydrodynamics, optics, acoustics, electricity, magnetism and the kinetic theory of gases. Among other things, he originated the law that the total radiation from a black body is proportional to the 4th power of its absolute temperature, known as the Stefan-Boltzmann law. The Jožef Stefan Institute (JSI) is the leading independent scientific research institution in Slovenia, covering a broad spectrum of fundamental and applied research in the fields of physics, chemistry and biochemistry, electronics and information science, nuclear science technology, energy research and environmental science. The Jožef Stefan Institute (JSI) is a research organisation for pure and applied research in the natural sciences and technology. Both are closely interconnected in research departments composed of different task teams. Emphasis in basic research is given to the development and education of young scientists, while applied research and development serve for the transfer of advanced knowledge, contributing to the development of the national economy and society in general. At present the Institute, with a total of about 700 staff, has 500 researchers, about 250 of whom are postgraduates, over 200 of whom have doctorates (Ph.D.), and around 150 of whom have permanent professorships or temporary teaching assignments at the Universities. In view of its activities and status, the JSI plays the role of a national institute, complementing the role of the universities and bridging the gap between basic science and applications. Research at the JSI includes the following major fields: physics; chemistry; electronics, informatics and computer sciences; biochemistry; ecology; reactor technology; applied mathematics. Most of the activities are more or less closely connected to information sciences, in particular computer sciences, artificial intelligence, language and speech technologies, computer-aided design, computer architectures, biocybemetics and robotics, computer automation and control, professional electronics, digital communications and networks, and applied mathematics. The Institute is located in Ljubljana, the capital of the independent state of Slovenia (or S^nia). The capital today is considered a crossroad between East, West and Mediter- ranean Europe, offering excellent productive capabilities and solid business opportunities, with strong international connections. Ljubljana is connected to important centers such as Prague, Budapest, Vienna, Zagreb, Milan, Rome, Monaco, Nice, Bern and Munich, all within a radius of 600 km. In the last year on the site of the Jožef Stefan Institute, the Technology park "Ljubljana" has been proposed as part of the national strategy for technological development to foster synergies between research and industry, to promote joint ventures between university bodies, research institutes and innovative industry, to act as an incubator for high-tech initiatives and to accelerate the development cycle of innovative products. At the present time, part of the Institute is being reorganized into several high-tech units supported by and connected within the Technology park at the Jožef Stefan Institute, established as the beginning of a regional Technology park "Ljubljana". The project is being developed at a particularly historical moment, characterized by the process of state reorganisation, privatisation and private initiative. The national Technology Park will take the form of a shareholding company and will host an independent venture-capital institution. The promoters and operational entities of the project are the Republic of Slovenia, Ministry of Science and Technology and the Jožef Stefan Institute. The framework of the operation also includes the University of Ljubljana, the National Institute of Chemistry, the Institute for Electronics and Vacuum Technology and the Institute for Materials and Construction Research among others. In addition, the project is supported by the Ministry of Economic Relations and Development, the National Chamber of Economy and the City of Ljubljana. Jožef Stefan Institute Jamova 39, 61000 Ljubljana, Slovenia Tel.:+386 61 1773 900, Fax.:+386 61 219 385 Tlx.:31 296 JOSTIN SI WWW: http://www.ijs.si E-mail: matjaz.gams@ijs.si Contact person for the Park: Iztok Lesjak, M.Sc. Public relations: Natalija Polenec 3TEIN1TS OF Informatica Ankerst, M., C. Elsen, M. Ester & H.P. Kriegel. 1999. Perception-based classification. Informatica 23:493^99. Batageu, v., a. Ferligoj & p. Doreian. 1999. Generalized blockmodeling. Informatica 23:501-506. Battou, a., B. Khan & s. Mountcastle. 1999. PNNI and the optimal design of high-speed atm networks. Informatica 23:565-573. Battou, A. & B. Khan. 1999. PNNI and the optimal design of high-speed ATM networks. Informatica 23:359-367. Bevilacqua, A. 1999. A dynamic load balancing method on a heterogeneous cluster of workstations. Informatica 23:49-56. Bohanec, M. & V. Rajkovič. 1999. Multi-attribute decision modeling: industrial applications of DEX. Informatica 2 3:4 8 7^ 91. Chung, K.-L. & J.-G. Wu. 1999. Improved representations for spatial data structures and their manipulations. Informatica 23:211-221. Claramunt, C. & M. Mainguenaud. 1999. A revisited database projection operator for network facilities in a GIS. Informatica 23:187-201. Clement, B.E.P., P.V. Coveney, M. Jessel & P.J. Marcer. 1999. The brain as a Huygens machine. Informatica 23:389-398. Cselényi, i. & r. SZABO. 1999. Service specific information based resource allocation for multimedia applications. Informatica23:317-324. Dahl, v., S. Rochefort, M. Scurtescu & P. Tarau. 1999. A Spanish interface to LogiMoo: towards multilingual virtual worlds. Informatica 23:531-542. Dai, H. 1999. Extended predicate logic and its application in designing MKL language. Informatica 23:289-299. De Florio, v., g. Deconick. & R. Lauwere-ins. 1999. An application-level dependable technique for farmer-worker parallel programs. Informatica 23:275-281. Dougherty, J.P. 1999. Structured performability analysis of parallel applications. Informatica 23:107-111. Gams, M. 1999. Information society and the intelligent systems generation. Informatica 23:449-454. iizuka, K. H. & M. Wada. 1999. Customer satisfaction of information system integration business in Japan. Informatica 23:473^76. Forlizzi, L. & E. Nardelli. 1999. Characterization results for the poset based representation of topological relations—I: Introduction and models. Informatica 23:223-237. Havran, V. 1999. Analysis and cache sensitive representation for binary space partitioning trees. Informatica 23:203-210. Hedley, N.R., C.H. Drew, E.A. Arfin & A. Lee. 1999. Hagerstrand revisited: Interactive space-time visualization of complex spatial data. Informatica 23:155168. Helman, D.R. & J. JÀJÀ. 1999. ters ofSMPs. Informatica 23:113-121. Sorting on clus- Hlupic, V. 1999. Discrete-event simulation software: A comparison of users' surveys. Informatica 23:249-258. Jereb, E. & M. Gradišar. 1999. Research on teleworkin Slovenia. Informatica 23:137-142. Jereb, E. & B. Šmitek. 1999. Using an electronic book in distance education. Informatica 23:483-486. Kapus-Kolar, M. 1999. More efficient functionality decomposition in LOTOS. Informatica 23:259-273. Katevenis, M.G.H., E.P. Markatos, P. Vatso-laki & C. Xanthaki. 1999. The remote enqueue operation on networks of workstations. Informatica 23:29-39. Kebbal, D., E.g. Talbi & J.M. Geib. 1999. Fault tolerance of parallel adaptive applications in heterogeneous systems. Informatica23:77-85. Korenjak-Černe, S. 1999. Adapted methods for clustering large datasets of mixed units. Informatica 23:507-511. Klobučar, T. & B. Jerman-Blažič. 1999. An infrastructure for support of digital signatures. Informatica 23:477^81. Kremien, O., K. Michael & E. Irit. 1999. Preserving mutual interests in high performance computing clusters. Infonnatica 23:41^8. Krisper, M. & T. Zrimec. 1999. Modelling of an information society in transition - Slovenia's position in the CE countries. Informatica 23:467-471. Kwong, p. & S. Majumdar. 1999. Scheduling of I/O in multiprogrammed parallel systems. Informatica 23:67-76. Laurini, R., K.-J. Li, S. Servigne & M.-A. Kang. 1999. Modeling an auditory urban database with a field-oriented approach. Informatica 23:169-185. Lin, W.-M. & W, Xie. 1999. Minimizing communication conflicts with load-skewing task assignment techniques on network of workstations. Informatica 23:57-66. Liotopoulos, F.K. 1999. Issues on gigabit switching using 3-stage Clos networks. Informatica 23:335-346. Maleković, M. 1999. Agent properties in multiagent systems. Infonnatica23:283-288. Manolakos, E.S. & D.G. Galatopoullos. 1999. JavaPorts: An environment to facilitate parallel computing on a heterogeneous cluster of workstations. Informatica 23:97-105. Marisits, T., S. Molnàr & G. Fodor. 1999. Supporting all service classes in ATM: A novel traffic control framework. Informatica 23:305-315. Marugesan, S. 1999. Intelligent agents on the Internet and Web: Applications and prospects. Informatica 23:437-443. Mickle, M.H. 1999. On the determination of absolute network performance. Informatica 23:383-387. Mordonini, M. & A. Poggi. 1999. SISTER: a flexible system for image retrieval. Informatica 23:549558. nančovsk.a, I., A. Jeglič, D. Fefer & L. Todor-ovskl. 1999. Equation discovery system and neural networks for short-term DC voltage prediction. Informatica 23:513-520. Nong, G., M. Hamdi & J.K. Muppala. 1999. Performance evaluation of a scheduling algorithm for mul- tiple input-qued ATM switches. Informatica 23:369-381. Omondi, A.R. 1999. Floating-point arithmetic and the IEEE-754 standard, I: Number-system design. Informatica 23:413^29. Provost, F. & A. Pohoreckyj Danyluk. 1999. Problem definition, data cleaning, and evaluation: A classifier learning case study. Informatica 23:123-136. Raković, D., M. Tomašević, e. Jovanov, V. RADIVOJEVIĆ, p. šuk.0v1ć, ž. Martinović, m. Car, d. Radenović, Z. Jovanović-Ignjatić & L. Škarić. 1999. Electroencephalographic (EEG) correlates of some activities which may alter consciousness: The transcendental meditation technique, musicogenic states, microwave resonance relaxation, healer/healee interaction, and alertness/drowsiness. Informatica23:399^12. Rayhan, a., F. Egluibaly, & a, Almulhem. 1999. Fault-tolerant ATM switch using logical neighborhood network. Informatica 23:325-334. Sang, J. 1999. High-performance cluster computing over gigabit/fast Ethernet. Informatica 23:19-27. Setz, T. 1999. Fault tolerant execution of computerintensive distributed applications in LiPS. Informatica 23:87-95. Sha, D. & v.b. bajić. 1999. Adaptive on-line ANN learning algorithm and application to identification of non-linear systems. Informatica 23:521-529. Shen K., W. Liang & J. Ng. 1999. Efficient computation of frequent itemsets in a subcollection of multiple set families. Informatica 23:543-547. SiCHERL, P. 1999. A new perspective in comparative analysis of information society indicators. Informatica 23:455^60. šilc, J. & B. RobiČ. 1999. Asynchronous microprocessors. Informatica 23:239-247. Vehovar, V. & M. KovačiČ. 1999. Measuring information society: some methodological problems. Informatica 23:461-465. Venkatesan, R., Y. El-Sayed, R. Thuppal & H. sivakumar. 1999. Performance analysis of pipelined multistage interconnection networks. Informatica 23:347357. Zazula, D., B. Viher, D. Korošec, E. Avdičauše-vič, M. Lenič & B. Potočnik. 1999. Conceptual interactive learning tools based on computer simulators. Special Issue on Design Issues of Gigabit Network-Zheng, H., R. Buyya & S. Bhattacharya. 1999. ing. 1999. Informatica23:149. Mobile cluster computing and timeliness issues. Informatica 23:5-17. Special Issue on Advances in Simulation and Control. 1999. Informatica 23:150. Zupančič, B., M. Klopčič & R. Karba. 1999. Tuning of fiizzy logic controller with genetic algorithm. Fourth Joint Conference on Knowledge-Based Soft-Informatica 23:559-564. ware Engineering. 1999. Informatica 23:576-577. Editoriais Professional Societies Buyya, R. & M. Paprzycki. 1999. Clustering in search The Ministry of Science and Technology of the Republic for scalable commodity supercomputing. Infonnatica of Slovenia. 1999. Informatica23:151,301,445,578. 23:1-3. Jožef Stefan Institute. 1999. Informatica Guizani, M. 1999. Introduction: Design issues of 23:152,302,446,579. gigabit networking. Informatica 23:303-304. Petry, F.E., M.A. Cobb & K.B. Shaw. 1999. Introduction: Special Issue on Spatial Data Management. Informatica 23:153. Bavec, C. & M. Gams. 1999. Introduction: Information Society and Intelligent Systems. Informatica 23:447-448. A Conference Report Gams, M. 1999. Intelligent Agent Technology Informatica 23:575. Calls for Papers Information Society—IS'99. An international multiconference. 1999. Informatica23:143-144. ERK'99—Electrotechnical and Computer Science Conference. 1999. Informatica 23:145. 8th International Conference on Computer Analysis oflmages and Patterns. 1999. Informatica 23:146-147. Special Issue on Group Support Systems. 1999. Informatica 23:148. INFORMATICA AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS INVITATION, COOPERATION Submissions and Refereeing Please submit three copies of the manuscript with good copies of the figures and photographs to one of the editors from the Editorial Board or to the Contact Person. At least two referees outside the author's country will examine it, and they are invited to make as many remarks as possible directly on the manuscript, from typing errors to global philosophical disagreements. The chosen editor will send the author copies with remarks. If the paper is accepted, the editor will also send copies to the Contact Person. The Executive Board will inforni the author that the paper has been accepted, in which case it will be published within one year of receipt of e-mails with the text in Infonnatica OT^ format and figures in . eps format. The original figures can also be sent on separate sheets. Style and examples of papers can be obtained by e-mail from the Contact Person or from FTP or WWW (see the last page of Infonnatica). Opinions, news, calls for conferences, calls for papers, etc. should be sent directly to the Contact Person. QUESTIONNAIRE Please, complete the order forni and send it to Dr. Rudi Mum, Infonnatica, Institut Jožef Stefan, Jamova 39, 61111 Ljubljana, Slovenia. Since 1977, Informatica has been a major Slovenian scientific journal of computing and informatics, including telecommunications, automation and other related areas. In its 16th year (more than five years ago) it became truly international, although it still remains connected to Central Europe. The basic aim of Informatica is to impose intellectual values (science, engineering) in a distributed organisation. Infonnatica is a journal primarily covering the European computer science and infonnatics community - scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains in-fonnation about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the Refereeing Board. Infonnatica is free of charge for major scientific, educational and governmental institutions. Others should subscribe (see the last page of Infonnatica). ORDER FORM - INFORMATICA Name: .................................................... Office Address and Telephone (optional): Title and Profession (optional): .................................................................... ........................................................... E-mail Address (optional): ............. Home Address and Telephone (optional): .................... ........................................................... Signature and Date: ................... Informatica WWW: http://ai.ijs.si/informatica/ http://orca.st.usm.edii/informatica/ Referees: Witold Abramowicz, David Abramson, Adel Adi, Kenneth Aizawa, Suad Alagič, Alan Aliu, Richard Amoroso, John Anderson, Hans-Jurgen Appelrath, Grzegorz Bartoszewicz, Catriel Beeri, Daniel Beech, Fevzi Belli, Istvan Berkeley, Azer Bestavros, Balaji Bharadwaj, Jacek Blazewicz, Laszlo Boeszoermenyi, Damjan Bojadžijev, Jeff Bone, Ivan Bratko, Jerzy Brzezinski, Marian Bubak, Leslie Burkholder, Frada Burstein, Wojciech Buszkowski, Rajkumar Bvyya, Netiva Caftori, Jason Ceddia, Ryszard Choras, Wojciech Cellary, Wojciech Chybowski, Andrzej Ciepielewski, Vie Ciesielski, David Cliff, Travis Craig, Noel Craske, Matthew Crocker, Tadeusz Czachorski, Milan Češka, Honghua Dai, Deborah Dent, Andrej Dobnikar, Sait Dogru, Georg Dorfner, Ludoslaw Drelichowski, Matija Drobnič, Maciej Drozdowski, Marek Druzdzel, Jozo Dujmović, Pavol Öuris, Hesham El-Rewini, Warren Fergusson, Pierre Flener, Wojciech Fliegner, Vladimir A. Fomichov, Terrence Forgarty, Hans Fraaije, Hugo de Garis, Eugeniusz Gatnar, James Geller, Michael Georgiopolus, Jan Golinski, Janusz Gorski, Georg Gottlob, David Green, Herbert Groiss, Inman Harvey, Elke Hochmueller, Rod Howell, Tomas Hruška, Alexey Ippa, Ryszard Jakubowski, Piotr Jedrzejowicz, A. Milton Jenkins, Eric Johnson, Polina Jordanova, Djani Juričič, Sabhash Kak, Li-Shan Kang, Roland Kaschek, Jacek Kierzenka, Jan Kniat, Stavros Kokkotos, Kevin Korb, Gilad Koren, Henryk Krawczyk, Ben Kroese, Zbyszko Krolikowski, Benjamin Kuipers, Matjaž Kukar, Aarre Laakso, Phil Laplante, Bud Lawson, Ulrike Leopold-Wildburger, Joseph Y-T. Leung, Xuefeng Li, Alexander Linkevich, Raymond Lister, Doug Locke, Peter Lockeman, Matija Lokar, Jason Lowder, Kim Teng Lua, Andrzej Malachowski, Bernardo Magnini, Peter Marcer, Andrzej Marciniak, Witold Marciszewski, Vladimir Marik, Jacek Martinek, Tomasz Maruszewski, Florian Matthes, Timothy Menzies, Dieter Merkl, Zbigniew Michalewicz, Roland Mittermeir, Madhav Moganti, Reinhard Moller, Tadeusz Morzy, Daniel Mossé, John Mueller, Hari Narayanan, Rance Necaise, Elzbieta Niedzielska, Marian Niedq'zwiedzinski, Jaroslav Nieplocha, Jerzy Nogieč, Stefano Nolfi, Franc Novak, Antoni Nowakowski, Adam Nowicki, Tadeusz Nowicki, Hubert Osterie, Wojciech Olejniczak, Jerzy Olszewski, Cherry Owen, Mieczyslaw Owoc, Tadeusz Pankowski, William C. Perkins, Warren Persons, Mitja Peruš, Stephen Pike, Niki Pissinou, Uliin Place, Gustav Pomberger, James Pomykalski, Gary Preckshot, Dejan Rakovič, Cveta Razdevšek Pučko, Ke Qiu, Michael Quinn, Gerald Quirchmayer, Luc de Raedt, Ewaryst Rafajlowicz, Sita Ramakrishnan, Wolf Rauch, Peter Rechenberg, Felix Redmill, David Robertson, Marko Robnik, Ingrid Rüssel, A.S.M. Sajeev, Bo Sanden, Vivek Sarin, Iztok Savnik, Walter Schempp, Wolfgang Schreiner, Guenter Schmidt, Heinz Schmidt, Dennis Sewer,Zhongzhi Shi, William Spears, Hartmut Stadtler, Olivero Stock, Janusz StokJosa, Przemyslaw Stpiczynski, Andrej Stritar, Maciej Stroinski, Tomasz Szmuc, Zdzislaw Szyjewski, Jure Šile, Metod Škarja, Jifi Šlechta, Chew Lim Tan, Zahir Tari, Jurij Tasič, Piotr Teczynski, Stephanie Teufel, Ken Tindell, A Min Tjoa, Wieslaw Traczyk, Roman Trobec, Marek Tudruj, Andrej Ule, Amjad Umar, Andrzej Urbanski, Marko Uršič, Tadeusz Usowicz, Elisabeth Valentine, Kanonkluk Vanapipat, Alexander P. Vazhenin, Zygmunt Vetulani, Olivier de Vel, John Weckert, Gerhard Widmer, Stefan Wrobel, Stanislaw Wrycza, Janusz Zalewski, Damir Zazula, Yanchun Zhang, Zonling Zhou, Robert Zorc, Anton P. Železnikar EDITORIAL BOARDS, PUBLISHING COUNCIL Informatica is a journal primarily covering the European computer science and informatics community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international referee-ing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor fi-om the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is inade through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board . is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Science and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor - Editor in Chief Anton P. Železnikar Volaričeva 8, Ljubljana, Slovenia sSlemOlea.hamradio.si http://lea.hamrađio.si/'s51em/ Executive Associate Editor (Contact Person) Matjaž Gams, Jožef Stefan Institute Jamova 39, 61000 Ljubljana, Slovenia Phone: +386 61 1773 900, Fax: +386 61 219 385 matjaz.gams®!j s.si http://www2.ijs.si/~mezi/matjaz.html Executive Associate Editor (Technical Editor) Rudi Mum, Jožef Stefan Institute Publishing Council: Tomaž Banovec, Ciril Baškovič, Andrej Jerman-Blažič, Jožko Čuk, Jernej Virant Editorial Board Suad Alagić (Bosnia and Herzegovina) Vladimir Bajić (Republic of South Africa) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Leon Birnbaum (Romania) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Se Woo Cheon (Korea) Hubert L. Dreyfus (USA) Jozo Dujmović (USA) Johann Eder (Austria) Vladimir Fomichov (Russia) Georg Gottlob (Austria) Janez Grad (Slovenia) Francis Heylighen (Belgium) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (Austria) Ante Laue (Croatia) Jadran Lenarčič (Slovenia) Huan Liu (Singapore) Ramon L. de Mantaras (Spain) Magoroh Maruyama (Japan) Nikos Mastorakis (Greece) Angelo Montanari (Italy) Igor Mozetič (Austria) Stephen Muggleton (UK.) Pavol Navrat (Slovakia) Jcrzy R. Nawrocki (Poland) Roumen Nikolov (Bulgaria) Marci n Papr^cki (USA) Oliver Popov (Macedonia) Kari H. Pribram (USA) Luc De Raedt (Belgium) Dejan Raković (Yugoslavia) Jean Ramaekers (Belgium) Wilhelm Rossak (USA) Ivan Rozman (Slovenia) Claude Sammut (Australia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Branko Souček (Italy) Oliviero Stock (Italy) Petra Stoerig (Germany) Jifi Šlechta (UK) Gheorghe Tecuci (USA) Robert Trappl (Austria) Terry Winograd (USA) Stefan Wrobel (Germany) Xindong Wu (Australia) Board of Advisors: Ivan Bratko, Marko Jagodič, Tomaž Pisanski, Stanko Strmčnik An International Journal of Computing and Informatics Introduction Information Society And The Intelligent Systems Generation A New Perspective In Comparative Analysis Of Information Society Indicators Measuring Information Society: Some _ Methodological Problems Modelling Of An Information Society In Transition -Slovenia's Position In The CE Countries ■ Customer Satisfaction Of Information System Integration Business In Japan An Infrastructure For Support Of Digital Signatures -Using An Electronic Book In Distance Education Multi-Attribute Decision Modeling: Industrial Applications of DEX - Perception-Based Classification Generalized Blockmodeling Adapted Methods For Clustering Large Datasets ... Equation Discovery System And Neural... Adaptive On-line ANN Learning Algorithm And... 447 M.Gams 449 P.Sicheri 455 V. Vehovar 461 M. Kovačič M. Krisper 467 T. Zrimec K.H.Iizuka 473 M.Wada T. Klobučar et al. 477 E. Jereb, B. Šmitek 483 M.Bohanec 487 V. Rajkovič M. Ankerst et al. 493 V. Batagelj et al. 501 S. Korenjak-Čeme 507 I. Nančovskaet al. 513 D. Sha et al. 521 A Spanish Interface To LogiMoo: Towards ... Efficient Computation Of Frequent Itemsets In... « SISTER: A Flexible System For Image Retrieval Tuning Of Fuzzy Logic Controller With... PNNI And The Optimal Design Of High-speed ... Reports and Announcements V. Dahl et al. H. Shenetal. M. Mordonini et al. B. Zupančič et al. A. Battou et al. 531 543 549 559 565 575