fi C» r-i d a k s 1 a co 05 O M c3 O Informatica An International Journal of Computing and Informatics Rough Sets: Facts Versus Misconceptions Integrating Knowledge and Database Distributed Shared Memory Neural Networks in Econometric Analysis Book Review: The Concious Mind The Slovene Society Informatika, Ljubljana, Sloveni^ f Informatica An International Journal of Computing and Informatics Basic info about Informatica and back issues may be FTP'ed from ftp.arnes.si in magazines/informatica ID: anonymous PASSWORD: FTP archive may be also accessed with WWW (worldwide web) clients with URL: http://www2.ijs.si/~mezi/informatica.html Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer, Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Vožarski pot 12, 61000 Ljubljana, Slovenia. The subscription rate for 1996 (Volume 20) is - DEM 50 (US$ 35) for institutions, - DEM 25 (US$ 17) for individuals, and - DEM 10 (US$ 7) for students plus the mail charge DEM 10 (US$ 7). Claims for missing issues will be honored free of charge within six months after the publication date of the issue. I^TßK Tech. Support: Borut Žnidar, DALCOM d.o.o., Stegne 27, 61000 Ljubljana, Slovenia. Lectorship: Fergus F. Smith, AMID AS d.o.o., Cankarjevo nabrežje 11, Ljubljana, Slovenia. Printed by Biro M, d.o.o., Žibertova 1, 61000 Ljubljana, Slovenia. Orders for subscription may be placed by telephone or fax using any major credit card. Please call Mr. R. Murn, Jožef Stefan Institute: Tel (+386) 61 1773 900, Fax (+386) 61 219 385, or use the bank account number 900-27620-5159/4 Ljubljanska banka d.d. Slovenia (LB 50101-678-51841 for domestic subscribers only). According to the opinion of the Ministry for Informing (number 23/216-92 of March 27, 1992), the scientific journal Informatica is a product of informative matter (point 13 of the tariff number 3), for which the tax of traffic amounts to 5%. Informatica is published in cooperation with the following societies (and contact persons): Robotics Society of Slovenia (Jadran Lenarčič) Slovene Society for Pattern Recognition (Franjo Pernuš) Slovenian Artificial Intelligence Society (Matjaž Gams) Slovenian Society of Mathematicians, Physicists and Astronomers (Bojan Mohar) Automatic Control Society of Slovenia (Borut Zupančič) Slovenian Association of Technical and Natural Sciences (Janez Peklenik) Informatica is surveyed by: AI and Robotic Abstracts, AI References, ACM Computing Surveys, Applied Science & Techn. Index, COMPENDEX*PLUS, Computer ASAP, Cur. Cont. & Comp. & Math. Sear., Engineering Index, INSPEC, Mathematical Reviews, Sociological Abstracts, Uncover, Zentralblatt für Mathematik, Linguistics and Language Behaviour Abstracts, Cybernetica Newsletter The issuing of the Informatica journal is financially supported by the Ministry for Science and Technology, Slovenska 50, 61000 Ljubljana, Slovenia. Distributed Shared Memory on Loosely Coupled Systems Vicente Cholvi-Juan Department of Computer Science University Jaume I Campus Penyeta Roja, Castellò, Spain E-mail: vcholviOinf.uji.es AND Roy Campbell Department of Computer Science University of Illinois at Urbana-Champaign 1304 W. Springfield Av, Urbana, IL 61801 E-mail: royScs .uiuc . edu Keywords: distributed systems, distributed shared memory, concurrency, operating systems Edited by: Rudi Murn Received: April 4, 1996 Revised: November 5, 1996 Accepted: November 29, 1996 The distributed shared memory model (DSMM) is considered a feasible alternative to the traditional communication model (CM), especially in loosely coupled distributed systems. While the CM is usually considered a low-level model, the DSMM provides a shared address space that can be used in the same way as local memory. This paper provides a taxonomy of distributed shared memory systems, focusing on different implementations and the factors which affect the behavior of those implementations. 1 Introduction cial (and often expensive) hardware. They can be easily upgraded and customized, and even though the performance gap between them and super- Many computational problems benefit from the computers is still relatively big, it is expected a availability of parallel-processing power: the com- notable reduction as high-speed networks become putational problem is split into subproblems and ^ore popular (e.g., ATM or HiPPI networks). We each one is solved concurrently. There are many ^ill focus our work in this type of systems. multiprocessor computers, ranging from only a a x ■ i /i i i j\ j- j. x j x ^ , , „ ^ typical (loosely coupled) distributed system few to thousands of processors. Typically, such a composed of a collection of independent com- multicomputer is much more expensive than a col- x-x x rx / ^ puters interconnected through some type ot net- lection of loosely coupled computers, having each ^^^^^ ^^ cooperate, applications written only a few number of processors. The main ad- ^^ ^p^^ computers on such a system need vantage of the large multicomputer systems is the , , i, ■ x n r, rxu • o o J j^g^yg some mechanism to allow each one oi their speed of the interconnection network joining its ^^ information, processors. However, trends in network technology will make possible to have high performance ^^^^^^ ^^^ communication model (CM) [17, 18, networks joining loosely coupled systems. In fact, ^8], this information exchange is accompUshed by the number of loosely coupled distributed systems messages: a given being used as parallel computers is quickly, increasing [4, 12, 32]. Thus, such systems constitute ° owing primitives, a low-cost approach entry into the parallel computing domain without necessarily requiring spe- - send(data,address) Memory McmDiy Mcmmy Intdconncciion Newnrk Disiii^Kiu-iJSbjrtilMtritxy Figure 1: Distributed Shared Memory (DSM). - receive (data) The CM model provides explicit control over the communication to the programmers, being relatively easy to overlap communication with computation. Nevertheless, that explicit control constitutes the main disadvantage of the CM [17, 18], as it increases its complexity. Thus, it is necessary that the source process of a message knows the target processes. In addition, target processes must exist when data is sent, and must eventually be able to receive that data. Finally, each process must dynamically extract its state when receiving random messages. On the other hand, the shared memory model (SMM) [51] provides a shared address space which can be used by processes in the same way as local memory, even if they are executed concurrently in different processors. Thus, every process can access any address by means of two basic operations: - data = read(address) - write(address,data) read returns the data in address, and write associates data with address. Using the SMM model has several important benefits. In the first place, it hides the particular communication mechanisms employed, thus application developers do not need to be involved in the management of messages, or know whether the application runs on a multiprocessor or on a distributed system (they should know, however, the cost of exchanging information, so they can decide on a performant partition). Besides, it allows complex shared structures to be passed by reference, providing a simple and well known paradigm. When a SMM is built on top of a distributed system, we get what is known as a DSMM. Even though a DSMM is built on top of a CM (suggesting a decrease in the performance), it has been shown that DSMM can perform well [15]. Factors, such as high locaUty of references [23], allow communication costs to be compensated against multiple accesses. Multiple rephcas can also reduce transfers between nodes, while distributing the communication over a larger interval of time (transfers of data are made on demand), increasing concurrence. Of course, those paradigms do not have to be necessarily exclusive. Indeed, systems such as SAM [49], Locust [19] and CarlOS [38] support the DSMM, providing at the same time mechanisms for communication and synchronization. The rest of the paper is organized as follows: Section 2.1 contains an overview of different approaches to implement the DSMM. Section 2.2 addresses implementation mechanisms. Section 2.3 focuses on the problem of consistency between shared units, while Section 2.4 analyzes the importance of the shared units structure. Finally, in Section 3 we give some concluding remarks and suggest future research directions. 2 Characterization of the BSMM As we have pointed previously, the DSMM has to be built on the CM in such a manner that it transforms the memory access requests into messages between processes. There are a lot of factors that affect the way such transformations take place. In the next sections we identify principal issues that characterize the behavior of DSM systems, presenting some of the proposed implementations. 2.1 Implementation Approaches The field of research in DSM systems was open up in 1985 by D.R. Cheriton [17]. Since then, a huge amount of work has been done in that area. The earliest DSM systems provided implementations of the DSMM principally by using operating system resources, through virtual memory management mechanisms. IVY [43, 44] constitutes a classical example of a system that implements the DSMM by adding coherence mech- anisms ^ to a distributed demand paging policy. More recently, Choices [48] incorporates custom designed distributed virtual memory protocols for different applications, which can be altered to trade off characteristics such as resiliee ncto packet loss, network loading, etc. In the same way, the virtual memory management system of Mach [47, 54], a well known operating system kernel that runs on a wide variety of architectures, is designed to be architecture and operating system independent, allowing programmers to handle directly memory as a system resource. Thus, individual memory manager systems that implement the DSMM can be customized for specific applications (e.g.. Agora [11] or Midway [10]). Another approach consists of making use of hardware components. For instance, MemNet [22, 52] is an entirely hardware implementation of the DSMM. Every node has a MemNet-device that includes both the host's system bus and the network interface, and a MemNet-cache (structured in blocks of 32 bytes) divided into a large cache and a reserved area. The cache is used to store the blocks whose reserved area is another node, while the reserved area is used to store the blocks which have to be flushed when a cache area become full. On every memory access, the local MemNet-device decides if it can alone handle that request. If it needs the cooperation of other devices, it will send a message and will block the node until receiving a reply. That message will circulate through the net (a token ring), being inspected by every MemNet-device (thus, the maximum reply time is limited). If there is a read access, the first MemNet-device with a copy will send it to the requester node, while if there is a write access, in addition it will be necessary to invalidate all the replicas in order to maintain some type of consistency between them. Compilers can also provide support for transforming shared accesses into primitives to manage both coherency and synchronizations. Among the languages for implementing the DSMM we can mention EDS Lisp [30], an extension of an existing sequential language, and Orca [6], a new language designed from scratch in such a way that data shared structures can be accessed through higher level operations. 'Basically they are very similar to those used in the . Berkeley multiprocessor system [5] However, currently most of the efforts are addressed in order to implement DSM environments. They consist of user-level libraries providing operations that programmers can use directly [21]. For instance. Tread Marks [35] constitutes a DSM environment that implements the DSMM using standard Unix systems such as SunOS and Ultrix without requiring any modification of them (the implementation is done at user level), avoiding the performance problems by focusing on reducing the communication between nodes. Also SAM [49], a shared object system for distributed memory machines, has been implemented as a C library on a variety of platforms: on the CM-5, Intel iPSC/860, Intel Paragon, IBM SPI and on heterogeneous networks of workstations using PVM. Other DSM environments are Quarks [16] and Carlos [38]. 2.2 Implementation Issues Placement. The DSMM provides a shared address space which can be used by processes in the same way as local memory. However, the implementation of such a shared address space requires placing physically shared units {blocks) at the local address spaces composing the global one. That placement can be done statically in su ach way that the same block is always placed at the same node. A simple way to implement static placement consists of employing a central server which will store all the blocks. Thus it will manage every access to them [17, 18, 51]. Unfortunately, this implementation needs twice as much messages as the CM. Besides, the central server constitutes a potential bottleneck and although this problem can be solved by using several servers, troubles will still remain if load is not properly distributed. Another possibility consists of using dynamic placement. In this case, blocks are transferred to the requester node before to be accessed. That approach avoids any communication between nodes if data is locally available, although it may force superfluous data transfers. Location. While finding blocks can be done in a straightforward way when using static placement, if the placement is dynamic it is necessary to follow circulating blocks. In the same way as in the placement of blocks, the simplest way of controlling circulation consists of using a single node. But analogously to that case, if the node becomes heavily loaded, the entire system w^ill also become overloaded. That problem can be also solved by using several controller nodes, but the eflfectiveness of that solution still will depend on the proper distribution of load. Also, it requires maintaining a mechanism to find the proper controller node, thus loading the system with a new task. Replication. To increase concurrency, most of the DSM systems support replication of data. That allows different processes to use the same data at the same time. How^ever, and in order to guarantee consistency of shared data, systems using replication must carry out control of replicas. That control can be done by invalidating outdated replicas, as for instance systems as IVY [43] or Clouds [36] or by propagating data to outdated replicas. Stumm et al. [1, 51] have proposed several algorithms intended to propagate values. Basically they use a single node, varying only the moment when the propagation takes place. Whereas propagation is more expensive than invalidation due that, in addition to the invalidating messages, data have to be sent, by using invalidation each block-fault (a block-fault happens when a request can not be locally served) leads to starting a process that will create a new replica, thus increasing latency. Application Customization. Application-specific protocols constitute a well known approach to improve performance [17, 18]. However, although it has been shown to be an efficient means to reduce extra communication against general purpose protocols [26], it requires writing protocols from scratch, which has been also shown to be difficult and error-prone. System-provided protocols, even though with reduced performance, seems to be a compromising solution to that problem. Indeed, experimental studies of several shared memory parallel programs [7, 15] support the hypothesis that a system employing a type-specific memory coherency scheme may outperform systems using only a single mechanism. Nevertheless, that technique requires a relatively small number of identifiable patterns that characterize the behavior of the majority of blocks (so that customized mechanisms can be devel- oped). Fault tolerance. Fault tolerance and error recovery constitute topics also addressed by using the DSMM. Let's introduce the approach taken by Wu & Kent [53]. They have designed a recoverable distributed virtual memory system which stands up to fail-stop processors [50] without any global re-starting. To do that they use security copies that store the necessary data to restart the execution [8]. Given that every process shares the global memory, a backward propagation might be needed if each process simply creates an independent security copy [37]. That happens if a process, after creating a security copy, modifies the value of a page and sends it to another process. Then, if the first process fails, the second one will have to get a security copy created previously to that failure. To solve this problem, every node creates a security copy before sending any modified page since the last checkpoint (also the operating system or even the program can create additional copies). That is done by using twin disk pages. One of them is a security copy. The other is either a work copy or a wrong copy (due to a failure or because it is an old security copy). Thus, every restart, the "right" page is chosen, which will avoid a backward propagation because data do not have to be invalidated in any node. However, to develop truly reliable systems, both processors and memory failures must be considered. In this way, Hoepman et al. [33] have addressed the construction of self-stabilizing wait-free shared memory objects (these objects occur naturally in systems in which both processors and memory may be faulty). 2.3 Coherency Models As it as been previously pointed out, the use of replication may increase concurrency. In turn, it is necessary to maintain some kind of coherency between replicas. This problem is similar to the cache coherency problem in multiprocessor systems [5, 24], where several processors share the same data in local caches. In this case, the size of the caches is relatively small, the connections fast and the coherency protocols are implemented by hardware. On the contrary, in distributed systems the communication cost is bigger, and the coherency prp- tocols are usually implemented by software. A memory coherency model is characterized by its constrains on initiation and completion of memory accesses [20]. Depending on the properties guaranteed by the coherency model, algorithms will vary in complexity. Programmers must ensure that accesses to data conform to the rules of the model. Basically coherency models can be split into non-synchronized and synchronized. Non-synchronized models use only read and write operations while synchronized ones have, in addition, another operations (synchronizations) intended to enforce dependencies at specific points. Whereas most of the systems support only one coherency model, there are systems which support multiple coherency models within a single parallel program. For instance, Midway [10], which has been implemented using Mach 3.0 with CMU's Unix server on MPIS R3000-based DECstations and 5000/120s, supports release consistency, entry consistency and processor consistency (described below). 2.3.1 Non-Synchronized Models One of the most widely known non-synchronized models is the atomic. It was formalized by Lamport [41] in the case of one writer, and by Misra [46] in the case of several writers. Also the lin-earizability condition for objects introduced by Herlihy and Wing [31] is equivalent to the atomic model when restricted to objects that support read and write operations. This model requires each read operation to obtain the "most recently written" value. It also preserves "real-time" ordering of operations without blocking every process while an operation is taking place. An interesting property of this model is that to guarantee that a system is atomic, it is enough to guarantee that each variable in isolation is atomic, i.e. the atomic model is compositional. The sequential model [40] resembles the atomic, although this one does not preserve any kind of global order between operations (only operations from the same process are forced to preserve realtime ordering). Sequential memory, on the contrary to what happens to atomic memory, does not satisfy the compositional property. Thus, in contrast with the atomic model, it is not possible in general to obtain a sequential system out of the composition of independent sequential components. On the other hand and in order to improve the performance, other coherency models do not preserve the "most recently written" property. For instance, the cache model (it was introduced by Goodman as cache consistency [29]) forces only operations affecting the same variable to "appear" as executed under the sequential model. That condition is also fulfilled by the PRAM (Pipehned RAM) model [45]. Only now, operations appearing as sequential are those in the same process and write ones. That allows pipelining of the write operations, which, even though may potentially delay the effect of write operations to different processes, permits programs take advantage of the better performance ■ of a PRAM implementation as compared to a sequential implementation. The causal model [2], besides to the conditions of the PRAM model, forces read operations to return the value written by the last causally ordered operation [42]. Similarly to PRAM implementations, implementations of the causal model result in far less communications than on sequential ones, providing also a good scalability. Also, the processor model [29] imposes additional conditions on the PRAM one. Now, restrictions are imposed on the write operations to the same variable. Finally, the safe and the regular models (they were introduced by Lamport [41] in order to provide a way for implementing stronger models in terms of weaker ones) force the restriction of their executions to the write and non-overlapping operations be atomic. Moreover and in the case of the regular model, read operations are forced to return the value of any previous or overlapping ; write operation to the same variable. 2.3.2 Synchronized Models The approach of synchronized models consists of obtaining algorithms that behave sequentially by forcing explicit dependencies between events (by using synchronizations) when necessary. However, that requires identifying dependencies in a proper way, which may induce additional complexity in the design of programs. Figure 2: Relations between non-synchronized models: The sets represent the executions they allow. We begin the description of synchronized models with the weak model [25]. It only uses a single synchronization type {weak). Roughly speaking, it forces dependencies between synchronizations and the preceding and following operations. However, slightly different versions of this model have been proposed varying the set of operations forced to be related with synchronizations. Contrary to the weak model, both the lazy-release (LR) [34] and the eager-release (ER) models [27] use two types of synchronizations {acq and rei). That permits addressing typical problems (e.g., implementing critical sections) in an easier way. Whereas the ER model sets up dependencies from the rei synchronizations to the whole set of operations, the LR model sets up dependencies from the rei synchronizations to the acq synchronizations. Moreover, and independently from the set up dependencies, they require the first synchronization operation for each process to be an acq synchronization and impose an alternating use of the acq and rei synchronizations. Besides, after an acq synchronization completes, the next completing synchronization has to be executed by the same process. The last synchronized model we introduce is the entry [9]. It is very similar to the LR model. Only now synchronizations are associated with "synchronization variables". As well as the release models, it requires the first synchronization for each process has to be an acq synchronization and it imposes an alternating use of the acq and rei synchronizations. Also, the rei synchro- nizations must be executed on the same variable that the previous acq synchronization, and after an acq synchronization completes, the next completing synchronization to the same variable has be executed by the same process. 2.4 Shared Data Characteristics DSM systems are intended to provide an address space where data can be shared among several nodes. Therefore it is not surprising that the characteristics of those data may affect the behavior of such systems. Heterogeneous size and structure greatly affect the system performance. That is due to the data conversion when interchanging information between modules (e.g., MMUs having to manage pages with different sizes [55]). On the other hand, in loosely coupled distributed systems, sending a big packet of data is not, relatively speaking, much more expensive than sending a small packet. Therefore, if programs have a high locality and we use dynamic placement, using a big size of the shared units may reduce the number of block-faults. But the more we increase the size the more false sharing arises. False sharing occurs when two non-related variables, each one referred from a different node, are located in the same shared unit, thereby inducing unnecessary coherence operations. It is believed to be a serious problem for parallel program performance. This belief is also supported by experimental evidence [13]. Multi-writer protocols address that problem by allowing multiple nodes to write one block at the same time and merging changes in a consistent way at specified points. Examples of systems using multi-writer protocols are Munin [14] and Tread M arks [35]. Delayed protocols attack false sharing by communicating updates at the latest possible moment. For instance, synchronized models, because they only suffer delays at synchronization points, are used to reduce false sharing. Systems supporting structured data provide the user with control of the shared units, which can be used to avoid false sharing. Orca [6], Indigo [39], Linda [3] or Agora [11] are examples of systems that allow data structures to be shared between nodes. In this case, a careful analysis must be done in such a way that data manipulated mostly by one process be allocated on shared units with no data for other processes. However, the analysis of data dependencies uses to be a difficult task. 3 Conclusions While many studies have shown the usefulness of the DSMM and a big amount of work has been done to improve the performance of DSM systems, some areas still seem to require paying more attention [16, 19]. Performance of the DSMM is greatly affected by memory access patterns. As a matter of fact, the consistency mistmach between the DSM systems and the application programs constitutes one of the most important factors that favors low performance. Therefore, an important approach in order to avoid performance problems consists of exploiting data dependencies. However, that requires knowing access patterns, which may not be always available. Real-time implementations and auto-configuring systems are other areas which also need deeper study. Contrary to available message passing systems such as MPI or PVM, the DSMM has not yet had a significant impact on non-researcher users. The earliest systems provided experimental environments useful to be used as benchmarks. Now, new generation DSM systems are overcoming former problems, which allow us to envisage a wider acceptance of the DSMM. References [1] A. Krishnamurthy and K. Yelick. Optimizing parallel programming with explicit synchronization. In Programming Language Design and Implementation, June 1995. [2] M. Ahamad, G. Neiger, J.E. Burns, P. KohH, and P.W. Hutto. Causal memory: Definitions, implementation and programming. Distributed Computing, 9(l):37-49, August 1995. [3] S. Ahuja, N. Carriere, and D. Gelernter. Linda and friends. IEEE Computer, 19(8):26-34, August 1986. [4] T.E. Anderson, D.E. Culler, and D.A. Patterson. A case for NOW (networks of workstations). IEEE Micro, 15(l):54-64, February 1995. [5] J. Archibals and J.L. Baer. Cache coherence protocols: Evaluation using a multiprocessor model. ACM Transactions on Computer Systems, 4(4):273-298, November 1986. [6] H.E. Bai, M.F. Kaashoek, and A.S. Tanen-baum. Orca: A language for parallel programming of distributed systems. IEEE Transactions on Software Engineering, 18(3):190-205, March 1992. [7] J.K. Bennett, J.B. Carter, and W. Zwaenepoel. Munin: Distributed shared memory based on type-specific memory coherence. In Proceedings of the 1990 International Conference on Parallel Processing, pages 168-176. ACM, 1990. [8] P.A. Bernstein, N. Goodman, and V. Hadzi-lacos. Recovery algorithms for database systems. In IFIP, pages 799-807, 1983. [9] B.N. Bershad and M.J. Zekauskas. Midway: Shared memory parallel programming with entry consistency for distributed memory multiprocessors. Technical Report CMU-CS-91-170, Carnegie-Mellon University, September 1991. [10] B.N. Bershad, M.J. Zekauskas, and W.A. Sawdon. The Midway distributed shared memory system. In COMPCON, 1993. [11] R. Bisiani and A. Forin. Architectural support for multilanguage parallel programming on heterogeneus systems. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, pages 21-30, October 1987. [12] M.A. Blumrich, K. Li, R. Alpert, C. Dub-nicki, E.W. Feiten, and J. Sandberg. Virtual memory mapped network interface for the SHRIMP multicomputer. In Proceedings of the 21th International Symposium on Computer Architectures, April 1994. [13] W.J. Bolosky and M.L. Scott. False sharing and its effects on shared memory performance. Technical report, Computer Science Department, University of Rochester, 1994. [14] J.B. Carter, J.K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. Operating System Review, 25(5):152-164, October 1991. [15] J.B. Carter, J.K. Bennett, and W. Zwaenepoel. Techniques for reducing consistency-related communication in distributed shared memory systems. Transactions on Computer Systems, 13(3):205-244, August 1995. [16] J.B. Carter, D. Khandekar, and L. Kamb. Distributed shared memory: Where we are and where we should be headed. In Proceedings of the 5th Workshop on Hot Topics in Operating Sytems, May 1995. [17] D.R. Cheriton. Preliminary thoughts on problem-oriented shared memory: A decentralized approach to distributed systems. ACM Operating System Review, pages 26-33, October 1985. [18] D.R. Cheriton. Problem-oriented shared memory: A decentralizad approach to distributed system design. In Proceedings of the Sixth International Conference on Distributed Computer Systems, pages 190-197. IEEE, May 1986. [19] T. Chiueh and M. Verma. A compiler-directed distributed shared memory system. In International Conference on Supercomputing, 1995. [20] V. Cholvi-Juan. Formalizing Memory Models. PhD thesis, Department of Computer Science, Polytechnic University of Valencia, December 1994. [21] V. Cholvi-Juan and J.M. Bernabéu-Aubàn. Implementing a distributed compiler library that provides a jV-mixed memory model. In lEEE/USP International Workshop on High Performance Computing, pages 229244, March 1994. [22] G. Delp. The Architecture and Implementation of MemNet: A High Speed-Shared Memory Computer Communication Network. PhD thesis, Computer Science Department, University of Delaware, 1988. [23] P.J. Denning. On modeling program behavior. In Proceedings of the AFIPS Spring Joint Computer Conference, pages 937-944, 1972. [24] M. Dubois and C. Scheurich. Synchronization, coherence and event ordering in multiprocessors. IEEE Computer, pages 9-21, February 1988. [25] M. Dubois, C. Scheurich, and F. Briggs. Memory access buffering in multiprocessors. In Proceedings of the 13th Annual Symposium on Computer Architecture, pages 434442, June 1986. [26] B. Falsali, A.R. Lebeck, S.K. Reinhardt, I. Schoinas, M. Hill, J.R. Larus, A. Rogers, and D.A. Wood. Application-specific protocols for user-level shared memory. In Supercomputing, November 1994. [27] K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proceedings of the 17th Annual Internationnl Symposium on Computer Architecture, pages 15-26. ACM, May 1990. [28] D.K. Gifford and N. Glasser. Remote pipes and procedures for efficient distributed communication. ACM Transactions on Computer Systems, 6(3):258-283, August 1988. [29] J.R. Goodman. Cache consistency and sequential consistency. Technical Report 61, IEEE Scalable Coherence Interface Working Group, March 1989. [30] C. Hammer and T. Henties. Using a weak coherency model for a parallel lisp. In A. Bode, editor, Distributed Memory Computing, volume 487 of Lecture Notes in Computer Science, pages 42-51. 1991. [31] M.P. Herhhy and J.M. Wing. Linearizabil-ity: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463-492, July 1990. [32] M.D. Hill, J.R. Larus, and D.A. Wood. Tempest: A substrate for portable parallel programs. In COMPCON Spring'95, March 1995. [33] J.-H. Hoepman, M. Papatriantafilou, and P. Tsigas. Toward self-stabilizing wait-free shared memory objects. Technical Report CS-R9514, Centrum voor Wiskunde en Informatica, 1995. [34] P. Keleher, A. Cox, and W.. Zwaenepoel. Lazy release consistency for software distributed shared memory. In Proceedings of the 19th Annual Symposium on Computer Architecture, pages 13-21, May 1992. [35] P. Keleher, A.L. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed shared memory on standard workstations and operating systems. In Winter USENIX, 1994. [36] Y.A. Khalidi. Hardware Support for Distributed Object-Based Systems. PhD thesis, School of Information and Computer Science, Georgia Institute of Technology, 1989. [37] K.H. Kim. Programmer-transparent coordination of recovery concurrent processes: Philosophy and rules for efficient implementation. IEEE Transactions on Software Engenieering, 14(8):810-821, June 1988. [38] P.T. Koch, R.J. Fowler, and E. Jul. Message-driven relaxed consistency in a software distributed shared memory. In First Symposium on Operating System Design and Implementation, pages 75-85. USENIX Association, November 1994. [39] P. Kohli, M. Ahamad, and K. Schwan. Indigo: User-level support for building distributed shared abstractions. Technical Report GIT-ICS-94/53, School of Information and Computer Science, Georgia Institute of Technology, March 1995. [40] L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 28(9):690-691, September 1979. [41] L. Lamport. On interprocess communication: Parts I and IL Distributed Commput-ing, 1(2):77-1101, 1986. [42] L. Lamport. Time, clocks and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-565, July 1991. [43] K. Li and P. Hudak. Memory coherence in shared virtual memory systems. In Proceedings of the 5th Annual ACM Symposium on Principles of Distributed Computing, pages 229-239. ACM, August 1986. [44] K. Li and P. Hudak. Memory coherence in shared memory systems. ACM Transactions on Computer Systems, 7(4):321-359, November 1989. [45] R.J. Lipton and J.S. Sandberg. PRAM: A scalable shared memory. Technical Report CS-TR-180-88, Princeton University, Department of Computer Science, September 1988. [46] J. Misra. Axioms for memory access in asynchronous hardware systems. ACM Transactions on Programming Languages and Systems, 8(1):142-153, January 1986. [47] R.F. Rashid et al. Machine-independent virtual memory management for paged uniprocessor and multiprocessor architectures. In Proceedings of the 2nd Symposium on Architectural Support for Programming Languages and Operating Systems, October 1987, [48] A. Sane, K. MacGregor, and R. Campbell. Distributed virtual memory consistency protocols: Design and performance. In Second IEEE Workshop on Experimental Distributed Systems, 1990. [49] D.J. Scales and M.S. Lam. The design and evaluation of a shared object system for distributed memory machines. In First Symposium on Operating Systems Design and Implementation. USENIX. [50] F.B. Schneider. Fail-stop processors. In Proceedings IEEE, pages 66-70. IEEE, 1983. [51] S. Stumm and S. Zhou. Algorithms implementing distributed shared memory. IEEE Computer, 23(5):54-64, May 1988. [52] S. Sureshchandran and T.A. Gonsalves. The performance of the MemNet distributed shared memory architectures. Technical Report TR-CSE-90-02, Department of Computer Science, Indian Institute of Technology, January 1990. [53] K.-L. Wu and W.K. Fuchs. Recoverable distributed shared virtual memory. IEEE Transactions on Computers, 39(4):460-469, April 1990. [54] M. Young et al. The duality of memory and communication in the implementation of a multiprocessor operating system. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles, pages 63-76. ACM, November 1987. [55] S. Zhou, M. Stumm, and T. Mclnerney. Extending distributed shared memory to heterogeneous environments. In Proceedings of the 10th International Conference on Distributed Computing Systems, pages 30-37. IEEE, 1990. Human Adaptation to Qualitatively Novel Environment: The Role of Information and Knowledge in Developing Countries A. Dreimanis Environmental State Inspectorate 25 Rupniecibas Str., LV-1877 Riga, Latvia Keywords: developing countries, human adaptation, information knowledge, novel environment Edited by: Anton P. Železnikar Received: April 4, 1996 Revised: November 5, 1996 Accepted: December 3, 1996 A systemic analysis of information and knowledge functions in human adaptation to qualitatively novel environment is proposed. The term "environment", being treated in line of C. Popper's and J. Eccle's concept of human's three worlds — the set of: 1) physical and 2) mental objects and states as well as that of 3) mental products — would include a multitude of various economical-material, social, cultural and psychological conditions. Knowledge and information — the necessary factors, in order humans and society could develop and elevate their internal variety. Acquired and transferred knowledge — efficienti source of proper adaptation and harmonization in qualitatively novel environment. Capabilities of a developing system to be in harmony with changing environment will depend on mutual interrelations between information adaptation and self-creativity, and, in particularly, on creative use of available knowledge and information. 1 Introduction The main tendencies of developmental processes of our world are certainly directed towards a permanent increase of its complexity and, correspondingly, to the emergence of a whole set of actual problems of various degree of complexity and solvability, which especially clearly is being manifested during the last decade. Just in this period there has been risen jan essentially novel set of economical, social, political and psychological problems, caused by disappearance of the previous two mutually incompatible political and socio-economic systems. Accordingly, we are faced up with a general global phenomenon — breakdown of the previous long-lasting poltical, economical and social relations in the East Europe countries, corresponding to the socialistic order, and, therefore, with a concomitant transition in these countries from the previous closed society to an open society. It is just the system's openess which provides the considering versatile phenomenon with a multitude of external factors which afffect on the elements of the developing system as well as impart to the forementioned transition a non-equilibrium character. Furthermore, such transition period is being characterized by an elevated degree of chaos on the level of the whole society and its political and economical order. 2 Key approaches to study basic principles of socio-economic development 2.1 The key methodological principles Therefore, for the seeking of basic principles of faourable socio-economic development of these transition countries to an open society(system), let us refer to the: 1) concepts of a generalized non-linear science — synergetics, in particulary, to the aspects of self-organization(SO) processes of qualitatively novel structures, as well as of chaotic phenomena being, in particulary, the basis of such fundamental concepts as adaptation, information and complexity, and 2) W.R.Ashby principle of requisite variety [2], requiring that for successful development and survival of the given system its own or inherent complexity should exceed the complexity of its environment. 2.2 The basic survival and adaptation problem Such openess actualizes to a significant degre the problem of individual's survival in a changing complex environment: namely, an urgent necessity is set in for the people of the East European countries in order to cope with a whole set of the forming essentially novel for them conditions and requirements. Therefore, as one of the basic problem of the forementioned survival strategy could be defined in the following manner: how should an individual of such society organize his activities and to evolve his own world (the inner as well as outer) in order to adapt to the changing fundamental values of the socio-economical life and, thus, as far as possible, to harmonize his own world. Furthermore, "general thinking and economic motivations of the individual people who are the microelements of a society, and millions of these people combine to make a macroscopic system, which undergo oscillations and unstable behaviour" [8]. Such an individual's problem is self-consistently connected with the more global transition process — being of non-linear character — of the whole society to a generally novel system, where, according to the self-consistency principle, "the individual members of society contribute, via their cultural and economical activities, to the generation of a general field of civilization" [16], consisting of political, economical, social and cultural components and determining the socio-political atmosphere and the economical situation of society. Therefore, one can argue: it is the collective field which governs the basic activities of society and, thus, could be considered as the Order parameter of society. 3 The extended concept of environment and the principle of requisite variety When stressing the complex system-environment interaction appr.och to the analysis of evolving society and of their members to the qualitatively novel conditions, then — for the purposes of a detail study of human-environment interaction — the concept "environment" could be understood in line of the concept of human's three worlds, argued by Popper and Eccles [12] — the set of 1) physical and 2) mental objects and states, as well as that of 3) mental products. Thus, the concept "environment" would include a multitude of various economic-material, social, cultural and psychological conditions. Having specified the meaning of the extended concept "environment" in the aboveconsidered sense , we subsequently acquire the capability to conclude that, according to the forementioned principle of requisite variety, the necessary condition of efficient adaptation of the individual to the new, open environment will be predominance of the human's inherent self-variety over the environmental variety. Correspondingly, it is necessary for th individuals as well as for the whole society to evolve themselve(s), in particulary, by acquiring knowledge or organized information, being considered by Kuhn and Lehman [11] as the complexity factor and envisaged in order to better comprehend human's functioning in complex environment. Furthermore, according to the thesis [15] "these are the systems with self-learnig ability which can react with environment at two different levels", thereby to a significant degree enhancing capabilities and efficiency of adaptaton to a changing environment. Moreover, it is reasonable to consider the viewpoint argued by Keel- Sleswik [9] that just information, knowledge and meaning will be the basic ways we relate to our environment by means of SO procsses. 4 Adaptation: its role and interrelations to information, knowledge and SO 4.1 Adaptation, information and uncertainty In order to elaborate further the concept of the inherent self-variety as well as to find out the possible routes of its enhancing, it is reasonable to develop an unified analysis of the SO, adaptation and information concepts. First of all, one should once more emphasize that both — adap- tation and information — are ultimately based on such concepts as chaos and diversity: namely, the probability of adaptation occurrence as well as of information generation is basically provided by chaotic attractors. In particulary, it is just the chaotic attractor which will be able to create sufficient flexibility of behaviour due to random change of "initial conditions" [17]. Besides, information as well as adaptation will proced by means of self-selection. Furthemore, as the next essential joint manifestation, in particulary, between chaos, adaptation and information (or its deficiency) is to be emphasized that system's adaptability will emerge from the system's ability to anticipate environmental perturbations [13, 14], At the same time, one of the three basic adaptability components is the behavioral uncertainty component of adaptability [5]. Actually, it is just the behavioral uncertainty which will allow the system to cope with unpredictability of the environment and with its external disturbance. Thus, this last thesis clearly substantiates the necessity of non-standard and unexpected approaches and activities of individuals in the processes of choosing and achieving their economical and social goals, especially in the transition period to an open society. On the other hand, let us note that, at the same time, adaptability is the use of information to handle environmental uncertainty [5] and is brought in, in order to replenish deficiency in the necessary information. Furtermore, the existing marked uncertainty, corresponding to a lack in necessary information, being especially strong during the transition period when there are no certain regulations and guidelines for actions, is put in the forefront of the process of generation of adequate reliable information. It is just such information which is urgently necessary for the individuals to design their activities which should be directed towards their adaptation and harmonization in the changing environment. 4.2 Knowledge, its SO and role in adaptation Taking into account the forementionned interrelations between information, adaptation and SO, as well as the view at the concept "knowledge" as an organized form of information, one can deduce that a necessary coiidition of successful adap- tation to qualitatively novel environment would be possessing of adequate knowledge about the world. For the problem under consideration one would be particularly reasonble to distinguish following components of the knowledge about the world [10]: a) knowledges about our own abilities (or limitations thereof), 2) possible human intentions, and c) possible relations between objects. Moreover, knowledge as such is not only the product of SO (of information), but also propos-edly is itself capable of further SO in efficient highlevel forms. Namely, an essential favourable condition of origin of novel advantegous solutions of adaptation to novel conditions would be a synthesized and self-organized knowledge. Thus, knowledge synthesis and integration in the form of inter-disciplinar knowledge would be basis of emergence of a knew, higher-level knowledge about complex current O On the other hand, as argued by [7] "the SO of the work which roots in creative man, but transcends him becomes evident is the structure of knowledges no less than the structures of art". 4.3 Informational stressors and self-organization In particular, such adaptation proceeds by means of SO processes being initiated by small internal and/or external (environmental) perturbations or stressors. As the most important informational stressors or agents should be distinguished the following ones: 1) the lack in available information being necessary in order to plan subsequent activities, 2) the prevalence of infoemation which is subjectively negative for the particular individual, and 3) the necessity for information processing in a too limited time-interval. These informational stressors manifest themselves as actual factors of emergence of considerable psycho-emotional stress, with a subsequent unfavourable effect on the individual's viability, health and ca-pabihty to perform a qualitative and efficient job. In general, the problem of beneficial adaptation to such information stressors is to be regarded as one of the key elements of harmonization of the human's Self in a crucially changing stressful environment. 5 Variety Of Adaptation Manifestations In The Transition Process To The Openess 5.1 The two levels of an adaptation process Further, let us take into account the thesis, proposed by Ashby, of a self-organizing system as a system consisting of an organism and of its environmental medium. In the present problem under consideration — about the socio-eonomic evolution of society and its members — it will be the following complex parameter, namely — the en-vironmntal novelty — which is driving the SO processes of the whole society as well as of its members — individuals — to a crucially distinguishing open system and correspondingly emerging requirements. In particulary, the environmental novelty is formed by two distinguishing phenomena: (a) the processes being arisen due to the transition from a closed to an open environment in a local area of the human's Self, and (b) by the second-level adaptation of the local environment (the elementary structure of the newly forming society) to the prevailing environment on the global scale. As the most urgent element of the first, i.e., (a), of these two abovementiond phenomena one can specify the following one: the individual's social, economical and psychological adaptation to the demands of competition and cooperation as the necessary prerequisites of it's successful functioning and survival in conditions of environmental openess and competiveness. Moreover, the fact that these basic principles of socio-economical system in an open, market-type society, namely — cooperaion and competition — correspond to the key principles of SO of forming new systems, substantiates, therefore, the appropriateness of synergetical approach to the problem of a human's adaptation to an open society and socio-economical system. On the other hand, the basic feature of the second, i.e., (b)-type phenomena, consist in the dual nature of the local environment — such environment presents itself simultaneously as 1) the adaptation target, as well as 2) an adapting subject. Therefore, it is reasonable to expect that just on this level there will be manifested synergic coadaptation of the human's Self and of his local environment to global environment and of its corresponding requirements, to which are faced up the evolving society and its particular members. 5.2 Mutual interrelations between passive and active adaptation In view of the occurrence of such multilevel adaptation which characterizes the procsses of socioeconomic development in the transition countries as well noting the viewpoint [15] about the system with self-learning ability as reacting with environment at two different levels, it is reasonable to set up the problem of mutual interrelation of a pure or passive adaptation to existing environment (namely: the system is mainly evolving in the environment) and an active adaptation (or self-creativity ), where the system evolves its environment. For the problem of harmonious involvement of a human into a changing socio-economic environment would be essential a following moment [15]: between "traditional stabilizing adaptation and the active self-creativity various tensions exist", which have harmonizing functions and are manifested most eficiently just at the border between the passive adaptation and active adaptation. Just this border region is characterized by an optimal flexibility which is necessary for beneficial multilevel coadaptation of individuals and of their environment. Actually, a human should, on the one hand, to adapt to the novel socio-economic environment, but, on the other hand, he should at the same time to create by himself such environment, in order to provide maximally favourable conditions for the system to accomplish efficient adaptation and evolution. Therefore, by considering the problems of socio-economic development of a new type of society and of harmonization of its members to the novel complex environment, pronouncely emerges a fundamental problem of mutual interrelation of active and passive adaptation. 6 Harmonization To A Changing Environment: essence and principles 6.1 Self-creativity as active adaptation factor In particulary, the capabilities of an evolving system to be in harmony with a changing environment will depend on the mutual interrelations between the active self- creativity, information and adaptation. Because one of the necessary requirements of successful involvement of a human in the novel socio-economic environment consists in breaking of the previous stereotypes of economical thinking and development of nonlinear, flexible and integral thinking, such an evolution of the human's approach to the forming process of his interrelations with the novel socio-economic environment could be regarded, according to Ba-nath'ies proposal [3], as creative reaching out of the system's own boundaries, where "under self-creativity and adaptability self-organizing system will reach the upper limit of the ordinary environment at the end of evolution by gradual and sudden changes and finally break the limits of ordinary environment" [15]. The importance of such flexible and creative approach one could be emphasized by the fact that markets, as argued by Allen and Phong [1], will always drive themselves to edge of predictability , and, therefore, first of all, one should learn to manage the changes in the economic environment. Moreover, Crutchfield [6] argued that there exists a following fundamental tendency of most of complex systems — to behave between chaos and order, to move at the border between structure and ucertainty. Therefore, just on such border region between the traditional status (the order or structure) and the novel and still yet uncertain factors, will most likely emerge the necessary high flexibility and efficient adaptability, and, in general, sufficient level of complexity and self-variety of the developing system. 6.2 Current problems of education optimization Practical improvement of human adaptation to novel environment and of their competence in different areas and, therefore, the readiness to cope with currrent problems is one of the basic tasks of contemporary education at all levels and, thus, the main objective of an efficient optimization of education system. Here there one is reasonable to distinguish (conventionally) three basic levels of education: 1) primary and secondary school level — the main emphasis proposedly is to be put on development of individuals' capability of creative, nonstandard approach to solving of complex problems, and, in general, of such mental characteristic as integral thinking. The possible route of achieving such capabilities would be intesive development and training of right-hemisphere functions of our brain, by means of lessions in different arts; 2) high and higher school level — optimization of education programmes, taking into account the current and future needs, as well as strengthening of international cooperation with developd countries, participation in international education and research programmes and networks; 3) the adult education — the most significat problem, in view of the urgent necessity to carry out the basic reforms within as short as possible time-period, with the main emphasis — to provide the individual with such general education basis, which would efficiently succeed to acquiring of several specialities, requalification and efficient competiveness in the job market. 7 Conclusion Having revealed and analyzed the nature of some of the basic problems of human's involvement in a new, open socio-economical world, it is worthwhile to emphasize that C.Godei and, thereafter, G.Chaitin [4] have proven that even simple problems can have answers so complicated that they contain more information than the human's entire logical pattern. Correspondingly, it is reasonable to expect beneficial results from a further elaboration of the foreconsidered nonlinear approach, in particularly, the joint integral analysis and the use of concepts of self-organization, information, adaptation and creativity as well as analysis of their mutual interrelations, in order better to comprehend and solve vital contemporary human problems, for example, the problem of adaptation to the novel socio-economic environment, which, in turn, itself is being formed during the transition process to a qualitatively distinguishing state's system. As one of the key elements of an individual's efficient adaptation of such type would be elevation of his self-variety by evolving his knowledge system, in particularly, about the environment in its wide sense (taking into account that such evolving is a typical feature of the developing system), as well as creative, flexible use of this knowledge, with the tendency to develop not only an open, but also an informational society, which would elevate economical processes to a totally new level [7]. References [1] Allen, P. and Phong, H. (1993). Evolution, creativity and intelligence. In Herman Haken, editor, Interdisciphnar Approaches To Nonlinear Complex Systems, 12-31, Springer, Berlin. [2] Ashby, W.R. (1959). The Introduction to Cybernetics (in Russian). Inostrannaya Literatura, Moscow. [3] Banathy, B. (1993). Prom evolutionary creativity to guided evolution. World Futures, 36, 73-79. [4] Chaitin, G. (1987). Information, Randomness and Incompleteness. World Scientific, Singapore. [5] Conrad, M. (1983). Adaptability. Plenum Press, New York. [6] Crutchfield, J. (1991). Reconstructing language hierarchies. In H.Atmanspacher and H.Scheingraber, editors. Information Dynamics, Plenum Press, New York. [7] Jantsch, E. (1980). The Self-Organizing Universe. Pergamon Press, Oxford. [8] Jaynes, E. (1985). Macroscopic prediction. In Herman Haken, editor. Complex Systems — Operational Approaches In Neurobiology, Physics and Computers, 254-269, Springer, Berlin. [9] Keel-Sleswik, R. (1992). Artifacts in software design. In C.Floyd et al., editors. Software Development and Reality Construction, 168188, Springer, Berlin. [10] Kobsa, A. (1984). Knowledge representation: a survey of its semantics, a sketch of its semantics. Cybernetics and Systems, 15, 4189. [11] Kuhn, H. and Lehman, U. (1984). Transition from the non-living state into living state. In R.K.Mishra, editor, The Living State, 300317, Delhi. [12] Popper, K. and Eccles, J. (1977). The Self and Its Brain. Springer, Berlin. [13] Rosen, R. (1985). Anticipatory Systems: philosophical, mathematical and methodological foundations. Pergamon Press, Oxford. [14] Salthe, S. (1992). Hierarchical self-organization as the new post-cybernetic perspective. In G.van de Vijer, editor. New Perspectives on Cybernetics: Self- Organization, Autonomy and Connectionism, 49-58, Kluwer, Dordrecht. [15] Tao, H. (1993). The structure of Multistasis: on the evaluation of self-organizing systems. World Futures, 37, 1-28. [16] Weidlich, H. (!991). Physics and social science — the approach of synergetics. Physics Reports, 240, 1-163. [17] Zak, M. (1990). Creative dynamics approach to neural intelligence. Biological Cybernetics, 64,15-23. On the Performance of Back-Propagation Networks in Econometric Analysis Montserrat Guillén and Carlos Soldevilla Dept. Econometria, Estadistica i Economia Espanyola, Universidad de Barcelona Tte. Cor. Valenzuela, 1-11. 08034 Barcelona, Spain Phone: (343) 402 14 09, Fax: (343) 402 18 21 E-mail: guillen0riscd2.eco.ub.es, csolde@riscd2.eco.ub.es Keywords: neural networks, discriminant analysis, logistic regression, time series Edited by: Matjaž Gams Received: July 10, 1996 Revised: November 5, 1996 Accepted: November 11, 1996 Neural networks may be applied in the context of econometric analysis, both when discussing issues that have traditionally been attached to multivariate analysis and in the fìeld of time series. This paper compares the performance of backpropagation networks with classical approaches. Firstly, an example in banking is presented. The network outperforms discriminant analysis and logistic regression when conditional classifìcafìon error is considered. Secondly, the identifìcation of simple stationary time series is analyzed. Some series following simple autoregressive, moving average schemes were simulated, and the network successfully identifìed them. Conclusions are presented in the closing section. 1 Introduction To date, neural networks techniques have been applied in many areas, such as pattern recognition, robotic control and decision making [6], [16]. In the field of applied economics and econometrics, the experience is still recent and rather short (see, for example, [1], [2], [4], [9] and [15]). In this respect, several outstanding contributions that relate neural networks and statistical analysis ([10], [11]) must also be cited. Artificial neural networks mimic the neurophy-sical structure of the brain using mathematical models and interaction algorithms. Every neural unit in this process transforms input signals into a single output, that is transmited to other elements. Interconnections are constantly adjusted throughout the learning process. In the first step, every signal is multiplied by its corresponding connection weight. The sum of these products is called the net input. In the second step, an activation function converts the input into a net output signal. Typically, two kinds of activation functions are used for this purpose: step functions, that compare the net input to a certain threshold, and sigmoidal functions, that allow for a non linear relationship. Although several authors are successfully applying neural network models in the context of applied economics, some of those situations had already been discussed using classical statistical or econometric methodology (see [1] and [4]). Studies tend to either apply neural networks without comparing their performance with other methods or similarities with traditional statistical techniques, in a theoretical framework. This article is concerned with artificial neural networks for data analysis. The aim of our paper is twofold: a) A comparison between Discriminant Analysis, Logistic Regression and a Neural Network approach, will be discussed using a classical data set in the field of banking, b) The performance of a neural network model when identifying the structure of a time series, given the ACF (Autocorrelation function) and the PACE (Partial-autocorrelation function) will also be studied. Figure 1. Structure of a feed-forward network. 2 Feed-Forward Neural Networks and Statistical Methods Several types of neural network models are available. This paper concentrates on feed-forward networks, which have the following structure: neural network processing units are grouped in three (or more) layers. The input layer directly receives information from input data, hidden layers connect input and output layers and, finally, the output layer provides output information. Figure 1 shows the simple topology just described. An extensive treatment of the relationship between neural networks and graph theory may be found in [2], A network learning process aims to find the "optimal" connection between input and output. Given P vectors {xi,yi),... {xp,yp), where Xi is the input data and yi the corresponding output data, with an unknown correspondence: yM Y = cj}{X)-. an algorithm must train the network, so that 4>*{-), an approximation of is found. Weights Wij are calculated in order to minimize the approximation error. Quadratic errors, i.e., i=l are usually considered. One of the most popular algorithms that has been used successfully in many applications is the backpropagation learning algorithm in a feedforward network [11]. It is based on numerical methods for several variable functions. By starting at the output, the algorithm modifies the weights in order to reduce the observed prediction errors using a sweeping backwards process. This process is repeated with each new input-output combination and along several iterations all over the learning sample until a certain precision criterium is met. After the training process, the network must be able to generalize and predict the output of a new input. The basic idea underlying the standard back-propagation algorithm is as follows: given an input data set and starting values for the weights relating the layers, the output values are calculated (fitted). The error function is also computed. In the next step, the weights are slightly modified (proportionally on the derivative of the error function with respect to the Wij). Iterations (cycles) are repeated until the error is small enough. Details about this algorithm and some modifications that accelerate its convergence are available in [13]. The backpropagation algorithm has proved to be very efficient in multi-layer networks, especially when input and output are non-linearly related. Differences between the neural network approach and the statistical methodology have extensively been discussed in [10] and [12]. Essentially they agree on the fact that, although neural networks are useful for statistical applications they remain as black boxes predicting rather than explaining. Here, we are going to overview some examples of typical statistical models and their corresponding neural network counterparts. A Simple Linear Regression may be represented by a neural network with a unit in the input layer (taking the input from the independent variable), a single output unit and an activating function equal to the identity. Figure 2 shows this kind of structure. Its generalization to Multiple Linear Regression is easily achei ved by increasing the number of input neural units. Whenever activating functions equal to step functions are used, the network corresponds to Discriminant Analysis scenario. Moreover, when taking logistic functions in the structure shown in Figure 3, the equivalent to a logistic model is found. In logistic regression, the behaviour of a dichotomous dependent variable (output) is modelled so that its probability of being equal to one, Pi, depends non-linearly on a linear combination of covariates: Pi = 1 -I- exp iß'xi) where /? is a vector parameter [7]. (1) /)= Y Dependent Variable Independent Variables Figure 2. Linear Regression Neural Network = Y Dependent Variable Independent Variables Figure 3. Logistic Regression Neural Network Altman et al. [1] claim that "neural networks are not a clearly dominant mathematical technique compared to traditional statistical techniques, such as discriminant analysis". These authors recommend to use both approaches in tandem. There seems to be a general agreement in encouraging balanced discussions that not only show advantages and disadvantatges, but that compare the performance of a neural network modelling against classical statistical methodologies. In fact, a similar controversy arose, some years ago, when discrete choice models (logit and probit) entered the econometric scene, as a novel alternative to multivariate analysis. In AppHcation 1 we are going to study the performance of an artificial neural network, and compare its results with those obtained using standard techniques. It is well known that Discriminant Analysis is based on strong statistical hypothesis about the distribution of the explanatory variables, while Logistic Regression introduces a specification restriction with the use of the link function in equation (1). Application 2 will be devoted to a completely different problem, where no alternative statistical procedure is widely accepted. The identification of the autocorrelation function patterns in time series analysis has traditionally been based on expertise, due to the poor performance of alternative approaches. Results from a neural network analysis will be shown. 3 Application 1: Interest Rate Choice An important decision process in the context of banking is choosing between fixed and adjustable rate mortgages. This issue has been discussed by several authors ([5], [14] and the references therein). In the theoretical literature, some authors agree on the fact that individual characteristics do not influence the choice, but that the terms of the contract do. Others suggest that, in the presence of asymmetric information, borrowers' characteristics may have a potencial impact on the decision choice. A straightforward method to study this problem is either Discriminant Analysis or Logistic (or Probit) Regression. 3.1 The Data The sample includes 78 loans from a USA national mortgage banker collected over the time period January 1983 to Febreary 1984. Details about the data set may be found in [5]. The variables considered are the following: Dependient variable: ADJ: Dichotomous, equals 0.5 if the client chooses and adjustable interest rate and -0.5 otherwise. Exogeneous variables indicating market and contract characteristics: FI: Fixed interest rate. MAR: Margin on the adjustable rate mortgage. YLD: Difference between the 10-year Treasury rate less the the 1-year Treasury rate. PTS: Ratio of points paid on adjustable to fixed rate mortgages. MAT: Ratio of maturities on adjustable to fixed rate mortgages. Exogeneous variables indicating personal characteristics: BA: Age of the borrower. BS: Number of years of school. FTB: Dichotomous, equals 1 if the borrower is a first-time homebuyer, 0 otherwise. CB: Dichotomous, equals 1 if there is a co-borrower, 0 otherwise. MC: Dichotomous, equals 1 if the borrower is married, 0 otherwise. SE: Dichotomous, equals 1 if the borrower is a self-employed, 0 otherwise. MOB: Number of years at present address. Exogeneous variables indicating economic characteristics: NW: Net worth of the borrower. LA: Liquid assests. STL: Short-term liabilities. As suggested by Serrano [14] and using graphical plots, it can easily be seen that there is an outlier observation that may distort the analysis. Therefore, this observation was eliminated from the study. Following [4], a Stepwise Discriminant Analysis and a Stepwise Logit were used to select the variables to be included in the models. Finally, only eight variables, were used in the specification (FI, MAR, PTS, BA, CB, SE, LA and STL). Since variable ranges were very dissimilar, the following scaling was needed: * _ Xjj - min Xj max Xi — min Xi ' where observation j for variable i and Xi is the vector of all observations of variable i. 3.2 Results The learning set was formed by randomly selecting 50 observations. Figure 4 reproduces the structure of the neural network that was used for this purpose: 8 units in the input layer, 3 units in the hidden layer and 2 output units. In fact, outputs were taken to be (ADJj, -ADJj). Fixed Rate Adjust. Rate Input Layer Hidden Layer Output Layer Figm'e 4. Neural Network for Interest Rate Choice. Using the Aspirin/MIGRANES software [8] on an IBM Risc/6000 machine, and after iterating, the conditional classification rates shown in Tables 1 and 2 were obtained. The backpropaga-tion algorithm described above was used and the learning process took 4,000 cycles. The results indicate that the learning set was correctly classified and that, for example, 27.2% of the clients that chose an adjustable interest rate were classified incorrectly in the test set. Predicted Observed Adjustable rate Fixed rate Adjustable 100.0% 0% Fixed 0% 100.0% Table 1: Classification rates for the learning set Predicted Observed Adjustable rate Fixed rate Adjustable 72.7% 27.2% Fixed 12.5% 87.5% Table 2: Classification rates for the test set Table 3 shows the conditional classification results in the overall sample for every method, i.e., neural network (NN), discriminant analysis (DA) and logistic regression (LR). Predicted Observed Adjustable rate Fixed rate NN 90.6% 9.4% Adj. DA 84.3% 15.6% LR 84.3% 15.6% NN 4.4% 95.5% Fix. DA 20.0% 80.0% LR 11.1% 88.8% Table 3: Classification rates for the data set 3.3 Discussion As shown in Table 3, the prediction performance of the neural network is better in the group of customers that chose a fixed interest rate than in the group that chose an adjustable rate. It is also the method with least classification error, both globally and conditionally. Of course, the classification rates in the classical methods may be improved by optimizing the classification threshold, but results do not vary substantially. We conclude that, in this example, a neural network approach is recommended when the prediction performance is crucial. On the other hand, the structure of the network does not allow to identify the relative importance of each input variable in the decision process, and therefore, does not allow to test the hypothesis of individual characteristics influencing the choice. This application leaves some room for further research, such as the effect of not excluding the outlier observation in the data set and the behaviour of the network in the presence of sample selection. 4 Application 2: Time Series Identification Box-Jenkins analysis of time series is based on the identification of the ARIMA process by recognizing the theoretical pattern of the Autocorrelation (ACF) and Partial-autocorrelation (PACF) sample functions (see, [7]). By examining a plot of those functions, it is possible to classify an observed time series into a type of ARIMA model. In econometric analysis time series are usually short, therefore, sampling variability makes it difficult to identify the model. A lot of experience, and sometimes several attempts, are needed to correctly classify the series. Identification is not only important for prediction purposes, but also as an explanatory tool of the economic trends. 4.1 The Data We focussed on a hmited amount of ARIMA models. So, we chose first and second-order stationary autoregressive schemes (i.e., AR(1) and AR(2)), and first and second-order invertible moving average schemes (i.e., MA(1) andMA(2)). We took the first eight values of both the ACF and the PACF, because they are usually enough to recognize the time series structure. So, 16 values were the input to the neural network model. Since an AR(1) model with positive parameter has a decreasing ACF, which is similar to the behaviour expected for a MA(1), the output layer distinguished between positive and negative parameters in the first-order models. Therefore, the network used 6 output neural units in the output layer. Figure 5 shows the neural network structure used to dentify the series. 4.2 Results The learning set for the network was formed by the theoretical values of the ACF and the PACF for models AR(1) and MA(1), with parameters -0.9,..., -0.1,0.1,..., 0.9. For second-order models, parameters were chosen such that acf pacf Input Layer Qar(i)>o ar(1)<0 qmA(1)>0 Qma(i)