H 9 informatica 1 YU ISSN 0350 5596 informatica JOURNAL OF COMPUTING AND INFORMATICS YU ISSN 0350-5596 Published by Informatika, Slovene Society for Infnrmntics, Parmeva 41, 61000 Ljubljana, Yugoslavia VOLUME 13, 1989 - No. 1 Rditorial Board Suad Alagić, Sarajevo; Damjan Bojadžiev, Ljubljana; Jozo Dujmović, Beograd; Janez Grad, Ljubljana, Bogomir Horvat, Maribor; Ljubo Pipan, Kranj; Tomo Pisanski , Ljubljana; Oliver Popov, Skopje; SaSo Prešern, Ljubljana; Viljem Rupnik, Ljubljana; Branko Souček, Zagreb Editor-in-Chief : Prof. Dr. Anton P. Železni kar F.xocutive Kditor : Or. Rudolf Murn Pviblishing Council: T. Banovec, Zavod SR Slovenije za statistiko, Vožarski pot 12, 61000 Ljubljana; .4. .Jermnn-Blaf, ič, Iskra Telematika, Kardeljeva ploSòad 24, 61000 Ljubljana; S. Klemenčič, Iskra Telematika, 64000 Kranj; S. Saksida, Institut za sociologijo Univerze Edvarda Kardelja, 61000 Ljubljana; J. Virant, Fakulteta za elektrotehniko, Tr?,aftka 20, 61000 Ljubljana. Headquarters: Informatica, Journal for Computing and Informatics, Iskra Delta Computers, Stegne 15C, 6 1000 I.jubljana, Yugoslavia Phone: (+38 61) 57 45 54. Telex: 31366 yu delta Fax: (+38 61) 32 88 87 and (+38 61) 55 32 61. Annual Subscription Rate: USS 30 for companies, and l'S$ 15 for individuals. Opinions expressed in the contributions are not necessarily shared by the Editorial Board Printed by: Tiskarna Kresija, Ljubljana CONTHMTS A.P.Zeleznikar 3 Possibilities of Parallel In foriDBtion Processing in the 19908 6 Syntactic Parsing and Plotting of Mathematical ExpresBions 11 Relational Schema Description Language N. Bogunović R. Miladinović D. VelaSeviü Z. Kri bel B. l.egac M. MaruSič A. Novak 22 Selecting the Fourth Generation Programming Tools A.P.Zeleznikar 25 Informational Logic III 43 Transputers B. Jereb I,. Pipan A. Kl o filtar V'. Rupnik J. RugelJ J. Zerovnik M. Radovan Helena Tvrdy Tatjana Kapua 48 Estimation of Information Loss in Expense and Income Transformed Production Sya terns 53 Management of Distributed Systems 58 A Probaliatic Model of Computation 67 Data Modeling: ER Language and Normal Forms 79 A Distributed Directory 88 Authors Subject Index of Infomatica 12 ( 1988) 90 "Constructive Methods in Computer Science" (A Report) informatica Časopis izdaja Slovensko društvo Informatika, 61000 Ljubljana, Parmova 41, Jugoslavija ČASOPIS ZA TEHNOLOGIJO RAČUNALNIŠTVA IN PROBLEME INFORMATIKE ČASOPIS ZA RAČUNARSKU TEHNOLOGIJU I PROBLEME INFORMATIKE SPISANIE ZA TEHNOLOGIJA NA SMETANJETO I PROBLEMI OD OBLASTA NA IN FORMATI KATA Uredniški odbor: Suad Alagić, Sarajevo; Damjan Bojadžiev, Ljubljana; Jozo Dujmović, Beograd; Janez Grad, Ljubljana, Bogomir Horvat, Maribor; Ljubo Pipan, Kranj; Tomo Pisanski, Ljubljana; Oliver Popov, Skopje; SaSo PreSern, Ljubljana; Viljem Rupnik, Ljubljana; Branko Souček, Zagreb YU ISSN 0350-5596 LETNIK 13, 1989 - ŠT. 1 Glavni in odgovorni urednik: prof. dr. Anton P. Zeleznikar Tehnični urednik: dr. Rudolf Murn Založniški svet: T. Banovec, Zavod SR Slovenije za statistiko, Vožarski pot 12, 61000 Ljubljana; A. Jerman-Blažič, Iskra Telematika, Kardeljeva ploSCad 24, 61000 Ljubljana; B. Klemendič, Iskra Telematika, 64000 Kranj; S. Saksida, Institut za sociologijo Univerze Edvarda Kardelja, 61000 Ljubljana; J. Virant', Fakulteta za elektrotehniko. Tržaška 25, 61000 Ljubljana. Uredništvo in uprava: Časopis Informatika, Iskra Delta, Stegne 150, 61000 Ljubljana, telefon (061) 574 554; teleks 31366 YU Delta; fax (061) 328 887 in (061) 553 261. Letna naroCnina za delovne organizacije znaSa 48000 din, za zasebne naroCnike 12000 din, za Studente 4000 din; posamezna Številka 16000 din Številka žiro računa: 50101-678-51841 Pri financiranju časopisa sodeluje Raziskovalna skupnost Slovenije Na podlagi mnenja Republiškega komiteja za informiranje št. 23-85, z dne 29. 1. 1986, je časopis oproščen temeljnega davka od prometa proi zvodov. Tisk: Tiskarna Kresija, Ljubljana V S B B I N A A.P.Zeleznikar 3 Possibilities of Parallel Information Processing in the 1990s 6 Syntactic Parsing and Plotting of Mathemtioal Kxpressiona 11 Relational Scheaa Description Language N. BosunoviÓ R. MiladinoviC D. VelaSevió Z. Kri be J B. Legac M. Maruš iÖ A. Novak 22 Izbor prograaskega orodja Četrte generacije A.P.Zeleznikar 2S Inforaatlonal Logic III 43 Transputerji B. Jereb L. Pipan A. Klofutar V. Rvpnik J. SugelJ J. Zerovnik M. Radovan Helena Tvrdy Tatjana Kapus 48 Ocena infornaciJske škode stroškovno in dohodkovno transformiranih proizvodnih sistemov 53 Upravljanje porazdeljenih sistemov 58 Verjetnostni model računanja 67 Modeliranje podataka: ER Jezik i normalne forme 79 Porazdeljeni elektronski imenik 88 Avtorsko stvarno kazalo časopisa Informatica 12 (1988) 90 "Constructive Methods in Computer Science" (poročilo) POSSIBILITIES OF PARALLEL INFORMATION PROCESING IN THE 1990s* Descriptors: PROCESSING PARALLEL, PARSYS PROJECT DEVELOPMENT PERSPECTIVE Anton P. Železnikar Iskra Delta This essay presents a short overview of a research project in which informational logic is developed and investigated in a general and the informational parallelism in a particular Banner. This investigation opens a new outlook on possibilities of parallel information processing in architectural as well as in operational philosophy and logic. In this project information is understood as an extremely parallel dynamic phenomenon, which arises in the realm of informational. Primarily the essay gives comments on the following topics: parallelism as information, informational machine, informational program, and some aspects of informational logic, its formalism, and axiomatization. 1. Introduction The carrying out of the Parsys Project of Iskra Delta Computers has brought to light and accentuated also several questions concerning possible computer scene in the next decade. The criticism of distinguished philosophers and computer scientists in the field of artificial intelligence has been understood as a substantial change of the optimistic research initiative and as the arising of a new philosophy considering the possibilities of information processing in the future. In the leading professional journals relating computer science, new generation computing, parallel processing and the traditional field of artificial intelligence, a new philosophical and technological paradigxo is coming into existence. Roughly, this paradigm tells that, for instance, the possibilities of numerous new logics of knowledge, belief, awareness, reasoning, epi stemo1ogy, cognition, information, etc., will certainly influence the' structure and organization of intelligent machines. Simultaneously, the appearance of new technologies exploring the concepts of biological, neural, and solid state nets will enable the implementation of extremely complexly structured and organized parallelism of information. » This essay was read at the 1st Yugoslav-Italian Meeting on Parallel Processing, held at the International Centre for Applied Sciences, in Gradisca d'Isonzo (Italy), on September 22, 1988. In this essay, our attention within the mentioned paradigm will be focused on the possibilities of parallel information processing. In this context, the necessity to redefine the notion of information is becoming essential. The common sense of the contemporary information era tells us that the notion of information is used in its narrowed meaning prevailing in the previous century, when information processing, for instance, in living organisms, was mostly a product of metaphysical imagination. Thus, together with the new philosophical paradigm in the field of artificial intelligence also the new paradigm of information has to be considered [IJ. 2. What is Parallelism as Information? Nowadays we imagine information as an extremely dynamic phenomenon appearing in the cosmic, living, and artificial realm. By its belief, awareness, and commonsense reasoning, a human being can look into its own information processing by se 1f-observati on, self-investigation, self-cognition, and self-comprehension. Through such a looking into its own self, a being understands information in a much more complex way than it can be comprehended from the positions and theories of contemporary information science. For instance, the measuring of information within mathematically founded information' theory does not concern the meaning of information or how this meaning or understanding of the meaning is coming into existence in living systems. What is information aa information? Information informs itself and other information and is informed by Itself and by other information. This principle says that information informs actively and is informed passively. Information performs as subject and aa object. Information can be viewed as operator and as operand. Information arises in its active and passive informin«. In general, informing of information is a synonym for the so-called informational arising. The adverb informational is introduced to represent the described nature of information. In this context we can speak about informational form and informational process, of informational structure and informational organization, and certainly of informational apoutanelty, circularity, recurrence, parallelism, sequentialness, serialness, etc., which are all informational notions and perform as Information. Informational arising means circularly spontaneous coming of information into existence, but also changing, vanishing and disappearing of information. Informational arising has to be understood as a creative component of information and its informing. Information which comes to existence is called counter-information and it has to be Informationally embedded into existing information, otherwise it vanishes as Informational noise. The "informational1y creative" has the meaning of spontaneous, unforeseeable, autopoietic, self-structuring, and self-organizing informing of information. It becomes evident that the so-called cultural forms, for instance, philosophy, art, science, technology, ethics, etc., to mention only a few of them, can be brought into the informational framework of understanding. It seems that informational parallelism is one of the most complex notions the mankind was ever capable to think, comprehend, and Implement. Informational parallelism is a structure and organization of informationally interwoven informational forms and Informational processes, which inform each other in a spontaneous, circular, recurrent, unforeseeable way. In this interweaving of informational forms and processes, oomplex information is coming into existence. informational parallelism. Thus a philosophical background of informational parallelism has to be constructed and investigated, for it cannot be discovered sufficiently in detail 121, 3. Parallel Informational Machine The principle of parallel informational machine can be described [1] in the following sense: The parallel informational machine performs as information. Informational parallelism is inherent to informational machine. This means that its structure of forms, components, constituents, and architecture is informational, for instance, in its physical, biological, or, generally, technological constitution. If informational, the architecture of informational machine has the property of parallel informational arising, for Instance, in varying of the machine connectedness, coming of structural components into existence, growing and vanishing of architecture, etc. In the same way the organization of informational machine is performing informational1y in functional flexibility, controllability, programmability under changing inward and outward conditions. Architecture of an informational machine is dynamic (brain-like, for instance), is dynamically controlled, interchangeable, interweaving, arising, and depends upon the machine's environments. The parallelism of an informational machine ia a regular property. Parallelism of information is the most general form of informational interweaving. Information is always interNOven, which is only a synonym for informational parallelism. Interweaving corresponds to Informational net structure and organization. Evidently, a particular strategy is needed for exploring informational nets in an optimal or economically complex parallel way. Parallel informational machine is only the synonym for various informational nets, irrespective of their specific natures, which can differ a great deal by their structure and organization. 4. Parallel Informational Program Currently, informational parallelism belongs to the most concealed and unrevealed phenomena and calls for philosophical and technological illumination. Parallel informational processes open the most complex spatial and temporal interweaving, dependence, and arising, as have ever been imaginable. Although, human mind in its basic structure and organization operates in a parallel-serial manner, on the higher cortical levels, in its global informational organization, for instance, in functions of belief, awareness, reasoning, world model, Intention, and understanding, it appears, behaves, and experiences primarily as a serial or sequential apparatus. Since the basic parallel-serial informational structure and organization of the mind are not directly, reflected in human awareness and conscious reasoning, the conscious part of human mind does not dispose of the required fundamental experience for an adequate conceptualization of The principle of parallel informational program can be described (11 in the following way: An informational program is simply information which spontaneously informs, embeds, arises, and counter-informs in an informationally circular manner, within an informational machine. An informational program informs, which means generates, changes itself and other informational programs during their execution and is influenced in such an informational way by other informational programs. It is used and embedded into an informational machine for production of information. This information can be, for instance, intelligence, specialized creativity, expertise, problem solving, dedicated informational functions, etc. Evidently, there is an essential difference between ,a computer program and an Informational program. The former is algorithmic. mathematical, procedural and informationally static, whereas the latter is informational. Intelligent and informationally dynamic. As a rule, a computer program has a stable, nonvariable program structure and program organization. Its definition (declaration) cannot be changed dynamically during its execution, by the parallel execution of itself and other programs, data, etc. An informational program performs as information. In this respect, such a program is also an informational object '(operand), which can be informationally changed during its execution. A typical computer program is always performing as a subject, by which non-program objects can be changed and can arise as results of its performing. In principle, the request to informatize a program concerns essentially different programming tools from those that are j.n use today. in parallel, possible informing in parallel, necessary informing in parallel, non-informing in parallel, informational carrying off, informational bringing, blocking to carry off, blocking to bring, sending and informing, receiving and informing, non-sending and non-informing, non-receiving and non-informing, causing of informational appearance, non-causing of informational appearance, coming into existence, non-coming into existence, causing informational end, coming to end, non-causing informational end, non-coming to end, choosing among informational alternatives, choosing and informing, being informationally impressed or memorized, recalling, disintegrating and informing, informing similarly, interrupting, breaking down, enriching, cyclical parallel informing, etc. 6. Conclusion 5. Parallel Informational Logic At its beginning, parallel to.the Parays Project, the study of informational logic was initiated, with the goal to construct adequate tools concerning informationally parallel machines and programs in particular. This study of informational logic has to be understood as a free-lance undertaking within IDC, which embraces the philosophy of the redefined notion of information and the so-called axiomatization of informational logic. The logic, which in its main part deals with informational parallelism, is being developed up to the form of an informational axiomatic system. In this regard, the system is not a usual axiomatic approach of pure mathematical doctrines, but above all, incorporates the so-called informational principles. This logical theory represents a generative axiomatic system the intention of which is to preserve semantically the arising nature of the redefined notion of information. The new formalism introduces several symbols for operators with the already known, but also a new semantics. Two types of variables appear: the operand and the operational ones. Operational variables can be particularized and universalized according to the needs, goals, and applications of a case. In operands, also the so-called functional operators can appear. In these cases operands perform operationally in an implicit manner. The most general operational variable has the meaning of informing. This operator, called informational metaoperator , can be particularized and univesalized, i.e., substituted in a logical expression in a non-uniform way. The axiomatic system of informational logic is constituted by basic informational operand and operator variables and informational constants of both types, of formation rules defining the syntax of informationally well-formed formulas, of informational axioms, and of the so-called transf orjnation rules for transformation of axioms and formulas. The theory of parallel processing within informational logic can be the basis for conception, design, and application of future parallel computing systems. In this context, the development of informational philosophy and to it adequate axiomatization and formalization of informational concepts are becoming necessity of the future. This could be the way to really intelligent systems which would cope with living minds and living systems. References 1. Zeleznikar, a. p., Principles of Information, Cybernetica 31 (1988) 99-122. 2. Zeleznikar, a. p., Informational Logic I, Informatica 12 (1988), No. 3, 26-38. 3. Zeleznikar, a. p., Informational Logic II, Informatica 12 (1988), No. 4, 3-20. The axiomatic basis of informational logic is still in the phase of development and formal construction 12, 3]. Operators of general parallel informing are, for instance, informing SYNTACTIC PARSING AND PLOTTING OF MATHEMATICAL EXPRESSIONS Descriptors: SYNTAX, ANALYSIS, TEXT NATURAL, SOFTWARE, TREE GRAMMAR, MATHEMATICAL ANALYSIS, LANGUAGE ANALYSIS Nikola Bogunović, Institut R. Bošković Zagreb The paper presents the parsing problem of a simple context free language. The "language" sentences are mathematical expressions with one variable. A computer program parses the expression according to the developed context free grammar rules. Upon building a parse tree, the program evaluates the expression over a given range of variable values, and plots the result on the screen. Even though the program is developed on a PC AT personal computer, it is highly portable since the C programming language is used, and graphics hardware dependent routines are removed in a separate module. SINTAKTIČKA ANALIZA I GRAFIČKI PRIKAZ MATEMATIČKIH IZRAZA U radu je predstavljen problem analize reeenitnog sloga u kontekstno slobodnom jeziku. "Jeziinu reòenicu" predstavlja matematički izraz s jednom varijablom. Računarski program razlaže izraz u skladu s razvijenim kontekstno slobodnim gramatiikim pravilima. Program gradi stablo sastavnih dijelova matematičkog izraza, nalazi vrijednosti izraza za dati niz vrijednosti varijable, i grafički prikazuje rezultat. Iako je program razvijen na računalu PC AT, jednostavno je prenosiv na druga računala, jer je pisan u viSem Jeziku (C), a grafičke, sklopovski ovisne rutine premještene su u odvojen programski modul. INTRODUCTION Language parsing has traditionally been one of the most intriguing research areas of artificial intelligence. The problem here is to take the information provided from the outside world and translate it into a precise internal representation. By the internal representation we mean a semantic representation, a common data structure produced or operated on by various program modules. Even though, a common data structure of internal representation may have different forms, it is assumed that the translations from one to another is easy, and all forms are variants of the same abstract representation. The application of language parsing, in the context of this paper, is directed to engineering problems, e.g. intelligent industrial process control, rather than attempt to solve an over aspiring and not very well defined problem like automatic language translation. It is more sensible to work on the internal representation generation, since this is an intermediate point between words and actions . The problem of language parsing can be divided into three areas, with the apparent ambiguity at each level [1]: 1. acoustic-phonetic: time domain and freguency domain analysis of the Incoming sound, and translation the input into words. 2. morphological-syntactic: taking words and establishing the syntactic form of the utterance. 3. semantic-pragmatic: finding out the meaning from the syntactically analyzed utterance. In this paper we will concentrate on the problems of second and third level only. Our goal is to develop an internal representation from the correctly received input stream of information, looking at the major data structures the computer program uses. The problems at first level, and partially at second, can be bound loosely to speech recognition, with major research advances and results given in [2] and [3]. A computer program that translates from any natural language to internal representation, must in the first step syntactically analyze, or parse, the sentence. In the process, one needs to know the rules of syntax for the language, specifying the legal syntactic structures for a sentence. A parsed sentence is usually presented in a tree structure known as parse tree or phrase marker. Specifying the structure of a sentence in a forma 1 knowledge structure, by a series of production rules i.e. grammar, could be far too complex for any natural I'anguage [4]. The grammar must be, in that case, context-free. Indicating how to replace a nonterminal node of the parse tree with lower constituents, without any reference to the context in which the nonterminal node finds itself [5]. Since the question, whether a natural language is context-free or not, is yet to be answered and a context-free grammar for such a language devised, in this paper we will concentrate on a much easier task of parsing the arithmetic expressions, which are inherently controlled by a context-free grammar. Nevertheless, the presented parser program applies the very same methodology of natural language parsing, and develops a parse tree of an arithmetic, expression of considerable complexity. grammar_rule --> g r am.ma r _he ad [-->] grammar_body grammar_head --> nonterm ina1_node grammar-head --> nontermina1_node termina 1_node grammar_body --> grammar_body grammar_body gramroar_body --> grammar_item grammar_item --> nontermina1_node g'r amma r_ i tem --> t e r m i na 1 _node The application of the context-free grammar to a s 1mp1 e sentence like "The parser makes a tree." is exceedingly clear from the following parse tree: / np / \ d n vp / \ np / \ d n The parser makes a tree A rough outline of context-free parsing can also be found in [6]. This paper extends the presented idea with different and more efficient algorithms and data structures available in C language, introduces complex expressions based on a family of new functions, evaluates the function of one variable and develops a graphics interface for the presentation of the evaluated function over a given range of values on the terminal of a PC XT/AT or PS/2 personal computer. The computer' program was written in Microsoft C, and linked with an assembly language program to run industry standard monochrome or colour graphics routines. THE SYNTAX O.F A LANGUAGE A context-free language is one whose syntax can be specified by a set of rules, usually called productions, or simply grammar. A context- free derivation of a sentence always start with a nonterminal node of the parser tree, or a nonterminal constituent, i.e. with a node that appears only in the interior, of the tree structure, and not in the final sentence. Each nonterminal node is then replaced by the right hand side of the rule, until we have only terminal nodes, or terminal constituents, i.e. nodes that appear in the final sentence. A very simple context-free might have the following rules: grammar This simple example creates more questions than it really indicates the solution of the problem. If is extremely difficult to express every rule of grammar with context-free rules. The problem of number agreement ( s ingular-p1ura 1 ) , morphology (differences between go, goes, going), ref1 ex1 v i zat i on (reflexive pronouns). Imperatives, passive case, etc. are just the few most common. One can add weakly context-free structure in the variation of the above example with the sentence like "It does." Programming languages, on the other hand, are context-free, and principal compiler task for such a language is to parse it. In that case we may even talk about the efficiency of parsing, optimal parsing, and so forth. The expansion of RISC computer architecture, as a contemporary industry trend, causes the compiler construction and language parsing problem to be of outmost importance. Mathematical expressions are the most simple case of finite character strings whose syntax can be captured in a context-free grammar. Since our primary goal in this paper is to show the principles of a working parser, we have constrained the presented application to a "language" that can be described with only a few production rules. Let us assume a mathematical equation with one variable, floating point constants, four arithmetic (binary), operators, and five unary operators, with left to right evaluation in the traditional fashion. An example of such an expression is: sentence(s) --> noun-phrase(np ) verb-phrase(vp) verb-phrase(vp) --> verb(v) noun-phrase(np) verb-phrase(vp ) --> verb(v) noun-phrase(np) --> determi ner(d ) noun(n) noun-phrase(np ) --> proper-noun(prn ) f(x)=2.5+3.4-4.5Ein(cos(5.8x+2.2)) ( 1 ) We have decided upon these basic set of operations with the following order of precedence: The last two rule pairs may be combined with the logical operator OR (1): vp. np V np d n V prn The syntax of grammar rules can be also described by the recursive set of grammar rules themselves: sin(X ) cos(X ) log(x) exp(x ) « \ ■ + unary minus sine function cosine function natural log exponent 1 a 1 multiplication division addition subtraction The syntax of the expression (1) can be captured in the recursive set of context-free grammar rules. We may use the notation introduced at the beginning of this paper or instead, we may use the familiar and traditional Backus-Naur form, from the computer science literature: :;= | + | - ::= | * | \ ::= | | - I sin | cos 1 log | exp | (expr) It is evident that the functions sin, cos, log, and exp are implemented as unary functions, like unary minus. Implied multiplication, used in the input expression, is later changed to explicit (»). term factor nuinßer 2.5 term t actor number 3.1 expr term factor number 4.5 term (actor expr term factor The application of rules to the equation (1), Fig.l. Parsing starts with according to the first rule have three forms: a , -. Since it +, further appi rules to part yields which is a floating the-se production is presented in the , which of our grammar may + or a is obviously a ication of grammar , then a point constant. term ♦ expr term factor • term f actor number factor number 5.B variable 2.2 Parsing the other part needs a recursive application of the same rules. Since it is an , we apply the first rule again, which yields -. The part is a , a and a constant. The part is a , which is a *. The process proceeds until the terminus of all branches is reached, yielding a or a . Fig.l The parse tree of equation (1! Once a parse tree is constructed, the expression may be evaluated starting at the top of the tree by recursively evaluating left and right branches, and then performing addition or subtraction at the top. DATA STRUCTURES AND ALGORITHMS The principal elements of a parse tree are nodes. Looking at Fig.l, we may deduce that there are four kinds of nodes. A node is either a number (a floating point constant), a variable (x), a unary operator node (-,sin,cos,log,exp ) , or a binary operator node (+,-,*,/). In our case the root node is a binary operator (+) with left and right operands, i.e. +, according to the first grammar rule. Left operand will be parsed into --> --> 2.5. Right operand will be parsed recursively starting with the first grammar rule again. All four kinds of nodes can be captured in a structure, in C language sense (7), with the following components: struct node { int tag; char operator; float number; struct node *left_operand; struct node *rlght_operand; } TREENODE, »TREEPTR; Integer tag identifies a node as a number (tag=0), a variable (tag=l), a binary operator (tag=2), or a unary operator (tag=3). Character operator identifies operator as +, -, *, /, sin, cos, log, or exp. If the node is a number, the floating point value Is held in the "number" structure member. If the node is a binary operator, pointers left_operand and right_operand point to the left and right "child" nodes (structures). If the node is a unary operator, only left_operand pointer is used. It is absolutely important to note that the root node structure embeds the whole parse tree, because left and right operands, as the structure members, are pointers to the structures of the same type as the root node itself. This recursive declaration of a node ia correct, as given in [7]. Typedef TREENODE and TREEPTR, define a node type structure and a pointer to such a node type structure. At the beginning o string, which correspond tion, is subject to the p tion. The string is conv and all surplus spaces are cos, log, and exp function unary operators, they are unique character operators implicit multiplication pllcit. After the prepr equation (1) would fill an that would look like: f the program, the s to the input equa-reprocessxng opeta-erted to lower case, removed. Since sin, s are implemented as stripped to a single (s,c,l,e). Finally, is changed to ex-ocesslng phase, our array of characters 2.5+3.4-4,5*s(c(5.8*x+2.2) (2) In the next step a parse tree is constructed. Any expression, if not constrained with parenthesis to a subexpression, must start with a term, which must start with a factor (number,vari ab1e or unary operator applied to ). In the process of building a parse tree, we actually make the instances of node structures defined earlier. As already stated, the root node contain pointers to the neighbouring structures, and in essence embeds the whole tree. There is an initial function expr(), which calls the function term(), which finally calls function factor(). These functions mirror our grammar rules. The function factor() analyses the beginning of the string. In our case it will find a number 2.5 (a constant), and it will make a number node and return a pointer to its caller. The callee, function term(), will further analyze the string to find if there is a multiplication (•) or a division (/) sign according to the second grammar rule. If not, which happens in our case, term() will return a pointer of a number node to expr(). The expr() function will analyze the string further and find an addition sign, hence the root node is a binary operator node with left operator already estab11 shed (previous1y found number node). The right . node will be found by a recursive call to expr() function again. To Illustrate the creation of the number node structure, we give the function numbernode(), which is called by the factor() after it has extracted the number and its value from the beginning of the string. f lo«t ^v^l (n.jil TRCCI'TIi n; /• Ipl -.r fl lo-.r.',-- L.'i tti' root slrucvjr- flodt X; /■ 2'iifl parom»;tTr is a vei'isLl.- vnlu-r '' < float opl.op2; awitch (n-'laa) ( case 0: /* ll 13 a nuriit-^r no'lf '/ r^turn(r.->rniffihert : br^ak. cap? 1; /• It 1? a variable nodr */ rpt urn f X I ; braaV. : CO?' 7.\ /• It 15 a binaiy operator nod* opl - ava 1 ( n->I t _op^rantl. xl ; op.'^ - »VA 1 ( n->r ! ght_opffrand , X t : sul teh(n->operator) ( case "*■": return(opl+op2) ; break: case return{opl-op2); break: case returnlopl*op2) : break; case "/": returnlopl/op2) ; break: J case 3: /* it is a unary operator node *f switchtn->operator) i case •-• : return(-eva1{n->)eft_operand.x)t ; break: case 's ■ : returnfsi n(eval(n->1 eft_oper8nd.y1)): break ; case 'c■; return( cos leva 1 {n->left_oper8nd . xM 1 : break: case ■e■: return(exp(eva1(n->1eft_operand.x))I ; break; case ■1': return(lofl(«val(n->left operand.x))): break ; ) I ) List 1. Evaluation of the expression. ♦ define new( ) (TREPTR) cal loc( 1, s 1zeof(TREENODE) ) /* This Is a global creation of the space which will hold a node, and return a pointer to that space. */ »define NULL 0 TREEPTR numbernode(value) /• take the number value and return a pointer to a structure */ float value; /* the type of passed parameter ♦/ { TREEPTR n; /' declaration of the pointer */ n = new( ) ; /* creation of an empty struct. ' / n->tag = 0; /♦ it is a number node */ n->number = value; /• fill in the value */ n->left_operand = NULL; /• numbernode has no neighbours •/ n->r1ght_operand = NULL; r e t u r n ( n ) ; /* returning a pointer ■* / ! Since this paper describes the equation parser and plotter, we have included in List 1, an evaluation function which, for a given variable value, will traverse through the parse tree in a recursive search fashion, finding the value of the whole expression. The function eval(root_node_polnter,X ) will test the tag of the root node and act accordingly. If the node is a unary or binary operator node, eval() will call itself with new pointers. The ■ presented function eval() is only a basic skeleton of the implemented function, because one has to take great care how binary and unary functions are defined (no negativ«? values for log, divide with zero, etc.), ar.ij whether a parenthesis is encountered indicating a subexpression. Finally, after obtaining domain and value points of the equation, we can display it graphically. The graphics functions greatly depend on the used hardware and can not be given generally. However, since industry standards like personal computers PC XT/AT and PS/2 are readily available, we will show the principles of implementation some sir.pie graphics procedures for these computers. Even within the PC and PS family of computers, graphics standards vary from 320x200 pixels to an impressive 1024x768 pixels (with additional advanced display adapter). In this paper we have chosen to show Hercules monochrome graphics implementation, in belief to represent a popular, yet fully acceptable med i urn resolution (720x348) graphics standard. Hercules graphics functions library is a set of memory resident routines, set up by INT10.COM, a program supplied and copywritpd by the vendor [8]. We have chosen to implement graphics routines in the assembly language and link them with the main C program to achieve maximum portability. The assembly language program treats Hercules graphics functions as the extension of the standard display control software interrupt procedures (int 10h). All parameters are simply loaded in registers, with the function code in AH register, prior to int 10h.call. Unlike the original set c.f functions within int 10h group, only segment registers are preserved, along with all registers used to pass parameters. An example of assembly language function, which moves the cursor to the x,y position (move(x,y)), is given below. The caller from the C program will leave x and y coordinates, as parameters, on the stack. It was assumed that C program will run on a PC XT/AT in the small model ("Microsoft" restriction to 54K byte), hence near procedure type. _text segment byte public 'code' assume cs : _text ; definitions as required by Microsoft C public _move _move proc near push bp mo V bp,sp push di , mov di , [bp + 4] ; get X from stack mov bp,[bp + 6 ] ; get y from stack mov ah, 48h ; it is function "move" Int 10h ; call function pop dl pop bp ret _move endp _text ends end ».i-i-i : I I ; \ ; ' / \ / 1 , \ ; Fig.2 The graph of equation (1) REFERENCES : CONCLUSIONS AND REMARKS We have studied string parsing techniques, applied to the simple mathematical equations. The syntax of these strings can be described by an elementary context free grammar. Nevertheless, the same principles apply to a broad range of languages described by context free grammars. To illustrate the parsing, evaluating and plotting operations of the working program, we have presented the graph of the equation (1) in Fig.2. The function is plotted over a domain range (-4,+4). The scale of ordinate values is appropriately chosen to display points between -6 and +15. The program prompts for the scale before it plots the function. By changing the coordinate scale one can easily zoom, scroll and pan the graph, sustaining the same resolution. It is worth noting that the program embeds implicit precedence rules (multiplication and division before addition and subtraction), and enforced precedence by parentheses, according to the given grammar rules . 1. E.Charniak, D.HcDermott, Introduction to Artificial Intelligence, Add 1 son-Wes1 ey, Reading, Mass., 1985. 2. J.L.Flanagan, Speech Analysis, Synthesis, and Perceptions, Springer Ferlag, New York, 1972. 3. S.E.Levinson, L.R.Rabiner, M.M.Sondhi, An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition, Bell Syst. Tech. J., Vol. 62, No. 4, 1982. 4. N.Chomsky, Syntactic Structures, Mouton, The Hague, 1957. B. A.V.Aho, J.D.Ullman, The Theory of Parsing, Translation and Compiling, Prent1 ce-Ha 11 , Englewood Cliffs, N.J., 1972. 6. J.Amsterdam, Context-free parsing of Arithmetic Expressions, Byte, Vol. 10, No. 8, August 1985. 7. B.W.Kernighan, D.M.Ritchie, The C Programming Language, Prentice-Hall, Englewood Cliffs, N.J., 1978. GRAPHX VI.1 Manual, Hercules Computer Technologies, 2550 Ninth Street, Berkeley, CA 94710, USA. RELATIONAL SCHEMA DESCRIPTION LANGUAGE INFORMATICA 1/89 Descriptors: RERLATION BASES, PROGRAMMING LANGUAGE, SCHEMA DEFINITION, INTERPRETERS, SDL LANGUAGE Radojko Miladinović Dušan Velasević ABSTRACT: In this paper a schema description language for relational databases is described. The language provides a schema description on which any query language can be defined. The implemented multiuser incremental interpreter for that language is also described. SADRŽAJ: U ovom danku opisan je jezik za definisanje seme u relacionim bazama podataka. Taj jezik omogućava definisanje seme na kojoj bilo koji upitni jezik moze biti definisan. Za taj jezik realizovan je višekorisnički inkrementalni interpreter. INTRODUCTION Since 1970., when E.F. Codd had defined the relational data model f4,5,6,7], a lot of relational database management systems (RDBMS) was developed and implemented. All these systems can be classified into two groups according to the way of the schema definition. The systems from the first group, for example Relational Database Management System [9,10], have a stand alone schema definition language. The systems from the second group do not provide such a language: the schema definition is realized as a function of data sublanguage, i.e. query language. SYSTEM R [1,3,11] belongs to this group of RDBMS. The relation between schema description language (SDL) and other languages in RDBMS (query language, data manipulation language -DML, subschema description language and physical database description language) represents a special problem in the database design. All these languages can be implemented as stand alone languages or as extensions of standard programming languages. The majority of RDBMS does not have the independent SDL, query language and DML; in fact, the data definition, data manipulation and query facility are realized as the functions of the special language called the data sublanguage. The data sublanguage can be implemented as a stand alone programming language or as an extension of the host language. If it is implemented as a stand alone language, it is usually called the query language. Although the most of modern RDBMS have not a separate SDL, there exists a need for such a language which should be general, simple, structured and user-friendly. This language should provide the means for a schema description over which any query language can be defined. A complete functional independence of the schema description process from the data manipulation and queries is achieved in this way. Bearing this in mind, we developed a new SDL and multiuser incremental interactive interpreter for that language. The language design was influenced by the general principles applied to the other programming languages. We especially emphasized the language reliability, precise syntax and semantics description of the language, orthogonallity and language independence. The SDL structure was designed bearing in mind that the language should be interactive. For this reason, the commands for direct communication with the users are defined, each statement must be written in one line and the language is structured to provide better readability and documentabi1 ity of SDL programs. BASIC LANGUAGE ELEMENTS The following notation is description of SDL elements; used in the the { ) - Braces indicate that one of elements enclosed must be specified. ( 1 - Square brackets indicate that one of the elements enclosed may be optionally specified. ... - Ellipses indicate that the immediately preceding part of the format may be repeated. - Upper case words are SDL reserved words. - Lower case words indicate the information that should be supplied by the user. Language alphabet The complete SDL character set consists of 52 characters. All SDL characters are presented in table 1. Table 1. character A,B.....X,Y,Z 0123456789 + t / < > uppercase letters decimal digits plus minus asterisk slash less than greater than equals left parenthesis right parenthesis poi nt comma colon quotation marks dollar sign hyphen space Identifiers A SDL identifier (or word) is a character string that forms a user defined or a reserved word of not more than 20 characters. It can be any combination of letters, digits and hyphen sign, but hyphen cannot be the first or the last character of the identifier. A reserved word has specific meaning and may be used only in the manner presented in the statement format. Reserved words are chosen carefully so they enable writing of reliable and synoptic programs. The list of the reserved words is given in APPENDIX A. The user defined words are SDL words that must be supplied by the user to satisfy the format of the statements. The user defined words are variable names, constants, function names, procedures and comments. Constants and variables The classification of constants and variables according to their format and type is given in Fig 1. Variables in SDL are relations and attributes. Attributes are one-dimensional arrays with elements of the same type. Relations are considered as two-dimensional arrays in which all elements in the same column are of the same type; elements in the same row are not obligatory of the same type. The attribute type is explicitly defined by a particular statement in the SDL, and the relation type is implicitly defined through attributes which constitute that relation. The external representation of constants is very simple. It corresponds to the syntax representation of integer and real numbers. For example: 1000, -3.5 0.75E10, 0.32DU, -34E10 The internal representation of constants depends on the context in which constants appear. For example, if variable A.B in the relational expression A.B > 10000 is declared as a real variable of the extended precision (binary floating point format), then the constant 1000 is presented in the same format. Alphanumeric strings (literals) are externally represented as a character strings delimited by the quotation marks. The internal representation of literals is completely the same if the data compression is not applied. Arithmetic expressions Arithmetic expressions are used to compute new attribute values during updating. Arithmetic expressions are formed with arithmetic operands and arithmetic operators. An arithmetic operand may be a numeric constant and a variable (attribute). Arithmetic operators specify a computation to be performed using the values of arithmetic operands; they produce a numeric value as a result. The operators are: addition(+l, subtraction (-) , mul tiplication ( * ) anri diviaion(/). Arithmetic expressions are evaluated in an order determined by a precedence associated with each operator. The precedence of the operators is: first (» and /) and second (+ and -). Parentheses can be used to override the normal evaluation order. The attributes which appear in arithmetic expressions must be of type real or integer. The attribute names in arithmetic expressions have the format: OLD- [( > ] relation_name.] attribute_name NEW . The relation name must be specified in front of attribute name if an attribute appears in more than one relation. If an attribute belongs to only one relation, then the relation name is optional. The reserved words OLD and NEW can appear in front of attribute name. These words denote attribute values before and after updating. If they are omitted the immediate attribute values are considered. The examples of arithmetic expressions: ORDER.QUANTITY - PRODUCT.QUANTITY OLD EMPLOYEE.SALARY - 1000 Relational expressions A relational expression may be a simple relational expression, or may be a combination of simple relational expressions, functions and logical operators. A simple relational expression consists of two operands separated by a comparison operator. The comparison operators are: less than(<), greater than(>), less than or equal to(<=), greater than or equal to(=>>, equal to(=> and not equal to(<>). An operand may be a constant and a variable (attribute) of any type. An attribute name has the same format as in an arithmetic expression. The logical operators are AND and OR. The precedence of the logical operators is: first AND and second OR. In a relational expression, the simple relational expressions are evaluated first to obtain their values. Evaluation of relational expressions is performed according to an order of precedence assigned to logical operators. Logical operators of equal rank are evaluated from left to right. Parentheses may be used to alter the normal sequence of evaluation, just DATA TYPES INTEGER STANDARD EXTENDED PRECISION PRECISION REAL FLOATING POINT FIXED POINT CHARACTER COMPRESSED NONCOMPRESSED STANDARD EXTENDED PRECISION PRECISION Fig 1. Data typesdefined in SDL as in arithmetic expressions. The examples of relational expressions: EMPLOYEE.SALARY > 2000 AND EMPLOYEE.SEX = "M" OLD EMPLOYEE.SALARY < NEW EMPLOYEE.SALARY Functions The functions in SDL are used to define relational expressions that appear in more than one statements. They are defined through FUNCTION statements. The functions also enable the decomposition of long relational expre'Ssions, which cannot be written in one line, into several logical parts. Each logical part must represent the complete relational expressions defined by a separate FUNCTION statement. In that way, the long statements containing relational expressions can be spread over several lines. This must be done because a SDL statement can be written in only one line. For example: {EMPLOYEE.JOB="PILOT" OR EMPLOYEE.JOB="DRIVER") AND EMPLOYEE.SEX="M" AND EMPLOYEE.AGE>60 This relational expression cannot be written in one line, so function FUNI is defined as follows : FUN1:=EMPL0YEE.J0B="PIL0T" OR EMPLOYEE.J0B="DRIVER" The relational expression becomes now: FUNI AND EMPLOYEE.SBX="M" AND EMPLOYEE.AGE>60 The names of other functions can appear in the function definition. LANGUAGE DESCRIPTION Programs written in SDL have specific structure due to thè fact that SDL is used only for the relational schema description. The program structure is determined by the structure of the relational database description. Each SDL program contains four type of entries: schema entry, domain entry, attribute entry and relation entry. The entries appear in the program in cited order. Each entry consists of a sequence of statements. Some statements can appear in different entries. The entry specification contains the following parts: 1. Narrative description of entry function 2. General format in which all statements of the entry are presented 3. Description of each statement The statement specification consists of the following parts: a. Narrative description of statement function b. General format that defines the statement parts and their functions c. Language rules that explain the usage of the statement Line format A SDL statement must be written in one line, i.e. the continuation of the statement in a few succeeding lines is not allowed. If any statement can't be written in one line, that statement is divided in several logical parts which should be defined as functions. One or more spaces can appear at the beginning of each line in other to achieve a better program structure and readability. All lexical entities in a statement may be separated by one or more spaces. Data base example The application of each statement described is illustrated by the examples which are composed of the following relations: PRODUCT(CODE, NAME, PRICE, CJUANTITY) DOMCODE DOMNAME DOMPRICE DOMQUANTITY EMPLOYEE(CODE, NAME, SEX, SALARY) DOMCODE DOMNAME DOMSEX DOMSALARY SUPPLY(CODEPRO, CODESUP, DATE, QUANTITY I DOMCODE DOMCODE DOMDATE DOMQUANTITY DEI'T(C0DE, NAME, ADDRESS, EMPNOI DOMCODE DOMNAME DOMADDRESS DOMNUMBER As we can see, an attribute can appear in more different relations. For each attribute, domain on which the attribute is defined is also represented. SCHEMA ENTRY 1. The schema entry uniquely identifies schema by its name. Besides, the textual description of the schema contents and cripto protection method applied, if any, is given in the schema entry. 2. Entry format SCHEMA statement [DESC statement] [CRIPTO_PROTECTION statement] (domain entry) (attribute entry) (relation entry) ENDSCHEMA statement 3. Statements description The schema entry begins with SCHEMA statement and ends with ENDSCHEMA statement. The statements that define the schema are located between the SCHEMA statement and first domain entry in the schema. The order of these statements is irrelevant. In the schema entry, the entries of all domains are defined first, then the entries of all attributes, and finally the relation entries. SCHEMA statement a) The SCHEMA statement uniquely identifies schema by its name. b) Format SCHEMA achema_name c) Language rules - Schema name must be unique in the database. DESC statement a) The DESC statement is used to describe the content and function of schema, domain, attribute and relation. This statement can appear in any entry in the schema. b) Format DESC comment c) Language rules - An arbitrary number of DESC statements can appear in the schema entry - Comment in DESC statement can contain any character from SDL character set. - This statement has no meaning for interpreter. - If a comment can't be written in one line, several DESC statement must be used. CRIPTO_PROTECTION statement a) The CRIPTO_PROTECTION statement can appear in the schema and in relation entry. If it appears in schema entry, all relations in the schema are cripto protected. The cripto protection can be implemented by a special procedure defined by the user or as a standard function in RDBMS. b) Format fprocedure_name" CRIPTO_PROTECTION { L SYSTEM c) Language rules - Only one CRIPTO_PROTECTION statement can appear in the schema or relation entry. - If the option SYSTEM is specified, the cripto protection is implemented as one of the activities of RDBMS., If the procedure_name is specified, the cripto protection is implemented by a special procedure. ENDSCHBMA statement a) This statements denotes the end of the schema entry, i.e. the end of the whole schema description program. b) Format ENDSCHEMA DOMAIN ENTRY 1. The domain entry uniquely identifies the domain by its name. Besides, the domain mode, the names of all attributes defined over that domain and the domain units are specified in the domain entry. 2. Entry format DOMAIN statement DEFINES statement [DESC statement] MODE statement [UNIT statement] ENDDOMAIN statement 3. Statements description The domain entry begins with the DOMAIN statement and ends with the ENDDOMAIN statement. The order of other statements in the entry is irrelevant. DOMAIN statement a) The DOMAIN statement uniquely identifies domain by its name. b) Format DOMAIN domain_name c) Language rules - Domain_name has to be unique in the schema. DEFINES statement a) The DEFINES statement specifies the names of all attributes defined over that domain. b) Format DEFINES attr_l ,attr_2.....attr_n c) Language rules - An arbitrary number of DEFINES statements can appear in the domain entry. - Attributes declared in this statement must be defined in the separate attribute entri es . MODE statement a> The MODE statement defines the data types in the domain, b) Format /"CHARACTER int_l [COMPRESSED proc_name] MODE REAL FIXED_POINT integer_2,int_3 FLOATING_POINT [EXTENDED] ■BINARY [EXTENDED] INTEGER I DECIMAL integer_4 c) Language rules - Only one MODE statement can appear in domain entry. - Integer data in the database may be represented in the binary or decimal form; the binary integer data can have the standard and extended format. If the decimal base is specified, it is necessary to give the number of decimal digits (integer_4). Default format for integer data is the binary representation in the standard format. - Real data can have the fixed or floating point representation in the database. Real data in floating point format is always represented in binary form with standard or extended precision. Real data in fixed point format is always represented in packed decimal form; integer_2 is the total number of decimal digits and integer_3 is the number of digits in the fractional part. The default format for real data is the binary floating point format in standard precision. - Character data can be represented in compressed or source form. Tnteger_t specifies the number of characters. If COMPRESSED option is declared, the name of the procedure (proc_name) which performs the compression must be declared too. UNIT statement a) The UNIT statement defines the input unit for data in the domain. Because the domain values stored in the database can be expressed in different units, the name of the procedure which performs the conversion from input to internal units may be also specified in this statement. b) Format UNIT unit [,procedure_nameJ c) Language rules - If the procedure_name is not specified, the data in the database are measured by the same units as input data. If the procedure_name is specified, the conversion from input units to internal units must be done. For example, the input unit can be dollar but the internnl unit can be million dollars, etc. - If this .statement is not given in the domain entry, the domain values are numbers or character strings, without a specific context. ENDDOMAIN statement a) This statement denotes the end of the domain description. b) Format ENDDOMAIN ATTRIBUTE ENTRY 1. The attribute entry uniquely identifies the attribute by its name. Besides, the domain name over which the attribute is defined, the names of relations in which the attribute appears and the total number of different values the attribute may assume are given in the attribute entry. 2. Entry format ATTRIBUTE statement ORIGIN statement BELONGS statement (DESC statement] [CARDINALITY statement] [VALUE statement) ENDATTRIBUTE statement 3. Statements description The attribute entry begins with the ATTRIBUTE statement and ends with the ENDATTRIBUTE statement. The order of other statements is irrelevant. ATTRIBUTE statement a) The ATTRIBUTE statement uniquely identifies the attribute by its name. b) Format ATTRIBUTE attribute_name c) Language rules - Attribute_name has to be unique in the schema - Attribute_name must appear in DEFINES statement in the entry of the domain whose name is declared in the ORIGIN statement in this attribute entry. - Attribute_name must appear in the CONTAINS statement in the entry of the relation whose name is declared in the BELONGS statement in this attribute entry. An attribute may appear in an arbitrary number of relations. ORIGIN statement a) The ORIGIN statement identifies the domain over which this attribute is defined. b) Format ORIGIN domain_name c) Language rules - Domain whose name appears in this statement must be declared in the separate domain entry. BELONGS statement a) This statement identifies the relations in which this attribute appears. b) Format BELONGS relation_l, relation_2, ... c) Language rules - An arbitrary number of the BELONGS statements may appear in the attribute entry. - Relations whose names appear in this statement must be declared in the separate relation entries. CARDINALITY statement a) The CARDINALITY statement specifies the total number of different attribute values. This statement specifies the total number of n-tuples in the relation, if the attribute is the primary key of the relation. b) Format CARDINALITY integer c) Language rules - Only one CARDINALITY statement may appear in the attribute entry. - The number of different attribute values is unlimited if this statement does not appear in the attribute entry. VALUE statement a) This statement defines an implicit value that should be assigned to the attribute if the attribute value is not assigned during the loading of the database. b) Format VALUE constant c) Language rules - Only one VALUE statement may appear in the attribute entry. - The constant can be of the type real, integer or character depending on the attribute type. - If this statement is omitted, the attribute values must be assigned during the database loading. ENDATTRIBUTE statement a) This statement declares the end of the attribute entry. b) Format ENDATTRIBUTE RELATION ENTRY 1. The relation entry uniquely identifies the relation by its name. Besides, the attributes belonging to that relation, the integrity constraints, the primary key, the cripto protection method and dependencies among this relation and other relations are specified in the relation entry. 2. Entry format RELATION statement CONTAINS statement KEY statement [DESC statement! tCRIPTO_PROTECTION statement] [INTEGRITY_CONSTRAINT statement) [UNIQUE statement] (FUNCTION statement] tACCESS_CONTROL statement] [TRIGGER structure) ENDRELATION 3. Statements description The order of the statements in the relation entry is irrelevant. RELATION statement a) The RELATION statement uniquely identifies schema by its name. b) Format RELATION relation_name c) Language rules - Relation_narae has to be unique in the schema. - Relation_name must be declared in the BELONGS statement of all attributes that belongs to the relation. CONTAINS statement a) The CONTAINS statement specifies the names of all attributes in the relation. b) Format CONTAINS attribute_l, attribute_2, ... c) Language rules - At least one attribute name must be declared in this statement. - An arbitrary number of the CONTAINS statements can appear in the relation entry. - The attributes declared in this statement must be defined in the separate attributes entries. KEY statement a) The KEY statement defines the primary key of the relation. b) Format KEY attribute_l,attribute_2, ... c) Language rules - Only one KEY statement can appear"in the relation entry. - At least one attribute name must be declared in the KEY statement. - The attributes declared in this statement must be defined in the separate attribute entries. - The attributes declared in this statement must belong to this relation, i.e. they must be defined in the CONTAINS statement of this relation entry. INTEGRITY_CONSTRAINT statement a) This statement defines the integrity constraints in the relations. The constraints can be static and dynamic. b) Format ■ r_exp_2 INTEGRITY CONSTRAINT ■r_exp_l .fun_l [IF fun 2 J of in but the this the relation can appear in not in the c) Language rules - An arbitrary number statements can appear entry; - Reserved words OLD and NEW the relational_exp_l relational_exp_2; - An arbitrary number statements can be attribute because the attribute can have more than one integrity constraint; - Static and dynamic integrity constraints for one attribute must be defined in different INTEGRITY CONSTRAINT statements of the this defined for one - Integrity constraint is defined by relational_exp_l or by function_l. Function is defined in the FUNCTION statement ; - Integrity constraint represents the comparison between old (OLD) and new (NEW) attribute values, if the integrity constraint is dynamic. In that case, attribute names must include reserved words OLD and NEW. If the integrity constraint is static, the attribute names in the relational expressions represents immediate values and reserved words OLD and NEW are not included in the attributes names. - IF option specifies the attribute values on which the integrity constraint is applied. If this option is omitted, the integrity is valid for all attribute values. Examples : INTEGRITY_CONSTRAINT SUPPLY.CODEPRO = PRODUCT.CODE INTEGRITY_CONSTRAINT NEW EMPLOYEE.SAL > OLD EMPLOYEE.SAL INTEORITY_CONSTRAINT FUN3 IF EMPLOYEE.DEPTCODE="B" INTEGRITY_CONSTRAINT FUN2 IF EMPLOYEE.SEX:"M" FUNCTION FUN2:=EMPL0YEE.SAL > 5000 FUNCTION FUN3:=NEW EMPLOYEE. SADOLD EMP.SAL The integrity constraint in the first example specifies that all values of the attribute CODEPRO in the relation SUPPLY must be equal to the values of the attribute CODE in relation PRODUCT. Second example specifies that new salaries of all employees must be greater than old salaries. The usage of the IF option is shown in the third example. This example specifies that new salary must be greater than old salary, but only for employees in department "B". The fourth example defines integrity constraint that the male employees have salary greater than 5000. In the third and fourth example functions FUN2 and FUN3 are used. These functions are defined by the FUNCTION statements. UNIQUE statement a) The UNIQUE statement identifies attribute or attributes group having unique values in the relation. That attribute or attributes group do not represent the primary key. b) Format UNIQUE attribute_l, attribute_2, ... c) Language rules - An arbitrary number of the UNIQUE statements can appear in the relation entry. - The attributes declared in this statement must be defined in the separate attributes entries; - The attributes declared must belong to this relation, i.e. they must be declared in the CONTAINS statement of the relation entry. - If more than one attribute is declared in the UNIQUE statement, that attribute group has unique values, not the particular attributes in that group. - If more than one attribute or attributes group in the relation have unique values, that should be specified by separate UNIQUE statements; - If UNIQUE statement doesn't appear in the relation entry, only attributes that constitute primary key have unique values. FUNCTION statement a) The FUNCTION statement defines a function. b) Format FUNCTION function_name:=relational_exp c) Language rules - Function_name must be unique in the schema; - The other function names can appear in the relational expression. ACCBSS_CONSTRAINT statement a) The ACCESS_CONSTRAINT statement specifies the operations that cannot be performed on all or particular n-tuples, i.e. on all or particular attribute values. b) Format r_exp fON attr_l) [IF ACCESS C FOR funot. /"READ UPDATE INSERT DELETE c) Language rules - An arbitrary number of this statements can appear in the relation entry. - One ACCESS_CONSTRAINT statement must be specified for each forbidden operation; - The operations INSERT and DELETE denote the insertion and deletion of n-tuples. They must be applied to n-tuples, not to the attribute values. The READ operation denotes the reading of n-tuples or attribute values, and UPDATE operation denotes the updating of one or more attributes in the relation. The READ and UPDATE operations can be applied to n-tuples and attribute values. If ACCESS_CONSTRAINT specifies that UPDATE or READ are forbidden on some attributes, the ON option specifies these attributes. If ON option is omitted, UPDATE or READ are forbidden for all attributes of the relation. The ON option can appear in this statement only for READ and UPDATE operations. - IF option specifies n-tuples or attribute values for which the given operation is forbidden. If this option is omitted the requested operation is forbidden for all n-tuples or for all attribute values; - If the ACCESS_CONSTRAINT statement is not included in the relation entry description all operations over n-tuples or attributes of the relation are allowed Examples : ACCESS_CONSTRAINT FOR DELETE ACCESS_CONSTRAINT FOR UPDATE ON CODE ACCESS_CONSTRAINT FOR READ IF CODE="C" ACCESS_CONSTRAINT FOR READ ON SAL IF CODE="C" All examples are defined over the relation EMPLOYEE. First example means that the DELETE operation is forbidden for all n-tuples of relation EMPLOYEE and the second example denotes that the updating of the attribute CODE is not allowed. Third example means that n-tuples in the relation EMPLOYEE having the value of attribute CODE equal "C" cannot be read, and the fourth example means that the values of attribute SALARY of the n-tuples with attribute CODE="C" cannot be read. TRIGGER structure a) The TRIGGER structure defines the set of forced operations that must be performed over other relations upon the finishing of the current operation. b) Format INSERT 1 TRIGGER FOR UPDATE DELETE (ON attribute_U INSERT] cconst UPDATES rel_l [SET atr_2:='^ [DELETE J (a_expj (IF fun ]] ENDTRIGGER c) Language rules - TRIGGER contains the head line, one or more lines in which the set of forced operations is defined and the end line; - An arbitrary number of triggers can appear in the relation entry; - Relation_l is the name of the relation to which the forced operation is applied. Attribute_2 is an attribute from that relation; - Reserved words OLD and NEW can appear in the relational expression of the IF option. - The FOR option in the head line specifies the operation that causes the trigger; - If the operation declared in head line is UPDATE, then the ON option must exist. -The ON option specifies the attribute in the current relation over which the UPDATE operation (causing the trigger) is performed. If the operation declared in the head line is INSERT or DELETE,, the ON option must not appear in the head line because these operations are performed over the n-tuples not over the attribute values; - The SET option exists only if the forced operation declared is UPDATE. One SET option exists in the same trigger for each attribute updated. For example, if the trigger causes the updating of two attributes (attr_2 and attr_3) in relation_l, then the trigger must contain the following lines: ■ constant UPDATE rel_l SET attr_2:= .arithmetic_exp /'constant UPDATE rel_l SET attr_3:= V .arithmetic_expj The arithmetic expression can be defined in the SET option only if the attr__2 and attr_3 are REAL or INTEGER. - The IF option specifies the n-tuples, i.e. attribute values in relation_l over which the operations defined by the trigger are performed. If this option is omitted, the forced operations are applied to all n-tuples in relation_l, i.e. to all attribute values; - The n-tuples, i.e. attribute values to which the forced operations must be applied are determined by relational expression or by the function whose name appears in the IF option. Examples : TRIGGER FOR UPDATE ON CODE UPDATE SUPPLY SET CODEPRO :=PRODUCT.CODE ENDTRIGGER TRIGGER FOR UPDATE ON DEPTCODE UPDATE DEPT SET EMPNO:=DEPT.EMPNO+1 IF FUNI UPDATE DEPT SET EMPNO:=DEPT.EMPNO-1 IF FUN2 ENDTRIGGER FUNCTION FUN1:=DBPT.C0DE=NEW EMPLOYEE.DEPTCODE FUNCTION FUN2:=DEPT.C0DE=0LD EMPLOYEE.DEPTCODE First trigger updates the relevant CODEPRO entries in the relation SUPPLY whenever the CODE of the PRODUCT is updated. This trigger is defined in the relation PRODUCT. Second trigger updates the relevant EMPNO entries in the relation DEPT whenever the DEPTCODE of the EMPLOYEE is updated. This trigger is defined in the relation EMPLOYEE. BNDRELATION statement a) This statement denotes the end of the relation entry. b) Format ENDRELATION The key words of SDL are given in APPENDIX A, the formal syntax description in APPENDIX B and the schema description for the example in APPENDIX C. THE INTERPRETER REALIZATION SDL commands Bearing in mind that SDL is an Interactive language it contains a set of commands which enables the user to list the program, run it, modify, etc. SDL commands are: LIST, INSERT, DELETE, SUBSTITUTE, RUN, PURGE, SAVE, OLD, TERMINATE, PROMPT, TEST and EXIT. The majority of these commands are the standard commands that exist in each interactive language and they have a usual meaning. For example, RUN command runs the SDL program, the OLD command brings an existing SDL program from the file to the user area in the main memory, the PURGE command deletes all program versions but the last one etc. Only TEST and TERMINATE commands have specific role in SDL. The TEST command enables the user to control and track the interpretation- process, and the TERMINATE command requests from the interpreter to finish the program interpretation after its modification is done. Interpreter structure The interpretation is performed in two phases. In the first phase, the source program is analyzed (statement by statement) and translated into an internal form (postfix notation). The internal form of the program is interpreted and the result is generated in the second phase. Two requests are imposed to the implementation of the SDL interpreter: a) the interpreter must be incremental, b) the multi user environment must be provided. As a result of the interpretation process, the description file which contains the description of all elements of the database and their internal dependencies is created. Each record in that file corresponds to one element of the database (schema, domain, attribute and relation). The interpreter is implemented as a set of six internally connected program modules. Fig. 2. The interpreter modules are: command module, lexical analyzer, parser, semantic analyzer, result generator and editor. The control communications between modules are presented in full lines, and the data flow in dashed lines. The data structures (static and dynamic) that are used during the interpretation process are: source program, internal program, dictionaries and description file. The dictionaries contain all information about the objects of the source program (operators and operands). These information are organized so they can be accessed from any part of the interpreter. There exist two static dictionaries which are parts of the interpreter (dictionary of reserved words and dictionary of delimiters) and one dynamic dictionary (symbol table). The static dictionaries are one dimensional arrays with the elements of fixed length. The dictionary of delimiters contains all arithmetic operators, comparison operators and special characters as comma, colon etc. The symbol table is formed during the translation of the source program. All information about constants and variables from a SDL program are stored into it. The symbol table has the fixed and variable part. The fixed part contains five fields: identifier name, object type indicator (R - relation, A -attribute, D - domain, etc.), special indicator used for semantic control (if the given object is completely defined this indicator has the value one, otherwise null) and the pointer to the variable part of the symbol table. The fields of the variable part depend on the database element to whom the 18 EDITOR COMMANO MODULE RELATIONAL DATABASE MANAGEMENT SYSTEM V \ \ SOURCE PROURAM LEXICAL PARSER SEMANTIC ANALYZER ANALYZER V V h \ X \ \ r 1 1 \ \ \ \ 1 V SVMHüL TABLE UICTIONARY OF DELIMITERS UICTIONARY OF RESERVED WORDS INTERNAL PROGRAM RESULT GENERATOR / DESCRIPTION FILE Fig. 2. The interpreter structure given entry belongs. The number of fields in the variable part and their format for different element types (relations, domains etc.) are different, because the quantity of information that must be stored is different. All errors detected during the program translation can be classified into five groups: lexical, syntax, semantic, editor and command errors. These errots are detected by the corresponding modules. Upon the detection of any error the interpretation process is interrupted, the command module takes over the control and sends the error message to the user. Due to the concept of the incremental interpretation, each error can be immediately corrected by the user , and the interpretation process continued. Some diagnostic facilities are built in to enable the tracking of the interpretation process. Each phase of the interpretation process can be easily tracked independently of other phases by printing the results of that phase. For example, the lexical analysis can be tracked by printing the set of tokens which was generated during the lexical analysis. A special SDL command (TEST) specifies the interpretation phase that the user wants to track. The communication of the SDL interpreter with other parts of the RDBMS is,provided through virtual calls. In this way, the interpreter is made self-contained. In the RDBMS environment, the virtual calls must be replaced by the actual ones. Functional analysis of interpreter parts The lexical analyzer, parser,. semantic analyzer and editor are implemented using the well known methods, because they are the standard parts of all interpreters [2,8]. The other modules are specific because their functions are dictated by the SDL characteristics. The command module is the main interpreter module which coordinates the work of other modules and communicates with other parts of the RDBMS. This module starts and terminates the interpretation process; it accepts the statements and commands and sends messages to the user, activates other modules to do their job, performs the memory management and provides the multi-user environment. The command module accepts the statements and commands either from terminal and or file. If the user entered a command, the command nodule performs the appropriate action. For example if RUN command was given, the command module activates the execution of a SDL program (in fact, activates the result generator). If the user entered a statement, the lexical analyzer, parser and semantic analyzer are activated to perform the analysis and translation of the source statement into the internal form. If any of these modules detects an error, it informs the command module by setting the error indicator. In that case, the command module interrupts the interpretation process and sends the message to the user. The lexical analyzer extracts tokens, forms the symbol table (fixed part) and writes into it all available information about identificators. The parser and semantic analyzer fill in the rest of the symbol table. The tokens are divided into four classes: identifiers (variable names), constants, reserved words and delimiters. The parser performs syntax analysis of the source statements and their translation into the internal form. The internal form of the statements is postfix notation. The parser is implemented by recursive descent method [2]. The semantic analyzer performs the semantic analysis and writes corresponding information into the symbol table and internal program. The semantic analysis includes the control of the whole program and particular statements. The control of semantic correctness of the whole program includes the checking the presence of all prerequisite statements, checking the order in which statements appear and checking the uniqueness of the statements that can appear only once in the program, in the entry or in the some block structure. The consistency of the operand and operator types and the uniqueness of the identifiers (that must be unique) are also checked. The implementation of the semantic analyzer, ■ egpecially the part which controls the program form, is based on the theory of the finite state machine. The semantic control of the single statements is done by the control routines that are called at particular places in the program structure which simulates that finite state machine [8]. The result generator, which is activated by the RUN command, creates the description file from the symbol, table and internal program. The description file contains the descriptions of all database elements (schema, domains, attributes and relations). Due to the fact that SDL has no imperative statements, the result is generated by searching the symbol table and internal program and gathering all necessary information for the description file. The records in the description file have four different formats, and each record format corresponds to one data structure in" the database (schema, domain, attribute and relation). The records of the same format are grouped together, so the physical description file contains four logical files. The result generator is implemented as a set of procedures and each procedure forms a description of one data type. The editor is the module whose task is to provide the editing of the source program and support the incremental interpretation of the SDL program. The multiuser SDL interpreter is implemented on the IBM 4331 computer using COBOL programming language. The system software for interactive work CICS [13] (which enables dynamic memory and procesa management, data interchange with disks and communications with the users over terminals) is used for implementation. The description file is formed as IBM VSAM KSDS data file with variable length records [12]. The interpreter volume expressed by the number of program lines is 5500. 3. Chamberlain D. D. etc: "A history and evolution of SYSTEM R"; Communications of ACM, no 10, october 1981. 4. Codd E.F.:"A relational model of data for large shared data banks"; Communications of the ACM, no 6, june 1970. 5. Codd E.F.:"Further Normalization of the Database Relational Model";Current Computer Science Simposia, Vol 6, Database Systems, New York city. May 1971, Prentice Hall. 6. Codd E•F.:"Normalized Database Structure: A Brief Tutorial"; Current Computer Science Simposia, Vol 6, Database Systems, New York city, May 1971, Prentice Hall. 7. Codd E.F.:"Relational Completeness of Database Sublanguages"; Current Computer Science Simposia, Vol 6, Database Systems, New York city, May 1971, Prentice Hall. 8. Gries D.; "Compiler Design Computers"; John Wiley and Sons. for Digital 9. Hütt A. T. F.:"Organizing the Description of a Relational' Database"; Software - Practice and Experience, vol k9, 1979. 10. Hütt A.T.F.: Management System" "A Relational Data Base John Wiley and Sons, 1979. 11. Gray J., McJones P.:"The recovery manager of SYSTEM R database"; ACM Computing Surveys, no 2, June 1982. 12. VSE (Virtual Storage Extended) System Data Management Concepts - IBM Laboratory, Programming Publication Department, Boebling, W. Germany. CONCLUSION The schema description language for relational database is a stand alone language completely independent from other languages in the DBMS (query language, data manipulation language etc.). Due to the fact that the relational database description have four different data types (schema , attribute, domain and relation), the SDL statements are grouped into four entries, so each entry describes one data type. The SDL is a structured language which provides a high readability of the SDL programs . The interpreter developed for this language is characterized by its functional independence from the query language, the multiusesr environment and incremental interpretation. The implementation of the interpreter provides the conditions for its easier portability to different machines. These conditions are: a) the application of standard programming language characteristics without any extensions, b) concentration of input/output activities in command module c) marking all machine dependent points in the interpreter (for example, calls of assembler routines, system service calls etc.) so they become immediately visible. 13. CICS/VS (Customer Information Control System/Virtual Storage) Introduction to Program Logic - IBM Laboratory, Technical Documentation Department, Hampshire, England. APPENDIX A: Reserved words ACCESS_CONSTRAINT AND ATTRIBUTE BELONGS BINARY CARDINALIT.Y CHARACTER COMPRESSED CONTAINS CRIPTO_PROTECTION DECIMAL DEFINES DELETE DESC DOMAIN ENDATTRIBUTE ENDDOMAIN ENDRELATION ENDSCHEMA ENDTRIGGER EXTENDED FIXED_POINT FLOATING_POINT FOR FUNCTION IF INSERT INTEGER INTEGRITY_C0NSTRA1NT KEY MODE NEW OLD ON OR ORIGIN READ REAL RELATION SCHEMA SET SYSTEM TRIGGER UNIT UNIQUE UPDATE VALUE REFERENCES 1. Astrahan M. M. : "SYSTEM R: A relational database management system"; Computer, may 1979. 2. Brown P. J.: "Writing Interactive Compilers and Interpreters"; John Wiley and Sons. APPENDIX B: SDL syntax Here we present the BNF syntax definition for SDL. In this notation square brackets [] indicate optional constructs, and braces (} indicate constructa that appear one or more times. () {) {) 1 : : :<8chema_stat> () [] < endschfema_stat> ;=SCHEMA _ ;= ::={ I ) : := I <8chema_stat>; : : : : : : :=DESC : : = 1) 1 : := ! I «;(;>! : := + I : : := < I / I - ; = » .>I = 1=>1<=1<> : :=CRIPTO_PBOTECTION : : = ; SYSTEM : ::ENDSCHEMA : := [] [] ::=DOMAIN : : = : :=DEFINES () : :=MODK : : = I I ::=CHARACTBR [COMPRESSED ] ::=FIXED_POINT , < integer_number> I FLOATING_POINT [EXTENDED 1 :::INTEGER : :=DECIMAL I BINARY [EXTENDED] : :=UNIT [,] : : = : :=ENDDOMAIN : := . ::=D[]< integer_number> 1 E[] : : = "" : :=ENDATTRIBUTE : := [) [| ( J () [) () [] : :=RELATION : : = : :=KEY () ::=CONTAINS (I : :=INTEGR1TY_C0NSTRAINT (IF ) : : = 1 : ; = 1 : : = <8imple_exp> I <8imple_exp> 1 () :AND ; OR <8imple_exp>: := : := | : : = : ::OLD NEW : : = (]. ::= : :=UNIQUE (1 ::=FUNCTION := : :=ACCESS_CONSTRAINT FOR [ON ] [IF ) : := READ ; ::= INSERT | UPDATE 1 DELETE : : = : :rTRIGGER FOR (ON ; : s(operation) [SET : = (expression> [IF (condition)]] (expression)::= ! (arithmetic_exp> (ari thnietic_exp> : : = (sign) (operand) 1 (sign)(operand)(arit_operator> ((arithmetic_exp)) (sign)::=+;- (operand)::=(variable) ! (numeric_constants) : := +1-1*1/ (end line)::=ENDTRIGOBR (attribute_entry>: : = (attribùte_8tat) [deacription_atat)] [(value_stat)] (origin_stat) (belong8_8tat> [(cardinality_stat>] (endattribute_8tat) (attribute_stat>: :=ATTRIBUTE (origin_stat> : :=ORIGIN (belong8_8tat>: :=BELONOS {(relation_name>) (cardinality_8tat>: :=CARDINALITY (integer_number) ( integer_nuinber> :: = { (digit) ) (value_atat>::=VALUE (constant) (constant)::=(numeric_con8tant) ! (literal) (numeric_constant): : = ((sign)] (integer_number) 1 [(sign)] (real_number) (sign)::=+;- APPENDIX C: SDL program SCHEMA MARKETING DKSC There is no protection at schema level DOMAIN DOMADDRESS DEFINES ADDRESS MODE CHARACTER COMPRESSED PROCI ENDDOMAIN DOMAIN DOMCODE DEFINES CODE,CODEPRO,CODESUP,DEPTCODE MODE INTEGER BINARY ENDDOMAIN DOMAIN DOMDATE DEFINES DATE MODE CHARACTER ENDDOMAIN DOMAIN DOMNO DEFINES EMPNO DEFINES EMPNO MODE INTEGER DECIMAL ENDDOMAIN DOMAIN DOMPRICE DEFINES PRICE MODE REAL FIXED_POINT UNIT DOLLAR ENDDDOMAIN DOMAIN DOMQUANTITY DEFINES QUANTITY MODE REAL FIXED_POINT ENDDOMAIN DOMAIN DOMNAME DEFINES NAME MODE CHARACTER COMPRIMED PR0C2 ENDDOMAIN DOMAIN DOMSALARY " DEFINES SALARY MODE REAL FLOATING_POINT UNIT DOLLAR ENDDOMAIN DOMAIN DOMSEX DEFINES SEX MODE CHARACTER ENDDOMAIN ATTRIBUTE ADDRESS ORIGIN DOMADDRESS BELONGS DEFT ENDATTRIBUTE ATTRIBUTE CODE ORIGIN DOMCODE BELONGS SUPPLY,PRODUCT,EMPLOYEE,DEPT ENDATTRIBUTE ATTRIBUTE CODEPRO ORIGIN DOMCODE BELONGS' SUPPLY CARDINALITY 5000 ENDATTRIBUTE ATTRIBUTE CODESUP ORIGIN DOMCODE BELONGS SUPPLY CARDINALITY 100 ENDATTRIBUTE ATTRIBUTE DEPTCODE ORIGIN DOMCODE BELONGS EMPLOYEE ENDATTRIBUTE ATTRIBUTE DATE BELONGS SUPPLY ORIGIN DOMDATE VALUE "01/01/1980" DESC Data when the product waa bought ENDATTRIBUTE ATTRIBUTE EMPNO ORIGIN DOMNO BELONGS DEPT DESC Number of employees in the department VALUE 0 ENDATTRIBUTE ATTRIBUTE QUANT ORIGIN DOMQUANTITY BELONGS PRODUCT,SUPPLY VALUE 0 ENDATTRIBUTE ATTRIBUTE NAME ORIGIN DOMNAME BELONGS PRODUCT,EMPLOYEE,DEPT VALUE "XXXX" ENDATTRIBUTE ATTRIBUTE PRICE ORIGIN DOMPRICE BELONGS PRODUCT VALUE 100 ENDATTRIBUTE ATTRIBUTE SALARY ORIGIN DOMSALARY BELONGS EMPLOYEE VALUE 1000 ENDATTRIBUTE ATTRIBUTE SEX ORIGIN DOMSEX BELONGS EMPLOYEE CARDINALITY 2 VALUE "M" ENDATTRIBUTE RELATION PRODUCT CONTAINS CODE,NAME,PRICE,QUANT KEY CODE DESC Products in the supply INTEGRITY_CONSTRAINT CODE>0 AND CODE<32767 INTEGRITY_CONSTRAINT PRICE<999.99 TRIGGER FOR UPDATE ON QUANT UPDATE SUPPLY SET QUANT :=SUPPLY.QUANT+NEW PRODUCT.QUANT ENDTRIGGER TRIGGER FOR UPDATE ON CODE UPDATE SUPPLY SET CODEPRO:=NEW PRODUCT.CODE ENDTRIGGER ENDRELATION RELATION EMPLOYEE CONTAINS CODE,NAME SEX,SALARY,DEPTCODE KEY CODE DESC Company employees in the last 5 years UNIQUE DEPTCODE TRIGGER FOR UPDATE ON DEPTCODE UPDATE DEPT SET EMPNO:rEMPNO+1 IF FUNI UPDATE DEPT SET EMPNO:=EMPNO-1 IF FUN2 ENDTRIGGER FUNCTION FUNI:=DEPT.CODE:NEW DEPTCODE FUNCTION FUN2::DEPT.C0DE=0LD DEPTCODE INTEGRITY_CONSTRAINT SEX="M" OR SEX-"F" INTEGRITY_CONSTRAINT C0DE>1 AND CODE<32767 INTEGRITY_CONSTRAINT DEPTCODE:DEPT.CODE ACCESS_CONSTRAINT FOR READ ON SALARY IF CODE Ol ENDRELATION RELATION SUPPLY CONTAINS CODEPRO,CODESUP,DATE,QUANT KEY CODEPRO,CODESUP INTEGRITY_CONSTRAINT CODEPRO=PRODUCT.CODE INTEGRITY_CONSTRAINT DATE >"01/01/1980" ENDRELATION RELATION DEPT CONTAINS CODE,NAME,ADDRESS,EMPNO KEY CODE INTEGRITY_CONSTRAINT C0DE>1 AND CODE<10 UNIQUE NAME ENDRELATION ENDSCHEMA IZBOR PROGRAMSKEGA ORODJA ČETRTE GENERACIJE Deskriptorji: REACIJSKE BAZE, IZBOR SISTEMA, PROGRAMSKA ORODJA, LUKA KOPER, ČETRTA GENERACIJA Zvone Kribel, Boris Legac, Marjan Marušič, Andrej Novak Luka Koper, Sektor za informacijski sistem Podajamo kriterije, s katerimi je? Luka Koper, oz. komisija zb izbor nove programske opreme i;;birala med različnimi ponudniki. Komisija je sestavila do podrobnosti specificirano vzorčno aplikacijo, ki jo sorazmeroma lahko preslikamo v CICS-ov program. Ob preverjanju lastnoisti opreme, smo prišli do zaključka, da izbor programske opreme ni tehnično, temveč muJ-ti di sciplinarno vprašanje. SELECTING THE: FÜURTH GENERATION PROGRAMMING TOOLS: In this paper we describe the criterians that were used in Port of Koper for the c?valuation of 4GL and DBMS. We describe in detail one case application that was tested with the PL/1 programming language and VSAM under CICS/VS environment. During the evaluéìi-tion we came to a conclusion that the selection is not only a technological matter, it should be a mu11 i di sc ipli ne process. 1. UVOD V Luki Koper Se tri UNIVERSE, relacijsko leta koristimo CA-banko podatkov, s programskim jezikom ADL in aplikacijskim generatorjem. S postavitvijo plana razvoja luškega IS in po triletnih izkuSnjah dela s to relacijsko banko smo prišli do ugotovitve, da potrebujemo programska orodje, ki nam bo omogočala doseganje boljših performans aplikacij v proizvodnj i.Dne 9. marca 88 smo v ta namen ustanovili komisijo za izbor nove programske opreme. S 1. oktobrom smo pričeli prenašati produkcijsko zahtevne aplikacije iz CA-Universe v produkte ameriške družbe Applied Data Research. Glavni zahtevi do nove opreme sta: - omogoča izgradnjo enotne produkcijsko stabilne podatkovne banke, - nuditi mora visoko produktivno programsko okolje. V sestavku podajamo kriterije za izbor opreme glede na potrebe razvoja informaci jskega sistema Luke Koper. Poleg tega podajamo razlike med produkti dveh družb, ki sta ostali v ožjem izboru. Opisali smo tudi vire informacij o lastnostih teh produktov. V posebnem razdelku je podrobno definirana aplikacija, Vci smo jo posredovali specialistom v obeh družbah. Ob koncu sledi zaključek in naše izkušnje pri izboru nove opreme. 2. RAZLIKE Po analizi tržišča je prišlo v širši produktov naslednjih proizvajalcev: - SQL. CSP: i zbor - Computer Associate: CA UNIVERSE - IBM: SOL/DS, CSP - ADR: DATACOM, DATADICTIONARY. IDEAL - Software AG: ADABAS, PREDICT. NATURAL - CINCGM: SUPRA, MANTIS - ORACLE: ORACLE - CULL INET: IDMS Po prvih primerjalnih analizah sta v ožjem izboru ostala le še ADR in SAG iz naslednjih raz 1ogov: - CA-UNIVERSE: v produkcijskem okolju zah- teva zmogljivo strojno opremo, ki je Luka Koper ne more z agotovi ti. ni pravo produkcijsko okolje - SUPRA,MANTIS: sistem je prekompleksen za naše razpoložljive resurse - ORACLE: Ni v DOS/CICS okolju. večina instalacij na DEC-VAX - IDMS: firma ima organi zacijske. finančne in tehnične težave. Po izvršitvi vrednotenja posameznih prodMktov smo ugotovili, da so razlike med produkti družb Software AG in ADR minimalni, tj. samo 47. možnih točk. Te razlike so v naslednjem: - podatkovna baza (ADABAS je nekoliko enostavnejša, ima možnost dinamičnega dodajanja atributov, ponavljajoče se skupine) . - razvojni jezik (NATURAL ima dostop do VSAM datotek,in ima možnost oken; IDEAL je enostavnejši 2a uporabo in ima boljSo programsko dokumentacijo) - podatkovni slovar DATADICTIONARY ni možno izključiti pri prevodu programov, v PREDICT pa lahko vpiSemo tudi nekatera integracij ska pravila. Obe družbi s svojo ponudbo zagotavljata izpolnitev kriterijev, ki smo jih postavili kot obvezne. ADABAS koristi amerižka FBI, DATACOM/DB pa ameriSka vojska, ADABAS ima cca 2200 instalacij. DATACOM/DB ima 1800 instalacij. Družba So-ftware AG ni imela v bližini instalacije s podobnim HW in v podobni gospodarski veji. Produkte družbe ADR smo si ogledali v dveh centrih v Trstu in v Inter-frigo v Baslu v èvici. Zanimiv je pristop tujcev, ki sploh niso vzeli v precep konkurenčnih produktov. Ob potovanju v Basel smo se pogovorili s predsednikom evropske veje družbe ADR v Klotenu pri Zurichu in istočasno smo obiskali nemško družbo So-ftware AB v Darmstadtu. V obe družbi smo poslali podrobno de-finiojo vzorčne aplikacije in zahtevali ocene CICB partici j, odzivnega easa in prosili, naj nam razvijejo spec ificirano aplikacijo v NATURALu, oz. IDEAL-u. 3. APLIKACIJA Aplikacija dodaja posamezen slog v tabelo delavcev. CICB-ovo ime transakcije je ZDDA in program naj bo ZDDOADD, ki je del projekta ZDD. Ta aplikacija mora koristiti psevdokon-verzacijski način na 250CICS terminalih (200 VTAM in 50 lokalnih 3270 ali 3170). Povprečna se doda 2.5 sloga v sekundi, največ pa 5 slogov v sekundi ob obremenitvi. Odzivni čas naj bo največ 2 sekundi, zaželjeno je povrečje ene sekunde v 8 urni izmeni. a). Osnovno tabelo delavcev specificiramo z ANSI SQL-om z: CREATE TABLE DELAVEC ( SIFDEL INTEĐER NOT NULL, IME VARCHAR (20), PRIIMEK VARCHAR (20), DAT ROJ CHAR(6) ) i Pogled (view), ZDDVDEL vsebuje vsa' polja z izjemo DAT_ROJ polja. Polje SIFDEL mora imeti masko PIC99999' povsod, kjer se pojavi. Prvi znak polj IME in PRIIMEK naj bo vedno črka. b). Aplikacija koristi dve BMS mapi ZDDMCTL kontrolni zaslon in ZDDMDTL za detaljni izgled sloga. ZDDMCTL ima samo SIFDEL polje, kamor operator vnese šifro delavca, ki ga želi dodati. Zaslon ZDDMDTL ima IME, PRIIMEK in SIFDEL polja, vendar SIFDEL je v tem zaslonu samo izhodno polje. Oba zaslona imata polje sporočil na spodnjem robu zaslona. Ob vsaki napaki napačno polje osvetlimo. Qbe mapi vsebujeta standardne tekste kot so ime organizacije, ime projekta, ime zaslona mape, datum in uro ob zadnjem pritisku neke AID tipke. c). Z vnosom ZDDA poženemo program ZDDOADD, prikaže se zaslon ZDDMCTL z LMIOOl sporočilom. Vsa sporočila so v standardni tabeli. Ko uporabnik vnese vrednost polja SIFDEL (vse numerično) in ENTER, program preveri PIC99999' masko. če je napaka, program ponovno pošlje mapo na terminal z osvetljenim poljem SIFDEL in standardnim sporočilom LMEOOl. Program zaključimo z CLEAR AID tipko, s PFl program prikaže zaslon za pomoč ZDDHAOO (help). Ko operator uspešno vnese SIFDEL, program preveri obstoj sloga s tem ključem. će tabela že vsebuje tega delavca, program prikaže LME020 sporočilo. će tega delavca še ni v tabeli, program prikaže ZDDMDTL mapo s sporočilom LMIOlO, kamor uporabnik vnese ime in priimek delavca. Vrednost polja SIFDEL program samo prenese iz predhodnega zaslona. Obe polji program preveri ali imata prvi znak črko, sicer sporočilo LME022. će je vse v redu, program doda slog s ključem SIFDEL v ZDDVDEL. Spet se prikaže zaslon ZDDMCTL s sporočilom LMI002, ki mu sledi sporočilo LMIOOl sporočilo (konkatenirano). d). Upoštevajoč hišni standard mora vsaka aplikacija koristiti standardni modul GETMS6,ki pričakuje vhodni parameter 6 znakov dolgo šifro sporočila in vrne preko globalnega področja (npr. CICS Common Work Area) 39 znakov dolg niz sporočila iz posebne tabele. Ta aplikacija koristi sledeča sporočila: LMIOOl LMI Oi:)2 LMIOlO LMEOOO LMEOO1 LME020 LME022 Prosim, vnesite ključ. Vrstica je uspešno dodana v tabel o Prosim, vnesite vrednosti stolpcev. Ničesar niste vnesli. Ključ mora biti številka. Ta vrstica že obstaja v tabeli. Prvi znak niza mora biti črka. 4. CENE Ponujeni finančni pogoji obeh firm so podobni; Družba cena letno vzdrže- vanje ADR 263.688 US$ 35.477 US« Teden šolanja 600 US$ Po petih letih znaša celotna vsota z osnovnim šolanjem (na dveh procesorjih) 461.297 US» SAG Za dva CPU Teden šolanja: prvo leto drugo leto ostal a 201.850 US* 278.438 US« 36.700 US« 50.625 US« 5.000 US« 7.500 US« 11.000 US« Po petih letih znaša celotna vsota z osnovnim šolanjem 480.938 US« 5. ZAKLJUĆEK Nakup nove programske opreme je v našem okolju dokaj zahtevna in nehvaležna naloga. Pogosto imajo glavno besedo pri izboru opreme samo vodilne osebe, ki običajno niso dovolj tehnično usmerjene. Delno se lahko izognemo napačni odločitvi z ustanovitvijo posebne komisije za izbor opreme. V Luki Koper smo izbrali slednje. Hitro se je pokazalo, da je naloga zelo zahtevna in da jo je zelo težko kvalitetno rešiti v zastavljenih šestih mesecih. Dokaj hitro smo ugotovili utesnjenost in tehnično zaostalost našega okolja. Naštejmo nekatere najbolj pereče težave, ki smo jih srečali pri sodelovanju v komisiji: - pol leta je nedvomno premalo časa za ocenitev programskih orodij, ki so strateškega pomena za delovno organi zaci j o. - Vsak posamezen programski produkt iz razreda sistemov za upravljanje bank podatkov je plod cca 10 do 20 človek/let in tega dela ni možno spoznati v treh tednih. - V komisiji smo močno pogrešali strokovnjake s finančnimi in pravnimi izkušnjami. Literatura za napotke pri sklepanju pogodb s tujimi softwarskimi hišami ni uporabna v naèem okolju.■ 6. REFERENCE 1. RELATIONAL DBHSs, Xephon Buyer's Guide, Xephon Technology Trans-fer Ltd, London, 1986 2. A BUYER'S GUIDE TO DATA BASE MANAGEMENT SYSTEM, Datapro Research Corp., Delran NJ, fBS 3. Diane L. HERDT: DBMS EVALUATION CRITERIA, Auerbach System Development Management port-folio 34-02-40, Auerbach Publishers Inc., Boston MA, 1987 4. Gordon C. EVEREST: DATABASE MANAGEMENT, Objectives, System Functions, and Admi-ni stration, McGraw Hill 1986 PRILOGA: Kriteriji za izbor opreme z uteimi glede na pomen posameznega kriterija za izgradnjo IS Luke Koper: 3. RAZVOJ APLIKACIJ 3.1. Programski jezik, ki deluje preko slovarja pod. 3.2. Generator map in reportov (map za PRINT) 3.3. Pomoč pri izdelavi dokumentacije 3.4. Interaktivni vpogledi v podatkovni slovar 3.5. Generator aplikacij (vodenje preko menijev, help) 3.6. Možnost izvajanja batch aplikacij v samostojni particiji 3.7. Moinost orodij za pomoč pri metodah analize in designa na PC (CASE Tools) 3.8. podpora poslovne grafike na PC 3.9. Razvoj aplikacij na PC, ki imajo dostop do centralne podatkovne baze preko ■file transfer 3.10.Popolna zamenjava za klasične programske jezike (vse kar lahko v PL/I) 3.11. Relacijski pristop do podatkov izključna preko VieW-ov 4. MIGRACIJA * 8 10 10 3 3 10 10 1. PRODUKCIJSKO OKOLJE UTEŽ 1. 1. 1.2. 1.3. 1.4. 1.5. 1.6. 1.7. 1.8. Operacijski sistem DOS pod VM CICS okolje Možnost vsaj 250 CICS terminalov (za isto transakcijo) na obstoječem HW Odzivni čas 2 sekunde (5 transakcij/s) Sledi spremembam releas-ov IBM prog. Omogoča spremembo (strojne opreme, DOS -> MVS Vključitev drugih uporabnikov v Lu4k\ IS preko SNA File transfer na PC 2. PODATKOVNA BAZA 2.1 Enoten podatkovni slovar (ne glede-na st. DOS-ov) 2.2. Vsebuje orodja za poročila na DD, (crossreference) 2.3. Recovery (avtomatski restart) 2.4. Monitoring (moinost fizične namestitve podatkov) 2.5. Moinost sprememb podatkovnih definicij (ne prizadene obstoječih apli-kaci j ) 2.6. Zaščita podatkov (permiti) je na nivoju DB/DD 2.7. Zaklene samo slog pri spreminjanju (locking) 2.8. Moien je pristop z ANSI SQL jezikom 2.9. Tip podatkov datum in preverjanje pravilnosti datuma 2.10.Null vrednost (nedefinirana vrednost) 2.11. iransact1 on control, rollback, commit 2.12.Agregatne funkcije (svg,sum,group,by) 2. 13.1-lojnoBt dostopa do banke s PL/1 pro- -jr aniom 2.14.Moinost definiranja VIEW-ov z dostopom tudi do datotek, ki niso v bazi V S A M i 2.15.Omogoča kompresijo podatkov 2.16.Formiranje viewov tipa SELECT, JOIN z WHERE pogojem z moinost update na eno tabelo 2.17.Podpora nacionalnim znakom, latin 2 2.IB. Integri tete entitet 2.19. Integri tete domene 2.20. Integri tete referenčne 2.21.Kriptozaščita 10 * 10 10 3 * 10 7 10 4. 1. 4.2. 4.3. 4.4. 4.5. 5.1. 5.2. 5.3. S. 4. D. 5.6. i.7. Stacih aplikacij ni potrebno prevajati DL/I (DL/I transparenca) Starih aplikacij ni potrebno prevajati VSAM (VSAM transparenca) Omogoča postopen prenos v novo okoljt časi pri migriranih aplika-DB naj ne padejo več kot Odz ivni cijah v z a 10'/. Uti1ity za prenos job control 5. PODPORA PROIZVAJALCA Osnovno šolanje v Luki Koper Kvalitetno šolanje specialistov (sistemci, DBA) Literatura (aiurna, original 2 izvoda, številka licence, interni časopis proizvajalca) Nepretrgana tehnična podpora <24 urni telefon) UP-DOWN LOAD 5 G BVTOV < 8 urah Menjava releasov ne sme vplivati izdelave aplikacije in na banko podatkov Zagotovljeno svetovanje pri razvoju api i kaci j na 6. NABAVNI POGOJI 6.1. Cena enkratnega nakupa do 270.000 % možnost obročnega odplačevanja 6.2. Testna instalacija 6.3. Osnovno šolanje vključeno v ceno 10 7 5 5 10 10 10 10 7 10 4 10 10 rpi 5 Uj ; N 5 OP i Vrednost polja UTE2 pove pomembnost te lastnosti za rai-voj IS Luke Koper. Zvezdica v tem stolpcu pomeni, da je označega lastnost obvezna. Način izračuna točk /a posamezne produkte: N TP I = D Uj * OPij J = 1 - točke i - tega produkta - utei j - tega produkta - števila kriterijev izbora - ocena j - tega kriterija za i-ti produkt (v razponu od O do 10 točk) INFORMATIONAL LOGIC III INFORMATICA 1/89 Descriptors: LOGIC INFORMATIONAL, RULES FORMATIONAL AXIOMS INFORMATIONAL, INFORMATIONAL Anton P. Zeleznikar WELL-FORMED FORMULA (IWFF) Iskra Delta, Ljubljana In this part of the essay two main topics of the informational logic (IL) are discussed: formation rules which govern the structure of informationally well-for»ed formulae (iwffs) and infornational axioms. In the continuation of this essay informational transfornation rules of IL will be examined in a formal informational way. Formation rules of IL have to answer the question how to construct initial informational formulae which will belong to the so-called class of iwffs. Within this question the so-called operational, operand, and parenthetic constituents and their compositions into iwffs are determined. In formatting a formula (iwff) several basic processes can be applied, for Instance, beginninjl of formula formation, introducing of operands and operators in implicit and explicit forms into the context of an iwff, particularization and universai ization of formulae, etc. Afterwards, formation rules of IL are exposed in a short and concise manner. At the end, the question can be put what could be the form of a non-informational formula. Within the topic of informational axioms the following subjects are discussed: axiomatization of informational principles, how informational axioms can be generated, axioms of Informing, informational difference, informational circularity. Informational spontaneity, informational arising, counter-infornat ion, counter-Informing, informational embedding, informational embedding of counter-information, informational differentiation, informational integration, informational particularization and universalization, informational structure and organization, informational parallelism, informational cyclicity, openness of informational axiomatization, influence of metaphysical beliefs on axiomatization,, and axiomatic consequences of informational arising. Informacijska logika III. V tem delu spisa se obravnavata dve glavni naslovni poglavji informacijske logike (IL): formacijska (oblikovalna) pravila, ki urejajo strukturo informacijsko dobro oblikovanih formul (iwff) in informacijski aksiomi. V nadaljevanju tega spisa se bodo na informacijsko formalen naöin preučevala Se informacijska transformacijska pravila IL. Formacijska pravila IL morajo odgovoriti na vpraSanJe, katere informacijske formule pripadajo t. i. razredu iwff. V okviru tega vprašanja se opredeljujejo operacijski, operandni in oklepajni konstituenti in njihove kompozicije v iwff. Pri oblikovanju informacijskih formul (iwff) se uporabljajo nekateri osnovni procesi, kot so npr. začenjanje oblikovanja formul, uvajanje operandov in operatorjev v implicitni in eksplicitni obliki v kontekst formul, stikanje operatorjev, partikulariziranje in univerzalieiranje formul itd. Oblikovalna pravila IL je mogoCe opisati kratko in jedrnato. Vprašanje, ki ga je mogoCe postaviti pri oblikovanju formul Je, kakdna bi lahko bila oblika neinformaciJake formule. V okviru problematike informacijskih aksiomov pa se obravnavajo tale naslovna vpraSanja; aksiooatizacija t.i. informacijskih principov, kako Je mogoCe generirati aksiome, aksiomi informiranja, informacijske diference, informacijske cirkularnosti, informacijske spontanosti, nadalje aksiomi informacijskega nastajanja, prot i informacij e, protiinformiranja, informacijskega vmefiCanja, informacijskega vmeBCanja proti informaci je, informacijske diferenciacije, informacijske integracije, informacijske partiku1 ar i zac i je in un i ver zal i zac i J e , informacijske strukture in organizacije, informacijskega paralelizma, informacijske cikliCnosti ter Se odprtost informacijske aksiomatizaciJe in aksiomatiCne posledice informacijskega nastajanja. II. 2. FORMATION RULES OF INFORMATIONAL LOQIC ... in rejecting aentaj representations as the objects oP belief one is not thereby rejecting the empirica! iiypothesia that the brain ia an information processor and thus processes in a neural machine language. Stephen Schiffer [11] 5 II.2.0. Introduction In this section (II. 2.) we have to say clearly which kind of informational formula will belong to informational logic (IL). Thus, we shall deal with the question how to construct informational formulae which will belong to the legal form of informational formulae. The word legal will have the meaning of well—formed. We have to state precisely what is an expression composed of informational operands, informational operators, and parenthetic delimiters, in such a way that It will represent the so-called informational well-formed formula (iwff). In the context of formation rules we must consider that an iwff has to be capable of representing any information, most abstract as well as most ordinary life information, simple as well as most complex one. In this respect, it seems senseful to put the following question: "Is it possible to state explicitly what will be the limits of formula formation or ii: it at all possible to set any fixed limits which would disable realization of any general principle of information?" Within this dilemmatic view of formation of an informational formula we shall develop some basic rules of formation, not saying that these rules are the only possible ones. Already within the principles of information ([4] or [10] respectively) it was shown how informational formulae can be composed on the level of natural language. This experience tella us that informational formulae are in no way limited sequences of informational operands and informational operators in relation to spoken and written language. Relationships within information are objective (operand-characteristic) and subjective (operational) and can be changed from the operand to operational states and vice versa during an informational process. Thus, a local informational operator can be viewed as an operational variable in a wider informational observation. Due to this phenomenology of informational compositions of operands and operators, iwffs are, in general, not structurally limited in any particular way. Limitations can be determined in cases of particular informational theories, concerning, for instance, formal logical systems as traditional symbolic logic, modal logic, etc., which can be conceived as special projections (particularizations) of informational logic. The next fact to be explicated ia, that sets of objects of IL are generative, i.e. not limited (determined once and for ever) in advance. Only static theories deal with finite and strictly determined sets of objects and as such are understood to be the most informationally primitive (static) forms of IL. In this respect, the set of formation rules of IL will not be semantically limited and informational operators will be recruited depending on needs, goals, and applications, which arise during an informational process. Similar will hold for informational operands occurring in Informational formulae. The principle of informational arising will govern the arising of concrete formation rules (concerning concrete informational operators). However, it will be possible to present the essential framework of the arising of formation rules within IL. II.2.1. Informational Operanda as Constituents of IWFF« The nature of information is variable in its arising, changing, vanishing, and disappearing. An informational operand is such a sort of variable information. We have the following basic definition concerning an informational operand as iwff: [Operandsl®^^: Informational variable a, defined as informational operand, is iwff. This operand can represent various kinds of information belonging to an informational realm. Thus, ('ot is informational operand') ^ Co is iwff) ■ In many cases it is reasonable to separate informational entitles as variable operands. So, one can set the following definition: i [Operands)"''^: A set of informational variables a, ß, ... , T» in which a, ß, ... , ^ are defined as informational operands, is iwff. These variables of the set can represent various kinds of information belonging to given informational domains. It is: ('a, ß, ... , f are informational operands') ('a, ß, ... , T is iwff) ■ In the last definition, the commas can be understood as inforutional operators, which connect operands into an informational set. The sequence a, ß, ... , could as well be written in the following way: at:: B f^comma '^comma ' • ' '^comma ^ where _____ ie the set-connective comma informational operator representing the delimiter A set of operands a, ß, ... , Tf can be represented by a resultant operand, say 5, where ? = a. ß.....T In this case, the symbol '=' is the informational operator of representation. It has the meaning that ? representatively Inforns o, ß, .... T or 5 a. ß.....If Let us set now the definition reverse to [Operands : [Operanda 1®": If o represents an iwff, then informational operand: i a an ( 'a is iwff ) ^ {'=con^=J>' "here Nj 6 l=. i8 a recursive definition of OC. If is an OC, one can write instead of (1) con and (2): (1)' (l=i e H ^ H h,) (2)* € A O marks OC')) ^ «»=con»=J> Expression (2)* is a kind of informational modus (a particular form of the so-called modus informationia): l=j € denotes 00') The consequence of [Operators is that is a word belonging to the set (1=1, \ (1 where () denotes the empty set. Ì8 a composition of informational operators in an arbitrary complex (interweaving) way. In a particular case, W can be the linear (usual, con mathematical) composition of operators. The complex composition means a parallel (interweaving) activity of operators constituting . For instance, we shall allow con the notation instead of |=HI, where in 1="' the relation of dominance |= A HI will be assumed. It is evident that among operators, constituting Kcon' additional dependencies (relations) can be determined. (Operatora If t= infornational operator (or operational variable): con represents a sub-iwff, then h is an con ''^con 8"b-iwff') :> is an operational variable') This definition says that irrespective of how '"con sub-iwff is structured, in fact. it represents a composite (complex) operational variable in which its components (suboperators ) are variables too. Formally, as a concatenation of operational variables, |= functions as an con operator composition. In regard to an iwff, a sub-iwff is in some way a reduced form of the iwff concept. This reduction is semantically presented as the prefix 'sub' in sub-iwff, which is a concatenation of informational operators and is marked by operator II.2.3. Parenthetic Delimiters and Parenthesizing of IWFFs In fact, parenthetic delimiters can be understood ae the delimiting informational operators within an iwff. Their function is to determine the so-called iwff's unities within a formula. A unity of an iwff can be used as operand of a higher operator structure. For parenthetic delimiters arbitrary symbols can be introduced, for instance, parentheses, brackets, etc. Besides parentheses, it is possible to introduce the so-called non-substantial delimiteri by which the so-called non-substantial part of an iwff will be marked. So, let us have the following definition: [Delimiters]®''^: Irrespective of their choice, the parenthetic delimiters occurring in an iwff unite parts of the iwff or the whole iwff, with the intention to define the unit they delimit to be used for some operation over the unit. Parenthetic delimiters occur always in pairs, consisting of the begioning and the ending delimiter, and can be nested within Other pairs of delimiters. Usually, parenthetic delimiters will be denoted by ' ( ' for the beginning and by ' ) ' for the ending delimiter. However, also other kinds of delimiters can be introduced, for instance, the pairs [, } and (, ], all of which can be understood as particularizations of the delimiter operators and , denoting the beginning and the ending parenthesis. ■ [Delimiters]®^^: Parenthetic delimiters can be used in pairs in such a way, that they delimit an expression within an iwff, which is either an iwff or a sub-iwff. In this case the usual nesting principle of parenthesizing is valid. ■ [Delimiters] : A special, unary delimiter is in fact the symbol, markini; the so-called non-substantial or self-comprehensive part of an iwff. This delimiter is composed of three consecutive dots, thus '...'. The three-dot delimiter is a legal symbol of an iwff. ■ {Delimiterai^^: Considering the previous three definitions, the legal formulae or iwffa are, for instance: ß, )) With the last formula we can even determine the positioning of parentheses in an iwff. Fvidently, this formula can be rewritten as a(P(T))i where the entities a, p, and T are understood as unsubstantial parts of the formula in question. ■ II.2.4. Some Basic Processes within the Formation of Formulae So,far, we have used the following basic processes in the formation of a formula: (1) Introducing of informational operands as constituents of an arising formula, where the operand as a variable represents an iwff. <2) Setting of informational operands into sets of variables, where distinct variables were separated by commas. Such a set of variables was declared to be the iwff. (3) Introducing of explicit informational operators of the type in an arising formula and concatenating them by other informational operators into a sub-iwff (OCs) within an iwff. (4) Introducing of explicit informational operators and their operational concatenations (OCs) and concatenating them with operands and their informational sets, thus formatting an iwff. (5) Introducing of implicit informational operators of the type gr (functional operands) into the operand parts of an arising formula and concatenating them (functionally) to an parenthesized informational set of operand variables. (6) Introducing of explicit and implicit informational operators in a particularized and universalized way of their choice. Hven if operational particulari zat i on and universalization are the basic formatting principles, they could be understood first of all as formula transformation principles (see subsection II. 5). (7) Formattin« a complete formula (iwff) means to use rules (l)-(6) in an arising manner. In this respect an instantaneous formula can be always developed by further steps proceeding from one formational state to the other in a growing (enlarging) or a vanishing (reducing) manner. II.2.5. Formation Rules of IL In the previous definitions of the subsection II.2 we gave the rules for formation of an iwff in the following way: it was said what operands and operators constituting a formula are. Then it was said how operands and operators can be combined or concatenated into a formula. Also, the use of the so-called delimiters, which determine the units or subunits of a fornuln, was described. There were not any particular restraints for formula formation. In general, combining of operands, operators, and delimiters in the described way, leads to the formation of a formula. In such formatting processes, particulari zat Ion and universalization of operators are still possible, [Formatting of formulae]®^': An informational well-formed formula (iwff) can be constructed by the use of the definitions ,DF4 2 [Operands]®'''^ - [Operands]®^"*, [Operators] ,DF3 DF44 [Operators] DFl and [Delimiters] [Delimiters] by concatenating (combining) operands, operators, and delimiters according to the description in subsection II. 2.5. With this procedure production of an iwff is assured. B [Formatting of formulae: Informational well-formed formulae, being constructed according to the previous definition, can be composed into the so-called informational system. In this system, distinct, iwffs are separated by commas or/and can appear in different linea. Similarly as a single iwff can be marked by a symbol of operand, the informational system can be marked by a symbol of operand. Informational system performs as iwff. ■ Later on, discussing the transformation rules, we shall show how formulae can be decomposed into systems and how systems of formulae can be composed into formulae under certain circumstances. In this essay we have given several examples, which clarify and illustrate the principles of formula formatting. II.2.6. What Could Be a Mon-Informational Formula? The first approach to the topical question could be in questioning what are already informational formulae. The answer to this counter-question is that all mathematical formulae certainly belong to the class of informational formulae. However, informational formulae are not necessarily mathematical, for mathematics does not deal with the creative component of information (Informing, arising, generating, functional changing, etc.). The semantics of informational operators and operands as variable subjects and objects is generative, while mathematically chosen objects are invariably determined. The answer to the topical question is that anything we form out of informational operanda, operatora, and parenthetic delimiters is either iwff or sub-iwff. Certainly, there exist sone unessential distinctions. For instance, if we take two informational variables, say a and ß, we can form several compositions, such as aß , ß, a(ß), V (a =1 ß)) (6) ( ((ß 1= a) v V (ß H a)) (a / p) (3) (('a is datum') A ('ß is datum')) (a = ß)) (4) (('a is information') A Cß is information')) :> (k*^ (a = ß)) (5) ((a N) V (1= a)) ^ (L 8^(a)) (6) ((4 a) V (a 4)) (»tt««) -J) (7) 8^(a) u(a) In (5)-(7), 8^(a) denotes the difference arising as Informing of a over itself and marks the possible equivalence between informational difference and arising of counter-informàtion from a. For instance, a pure logical axiomatic conclusion would be that (a ^ ß) :> (f, (a = ß)) From (3) it is understood that only data can be equal. Thus, the equality between two informational items is possible in the realm of information, which represents information as data, that is on the informationally static basis. Sooner or later informational equality remains very unnatural and lifeless informational property. lAxioms]®^«: If ß is information and if a = p, then a is the marker for ß. In this case oc is the so-called marking information, which has the nature of information ß whose marker it is. Formally, 32 (CP is information') A (a = ß)) ('a is the marker of ß') II.3.1.3. Axioms of Info Ciroularity ,ttonal [Axioms] DF5. (1) (a 1=) (a N oc) (2) (a h) a) <3) (H a) (a J= P) > ((a A ß)) <ß ^ o) > <(« 1= ß) l=> ((a h a) A <ß ». ß) A <=) V (t= a)) ((a t= 3„(a)) V (9^(a) N «)) < be counter-information which arises from information a. Let the meaning of operators L and J be 'causes the appearance of or 'comes into existence from', respectively. There is possible to set several axioms, for instance: (1) ((a h) V (I- a)) (a L u)(a)) (2) ((=1 a) V (« H)) (u(ot> -I a) <3) ((a V (H a)) ^ ((a u(a)) A (u respectively. ■ II.3.1.8. Axions of Informational Babedding Let, for instance, sensory or outward information ((« J a, ff) A (e J E) ) (4) ((cr (= a) L E, e) ( (« = E ) A (e = C (0-, a) 1 o' I a c,a (5) (e , E J (a =) cr) ) i ((E = E ot (6) () H a)) a a, a ((« V (t=jj a)) (2) ( (=j a) V (a H) ) ((Hj, o() V (a (3) (a a) ^ (a oc) II.3.1.10. Axioms of Informational Differentiation Differentiation of information is an inherent property of information which informs and is informed. Differentiation is not only informational arising, but arising of informational difference in comparison to the present state or processing of an informational entity. Differentiation is a component of informational arising with the Intention to arise differently to existing information. The consequence of this fact is that information arises differently. To enter into the discourse concerning informational differentiation, we can introduce two basic and general differential operators which govern the so-called explicit and implicit informational differentiation. _____,DF45. (Operators] : The explicit informational operator of differentiation can be determined in the following way: (4) (a ß) (a, ß a, ß) etc. The last axiom can be constructed from a more general one, namely from, (a t= ß) :> (a, ß h a, ß) by the non-uniform substitution of the second operator, i.e., by its particularization. ■ Through informational differentiation of information also several differences can be determined. These differences can be marked by special symbols. For instance. 8 oc,ß , ,(5. T).....K) S <(o, ß.....tI=5. .....?)v (5. T), ... , ? =1 a, ß.....t) ) This formula shows the possibilities of conversion between implicit and explicit informational operators, where marking of a difference becomes equivalent to the result of an implicit operation. For instance. (N^ V H^) "Df ('differentiates' ) V ('differentiate') V ('i8_differentiated(_by)') V ('are_differentiated(_by)') »a,ß........... ®a,ß, ,(5. TI.....K) The implicit informational operator of differentiation can be determined as 36 II.3.1.11. AxIobb of Inforaational Integration Integration of information ia an inherent property of information which informs and is informed. By Integration the incoming, arriving, and arising information is informat i ona11y integrated into source information or into information in question. Informational integration is a consequence of the appearing information which has to be integrated into an existing informational realm, otherwise it will be lost as informational noise. Similarly as in the case of differentiation, it is possible to introduce two basic and general operators of integration which govern the so-called explicit and implicit informational integration. [Operators One kind of the explicit informational operator of integration can be determined in the following way: "Df {'integrates') v ('integrate') v ('Ì8_integratod(_by)') V (•are_integrated(_by)') The primitive implicit informational operator of integration can be determined as a.ß , T), K) with the following meaning: a, ß, ... , t integrate ?, i), . . . , ^ or ?, t), . . . , ? are integrated by a, ß, ... , f into a, ß, ... , T-Evidently, the operators Hgf and 3 integrate the given information into the integrating information itself. At this point a clear difference between differentiation and integration comes into the foreground. Certainly, the complete implicit informational operator of integration can be determined in the following way: ^[X.n.....v)a,ß.....T*^' .......^^ The meaning of this implicit operator is as follows: informational entities a, ß..... T integrate informational entities tj, ... , ? into informational entities X, (i, ... , v. ■ [Operators]®''": Now, we have to define an informational operator of location with the meaning "into", to enable the expression of this particular need, for instance, to be integrated "into" information. This operator has to be of explicit informational type, for difficulties of expressing the process concerning the "into" occur, for instance, by the use of the operator Hn■ We have: o "Df ('into') Further, we can introduce the left to the right and the opposite version of this operator by and respectively. The proposed operator is general and introduces substantial semantics into our further discourse. ■ [Operator]"^'"': In the explicit case the informational operator of integration t=g the need has arisen to express into which informational entity information will be integrated. Wc have now the following possibility: a Ng ß T A^ ß =1, The meaning is the following: information a integrates information ß in one or another way into information T- By the operator 1 we can even capture the most subtle phenomenon of coming of information into existence. Thus, we can decompose the operator L to some degree by splitting its meaning into "coming_of' (>=„„ ) come and 'into'. We can introduce the following implication: {L a) <«) J» come At this point the question what is existence can arise. "Existence" has the meaning of information of existence or of existing information. Similarly, "coming" has the meaning of information which comes or of coaing information. This kind of Information is, for instance, coun ter-informat i on or sensory information. Coming of information into existence is, for instance, embedding of the arisen counter-information into the exinting Information. Here, coming into exislfncp concerns informational differentiation as well as informational integration. In other words, arising of information is nothing else but counter-Informing and embedding or differentiation and integration of information. These two types of Informing constitute the so-called Informational cycle, which is the cyclo of coming into existence: from the existing information arises the counter-information and is embedded again into the existing information, enlarging (or decreasing) its informational realm. This informational cycling is the fundamental process of any informational arising. Therefore, we can say that information informs (differentiates) and is informed (integrates) cyclically or, in a more general sense, circularly. [Operators]®*'®: Let us clear the meaning of the explicit informational operator of integration in composition with the informational operator 'into*. We can list the following examples: a or (ct (=g) 1 a information a integrates or information a integrates into itself; « or (hg ct) 1 a information a is integrated or information a is integrated into itself; « t=g ß (a 1=3 ß) IT information a integrates information ß or information a integrates information ß into information t ; the last formula can also be read as informatio'n ß Is integrateci by information a into information t I a, p, ... , T t=3 5. 7), ... , K or (a, ß, ... , T t=g .......A X, ... , V informational entities a, p, ... , T integrate informational entities , t), ... , ^ or informational entities a, ß, ... , T integrate informational entities T), ... , ? into informational entities X, (1, ... , v; the last formula can also be read as informational entities t), ... , ? are integrated by informational entities a, ß, ... , y into informational entities A, ji, ... , v, etc. These cases of informational integration, using the explicit operator |=g, can certainly be determined also for parallel, cyclical, and parallel-cyclical cases (l=g. Hg, and H-g). ■ [Axioms! (1) ((a t=) V (H ct)) f <(a Hg) V (h=3 a)) (2) <(=! a) V (a ^ ((^g a) V (a ^g>) <3) (a N «) ( ß h oc, ß) N a, ß)) by the non-uniform substitution of operators (e.g. particularization of the second operator on the left side of implication). ■ (AxioBsl®^^: The axiom (4) in the last definition can be decomposed in details in the following way: (((oc •=3 a) JL a) V ((a ß) 1 oc) V ((ß •=3 oc) 1 oc) V ((ß ß) 1 a) V ((oc a) 1 ß) V ((oc P) JI ß) V ((ß >=3 oc) X ß) V ((ß •=3 ß) 1 ß)) This " is the well-known principle of thè iwff decomposition. The so-oalled parallel decomposition of the last case would be as follows : (a N ß> ((a a) 1 a, (a |=g ß) 1 a, (ß 1=3 a) I a. (ß tg ß) 1 o. (oc 1=3 a) 1 P. 1 ß) This example is in fact the axiom of the parallel decomposition of the case a H ß- ■ II.3.12. Axioms of laformational Particularization and Universalization In informational logic iwffs can be particularized and universalized. This principle permits various substitutions of explicit operators, enabling specialization (particularization) and generalization {universalization ) of informational formulae. Processes of informational particularization and universalization are the basic, i.e. axiomatic properties of an iwff. These processes could be included as well into the domain of the so-called transformation rules, for through their application, formulae are transformed from original semantic domains into other special or general ones. ,DF16. [Axioms] (1) ('a is iwff) (2) ( 't= is sub-iwff • ) con ('?(a) ia iwff) (3) (oc N ß) (4) (ß H a) (5) V. ('a can be put into the for« of aa Iwff'))) This axioB says: for each a, which is infonaation, irrespective of its complexity and inforaational nature, there exists at least one iwff such that (•) inforaation can be put into the fori» of this iwff. (Va).(('a is inforaation'> OCiwff'). <'iwff Is an adequate interpretation of a'>)) This axioa says: for each ((3,.).((a 9) V (9 a)) V (3 fi'V2.....^n* "•''2'■ ■ "'n where 9 symbolizes iwff, is informational operator of adequate interpretation, and each of the three conductive parts on the right side of implication concerns one of the three ax i oma. ■ II.3.14. Axloas of Inforaational Structure and Inforaational OrganlEation What are informational structure and informational organization and how do they reflect in informational ax i oma t ization ? To answer this question we have to consider the principle of informational structure and informational organization ([4] or [10]) as follows : "Informational structure is a constitution of information, that is, a coasti tution of inforaational forms and informational processes that are composed as information. These forma and processes are informational components. The inforaational relations among informational components which determine a composite information cons ti tute in formati onal organi zation. In terms of informational epistemologjr, informatioaal structure is closer to the form, uheream informational orgaaixation ia closer to the prooemm. Vithin information, informational forms and iaformational processes are in forma tionalljr interwoven components. Informational components integrate information. Informational structure and informational organization are information by themmelves." What are forms and processes constituting a particular inforaational case? Speaking in the language of informational formulae, these forms and processes are inforaational operands and operators, where forms are, for instance, self-standing operands and processes are foraally grouped (parenthesized) operands and operators. In this respect, informational structure appears as a more or less pure syntactic structure. We suppose that given Information always has a structure. This structure, which is observed as information concerning the structure of information in question, can always be interpreted through an iwff in a simple informational case or through an informational system of iwffs in a complex informational case. The structure of information is interpreted in iwffs by particular informational forms and processes, consisting of informational operands and operators. It is possible to list several axioos concerning the structure of information, for instance: [Axioas)»"«: (1) (Va).(('a is information') ^ ((3ff).(').('a ia interpreted by an adequately syntactically structured <(>'))) In this and in the next axioms, 9 marks the so-called iwff. (3) (Va).(Ca is information') ^ ((39).C9 as inforaation syntactically constitutes inforaation a'>)) (4) (Va). (Ca is inforaation') ^ ((39).('9 interprets the syntactic structure of information a'))) etc. These axioms constitute the so-called structural hypothesis of information. This hypothesis, in fact, is the inforaational principle of structural constructibility of information and its adequate iwff. Here iwff is understood to be an informational system or any form of informationally connected iwffs. ■ The structure of information a is inforaation which concerns the coaponential syntax of a. This syntax is interpreted into the structure of iwff of a. Structural interpretation of a onto its iwff does not represent the sufficient condition for a completely adequate interpretation of a by its iwff. The second component, called informational organization of a, has to be considered when the iwff adequate to a is constructed. While the structure of information predominantly concerns the so-called syntax or form of coBipon en t i al constitution of information, organization of information predominantly concerns the so-called semantics or componential processes, relations, and informational interweaving of informational processes. Among possible interweaving of informational forms and processes within information, the most important seems to be informational parallelisn, which is the synonym for interweaving nature of informational forms and processes. It becomes evident that information, by its nature, is nothing else but extremely interwoven structure (topology, granularity) and organization (selectivity, relationship) of information. [Axioms] DF19. (1) (Va).(('a is information') ^ ((3u).('io interprets the semantic nature of information a'))) In this axiom, u denotes information concerning the organization of information a. (2) (Va).(('ot is organized information') ((3 ae information semantically constitutes information a'))) Instead of the consequence in the last implication it could be ( (a h CT) A (o- C a) ) If information a is structured, then it informs (gives, transmits) information ct of its structure (c C a). (4) ('a is interpreted by an adequately syntactically structured 9') ^ ( ( (a C 9) ) A (3(9 t=).(9 o<))) The iwff 9 in fact constitutes also the structure of information o. The operator of this syntactic constitution is ^gy^^- There exists such Informing of 9 that 9 syntactically informs a. (6) ('9 interprets the syntactic structure of information a') ^ ( (o- C a) (9 a) ) (7) ('u interprets the semantic nature of information a') (u N «(a)) Instead of the consequence in the last implication it could be u . «(a) or, conventionally, u = «(a) int a(a) has the meaning of "semantic, i.e. organizational nature of a". (4) (Va).(('a is information') ^ ((39).<'9 interprets the semantic organization of information ot'))) etc. (AxiomBl BX3. DF18 DF19 The axioms [Axioms] and [Axioms) can be interpreted in a more symbolically compact and instructive aanner. Let us construct the following implications: (8) ('a is organized information') ((«h«) A (wCa)) If information a is organized, then it informs (gives, transmits) information oj of its organization (u C a). (9) (('a is interpreted by an adequately semantically organized 9') v ('9 semantically interprets o as informational organization')) ^ ((u C a) A (u C 9) A (9 (1) Ca is information') ^ ( (a t=) V (>= a) ) (2) (' ((3 ( (9 N^y^, «)))) DF19 etc. Further, for [A.xiomsl there Is: a It CT, at bi, (T, u m syn "^sea ' ' ^ form * The consequence of this system is a

.(9 K sem =) V (1= a)) :> ({39).(<(uCa) :> (uC9)) A (3(9 h).( a))))) If a is information, then it informs and is informed in parallel in one or another way. This fact can be expressed also in the form of parallel informational system, i.e. , ('a is information') (a h, ¥ 4 a, a HI) (2) (Va).(('a is information') etc. ■ DPI8 DF19 The axioms (Axioms] and [Axioms] assure the existence of an adequate (Informationally complete) interpretation of any information a onto its Iwff 9 in the sense of informational structure ct and informational organization u. This leads to the fundamentally important axioms of cona truet ibi 11ty of iwffa for arbitrarily occurring informational cases. [Axiomsl'^'^O, The axioms which follow govern . the interpretation of information a onto the structure ct and organization tu of an iwff (or of a system of iwffa) 9, which models a in the informationally complete way. The process of constructing iwff from given information can be expressed in the following manner: a CT, a 1= w, CT, lü t=- 9 If a is information, then there exist counter-Informing caused by a, (a I- (a, I- w). («• w ^ e^), {a , , u, I- a ) ) a a This cyclic parallelism can be captured in the most complex form by the iwff II.3.16. Axioas of Informational Cyclioity Information is a cyclic informational phenomenon in itself as well as in its interaction with other or outward information. It means that informational forms and informational processes appear, inform, change, vanish, etc. in a cyclic manner. Cyclicity of information can be viewed to be purely serial, parallel, or serial-parallel phenomenon. The last case seems to be the most obvious one. Within this phenomenology, cyclicity can be understood not only topologically and temporally, but also symbolically, abstractly, expressively. The basic question is how information.performs cyclically in itself. Why informational interaction is in fact always a cyclic Informing? Let us frame these observations axiomatically in the following manner: lAxio.sl®''": (1) Ca is information') :> (((ah) V (J- a)) V ((a |-) V o)) V ((^ a) V (a -I)) V ((HI a) V (a -J) ) ) If a 1b information, then it informs and is Informed cyclically and parallel-cyclically in such or another way. This fact can be expressed also in the form of parallel-cyclical informational system, i.e., in a particular form: ('a is information') ^ (-1 a, a -I, HI a, a HI) (2) (Va).(('a is information') :> ((a N S^) A (a, 1= u) A (a, E , u h «„) A (a, (! their continuation in any construction of tuff. References of IL I, II, and III (11 H. L. Dreyfus and S. E. Dreyfus?: Mind over Machine. The Free Press, Macmillan, New York 1986. [2] A. P. Zeleznikar: On the »'ar to Information. Informatica 11 (1987), No. 1, 411. [3] A. P. Zeleznikar: Information Determinations I. Informatica 11 (1987), No, 2, 3-17. (41 A. P. Zeleznikar: Principles of Information. Informatica 11 (1987), No. 3, 9-17. (Published also in Cybernetica 31 (1988), 99122. 1 [5] A. P. Zeleznikar: Information Peteraina-tions II. Informatica II (1987), No. 4, S-25. [6] A. P. Zeleznikar: Problema of Rati ona I Understanding of Information. Informatica 12 (1988), No. 2, 31-46. [7] h. A. Steen (Editor): Mathematics Today. Twelve Informal Essays. Springer-Verlag. New York (Third Printing, 1984). [8] B. de Bono: The Mechanisms of Mind. Penguin Books, Harmondsworth, Middlesex, England (Reprinted 1977). [9] G.W.F. Hegel: The Science of Logic. In Hegel's Logic, being Part One of the Philosophical Sciences (1830), translated by W. Wallace. Oxford, At the Clarendon Press, reprinted 1985. [10] A.P. Zeleznikar: Principles Information. Cybernetica 31 (1988) 99-122. of [11) S. Schiffer: Symposium on Remnants of Meaning: 1. Overview of the Book. Mind and Language 3 (1988), No, 1, 1-8 (Basil Blackwell). [12] T. Winograd: On Understanding Computers and Cognition: A New Foundation for Design (A response to the reviews). Artificial Intelligence 31 (1987) 250-261. At the end of section II. 3, in which we have discussed informational axlons, it aeems necessary to stress again the arising (or at least variable) nature of informational operands, operators, and formulae. Such as they are, all of the listed axioms in this essay concern the arising principle of occurring informational entities. Thus, this informational nature is found not only In the semantics of explicit informational operators TRANSPUTERJI INFORMATICA 1/89 Descriptors: TRANSPUTERS, PARALLEL PROCESSING Borut Jereb,* Ljubo Pipan,** Aleš Klofutar Institut J. Stefan * Gorenje raziskave in razvoj, T. Velenje ** Fakulteta za elektrotehniko, Ljubljna Za dcweganje večje imogljivosti računalnikov, je «nanih veliko teoretičnih pristopov. Nekatere od njih je moine realiiirati, nekatere debo, nekateri pristopi pa so s sedaj posnanimi tehnologijami nerealni. Pri transpnteijih je načrtovalcem uspelo realiiirati nekaj sanimivih lamisli. Članek je pregleden in obravnava predvsem najnovqSi transputer T800, ki je relativno cenen in «mogljiv gradnik v eno ali večprocesorskem sistemu. Pri večprocesorskem sistemu so lahko transputeiji močno ali Mbko povesani. Model T800 Ima obenem veliko nnmerično moč. l^i moč je posledica nekaterih iivimih reiitev, ki so opisane v članka. TRANSPUTERS Several theoretical principles for the achievement of greater computer capability already exist. Some of theae can be carri«d out, some are only partially aplicable, while many theories can not yet be realised, given today's technology. Since experts have succesed in discovering several good and applicable ideas and possibilities m transputer designing, this paper is meant to give the reader a lucid survey of transputers, with special emphasis placed on the newest model: T800. This is a relatively cheap and capable unit, designed for UM in both single or multiprocessor systems. In the latter, the transputers can be either tightly or loosely linked and the T800 model has, at the same time, great numerical power - a result of the several original solutions presented further on in this paper. OUVOD Clanek opisuje dniimo transputerjev. To so isdelki podjetja In-moe, ki se od klasičnih mikroprocesoijev raslikujejo po prilagojenosti paralelnemu procesiranju. Prn itlri poglavja podajajo sploien opis dniiine transputer-jev. Peto poglavje na kratko opisuje nekatere imoSnoeti transpu-terja IMS T800 (predvsem enoto >a delo s Itevili v plavajoči vejici). Sledi ie laključek. 1 KONCEPT m ARHITEKTURA TRAN8PUTBRJA VLSI tehnologija omogoča ceneno iadelavo velikega števila enakih mtegriranih veijj velike imo^ivosti. Z uporabo več enakih mte-griranih veiij lahko realliiramo sistem ■ neko mero paralelizma Ol. sočasnosti. FPU - CPU RAM — LINKS MEMORY INTERFACE slika 1 Bločni diagram transputerja IMS T800 TVansputer je VLSI mtegrirano veije, ki vsebuje: procesor (CPU), pomnilnik (RAM), komunikacijske kanale (LINKS) za direktno povesavo i ostalimi transputerji v tako imenovani mreJi transpateijev in vmesnik la zunanji pomnilnik (MEMORY INTERFACE); nekateri, odvisno od tipa, tudi enoto la delo » Itevili v plavajoči vejici (FPU), (slika 1). Načrtovalcem tega veija je uspelo napraviti veije, ki je dobro prilagojeno paralelnemu procesiranju. Dodatek k paralelizmu realiziranem z večimi transputeiji je velika mera notranjega paralelizma v samem transputerju. Samostojen sistem lahko predstavlja ie m sam transputer aJi pa več v mreio povezanih trans-puterjev. Kot primer slika 1 podaja bločni diagram transputerja IMS T800. 1.1 PROCESOR IN POMNILNIK NA SKUPNEM VEZJU V sistemih sestavljenih ii več VLSI vezij (procesor, pomnilnik itd), je hitrost prenosa podatkov med veiji, glede na hitrost delovanja samih veiij, zelo majhna. Obenem pa vsaka operacija, ki jo izvede procesc«-, lahteva uporabo ponmilnika. Zaradi slednjega sta pri transputerjih procesor in ponmilnik sestavna dela enega samega vesja. Tipična velikost pomnilnika na vezju je pri sedanjih transputerjih nekaj Kilogov. 1.2 KOMUNIKACUE MED TRANSPUTERJI Povezave med vezji in dodatna vezja za upravljanje povezav pomenijo ^avno om b order := gt a enostaven is. >1 <» > integer, real AND, OR, NOT boolean A (bitwise and), V (bitwise or) >< (bitwise xor), ~ (bitwise not) mtegers «, » (premikanje) integer Tabela 1 Operatorji v Occamu Vsak proces lahko ima svoj neodvisen časovnik, ki ga uporabi la svoje meritve ali sa rasdeljevanje dela v realnem času. Časovnik prebere vrednost v spremenljivko tipa INT. Npr: tim 7 v postavi spremenljivko v na trenutno vrednost prostotekoče ure, ki je deklarirana kot časovnik tim. Dostop do lunanjih enot je v Occamu omogočena i mehanij-mom vhodno/ishodnih vrat. Vrata se uporabljajo podobno kot kanali. Podobno lahko le en proces bere iz v/i vrat in le eden daje na v/i vrata. 8 KODIRANJE INSTRUKCU Vsi transputeijl imajo enak osnovni nabor maloltevilčnih instnik-cij. Vsaka instrukcija vsebuje osem bitov, ki so rasdeljeni v dve skupini po itiri bite. (Glej sliko 3.) F\mkdj3 Podatek 4 3 slika 3 Ibrmat transputerjeve instrukcije. Pomembnejši «tiije biti tvorijo funkcijsko kodo, preostali Štirje so podatki. Dobimo 16 'direktnih* funkcij (nalaganje, shranjevanje, skoke, klice...). Vse instrukcije se isvedejo tako, da se spodnji Stiije biti prepišejo v spodnje itiri bite operandskega registra, katerega vsebina se kasneje uporabi kot operand instrukcije (glej sliko 4). Pri tem se vse instrukcije, rasen Prefix instrukcij (njihova funkcija bo ra-zloiena kasneje), končajo z brisanjem operandskega regpstra. "Wso je le ta pripravljen sa naslednjo instrukcijo. Ker pa takšno kodiranje dopnSča samo itiri bitne operande, imamo med zgornjimi 16. funkcijami dve, ki omogočata rasiirjavo velikosti operandov. To sta Prefix in Negative Prefix. Prefix instrukcija napobi spodnje itiri bite operandskega registra s svojim podatkovnim poljem in potem premakne vsebino operandskega registra za itiri mesta v levo (iiftanje). Negative Prefix instrukcija je podobna pravkar opisani Prefix instrukciji, le da komplemen-tira operandski register pred premaknitvijo vsebine sa itiri mesta. "Riko lahko operand z uporabo Prefix instrukcij povečujemo do velikosti operandskega re^tra. Nadednja od sgomjih 16. funkcij je Operate, ki tretira svoj operandski del - osiroma vrednost operandskega registra - kot operacijo nad vrednostmi v registrih procesorja. funkcija omogoča kodiranje ie dodatnih 16. operacij v enem zlogu. F^mkdja Podatek 7 4 3 0 Operuidu register slika 4 Polnjenje operandnega registra pri IMS T800. S Prefix funkcijo lahko raeiirimo tudi operande funkcije Operate, kar je ekvivalentno povečanju nabora instrukdj. Seveda so instrukcije kodirane tako, da lo najpogosteje uporabljane instrukcije predstavljene bres Prefix instrukcij. Meijenja so po kas ala, da je okoli 70% instrukcij, ki se isvajajo, kodiranih v enem »logu. [3] T800 ima dodatne instrukcije la delo i FPU. Pravtako vsebuje instrukcije la barvno grafiko, raspoinavanje viorcev in implementacijo kod la odpravljanje napak. To je realiiirano i Egoraj opisano moSnostjo, ki omogoča railiritev naJbora instrukcij. [5,7| 4 K0MUNIKACU8KE POVEZAVE Stiri identične dvosmerne povesave omogočajo sinhroniiirano komunikacijo med procesorji in komunikacijo i lunanjim svetom. Vsaka povesava vsebuje vhodni in iihodni kanal. Poveiava med dvema transputeijema je realiiirana s poveiovanjem vmesnika poveiave enega transputeija na vmesnik povesave drugega trans-putetja. Vsak poslüi podatkovni slog mora biti potijen preko vhodnega kanala iste povenve. Tb je sinliromiacija, ki poteka na vseh Itirih poveiavah transputeija avtomatično in ne lahteva dodatnega programiranja. Ko linija ni aktivna je iihodni kanal na niskem nivoju. Vsak podatkovni ilog s« prenese kot laporedje visokega start bita, enega visokega bita, tema bitoma sledi osem podatkovnih bitov in nizek stop bit. Potrditev, ki jo čaka oddajnik vsebuje visok in nizek bit (glej sliko 5) in je indilrator sa dvoje: proces je sprejel podatek in vmesnik je pripravljen la sprejem naslednjega sloga. Pofiljanje potrditvenih paketov, preden se podatkovni paket popolnoma sprejme, poveča amolnoet povezav. IMS T4M nima implementiranega pravkar opisanega pošiljanja in dosega le 0.8 Mik^gov na sekundo. Z implementacijo prekrivanja in zadostnih izravnalnikov za povezavo, je IMS TSOO omogočena več kot dvakrat vijja hitrost prenosa. [1,2,3,6,7] 1 1 Podatek 0 Podatkovni ilog I Potrditveno «poročilo sUkaS Elementi komunikacijskega protokola S IMS TSOO IMS T800 je naslednik T4I4, ki je bil prvi Sirie uporabljen predstavnik iz dniline transputeijev. T800 vsebuje, za razliko s T414, integrirano enoto za delo s itevili v plav^oči vejici (FPU). To predstavlja, glede na običajne reäitve s koprocesotji, le malo površbo dodatnega silicija. Običajno zahteva zadovoljivo povečanje numerične zmogljivosti dodatno povriino silicija, ki se giblje v razredu velikosti povriine silicija, porabljene za izvedbo mikroprocesorja. Seveda to avtomatično zahteva eno ali več integriranih vezij I vso logiko, ki je potrebna sa delovanje samega vezja (Weitek WTL1167 koprocesor za mikroprocesor Intel 80386 zahteva tri mtegrirana vezja). FPU dduje sočasno s centralno procesno enoto (CPU) in pod kontrolo CPU. (Glej sliko 1.) 5.1 ARHITEKTURA Procesor transputeija T800 ima malo registrov, kar je kompenzirano z zelo hitrim pomnilnikom na vezju. Sest repairov in enostaven instrukcijski nabor omogoča enostavno kontrolno logiko in enostavne ter hitre podatkovne poti. Registri procesoija so: kazalec na dekjvno področje, kjer so shranjene lokalne spremenljivke; kazalec instrukcij, ki kaie na naslednjo instrukcijo, ki se naj izvrši; operandni re^er, kjer se nahaja operand instrukcije; registri A, B in C, ki predstavljajo strojno izveden sklad. Slednji trije registri se uporabljajo sa aritmetiko pri naslavljanju, za celoštevilčno aritmetiko ter za logične operacije. Tudi FPU ima tri repstre (AF, BF in CF) v obliki sklada ta računanje. Nazlovi za podatke, ki so zapisani v plavajoči vejici se formirajo na skladu CPU. Obenem je pod kontrolo CPU prenos vrednosti med naslovljenimi pomnibiškimi lokacijami in FPU. Ker je CPU sklad uporabljan le za naslavljanje vrednosti v plavajoči vejici, je doUina besede CPU neodvisna od dolline besede FPU. Tiko doeeiemo, da isti FPU na vezju T800 lahko uporabljata T212 (16 bitna beaeda) in T4M (32 bitna beseda). Registrski sklad FPU je podvojen. Pomembnost tega se vidi ob preklopu T800 v delovanje z visoko prioriteto, ne da bi bilo potrebno prepisovanje vsebine sklada v pomnilnik. Rezultat slednjega je zelo ugoden časovni odziv na prekinitve |3,6,7]. 6.2 ORGANIZACUA NASLOVNEGA POUA Celoten pomnilniški prostor je naslovljiv po zlogih. Naslovi med #80000000 in #80000FFP naslav^ajo pomnilnik na vezju (to je 4 Kzloge). Uporabnifld pomnilnik se začenja na nazlovu #80000070. Lokacija s tem naslovom je označena kot MemStart (Memory Start) |3]. Glq sliko 6. Naslovi povezav »o v spodnjem delu pomnibika na vezju. ZaDaüji pomollnik Pomnilnik na veijn Povesave Uporabnifto doetop«n pomnilnik MemStart slika 6 Organizacija naslovnega polja 5.3 INSTRUKCIJE ZA DELO S ŠTEVILI V PLAVAJOČI VEJICI Jedro mnolice instrukcij za delo v plavajoči vejici so določili v fazi pred načrtovanjem IMS T800. To jedro vsebuje enostavne operacije vpisovanja in branja FP operandov ter cenovne aritmetične operacije. Po drugi strani pa je statistika, ki je bila izdelana na osnovi fortranzkih programov pokazala, da bi z dodatkom nekaterih bolj kompleksnih instrukcij povečali ujUnkovitoet in kompaktnoet kode. Odločiti so se morali za najustreznejši nabor instrukcij. Zato 80 opravljali raziskave učinkovitosti predlaganih razširjav naborov instrukcij. Tak nabor so potem testirali v numerične orientiranih programih. Pri tem so za vsak predlagan nabor instrukcij skonstruirali prevajalnik, program prevedli t njim in tako dobljeno kodo testirali na simulatorju. V nadaljevanja sledi opis rezultirajočega nabora instmkcij. IMS T800 prenaša operande med pomnilnikom transputer] a in skladom FPU z uporal» instrukcij za shranjevanje in nalaganje števil v plavajoči vejici. Obstajata dve skupini takšnih instrukcij: ena za števila enojne doliine in ena za števila dvojne doliine. Naslov operandov v plavajoči vqid se izračuna na CPU skladu, nakar se operand naloži iz naslovljene pomnilniške lokacije na FPU sklad. Omogočena sta dva načina naslavljanja FP operandov: direktni in indirektni. Slednji način naslavljanja olajša delo s polji. Operandi na FPU skladu imajo oznake, ki predstavljajo njihovo dolžino. Oznaka operanda se nastavi, ko operand naložimo oz. izračunamo. Te oznake zmanjšajo število instrukdj, ki jih rabimo pri aritmetiki v plavajoči vejici. Npr. ne rabimo instrukcije za seštevanje dolgih besed in za seštevanje kratkih besed, temveč preprosto le instrukcijo za seštevanje. Operaciji «a branje resultatov ii FPU shranita prebrano vrednost il FPU «klada v transputeijev pomnilnik. Za branje ne obstajajo indeksne instrukcije. To is^eda na prvi pogled presenetljivo, vendar isvira i« dejstva, da je v programih manj operacij branja is FPU, kot pa vpisovanja v FPU. Zato indeksno naslavljanje pri branju ni resdiiirano. Enojne instrukcije omogočajo najpogostejie operac^e v FP: seštevanje, odštevanje, mnoienje, deljenje in primerjavo. Resultat primerjave se prenese v register CPU. Zaradi pogostega seštevanja in mnoienja v programih so v smisln čim večje kompaktnosti kode in hitrosti imjanja nekatere instrukcije sestavili ii več osnovnih instrukcij. Npr instrukcija prištevanja operanda operandu na skladu je enakovredna dvema operacijama: vpisu operanda na sklad in seštevanju operandov na skladu. |3,6,7] 5,4 SOČASNE OPERACUE FPU IN CPU Pri IMS T800 dela FPU sočasno t CPU. sočasnost selo iiboljia inačibosti v realnih problemih, kjer so elementi polja relativno teiko dostopni. To je raividno is 'Livermore Loops* testa, ki bo opisan v nadaljevanju. Tä test je mnoSica majhnih jeder napravljenih tako, da predstavljajo moine tipe isračunov. Posebno vsebuje dostope do dvo in trodimeniionalnih polj, to je tam kjer transputeijeva sočasnost pokale selo dobre resultate. Prevajalnik isbere najugodnejši vrstni red računanja naslovov in s tem poveča časovno prekrivanje. Pri testu »Livermore Loo^ je ttlS T800-30 dosegel 2.25 MFLOPS, IMS T800-20 1.5 MFLOPS, T414-20 0.09 MFLOPS in VAX 11/780 (i PF pospdtevalnikom) 0.54 MFLOPS. Program napisan v Occamu sa ta test ima obliko |7): - LIVERMORE LOOP 7 SEQ k = O FOR n x[kl u[kl-f ((( r*(.[kl + (ror[k]))) + k-l-3 k-(-6 + + (r*(o k-l-2 M + {r*u -I- (r*u k+1 k+4 5.5 ZMOŽNOST IMS T800 ZA FP OPERACUE Casi izvajanj FP operacij niso zanes^ivo merilo hitrosti izvajanja pravih numerično orientiranih programov. Zaradi tega primeijamo numerično učinkovitost procesorjev s Whetstonovim preizkusom. Tb je program, ki je dobra imitacga znanstveno-tehničnega programa. Vsebuje ustrezno število in strukturo operacij v plavajoči vejici, klicov procedur, indeksiranja polj in računanja transcendentnih funkcij. Tibela 2 podaja zmošnoeti IMS T414 in IMS T80G v primerjavi z ostalimi procesorji glede na Whetstonov test. IMS T414 je trikrat počasnejši kot koprocesor MC68881, vendar ima kombinacija MC68000/MC68881 le 25% večje zmoinosti kot T414. To je zato ker je hitrost izračuna FP izraza odvisna od dveh stvari: prvič od hitrosti prenosa operandov v in iz koprocesorja in od hitrosti same FP enote. S skrbnim uravnotelenjem teh faktorjev, postane eno samo Integrirano vezja IMS T800-20 več kot petkrat hitrejše od kombinacije MC68000/MC68881 [7]. Procesor Intel 80286/80287 IMS T414-20 NS 32332-32081 MC68000/MC68881 VAX 11/780 FPA IMS T800-20 IMS T800-30 Tip 8 MHz 20 Mhz 15 Mhz 16/12 Mhz Sun3 Unfat 4.3 BSD 20 Mhz 30 Mhz Količina/s 300K 663K 728K 860K lOSdK 4000K 6000K Drugi pomembni kriteriji so še učinkovitost glede na površino silicija, zasedanje prostora na tiskanem vezju in potrebno it. dodatnih vezij. Poleg tega IMS T800 v mnogih aplikacijah ne potrebuje zunanjega pomnilnika, saj ga je na čipu le 4 Kzloge. štirje IMS T800-30 zavzemajo enako površino na tiskanem vezju kot 80386 z WTL1167, obenem pa omogočajo šestkratno učinkovitost v vsaki sočasni aplikaciji. Procesor ftIS T414-20 ktel 80286/80287 NS 32332-32081 MC68000/MC68881 IMST800-20 IMS T800-30 Količina/s 33K 37.5K 48.5K 54K 200K 200K Tibela 2 Primerjava procesorjev na osnovi Whetstonovega testa Tabela 3 > Normirana primerjava procesorjev na osnovi Whetstonovega testa pri taktu iMHz Tudi iz tabele 3 je razvidna moč T800 v primerjavi a predhodnikom T414. Ta moč izvira iz naslednjih dejstev: - V testih T800 uporablja 4 Kzloge velik pomnilnik, ki je integriran v vezju. - FPU je prav tako integrirana v vezje. ■ FPU in CPU delujeta sočasno, pri čemer CPU računa naslove operandov v FP operacijah. Rezultati bi bili za T800 manj ugodni, če bi naslavljal pomnilnik, ki ni na vezju: še slabši bi bili, če bi uporabljal serijske povezave za doseganje operandov. 6 ZAKLJUČEK IMS T800 transputer je zelo zmogljiv gradnik za paralelne gisteine. Obenem dokazuje, da ni potrebno uporabljati koprocesorja, če lelimo veliko numerično računalniško moč. Pri transputerju T800 najdemo na enem integriranem vezju centralno procesno enoto, numerično enoto za delo s Števili v plavajoči vejici, pomnilnik in komunikacijski sistem. Skratka cel računalnik na enem vezju. Na primer: štiri Kzloge ponmilnika je v aplikacijah, kjer obdelujemo signale, ponavadi dovolj. Pri tem ne potrebujemo zunanjega pomnibika. Zanimivo je tudi dejstvo, daje zmogljivost T800 pri računanja v plavajoči vejici večja od znoogljivosti mnogih drugih procesorjev pri računanju v številskih sistemih z fiksno piko. Tb vezje bo osnova najmočnejšega evropskega guperračunal-nika, ki ga načrtujejo na univerzi v Edinburgu. Vsebovalo bo 1000 transputerjev in 1 Gzlog porrmilnika. S takšno rešitvijo bo super-računalnik velik le slab kubičnih meter. Današnja tehnologija a katero je izdelan T800-20 ali T800-30 nudi 1.5 GFLOPS na 0.028 kubičnega metra atrojne opreme. Literatura 1 Inmos reference manual for transputers, 1986 2 Engineering data - IMS T414 transputer, 1986 3 Engineering data - IMS T800 transputer, 1987 4 Inmofl product overwiew {transputer development system), 1986 5 Inmos compilers writer's guide, 1987 6 IMS T800 Architecture • technical note 6, 1988 M Homewood, D May, D Shepherd, R Shepherd: The ft4S T800 transputer IEEE Micro, October 1987 (8| B Mihovilovič, S Mavrič, P Kolbesen: TVansputer - osnovni gradnik večprocesorskih sistemov: Informatica 4,1986 [9) F Mayer-Lindenberg: FIFTH on the transputer: Microprocessing and microprogramming, December 1987 48 OCENA INFORMACIJSKE ŠKODE STROŠKOVNO IN DOHODKOVNO TRANSFORMIRANIH PROIZVODNIH SISTEMOV INFORMATICA 1/89 Deskriptorji: PROIZVODNJA, ORGANIZACIJA, VODENJE, MODELIRANJE, STROŠEK, DOHODEK Viljem Rupnik Reboljeva 16, Ljubljana Modeliranje proizvodnih struktur je Se posebej občutljivo na kriterij modeliranja. Tu obravnavamo stroškovni in dohodkovni kriterij za meritve uspešnosti delovanja proizvodnega sistema in njun vpliv na izgubo informacij, pomembnih za upravljanje. 1. uvod Na področju organizacije in vodenja proizvodnje poznamo zelo pestro množico modelov, ki so (vsaj v praksi) pretežno statične narave in ki kljub morebitni visoki parametrizaciji zaradi svoje arhitekture spravljajo v zadrego, ko želimo doseči največji možni sinergetski efekt tako pri planiranju, kakor tudi pri upravljanju proizvodnih procesov. Ta primanjkljaj si-nergetske mase - kot se da pokazati - je še najmočnejši pri operativnih oblikah upravljanja, manj pa pri taktičnih verzijah modelov proizvodnih sistemov, kjer se ta defekt izgubi oziroma preglasi z razponom možne disperzije, ki jo povzroča slučajnostni značaj pogojev delovanja proizvodnega sistema predvsem v njegovi okolici. Naravna se zdi zahteva po ločeni "odgovornosti" ekonomista od "odgovornosti" tehnika, zato je smiselna takšna gradnja modelov proizvodnih sistemov, ki so zlahka dostopni za različne vrednostne transformacije, kot so npr. stroškovne in dohodkovne transformacije. V dosedanji literaturi, še bolj pa v praksi, prevladujejo modeli, ki so mešanega tipa, torej vsebujejo tako tehnične in tehnološke kot tudi ekonomske kategorije kot konstituante v istem modelu, pri tem pa je redkokdaj ločljiva ena množica konstituant od druge na način, ki bi omogočal preproste razumljive preslikave. Da bi ocenili informacijske Izgube, ki nastanejo pri prehodu modela proizvodnega sistema v "naturalni" obliki v vrednostno obliko modela, predpostavimo poljubni proizvodni sistem, ki naj bo po svoji naravi dinamični sistem ter sintetizirane oblike glede na posamične proizvodne podsisteme (npr. stroškovna mesta). Za tako oblikovani model proizvodnega sistema lahko uporabimo rezultate o njegovi multlkri-terialni upravljivosti (/1/). Znano je namreč, da praksa zahteva že v tehnološki sferi opti- ' mizacijo proizvodnih sistemov po več kriterijih hkrati. Eksistenčni izreki multlkriterlal-ne upravljivosti dinamičnega modela splošnega proizvodnega sistema fs (1) slonijo na razmeroma ostrih lastnostih posameznih informacijskih razredov, vendar ne tako težko izpolnjivih, da se ne bi potrudili zanje v zameno za optimalnost, ki daleč presena koncepcijo neoaretovske optimalnostl. Zgoraj pomenijo Sj^, ,.., Sg proizvodne podsisteme, od katerih je okolica proizvodnega sistema, B pomeni množico binarnih, triadnih, itd. rela- cij med podsistemi S, 3g, tj. pa je ooe- rator upravljanja dinamičnega sistema v multi-krlterialnem prostoru upravijalsklh kriterijev. Glede na /1/ natanko vemo, kdaj je tak proizvodni sistem popolno in hkrati polno upravijiv na danem upravijalskem horizontu T; podali smo tudi pogoje absolutne In enakomerne upravljivosti kot tisto zaostritev polne upravljivosti, ki jo terjajo proizvodni sistemi, ki jih moramo upravljati v realnem času (procesna kontrola) . Dovolj bo, da matematičnemu sistemu (1) dopustimo kalmanovsko naravo, saj ta zelo visoko prekriva večino proizvodnih sistemov, ki so hibrldl končne množice elementarnih tehnoloških procesov in operacij. Opozoriti pa je treba, da takrat, ko proizvodni sistem (1) oklenemo z organizacijskim sistemom, predpostavka o kalmanovski naravi sistema (1) ne zadošča več. To velja toliko bolj za oba sistema, tj. proizvodni in organizacijski sistem (za katere se da ne tako enostavno dokazati, da organizacijski sistem vsebuje proizvodni sistem in ne narobe). Toda oblikovanje poslovnega sistema terja še dve Inkluzijl; objem pravkar zgrajenega tandema s strani informacijskega sistema in objem vseh treh v upravljalski sistem. Pri določenih sistemskih lastnostih lahko tako dobljeno četverico obravnavamo kot poslovni sistem. Zanj se da uvideti, da je njegova matematična struktura mnogo bolj zahtevna (/2/) kot pa je matematična struktura, ki smo jo privzeli za (I) . Oznaka proizvodnega sistema (1) kot "natural-nega" modela je opravičena z lastnostjo, da so vsi inputl kot tudi outputi nevrednostne dimenzije, medtem ko operator upravljanja »j. lahko vsebuje tudi kakšno vrednostno kategorijo kot kriterij upravljanja. Vrednostna transformacija, kot sta npr. stroškovna, dohodkovna, prl-hodkovna in druge, je definirana kot tista transformacija, ki obravnava vrednostne Inpute in outpute, pri čemer z Izrazom "vrednost" označujemo poljubno vrednost v najširšem pomenu besede (korist, "benefit"). Preden se lotimo ocene informacijskih izgub, ki jih s seboj prinašata dve najbolj pogosti vrednostni transformaciji, tj. stroškovna in dohodkovna transformacija proizvodnega sistema (1), povzemlmo peterico izrekov v (/I/) s tem, da navedemo osnovni izrek o zadostnem pogoju terminalne multikrlterlalne upravljivosti: Proizvodni sistem S^. z vektorskim kriterijem (ij.) in omejenim normiranim vektorskim prostorom upravljanja Uj, kot komutativnlma grupama, z monoidnim prostorom običajnih vhodov Xj., operatorjem Jj., ki je aditivno separabilen glede na Xj. in Uj. v smislu f X1 kjer npr. operator C^ zavisi o cenovnih parametrih na nabavnem tržišču (npr. materialov). Piri tem v splošnem pričakujemo - f^l I- o v x,u: in ki je aditivno homomorfen z ozirom na u (t) v smislu je na A - popolnoma upravijiv in na T polno upravljiv, če je 1) S J. popolnoma upravljiv na A J (T, X., U'ì, ÌU'. C u^, u' k 9 2) če velja Ä n it. {J er, }} ''z Pri tem naj spomnimo, da popolna upravljivost pomeni upravljivost glede na vse izbrane kriterije istočasno in da ta upravljivost ne dopušča paretovskega kompromisa. Opozarjamo tudi, da je naloga, oceniti informacijsko škodo zaradi transformacije (1) na stroškovno in dohodkovno izražene konstituante proizvodnega sistema, precej ožja od naloge, oceniti uprav-Ijalsko škodo, ki utegne nastati pri kakšni drugi transformaciji. Naša naloga torej pomeni napor, ugotoviti, ali stroškovna in dohodkovna transformacija vplivata na definicijske prostore proizvodnega sistema (1). Pri tem se bomo zaradi obsežnosti problematike informacijskih izgub, ki nastanejo zaradi dimenzijske redukcije proizvodnega sistema (1), popolnoma izognili vprašanjem reducibilnosti tega sistema in jih obravnavali kdaj drugič. 2. Stroškovne in dohodkovne transformacije proizvodnega sistema Med najrazličnejšimi vrednostnimi transformacijami proizvodnega sistema (1) si oglejmo dohodkovno in stroškovno transformacijo posameznih ali celo vseh naturalnih konstituant modela proizvodnega sistema (1). Predpostavimo, da je Sj, dimenzijsko nereduciran in si najprej oglejmo prostor običajnih vhodov Xj. = = (Xj.p, Xj.^, Xj.^), kjer pomeni Xj.^ podprostor neinformacijskih vhodov (proizvodni materiali, energija itd.), Xj.^ podprostor karakteristik nematerialnega vhoda in Xj.^ podprostor informacijskih vhodov v informacijski sistem (1). Na te podprostore uporabimo naslednje stroškovne transformacije Jxì 'to tp J'J (C) X —► y (1) J") f, vc) lu = (X) r^(x) x (X) in ta transformacija vedno A A _j eksistira, če S^. deluje kot samostojna proizvodna celota, ki komunicira z okolico." Ker se lastnosti posredno izražajo preko nabavnih cen za X " (i) ' imamo običajno C^ ' =0; vendar pa fx) ]e v splošnem C^ ' f 0. Naslonimo se na rezultate iz (/2/) o načinu sinteze posameznih proizvodnih podsistemov S , Vi. Proizvodnemu sistemu tako vedno lahko najdemo totalni primarni običajni vhod Xj. = ir X^/y''^'; temu glede na (1) pripada stroškovno transformirani prostor (2Ì ® tyì X = TT X./Y' ' i. k ki je stroškovni totalni primarni običajni vhod za S^ s strukturo = , 0., )'. Tako ^ ^ fx) fv^ smemo predpostaviti C^ ' = (C^ ' i p imamo spet O , in (3) kot osnovno stroškovno transformacijo proizvodnega sistema S^,. Za stroškovni transformat (rezultat transformacije, ki pripada totalnemu navadnemu sekun-(x) darnemu izhodu Y , moramo dovoliti obstoj operatorja kot kompozitum operatorja C lastnostjo (u) in imamo s tem (4) (5) .(u) 'ki ,(x) , -k j stroškovna transformacija navadnega in uprav-Ijalskega vhoda v proizvodni sistem. V poslovnem svetu velja načelo, da se vrednost ne sme izgubljati, zato to načelo - imenovano načelo kompenzacije - določa identiteto Z^ e O, s ^kj" definiciji celotnega izhoda iz S J. pa imamo sedaj s-l ^ = V (y'."' X xil') C Y. ^ jk ^ jk jk J (6) kjer so y^ì^', in y. stroškovni transfor- mati prostorov y?!^' in y . . Na tem mestu 3K ^K seveda še nismo ugotovili, ali ustrezni stroškovni operatorji sploh obstajajo. Iz (6) sledi, da je y:. ' stroškovni navadni sekundarni ■J K vhod, ki ga generira podsistem S^j za celotni proizvodni sistem pa imamo s S-; Jk = Y fx) (7) 3 k kot totalni stroškovni navadni sekundarni vhod 50 za S_. Glede na rezultate v /2/ pa lahko skle-£ pamo, da stroškovni operatorji, ki se nanašajo na sekundarne vhode, ne zavisijo samo od nabavnih cen, temveč tudi od operatorjev tipa G (operatorji outputov) in relacije R med podsistemi Sj,. Ker operatorji G^ zavisijo od prostora stanj Zj v vsakem podsistemu, si najprej oglejmo stroškovne transformacije -^ Zr- rp zp (8} z,. -^ Z,. Is £s in ki torej stroškovno izražajo stanje proizvodnega sistema (npr. revalorizirana neodpisana vrednost osnovnih sredstev, ki še "Saka" za vkalkulacijo v stroške bodoče proizvodnje). Pri tem pomeni kompozitum za posamične Sj, j=l,...,s, tako kot prej. Za proizvodni podsistem S^ npr. Z^^ vsebuje vsa možna stanja v pogledu cepitve materialnih tokov tako znotraj Sj kot tudi med njimi. V prvem primeru ^jp ^Ip preko Gj na y^ pa je zato ustrezna stroškovna transformacija sposobna vključevati posledice takšnih dogajanj za y^ kot stroškovni transformat vektorskega outputa yj proizvodnega podsistema S^. V drugem primeru Z sodeluje pri oblikovanju medfaznih l p transformatov, tj. stroškovnih izrazov notranje navadne emisije Y., , kar pomeni, da zaradi Zjp t O načelo kompenzacije dobi naslednjo obliko ?, . = X, . + c'^' . (Z , J kj kj p,kj p,kj (9) * (Z + c"> (Z ^J X,kj x,kj s,kj s,kj Tako lahko sedaj stroškovni princip kompenzacije izrazimo krajše Jz) (z) Jz) 'kj * '^X,kj' '^B,kj (9'i %,kj Diskusija trojice stroškovnih operatorjev izhaja iz same definicije prostorov Z. , Z. in (zl " " Z^ . Tako se C definira kot enotna projek- iS p,Kj _ ci ja Stroškov transporta izhoda Xj^^ od Sj^ k S^, kar pomeni, da je ta projekcija stroškovni parameter povsem internega značaja. Podobno se pri C^^}. definirajo enotne projekcije stro-X »KJ škov, ki povzročajo procesne lastnosti proizvodnega sistema na relaciji od S. k S.. Kar (z) ^ zadeva transformacijo C„ J. : . . + . = »»"J »»i^j velja opozoriti na naslednji problem, V prostor Zg spadajo fizikalne lastnosti osnovnih sredstev, naprav itd., izražene z njihovimi Iz) projekcijami na 5as. Zato se C^ definira s projekcijo kontingenčne cene na življenjski horizont materialnega substrata takšnega elementa; nabavna vrednost npr, se projeclra na življenjsko dobo neke naprave. Tako torej Cg^' skupek "cen" odnosno stroškov vzdrževanja eksistenčnih lastnosti materialnih tokov (vhodov, izhodov in vmesnih produktov) med proizvodnimi podsistemi. Stroškovne transformacije izhodov Y, . v Y. . so S tem v celoti opravljene s pomočjo operatorja C^V = (C^^l-, .,.). Z UDorabo presli- K J P r K J (v) ""(x) kav -»■ » ^ ii^ uporabo zaporedja ope- ratorjev tipa C^*', C^^' in C^"^ smo našli kj kj y. J yf' k] ""kj )c=l,...,s 1, s kar vodi k stroškovnim transformacijam Jxì ZO') Jo "(u) (10Ì kar pomeni, da smo našli stroškovne transformate totalnega navadnega izhoda in totalnega upravijalskega izhoda za vsak proizvodni podsistem Sj. Sedaj je možna tudi stroškovna transformacija ■(C) v. k ^ (1!) kar pomeni, da poznamo tudi totalni stroškovni primarni upravljalskl vhod, medtem ko je totalni stroškovni sekundarni upravljalski vhod vključen v X^.. Da bi dobili popoln stroškovni obračun procesov v proizvodnem sistemu, moramo podvreči stroškovni transformaciji tudi vse izhode. V ta namen moramo osvetliti obstoj Y^, Vj. Predvsem pričakujemo naslednje implikacije + O iP 7P Snem lahko rezultat + ° ' + " y. io. v splo- JU ^ ostroškovanja" izrazimo + kajti stroški, ki imajo kot Y. = i. j 3P svoj "izvor" v lastnostih materialnih vhodov, dodatno obremenjujejo stroške "proizvodnje" totalnega izhoda. Vendar pa redkeje lahko pričakujemo eksplicitno poznavanje ker stroške izhodov proizvodnega sistema ne zajemamo Istočasno kot za materialni substrat izhoda in pa njegove lastnosti kot posledice sistemskih izhodov. Zaradi prjgglednosti lahko v nadaljnjem povzamemo kar Y, . = O oziroma ^ ^ ^ J ) 'j G. J t, t odkoder sledi (X) C^^'zlt^), rt, C. x(t), c'.^'ult) H Jx) (12) če le poznčuno operator izhoda G^ in stroškovne transformacije na desni strani v (12). Tako smo ugotovili, da so za poznavanje stroškovnega transformata y^ potrebne vse stroškovne transformacije upravljalskega vhoda, običajno vhoda in stanja, kar lahko zapišemo takole J») Ju) J^) Podobno se lajjko prepričamo tudi o transformaciji H^ -► Hj preko istih stroškovnih transformacij. V celoti smo našli stroškovne transformacije in sicer: za totalni stroškovni navadni primarni in sekundarni vhod, za totalni stroškovni upravljalski primarni in sekundarni vhod in za totalni stroškovni upravljalski in navadni izhod. Na kratko zapišemo Cj. = (Cj*', Cj^^ , C^"^), s tem pa stroškovno izražena operatorja prehoda stanj v^sistemu in izhoda iz sistema Gj = Cj, Gj in H^ = Cj, H^ omogočata, da je definicijsko področje stroškovno transformiranega proizvodnega sistema 3)fSj. ^ > = z^, E s^, k'=o,...,s,(R, Cj.;) = k,j = 0,...,Sf Xj., I/j.) oziroma da smo prišli na stroškovno transformirani proizvodni sistem (stroškovno-proizvod-ni transformat) { V (13) Tako smo torej spoznali, da je stroškovna transformacija definirana v načelu nad celotno množico definicijskih prostorov proizvodnega sistema (1), da pa so oblike takšne transformacije le prostori, katerih število je manjše od števila definicijskih prostorov prvotnega sistema. Sistem (13) ima torej manjše informacijsko bogastvo, čeprav se na tem omejenem prostoru ni mogoče lotiti obsežnega razmišljanja o posledicah zmanjšanega informacij-" skega ozadja na upravljivost sistema š. , i. ,c vg^ndarle že vidimo, da je zaradi obstoja {Hj^} in {Gj^} stroškovno transformirani sistem S_ upravljiv na isti način kot so upravlji- vi vsak Sj posebej. Dalje, stroškovni operator C_ v š definira množico , n 4) imamo tedaj z operatorjem določeno transformacijo finalnih izhodov za njihove tržne vrednosti. Zaradi preglednosti razmišljanja predpostavimo, da se ta tržna vrednost tudi realizira, torej je operator ir definiran takole I,o T.,o h,o z,o I,o <15i oziroma krajše ir j. = ^ j. -Cj. je operator, ki finalnim izhodom prireja njihove dohodke. Odtod vidimo, da Hj, deluje na precej manjšen številu . prostorov kot pa delujejo operatorji in C^., namreč samo na prostorih in y'*^' V (2) L f O i. ^ O smo pokazali, kako ta dva prostora zavlsita ® (x) ® od TT {y}.' X k=l . Y^-^, Y in k=0 ^^ ^ TT y''^^ X n ki SO preko operatorja k=0 k=l ^^ tipa G povezani s prostori Uj., Zj. in Xj,. Analogno kot prej, imamo torej dohodkovno-trans-formlrani proizvodni sistem podan kot 52 ki je definiran na podranožici {"i^i , L ,0 L,0 celotne množice definicijskih prostorov proizvodnega sistema, če informacijsko ozadje pripravljamo za analizo upravljivosti proizvodne-gi sistana samo z oziram na in w Ta vrsta upravljivosti je L / O L f o L najbolj pogosta v praktičnih primerih za naše gospodarstvo; kadar pa gre za močnejše vzvode optimizacije, kakršne terja sanacija gospodarjenja, procesi prestrukturiranja itd., pa je treba definirati celotno definicijsko področje proizvodnega sistema. Ta dva primera v nadaljnjem ne bomo formalno ločevali in bomo torej splošneje predstavljali dohodkovni transformat proizvodnega sistema v obliki (t 6' i Tako smo torej prišli do naslednjega spoznanja: dohodkovna transformacija deluje nad manjšim prostorcm Iz množice konstituant proizvodnega sisitema ter je zato tudi operator ^^ enostavnejši. Informacijsko ozadje dohodkovno transformiranih proizvodnih sistemov je torej še skromnejše kot pa informacijsko ozadje stroškovno transformiranih proizvodnih sistemov. To pa nadalje pomeni, da v primeru (16') pričakujmo vse slabše efekte upravljanja, kot so v primeru (13). Vprašanje upravljivosti sistema (16') glede na funkcional i(i j. je analogno prejšnjemu vprašanju, in je treba prav tako upoštevati primere a), b) in c), vendar dodatno še primer d): je lahko prostor upravljal- skih parametrov in to celo v kombinaciji z ostalimi prostori, pri katerih je S^ definiran. To pa je osnova za najbolj splošen primer upravljivosti dohodkovno transformiranega proizvodnega sistema. Viri: 1) Viljem Bupnlk, Eksistenčni izreki multl-kriterialne upravljivosti dinamičnih sistemov; Ekonomska revija, 1981, št. 3-4. 2) Viljem Rupnik, Matematična teorija sistemov; RCEF, Ljubljana, 1977. UPRAVLJANJE PORAZDELJENIH SISTEMOV INFORMATICA 1/89 Deskriptorji: SISTEM PORAZDELJENI, VODENJE, UPRAVLJANJE, SISTEM OPERACIJSKI Jože Rugelj IJS Ljubljana POVZETEK Upravljanje omogoča ohranjanje konsistentnosti sistema in pomaga uporabnikom pri delu s sistemom. Pomen uprav-Ijalskih aktivnosti se povečuje z naraščanjem zmogljivosti in kompleksnosti sistemov in z njihovo porazdeljenostjo. NajpomembnejŠe upravljalske funkcije so upravljanje konfiguracije, spremljanje delovanja sistema, ravnanje ob odpovedih, varnostne in zaSčitne aktivnosti, obračunavanje in upravljanje z imeni. ABSTRACT Management enables keeping the consistency of a system and facilitates the usage of a system. The importance of management activities increases with the growing of performances and complexity of the systems and with their distribution. The most important management functions are configuration management, monitoring, fault handling, protection and security activities, accounting and name management. Upravljanje porazde^enih sistemov vključuje mnogo vidikov porazdeljenega procesiranja in je v enakem odnosu do porazdeljenih aplikacij kot klasičen operacijski sistem do lokalnih aplikacij [COST83]. Zato pogosto imeniuemo nabor upravljalskih funkcij za porazdeljen sistem porazdeljen operacyski sistem. Najbolj sploäno bi lahko kot cilj upravljanja porazdeljenega sistema opredelili ohranjanje sistema v stanju, ki ga zahtevajo uporabniki, in to s Cimmanjäimi stroäki. To pomeni, da sistem omogoča uporabnikom in upravljalcem sistema načrtovanje, organiziranje, nadzorovanje in vodenje uporabe porazdeljenih virov informacijskega procesiranja. 1 Cilji upravljanja Upravljanje sistema se izvaja v skladu z upravljalsko politiko, ki določa osnovna pravila v zvezi z upravljalskimi odločitvami [Slom87]. Problemi, s katerimi se ukvarja upravljalska politika, so komercialne in tehnične narave. V podjetjih je njen cilj maksimizacija prihodkov, v izobraževalnih ustanovah mini-mizacija stroSkov in pri vojaäkih projektih zagotovljena visoka zanesljivost. Upravljalsko politiko uporabljamo za določitev strategije in taktike pri upravljanju. Strategija je dolgoročen načrt, kaj je treba storiti za uresničenje ciljev, taktika pa nam pove, kako bomo to storili. Upravljalske aktivnosti bi lahko razdelili na • dolgoročne • operativne. Dolgoročne odločitve se nanaSajo predvsem na načrtovanje razvoja in Širjenja porazdeljenega sistema, pa tudi na strategije optimizacije uporabe virov, poimenovanja virov, varnosti in zanesljivosti sistema. Take dolgoročne odločitve sprejemajo lju4ie, ki vodijo in upravljajo porazdeljen sistem. V primeru tesno sklopljenih sistemov, ki jih obravnavamo v tem delu, pri tem ni posebnih težav, saj je lastnik porazdeljenega sistema ena organizacija. Zato ne prihaja do konfliktov pri reSevanju teh problemov tako kot pri äibko sklopljenih sistemih. Operativno upravljanje porazdeljenega sistema se izvaja v času delovanja sistema, torej sočasno z izvajanjem uporabniških programov. Operativno upravljanje skrbi za optimizacijo izrabe virov v sistemu in možnost spreminjanja konfiguracije sistema, za reSevanje iz napak, za varnost sistema in za obračunavanje stroškov za uporabo virov. Večina teh funkcij je soodvisna med seboj. Odločitve morajo biti zelo hitre, v nekaj desetinkah ali tisočinkah sekunde in zato je skoraj nemogoče, da bi pri njih sodeloval človek. Upravljalske aktivnosti so zato izvedene kot strežniki znotraj sistema. Pogosto upravljalske aktivnosti niso eksplicitno ločene in delujejo kot vključene komponente v standardnih strežnikih. Spremljajo delovanje posameznih komponent in sproti ocenjujejo njihovo delovanje. V primeru, ko delovanje komponente ne ustreza pričakovanjem, strežniki uprav-Ijalci izvedejo predvidene operacije [COST83]. Slika .1 prikazuje sploBno shemo upravljalske operacije. Vhod Procesiranje Izhod (Spremljanje) (Odločanje) (Krmiljenje) Slika .1: Shema upravljalske operacije Samo v skrajnih primerih ali na zahtevo upravljalske funkcije javljajo rezultate svoje aktivnosti človeku, upravljalcu in uporabniku sistema. Oblika in obseg informacije, ki jo pripravi upravljalska aktivnost, je odvisna od tega, komu je namenjena. Uporabnika predvsem zanimajo osnovne informacije o stanju sistema, ki naj bodo kratke in pregledne. Za upravljalce, ki skrbijo tudi za vzdrževanje sistema in dolgoročno upravljanje, pa so pomembne vse podrobnosti. 2 Zgodovinski pregled upravljalskih pristopov v računalniških sistemih Glavna cilja upravljanja računalniških sistemov sta čimboljša izraba računalniške opreme in čimbolj enostaven vmesnik proti 3.2 SpremUanJc delovanja sistema Spremljanje delovanja sistema je povezano e delovanjem vseh upravljalskih funkcij, saj sluli kot vhodni podatek ea proces odločanja in itvajanja upravljalskih aktivnosti. Ta funkcija upravljanja torej nudi servis ostalim upravljalskim funkcijam, redkeje pa neposredno uporabniku. Poseben pomen imajo resultati spremljanja delovanja sistema za ravnanje ob odpovedih in reSevanje iz napak. Funkcija spremljanja delovanja sistema je lahko realizirana centralizirano ali porazdeljeno. Tudi pri tej funkciji imata ena in druga reSitev dobre in slabe lastnosti. Centralizirana reSitev je enostavnejša, a manj zanesljiva in prilagodljiva, porazdeljena pa ima komplementarne lastnosti. Pri obeh se pojavlja problem konsistentnosti starya sistema, ki je posledica nedeterminizma pri prenosu sporočil po komunikacijski mreži in sinhronizacijskih problemov, ki sledijo iz tega. Pri spremljanju delovanja se zbere velike količine podatkov in strežnik, ki opravlja funkcijo spremljanja, mora te podatke zbrati in urediti tako, da nudijo ustrezno informacijo o delovanju sistema. 5.3 Ravnanje pri odpovedih Ravnanje pri odpovedih je upravljalska funkcija, ki nudi servis strežnikom in uporabnikom porazdeljenega sistema, ko pride do odpovedi ali nepravilnega delovanja ene ali več komponent v sistemu. Posledica nenormalnega stanja sistema ali kateregakoli njegovega dela se pokaže v obliki napak pri izvajanju operacij pri nekaterih strežnikih v sistemu. Ravnanje pri odpovedih vključuje zaznavo napak in njihovo analizo, iskanje okvarjenih komponent, odpravo okvare ali ustrezno zamenjavo komponente, odpravljane posledic napak in obvestila o napaki up-ravljalcem sistema in po potrebi uporabnikom. V porazdeljenem sistemu so problemi ravnanja t odpovedmi 5e večji, saj se napake zaradi sodelovanja med porazdeljenimi komponentami hitro Sirijo, nadzor in odpravljanje posledic pa sta zaradi porazdeljenosti težja [LeLaSlj. Pri pregledu aktivnosti se bomo posvetili tistim servisom, ki jih v primeru odpovedi nudijo strežniki v sistemu. Odkrivanje napak Če hočemo odkrivati napake v sistemu med delovanjem, moramo vnaprej poznati vsaj približno obnašanje sistema. Tako lahko v vsakem trenutku primerjamo dejansko stanje sistema s predvidenim in če se pojavi odstopanje, to pomeni napako. Ker pa v sploSnem stanj sistema ne poznamo vnaprej, dodamo posebne komponente strojne in programske opreme, ki izvajajo različne teste ob začetku delovanja sistema in med samim delovanjem, periodično ali ob posebnih pogojih. Rezultate teh testov pri pravilno delujočem sistemu poznamo in zato lahko odkrijemo odpovedi in nepravilnosti, če se pojavyo. Taki dodatni testi sicer zmanjäujejo zmogljivost sistema, vendar je to cena, ki jo moramo plačati za povečano zanesljivost delovanja. Pogostost in natančnost testiranja je odvisna od zahtev po zanesljivosti. Tudi zmanjšanje zmogljivosti sistema je lahko znak za napake v sistemu, vendar je take značilnosti razmeroma težko formalno opisati in zato tudi niso primerne za uporabo pri sprotni diagnostiki. Analiza napak In odkrivanje okvarJcniFi komponent Analiza rezultatov testov pomaga pri odkrivanju napak in vzrokov, zaradi katerih je prišlo do napak. V porazdeljenih sistemih je težko določiti izvor napake, saj se rezultati prenašajo med posameznimi strežniki, ki so na različnih lokacijah. Napako lahko zaradi premalo pogostega testiranja spregledamo in jo zaznamo Sele, ko se je že razširila po sistemu. Zato kljub očitni napaki težko odkrijemo njen izvor. Sodelovanje strežnikov, ki se ukvarjajo t analizo napak in so porazdeljeni po sistemu, omogoča analizo dogiyanj v celotnem sistemu. Popravilo ali lamenjava okvarjene komponente Ko odkrijemo vzrok za napako in s tem tudi komponento, ki ne deliy'e pravilno, najprej tako komponento izločimo, da ne bi povzročala dodatnih težav. To storimo tako, da o napaki obvestimo upravljalca konfiguracije. Potem poskušamo tako komponento delno ali v celoti popraviti s servisi, ki so na razpolago v sistemu, če tako popravilo uspe, komponento vrnemo v sistem. Pri komponentah, ki so bolj kompleksne, je možno tudi delno popravilo. V takih primerih komponenta izvaja samo omejeno število funkcij. Kadar okvare ne moremo odpraviti s servisi, ki so na razpolago v sistemu, ali jo lahko le delno odpravimo, o tem obvestimo upravljalca sistema; le-ta o okvari obvesti vzdrževalce programske ali strojne opreme in po potrebi tudi uporabnika. To jé potrebno predvsem v primerih, ko servis zaradi odpovedi komponent ne more nuditi predvidenega servisa, saj določeni viri niso replicirani. Če se zaradi odpovedi posamezne komponente delo porazdeli med druge in to pomeni samo zmanjäanje zmogljivosti sistema, to običajno ni usodno za uporabnika in ga zato ni treba vznemirjati s poročili o odpovedih. Odprava posledic odpovedi Odpravljanje posledic odpovedi po odstranitvi ali popravilu okvarjene komponente je osredotočeno predvsem na razveljavitev rezultatov, ki vsebujejo napake, posledice odpovedi. Kompleksnost problema lahko zmanjšamo z uvedbo principa nedeljivih akcij. To 80 segmenti programske opreme, ki se izvedejo skupaj. Poznamo vse povezave, ki jih ima komponenta, ki vsebuje tak segment, znotraj segmenta z drugimi komponentami. Takoj po izvedbi tega segmenta izvedemo potrebne teste in v primeru napake razveljavimo dobljene rezultate. Ker poznamo vse povezave z drugimi komponentami in je le-teh omejeno število, je to dokaj preprosto. V končni fazi postavimo vse komponente v stanje, v kateri so bile, tik preden se je pojavila napaka. S tem smo odpravili posledice odpovedi. 3.4 Varnost in zaSCita V vsakem računalniškem sistemu morajo obstajati metode in mehanizmi za zaščito vseh objektov v sistemu. S tem je vsakemu uporabniku zagotovljena varnost. To pomeni, da so podatki in servisi dostopni samo pooblaščenim strežnikom in uporabnikom, da so servisi, ki jih nudijo strežniki, ustrezni, ter da vedno poznamo identiteto uporabnikov servisov. To potrebujemo zaradi nadzora delovanja sistema in analize ob zlorabi in za obračunavanje. Zaščita objektov pomeni nadzor dosega do le-teh. Objekti v sistemu morajo biti zaščiteni pred nepooblaščenimi uporabniki in pred nenadzorovano uporabo zaradi napak in odpovedi v programski ali strojni opremi. Pri problemih zaščite v računalniških sistemih je treba ugotoviti, komu lahko zaupamo. V centraliziranih sistemih bo bili mehanizmi la zaščito zbrani v jedru operacijskega sistema. Jedro je bilo namreč ustrezno zaščiteno pred ostalimi deli sistema in so mu lahko vsi zaupali. Pri porazdeljenih sistemih sta dva problema, ki narekujeta drugačno reševanje. Ker so jedra operacijskega sistema porazdeljena v sistemu, zelo lahko pride do poskusov spreminjanja le-teh in do nedovoljenih operacij v sistemu. Drugi razlog za uporabniku. V teku razvoja sistemov so se naCini in možnosti ta tago-tavljarye teh ciljev spremiryali. Pri prvih računalnikih so bile procesne in pomnilniäke zmogljivosti zelo drage. Zato si ni bilo mogoče predstavljati, da bi računalnik sam izvajal kakršnekoli upravljalske funkcije. Uporabnikov je bilo zelo malo in vsak je bil hkrati programer, operater in upravljalec sistema, Sam je poskrbel za nalaganje programov in podatkov, za izvajanje programa in za izpis rezultatov. Ko so se zmogljivosti računalnikov povečevale in se je tudi cena procesiranja zniževala, je bilo uporabnikov vedno več. Računalniki so že imeli enostavne upravljalske mehanizme, ki so se imenovali paketni sistemi. Le-ti so skrbeli za zaporedno nalaganje programov in podatkov, za izvajanje programov in za shranjevanje rezultatov. Vsak program se je izvedel do konca, in 9ele potem je sistem naložil nov program. Tak sistem je bistveno izboljäal uporabo virov v sistemu, saj je bilo mnogo manj izgube časa med posameznimi posli kot prej pri ročnem upravljanju. Naslednja težava, ki je onemogočala boljäo izrabo zmogljivosti, je bila razlika v hitrosti delovanja procesnih enot in vhodnò-izhodnih enot. Zato so uvedli tako imenovano istočasno računanje in vhod-izhod, dobro poznano po začetnicah angleškega opisa te dejavnosti kot SPOOLer (simultaneous peripheral operation on line ). To je bila prva oblika kvazi-paralelnega izvajanja več dejavnosti, ki je zahtevala določene zmožnosti v strojni opremi. Tu mislimo predvsem na prekinitvene mehanizme. Nata način je računalnik prevzel skrb za upravljarye mnogih virov in v večji meri omogočil doseganje zgoraj oprede^enih ciljev. Naslednji korak, ki je računalnik Se bolj približal uporabniku, je bil uporaba operacijskih sistemov z dodeljevanjem časa (time sharing). Tak opcracyski sistem hkrati več uporabnikom omogoča interaktivno delo. Za razliko od paketnega sistema se tu program praviloma ne izvrSi naenkrat. Uporabnik dobi pravico do uporabe virov v sistemu samo za kratke intervale, vendar ima zaradi velike hitrosti delovanja občutek, kot da uporabna računalnik sam. To seveda velja, če sistem ni preobremenjen. Viri v sistemu so dobro izkoriščeni, vendar zahteva režijsko delo pri menjaryu dejavnosti precej časa. Čim boljši so bili računalniki in čim večje so bile zmogljivosti, več uporabnikov jih je uporabljalo in pojavljale so se nove zahteve. Z optimizacijo algoritmov in programov in z novimi tehnološkimi reSitvami so nekaj časa uspeli zadovoljevati zahteve. Bolj dolgoročna reSitev pa je porazdelitev procesiranja in paralelnost. Prvi problem, ki so se ga lotili strokovnjaki, je bilo upravljanje komunikacij. Prenos podatkov med porazdeljenimi procesnimi mesti mora biti zanesljiv in brez napak. Vsako procesno mesto je imelo svoj lokalni operacijski sistem. Lokalni operacyski sistem uporablja servise komunikacijskega sistema in nudi servise uporabniku. Razen zanesljive komunikacije v teh zgodnjih oblikah porazdeljenih sistemov ni bilo posebnih pripomočkov, ki bi uporabniku pomagali pri delu v porazdeljenem sistemu [Fort86]. Uporabniki pa so potrebovali mehanizme, ki bi jim omogočili bolj celovit pogled na porazdeljen sistem in čimvečjo transparentnost porazdeljenosti sistema. Mreini opemcijiìà tit-tem je bil nov nivo, ki so ga dodali med komunikacijski sistem in lokalni operacijski sistem na vsakem procesnem mestu. Značilnost mrežnega operacijskega sistema je, da vsak uporabnik Se vedno v glavnem dela ne enem procesnem mestu, da se zaveda porazdeljenosti objektov v sistemu in da sistem vsebvge zelo malo mehanizmov za reSevanje iz napak [Trip87|. Glavna področja, kjer uporabik čuti porazdeljenost so datotečni sistem, zaičita in izvajarye programov (Tane85|. To pomeni, da mora uporabnik Se vedno sam reSevati množico problemov, povezanih z upravljanjem sistema. Porazdeljeni opemcijtid tiatem omogoča uporabniku pogled na porazdeljen sistem kot celoto, brez mej med posameznimi procesnimi mesti. Upravljanje se nanaSa na vse objekte v sistemu in optimira rabo vseh virov, za razliko od mrežnega operacijskega sistema, kjer gre za optimizacijo delovanja vsakega posameznega procesnega mesta. Porazdeljen operacijski sistem je v enakem odnosu do porazdeljenega računalruSkega sistema, kot lokalen operacijski sistem do centraliziranega računalniškega sistema. Porazdeljen operacijski sistem deluje enotno, vendar ima svoje kopije na vseh procesnih mestih, njegove funkcije pa se izvajajo paralelno in med seboj sodelujejo. 3 UpravUaUke funkcije Pri pregledu upravljalskih funkcij se bomo omejili predvsem na tiste, ki so povezane z operativnim upravljanjem porazdeljenih sistemov in so realizirane v porazdeljenih operacijskih sistemih. 3.1 Upravljanje konfiguracije Porazdeljen računalniški sistem lahko predstavimo kot množico komponent programske in strojne opreme, ki so lahko prostorsko porazdeljene. Vendar pa opisana množica tvori porazdeljen sistem Sele, ko so komponente povezane in sodelujejo pri reSevanju problemov. Funkcija upravljanja konfiguracije sistema poskrbi za začetno povezavo komponent in vzpostavitev delovanja sistema in tudi za dinamično prilagajanje sistema, ki je potrebno zaracU napak ali spremenjenih potreb uporabnikov. Določene operacije pri konfiguraciji so dolgoročnega značaja in jih mora opraviti operater. Mislimo predvsem na instalacijo osnovnih komponent programske in strojne opreme in fizično povezavo posameznih komponent sistema. Ko je najbolj osnovna konfiguracija sistema vzpostavljena, lahko nadzor nad konfiguracijo sistema prevzame ustrezen strežnik. Večina ostalih aktivnosti, ki jih vključuje upravljanje konfiguracije, se izvaja dinamično. Tako se razporejajo aktivnosti sistema glede na trenutno zasedenost posameznih virov v sistemu in njihovih sposobnosti za opravljanje posameznih operacij. To je tudi edini način za optimizacijo zmogljivosti, ki se lahko izvaja dinamično. Ostali načini so dolgoročnega značaja in zato ne sodijo v okvir naSe obravnave. Tudi v primeru odpovedi komponent se mora konfiguracija sistema spremeniti tako, da uporabniku ostanejo na razpolago vsi servisi, čeprav se zmogljivost sistema zmanjSa. Komponente, ki so bistvene za zagotavljanje določenih servisov, ni pa jih mogoče nadomestiti z drugimi komponentami, morajo biti zato replicirane. Poleg skrbi za povezave med posameznimi komponentami sistema mora upravljalska aktivnost v vsakem trenutku posredovati informacije o stanju sistema, če uporabnik to zahteva. V sploSnem pa je zaželjena transparentnost lokacije virov, torej se uporabnik ne ukvarja z iskanjem lokacije vira. Transparentnost lokacije vira je tesno povezana s poimenovanjem virov in semantično konsistentnostjo strežnikov in pomeni, da je njihov servis enak, ne glede na to, kje v sistemu se strežnik nahaja. Večje težave se pri tem pojavijo v porazdeljenih sis-terhih, ki ga sestavljajo zelo različne komponente strojne ali programske opreme. Transparentnost lokacije virov lahko dosežemo na različnih nivojih, vendar je najbolje, če je realizirana v jedru operacijskega sistema, saj je tako dostopna najSirSemu krogu uporabnikov. realaacijo mehaniEinov zaSčitc izven jedra pa je želja po čim manjäem jedru. Za lažčito sta v porazdeljenih sistemih uporabljena dva mehanizma: seznam za nadzor dostopa in mehanizem prenaäanja pravic. Seznam za nadzor dostopa do vira vsebuje vse uporabnike, ki imajo pravico dostopa do tega vira. Seznam je pridružen strežniku, ki vsebuje vir. Zahtevek, s katerim uporabnik zahteva servis, vsebuje tudi enoliino oznako uporabnika. Ta del sporočila je tako zaSčiten, da ga ni mogoče ponarediti. Dobra lastnost mehanizma s seznamom je tudi to, da je mogoCe enostavno razveljaviti pravice v zvezi z dostopom, če je to potrebno. To je potrebno ob odpovedih in pojavih napak, pri spremembah konfiguracije sistema ali zaradi optimizacije zmogljivosti sistema. Osnovni element drugega mehanizma je posebna vstopnica ali žeton, ki ga imenujemo pravica [Tane84]. Le-ta omogoča svojemu lastniku dostop do virov ali izvajanje določenih operacij, ki BO vključene v dani strežnik. Uporabnik, ki želi generirati nov objekt, poSlje zahtevek za to ustreznemu strežniku. Zahtevek vsebuje tudi neko naključno Število iz določenega intervala. Strežnik generira nov objekt, hkrati pa zgenerira tudi pravico za dostop do tega objekta. S pomočjo naključnega Števila, ki ga je poslal uporabnik, Šifrira pravico in jo vrne uporabniku. Le-ta jo lahko potem razmnoži in jo preda tudi drugim strežnikom in uporabnikom. Strežnik shrani naključno StevUo v posebno tabelo, povezano z novim objektom. Ko pride zahtevek za doseg novega objekta, strežnik preveri pristnost pravice s Številom, shrar^jenim v tabeli. Pravica je zaSčitena tako, da je ni mogoče ponarediti. Pravice je težko razveljaviti in to je slaba lastnost tega mehanizma. Ko strežnik, ki je odgovoren za podani objekt, generira pravico, jo Šifrira z naključnim Številom. To pomeni, da uporabnik, ki je zahteval kreiranje novega objekta, sam ne more spremeniti vsebine pravicc in omejiti pravice uporabe posameznih operacij nad objektom uporabniku, ki bi mu želel dati kopijo pravice. To stori lahko le tako, da poSlje zahtevek za kreiranje take pravice strežniku. Opisane lastnosti mehanizma s pravicami omogočajo porazdeljeno realizacijo zaSčite. Zaradi porazdeljenosti sistema potrebujemo metode za Šifriranje sporočil, ki onemogočajo prisluškovanje na komunikacijskih medijih, in overjanje sporočil s tako imenovanim digitalnim podpisom. Najbolj je razSirjeno Šifriranje po tako imenovanem DES (Data Encryption Standard) standardu. Oba osebka, ki komunicirata, morata poznati ključ, to je neko dovolj veliko Število, ki služi kot osnova za Šifriranje in dešifriranje sporočil. Metoda Sifrirarya je poznana, to pomeni, da je vsa skrivnost v ključu. Funkcije za Šifriranje sporočil so realizirane v siliciju in zato je 5ifrirar\je v realnem času mogoče tudi za velike hitrosti prenosa podatkov, ki so potrebne v porazdeljenih sistemih. Problem predstavlja samo generacija in distribucija ključev. Zato sta DifHe in Hellman razvila nov sistem, imenovan sistem Šifriranja z javnim ključem. Poznana sta sistem Šifriranja in deSifrirarya in tudi ključ za šifriranje. Ključa za Šifriranje in dešifriranje sta pridobljena iz istega osnovnega ključa. Vsa skrivnost sistema je v posebnosti funkcije, s katero iz osnovnega ključa dobimo ključ za Šifriranje. Ta funkcija je zelo enoatavno izračunljiva, njena inverzna funkcija pa ni izračunljiva v realnem času. Zato ni mogoče učinkovito izračunati osnovnega ključa in iz tega ključa za dešifriranje. 3.5 Obračunavanje Obračunavanje je sistemska funkcija, ki spremlja in beleži uporabo virov in servisov v sistemu. Poleg sestavljanja računov na osnovi pogodb in tarif za posamezne servise, ta funkcija omogoča tudi omejevanje uporabe virov in servisov in zagotavlja poštenost pri uporabi le-teh. V tesno sklopljenih sistemih, ki so običajno v lasti ene same organizacije in kjer se stroSki pogo?to ne delijo med uporabnike, prevladujejo predvsem ti drugi motivi. Tarife za uporabo posameznih servisov ae določijo na osnovi stalnih stroBkov, spremenljivih stroSkov in posebnih stroškov. Stalni stroSki se plačujejo periodično in 80 neodvisni od uporabe sistema. Posebni stroSki so odvisni od časa uporabe ali količini obdelanih in preneSenih podatkov, kvalitete servisa in oddaljenosti strežnika. Posebni stroSki so povezani s servisi, ki niso vključeni v osnovni nabor [Slom87]. Pravice za uporabo posameznih virov lahko izrazimo z virtualnim denarjem, ki ga ima uporabnik (COST84). Z virtualnim denarjem si kupi pravice za uporabo virov oz. servisov. Virtualni denar lahko uporabnik dobi na osnovi dogovora med uporabniki ali kot protivrednost denarja, ki ga plača za uporabo sistema. Virtualni denarje v različnih valutah, za vsak vir ali servis druga valuta. Vse operacije t denarjem so tesno povezane z uporabo mehanizmov za varnost in zaSčito. 3.6 UpravUanJe i imeni Imena imajo v računalniških sistemih zelo pomembno vlogo, saj z njihovo pomočjo identifldramo izbrane objekte v sistemu na vseh nivojih abstrakcije. Imena uporabljamo na različnih področjih upravljanja. Se posebno pomembna je vloga imenskega sistema v tesno sklopljenih računalniških sistemih zaradi dovoljene raznolikosti gradnikov sistema in želje, da bi uporabnik videl sistem kot enovito celoto. Zaradi čimboljSe prilagodljivosti sistema želimo, da ime vsebuje čim manj informacij o objektih, saj se lahko le-te dinamično spreminjryo |Terr86]. Te informacije morajo zato biti shranjene neodvisno in organizirane tako, da je spreminjanje in dodajanje podatkov o objektih enostavno, hkrati pa je enostavna tudi povezava z imeni. To namreč omogoCa učinkovito delovanje imenskega sistema. 4 Literatura ICOST83] T. Kalin (edt.), Management issues in Local Area Networks, COST 11 bis LAN Management Group Report, Proc. EUTECO '83, North Holland, 1983 jCOST84] A. Langsford (edt.). Distributed Systems Management in Wide Area Networks, report by COST 11 bis DSM Group, February 1984 [Fort86] P.J. Fortier, Design of Distributed Operating Systems McGraw-Hill, New York, 1986 |LeLa8l) G. LeLann, Error recovery, in Distributed systems, LCNS 105, Springer-Verlag, Berlin 1981 [Slom87) M. Sloman, J. Kramer, Distributed Systems and Computer Networks, Prentice-Hall International, 1987 [Tane84l A.S. T:inenbaum et.al., Capability Based Protection in Distributed Operating Systems, Proc. Symp. Certificcring van Software, Utrecht, November 1984 [TaneSS] A.S. Tanenbaum, S.J. Mullender, Distributed Operating Systems, ACM Computing Surveys, vol.17/4, December 1985 (Tanc87] A.S. Tanenbaum, R. van Renesee, Reliability Issues in Distributed Operating Systems, Proc. 6th Symp. Reliability of Distr. Softw. Sc Datab. Systems, Williamsburg, Virginia, March 1987 (Terr86l D.B. Terry, Structure-free Name Management for Evolving Distributed Enviroments IEEE Proc. 6th ICDCS, May 1986 [Trip87] A. Tripathi, Distributed Operating Systems, Tutorial no.4, 7.ICDCS, W.Berlin, September 1987 VERJETNOSTNI MODEL RAČUNANJA INFORMATICA 1/89 Deskriptorji: RAČUNANJE, VERJETNOSTNI MODEL, ALGORITMI HEVRISTIČNI Janez Žerovnik Inštitut za matematiko, fiziko in mehaniko, Ljubljana POVZETEK: V preglednem {Unka predrtsrimo vcijetDostai model ratuiiiqj& in podamo delnküo Dckaterih rauedcnr tMorae uhtevnosti vetietaostnih algoritmov. V utetnih rsidelkib vpeljemo kUsÜsi (đetenniniatKiii) model ratonaiO» In ponovimo mane deAnic^e iz teorije tasovne uhtevnoat! al^tmov. ABSTEACT: A PioUbUUtk Mod«! of CompuUtlon In artide a survey ot a probabilistic modd of eompntatioo is given. Some claaaa o( probabilistic algorithms are deHned. PieUmisary sections give delnition d classical (deterministic) model ci compatatloi and bitrodnctiao to the theory of time complexity of computation. 1. UVOD V članku bomo vpeljali nekatere pojme in definicije iz teorije iasovne uhtevnosti algoritmov. V sloveničlni s teg» podroCja poznamo dve deli [Plaa83, PePi82| in prevod |AhUi86|, ti^ja literatura pa je precej bolj bog&tv Samo problemu trgovskega potnika na primer Je posvečena cela knjiga |LLRS86|. Ze klasična je knjiga Gareya in Johnsona |GaJo79|, m! pa bomo zaradi novejfìh definidj nekaterih razredov slučajnih algoritmov sledili knjigi [Melh84], oprii pa se bomo «e na |HoU179| in 1011177]. V zadnjem desetletju In pol so razred NP-polnIh odločitvenih problemov veliko raziskovali. Vpratenje, ali je razred P enak razredu NP Je verjetno eden najpomembnejfih odprtih problemov teoretičnega računalništva. Teoretično računalniitvo tu Imenujemo vedo, ki Jo nekateri Imenujejo tudi Informatika (informatique), kibemetlka uporabna matematika (Prikladnaja matematika) sii 'slgoritmlka' (algorikhmics) [KnutS!]. Ce bi za katerikoli NP-poln problem uspeli pokazati, da je v P, bi sledilo P=NP. Ker Je veliko praktičnih problemov NP-polnIh, si v praksi pomagamo do delnih rejltev na različne načine. Aproksimativni algoritmi na primer so taki algoritmi, ki v pollnomskem času najdejo refitev, ki je 'dovolj blizu' optimalne reiitve. Deterministični hevrističnl * algoritmi imajo pogosto lepo lastnost, da najdejo dobre refltve na neki podmnofic! problemske domene. Običajno Imajo taki algoritmi 'Ahilove pete': obstajajo primeri, na katerih se algoritem obnala zelo slabo. Avtorji algoritmov običajno te sami navajajo primere, ki so neugodni za algoritem [WePo67,DuBr81|. Pri slučajnih hevristlčnlh algoritmih * SploSno privzete definicije hevrističnega algoritma ni |Tro883|. Po Verblnčevem Slovarju tujk je hevristika: 'nauk o metodah raziskovanja (z uporabljanjem le nedokazanih metod, hipotez, itd.)'. Za naio rabo naj hevristični algoritem pomeni postopek, ki ga uporabljamo, čeprav nimamo dokaza, da so njegove relitve vedno točne (optimalne). Vemo pa, da na primer velja, da je točen na neki podmnoticl problemske domene ali da točno reiuje nek "podoben' problem ali kakšen podoben delni rezultat. nekatere korakenaredlmo slučajno. S tem se včasih izognemo 'Ahilovim petam' na račun nekaj večje časovne zahtevnosti. Eden od intuitivno najlaije razumljivih NP-polnih problemov je problem barvanja točk grafa. Ker je sestavek nekoliko predelano poglavje avtorjeve magisterske naloge 'Verjetnostni algoritem za barvanje točk grafa', je tudi tu najpogosteje obravnavani primer ravno problem barvanja točk grafv Grtf lt komblnatoričnl objekt, ki ga sestavlja ta mnotica točk V in mnotica E, v kateri 80 pari točk. Elementom mnotìce E pravimo paveitve. Točki, ki sta povezani i povezavo, sta $e»tinji Btrvanje (točk grafa) Je prireditev barv točkam graf». Za barve običajno vzamemo kar naravna itevlla. Barvanje Je ioiro, če je vsak par sosednjih točk pobarvan z različnima barvams- Formalno to zapišemo na primer takole: 6 : V Af je dobro barvanje «=> (»,»)€£=> 6(«) ^ b{v). Odločitveni problem barvanja grafa lahko zdsj zastavimo takole: Ali ca dani graf G In mnolico barv {!,...,)[) obstaja dobro barvanje? Ce tako barvanj« obstaja, rečemo, da je G k-poitrvijiv. Pogosto je zanimiva optimizacljska varianta problema: PoHči mlnimalnU, da Je G »e t-pobarvljWt Ta minimalni * Imenujemo kremttično ftevUe grafa G. V nadaljevanju bomo predpostavljali, da bralec pozna osnovne pojme teorije grafov, kot so vpeljani na primer v [BaPiSS] ali [CvMiTT]. Problem barvanja grafa Je zanimiv tudi iz zgodovinskih razlogov. Problem itirih barv je bil skoraj sto let velika uganka, ob kateri se Je razvijala teorija grafov. Problem si Je prvi zastavil leta 1852 londonski Student Francis Guthrie, ko je barval zemljevid angleiklh grofij. Domneval je, da je mogoče vsak zemljevid pobarvati s štirimi barvami, seveda tako, da ozemlja, ki mejijo, niso pobarvana t isto barvo. Pri tem ozemlja, ki se dotikajo samo v končno mnogo točkah ne itejemo za sosednja. Problem običajno zastavimo v nekoliko drugačni obliki. Namesto ozemelj barvamo točke (lahko si mislimo, da to to glavna mesta), povezani pa sta točki, ki sta v sosednjih drtavah. Problem se potem glasi: ali je mogoče točke vsakega planamega grafa pobarvati s Itlrimi barvami (tako da pobarvamo poljubni povezani točki z različnima barvama)? Francisov brat Frederic je problem predstivil profesorju Augustu de Morgaru. Sirie uian je problem postal leta 1878, ko je Arthur Cayley na sreCanju Londonske matematiine druJbe (London Math. Society) vpraJal, ali je problem le reéen. Prvi se je r«*«vanja problema resno lotil A.B.Kempe, ki je leta 1879 objavil dokai, daJtiri barve ladoičajo. Deset let kasneje je P. J.Heawood odkril napako v Kempejevem dokazu, ki pa so jo mnogi imeli bolj u tehnično pomanjkljivost kot pa u resno napako. Ko pa so leta minevala in nikomur ni uspelo popraviti dokaia, je postalo jasno, da problem ni od muh. Leta 1976 sta Kenneth Appel in Wolfgang Haken objavila, da sta dokaiala iirek o Mirih barvah. Ideja dokaiaje v bistvu enaka Kempejevl, leMevllo posebnih primerov, ki jih je bilo potrebno pregledati, je precej večje (okoli 1500 namesto 51). Ob»e«no«t je tudi glavna hiba dokata,saj je ta pregled (okoli 600 strani) dolgega dokaia potrebno ogromno Casa, tudi Ce la nekatera rutinska opravila uporabimo raCunalnik, kot sta to storila avtorja dokaia |WoWi78,ApHa86l. i. Primer NP-polnega problema Za motivacijo si v tem razdelku ie pred formalnimi definicijami oglejmo problem klike (skupka). Zastavimo ga takole: Dan je (neusmeijen) graf G = (V, E) in naravno število k. Ali v G obstaja podgr&f, ki je izomorfen polnemu grafu Kut Obstaja enostaven, toda neučinkovit algoritem la reievanje tega problema. Pregledamo vse podgrafe v G, ki so inducirani i množicami toCk kardinalnosti k. Za vsak tak podgraf pa preverimo, Ce je poln. Med n toCkami lahko liberemo k loCk na (J) načinov. V primeru t = n/2 torej naj algoritem preveri > 2*/(n+l) podmnolic. Preverjanje, ali je graf poln, je enostavno in hitro: ta vsako od n(n-1)/2 povezav pogledamo, ali je v induciranem grafu. Nai naivni algoritem torej zahteva Ca«, ki je eksponentna funkcija Jtevila toCk danega grafa. Preden trdimo, da je to neučinkovito, povejmo, kaj mislimo z velikostjo problema. Predpostavljali bomo, da «o kombinatorični objekti (grafi, množice, cela Mevila,...) podani v 'normalni obliki' s končno abecedo. Tako bo na primer graf z n točkami zakodlran v obliki zaporedja n' bitov, v katerem so zlotene vrstice matrike sosednosti. Cela Q x T x {L, Ä}, ki ni n^^no definirana na vsej mnotici Q x T e 1 potem zapiJemo XìXj.. ...X, h X\Xi ...Xt-ipXt-iYX(^i.. .A, Ce se nam na koncu zapisa konfiguracije pojavijo prazni simboli, jih po dogovoru zbriiemo. Podobno definiramo korak, če je %X() = (p,K,Ä). XjX2... Xi-iqXi ...Xm h XiX}... Xi-i ypX(+i ...Xt V primeru » - 1 = n je niz ... Jf, je zapis konfiguracije po koraku daljtì od zapisa prejJnje konfiguracije. Konfiguracija C = XiX3...Xi-iiXi...X, je i*<«t)i(imii, če velja = t. Konfiguracija je tpr^emtjee*, če je j e F. Konfiguracija je jštetm pri podatku i = Xi, Jfj,...,X„ če je j = fo, in i = 1. Jirtitin pri podatku i e T* je zaporedje konfiguracij Co,Ci,...,Ct, za katere velja £7( I- Ci+i za O < i < *. Izračun »e »»«sm, če je končna konfiguracija C» ustavitvena. Izračun je tfrqemtjoi, če je ivjegova končna konfiguracija Ct sprejem^oča. Brez ikode za splolnost lahko privzamemo, da se vsak sprejemajoč račun ustavi. 2.DEFINICUA; Ntdeterrmnitiični TVinnjoc » 5AT € P Problemov > to lastnostjo je »e veliko, prihajajo pa i r&ilienlh podroSlj. li teorije gra/ov omenimo problem klike in problem barvanja točk grafa. Raivpiti problem trgovskega potnika je samo eden iimed telkih (beri NP-polnih) transportnih problemov v operacijskih raziskavah. Mnogo dela mnogih raiiskovalcevje bilo brei uspeha, ko so iskali učinkovite (polinomske) algoritme lA te probleme. Se več, b) Ka mnoge algoritme, ki so jih predlagali, se je ickazalo, da imajo več kot polinomsko časovno zahtevnost. Za konec razdelka navedimo te izrek, ki primetja deterministično in nedetermlnistično časovno zahtevnost. Enostavna simulacija da za eksponentni red večjo časovno zahtevnost pri simulaciji nedeterminističnega TS z determinističnim. BoljJe simulacije ne poznamo. Ce velja P NP potem je to tudi nsjboljie, kar lahko dosetemo. 7.DEFINIC1JA: Funkcija T : N S ]e luema funkajt, It obstaja deterministični TS, ki se ustavi po natanko T(n) korakih pri vsakem vhodu dolžine n. Nekatere običajne funkcije so tudi časovne funkcije, na primer T(n) = n, r(W) = npog n|, T(n) = n', T(n) = V. 8.IZR£K: (Simulacija nedeterminističnega TS z determinističnim) Bodi T(n) časovna funkcija, t C T* jezik in N nedeterministični TS, ki sprejema L v času TN(n) = 0(r(n)). Potem obstiya deterministični TS W z L{U) = L in 7w(n) = 0(c''("') za neko konstanto c. DOKAZ: Za vsak x € L obstaja izračun dolflne < Tw(|i|), ki sprejme I. Bodi k = max{(»ri<(«(r,o)); t Btu\le TS W in o e F). Potem ima TS ^ na vsakem koraku nsjveč k raiRčnih mogočih operacij. Torej je največ fXI'l) različnih izračunov, ki se ustavijo, dolUne < rf/(|i|) pri vhodu I. Spomnimo se, da je z € L natanko ted^j, ko je vsaj eden od izračunov, ki se ustavijo, sprejemajoč. Uporabimo to pri slmulacijU Izberimo tak m, da bo rw(n) < mr(n) za vsak n. Poglejmo cela števila od O do t^n») pri osnovi k. Vsaka od mogođh izvedb TS JV je zakodirana kot eno od teh itevll na naslednji način: i-ta cifra v t-adičnem zapisu itevila določa operacijo na i-tem koraku Izvedbe. (Nekatera Itevila ne predstavljajo moine izvedbe, vendv nas to ne moti.) Enostavno je videti, da je vsako konkretno izvedbo TS N potrebno največ p(mT(|z|)) korakov, za neki polinom p. (R«s, dodatno moramo na vsakem koraku poiskati naslednjo cifro zakodirane izvedbe, za to K moramo največ dvakrat zapeljati čet trak.) Torej celotna simulacija zahteva čas p(mT(|i|))t"''"(l'l) = 0(c'"(MI)) a neko konstanto c. Q.E.D. 4. Problemi, Jeiiki in optimUacijski problemi Razreda P In NP smo deflnirali za jezike. Praktični problemi pa običajno niso formulirani v obliki problema razpoznavanja nekega jezika. V tem razdelku bomo pokazali zvezo med optlmizacüskimi problemi in njihovimi odločitvenimi inačicami. Za primer si poglejmo odločitveni problem barvanja točk grafa in ustrezni optimizacijski problem. Problem: Optimizacijski problem barvanja točk grafa Vhod: Graf G bhod: Dobro barvanje grafa G z x(G) barvami Optimizacijski problem barvanja točk grafa očitno ni jezik, pač pa zahteva izračun neke transformacije iz matrike n x n bitov v zaporedje n Jtevil 6 {1,2, ...,x(G)} (ki označujejo barve). Pripadajoči odločitveni problem je Problem: Odločitveni problem pobarvljivosti grafa (k-COL) Vhod: Graf G in celo Iteviio k Vpraianje: Ali obstaja dobro barvanje grafa G z največ k barvami? Odločitvenemu problemu je mogoče na naraven način prirediti jezik. To je mnotìca vseh primerov problema (G, k) ali točneje zapisano m, kjer je to koda grafa G in v koda Števila k, za katere je odgovor na dano vpraianje pozitiven. Ik-coL = {(G, (t); G je fc-pobarvljiv ) Oziroina l'k-coL = { u>i); w je matrika povezav nekega grafa G in « je binarni zapis nekega celega Itevila k in graf G je *-pobarvljlv } Mimogrede smo privzeli, da je kodiranje nsÄh problemov 'naravno'. Kot smo omenili le v uvodu poglavja, se dogovorimo, kako bomo kodirali kombinatorične objekte: naravna Itevila bomo zapisali binarno, splolne grafe pa z matriko sosednosti. Bistvena je zahteva, da kodlma shema ne sme umetno znitati zahtevnosti problema. Ce bi na primer naravna Itevila zapisali unamo (n = Ul^..lJ, bi namesto vhoda dolUne O(logtt) Imeli vhod dottine ■ 0{n). Tako bi eksponentni algoritem navidez postal polinomski! Podrobneje so zahteve 'naravnega' kodlraqje opisane na primer v |PeP182). Odločitvenemu problemu smo priredili jezik, torej se lahko vpraiamo, ali je problem v P. V nadaljevanju bomo pokazali, da bi v tem primeru optimizacijski problem lahko refili v polinomskem času. Torej lahko tudi za optimizacijski problem rečemo, daje polinomski, če je njegov pripadajoči odločitveni problem v razredu P. Za začetek poglejmo kvakteristično lastnost razreda NP. 9.IZREK: NP = {L; t C E* za neJto končno abecedo E in obst^a poflnom p in polinomtko izraiiin{jiv predikat Q(i,») C E' x E' , da za vsak x 6 E' ve(ja; i6L' ■3,€E':|y| k. Deflninjmo = '»j« koda aprejemajoieja iira E* iiračunljlva v polinomakem čmu, potem Isto velja za funkciji optval : E* —• E* In witnen ■.t'—'Z'. b) C« je vsaj ena od funkcij untne«« in opjvsi izračunljiv» v polinomsktra času, potem je := {(i,C);r € E',C 6 Nin optval < C} e P DOKAZ: Očitno iz prej povedanega. Manj očitno je, da velja tudi obrat zadnje leme. V »ploinem znamo dokazati nekoliko libkejJo obratno trditev. Primer, ki ga v »ploJnem ne znamo dokazati, pa bomo dokazali za poseben primer barvanja točk grafa. 13.LEMA: Bodi (Q,c) polinomako omejen optimlzacijaki problem. a) Ce ata funkciji wUnett in optval izračunljivi v polinomakem ča«u, potem isto velja za opttol. b) Ce je problem razpoznavanja v razredu P in za neki polinom j velja opJi;o((i) < 2«(l»l) za vae i, potem je funkcija optval izračunljiv» v polinomakem čaau. DOKAZ: a) Trivialno, a^j za vaak z velja opttoi(z} = witnest{z, optval(T}]. b) Za izračun optimalne refitve optval bomo uporabili binarno iakanje med vrednoatmi {l,...,2»(t'"}. Poglejmo naslednji algoritem: hw := 0; high := 1; while t nima refitve < hi/h do hifh := 2 • hifh; while hifh -low> i do middle := tr«nc{(M'jfi + lou!)/2) tf » ima refitev < midJte then high := middle else low := middle^ od optval(z) := lour, Ker je problem po predpostavki polinomsko omejen, vemo, da je reJilev omejena z 2«(l*l), kjer je g neki polinom. Prva while zanka torej v polinomakem Itevilu korakov določi zgornjo mejo reiltve. Hitro vidimo, da je low < optval(z) < high invariant» do zanke. Torej gornji algoritem pravilno Izračuna vrednoat optval{z) v (o^(2»(l*l)) = }(|i|) ponovitvah zanke. Ce v drugi while zanki uporabimo polinomaki algoritem za razpoznavanje, ki po predpostavki obstaja, amo torej rea naJli polinomski algoritem za izračun optimalne vrednosti optral(i). Q.E.D. Primerjavo med problemom C-vrednosti in problemom razpoznavanja naredimo za konkretni primer barvanja. V ktvjigi (Melh84| najdemo dokaza za problem trgovskega potnika in za izpolnjivoet, pa tudi trditev, da podobni dokazi, v katerih uporabimo tehniko redukcije problema na manjii problem iste vrate, obstajajo tudi za mnoge druge NP-polne probleme. 14.LEMA: Ce je odločitveni problem C-barvanja v P potem obataja algoritem, ki v polinomakem čaau poliče C-barvanje. DOKAZ: Idejo it |PePi82l Mpliimo v obliki algoritmi: if not C - COL{G) then {»barvinia ni*) elie for C := C downto 2 do begin liberi poljubno točko u 6 V(C'); pob&rvij H i birvo c, for all toCke v e kl niso sosede u-ji do If not C - COL(V(Cr), E{G') U («, «)) then begin (*toikl sta neodvisni!») pobarvaj v t barvo c, identificiraj « in t; v <7*; end; er := C \{tt} end preostale točke v C pobarvaj i barvo 1 Identifikacijo dveh točk grafa očitno lahko naredimo v linearnem času (potrebno je namesto dveh vrstic (stolpcev) enega pobrisati, v drugega pa vpisati konjunkcljo vrstic (stolpcev)), tato je gornji algoritem polinomski, seveda pri predpostavki, da je algoritem, ki odloči, ali je graf c-pobarvijiv, polinomski. Q.E.D S. PoUnomsko prevajanje In NP-polnl problemi Prevajanje problemov je uporabna tehnika ta ugotavljanje 'težavnosti' problemov. (Primer smo videli v prejSnjem raidelku.) 15.DEFINICUA: a) Naj bosta E in T končni abecedi. Preslikava/ : E* -r* je poKnonuka Irtniformacija, če lahko / itračunamo v polinomskem ča /(i) e Lj ta vse I e E'. c) Jeiik L je NP-poln, če velja 1) t€NP 2) L' (X i, ta vse i' 6 NP Pomembnost definicije pojasni naslednji 16JZREK: Bodi Lo NP-poln. Potem a) Lo e P <=>• P = NP b) če je Lo oc Li In Li e NP, potem Je Li NP-poln Gomji algoritem očitno ratpoinava L. Potreben čas je ?(|z|) + |/(i)| + p(|/(»)|) < '■(1*1) " n«''' polinom r. Zadnja neenakost je posledica dejstva, da je |/(i)| < |i| + ?(|i|), saj lahko TS v enem koraku porabi nsjveč en nov kvadratek traku. b) Pokatati moramo, da je L « Li ta vsak L € NP. Vtemlmo poljuben L € NP. Ker je Lo NP-poln, obstaja polinomska transformacija /, da jt L = /-'(Lo). Ker je Lo a Li obst^a tudi polinomska transformacija tako da je Lo = Otnačimo h = } o f. Potem je L = f'^U) = /-'(j-'(Li)) = {goJ)-^Ll) = h-^L^). S podobnimi argumenti kot v točki a) pokažemo, daje h polinomska transformacijv Q.E.D. NP-polni problemi so torej najtežji med problemi ratreda NP. Cc bi bil eden Itmed njih rešljiv v polinomskem času, potem bi bili vsi refljivi v polinomskem času in veljalo bi P=NP. Po drugi strani pa velja; ii N/NP sledi, da noben NP-poln problem ni rešljiv v polinomskem času. Točka b) nam da enostavno tehniko ta dokai NP-polno«ti danega problema. Ce telimo pokatati, da je L NP-poln, moramo dokaiati L € NP in Lo « L ta nek problem Lo, ta katere^ le vemo, da je NP-poln. Prvi dokat NP-polnosti nekega problema je naredil Cook v svojem klasičnem članku |Cook71]. Dokatal je, da je problem iipolnjivosti NP-poln. Dokat tu opuičamo, bralec ga lahko najde v knjigah, na primer v |GaJo79| ali 1M^841, v sloveničlni pa v |PePi82l. Ko tak dokat imamo, lahko relativno enostavno poka/emo NP-polnost drugih problemov, kot je prvi storil Karp |Karp72|. Setnam NP-polnih problemov se od tedaj Siri, tajetne setname lahko bralec najde v omenjenih knjigah. Za problem k COL (odločitveni problem barvanja točk grafa) je dokatal NP-polnost le Karp |Karp72l. Dokaz je posreden, vmes pokafe NP-polnost problema 3-iipolnjivostl (i-SAT). Dokat tu opuščamo, bralec ga lahko v slovenščini najde v |Gode83). Pravkar povedano tapiSimo v obliki itrekov: 17.IZREK: Problem itpolnjivosti je'NP-poln. 18.IZREK: SAT co-RP C co-NP DOKAZ: a) Ce je P=NP potem je gotovo Lo € P. Dokazati moramo še drugo smer ekvivalence. Bodi Lo G P in L € NP poljuben. Ker je Lo e P, obstaja deterministični TS M, ki sprejme Lo v času omejenem t nekim polinomom p. Ker je Lo NP-poln in L 6 P, velja L o< Lo. Torej obstaja polinomsko itračunljiva preslikava /, tako da je L = /"' (Lo). Bodi N deterministični TS, ki itračunava/ v času, omejenem s polinomom f. It TS M in N konstruirajmo nov TS A, ki bo ratpotnaval L takole: (1) Itračun^ /(») v času ?(|i|) t TS N. (2) Premakni glavo ('oko') na prvi simbol /(i) v času |/(i)| (3) Razpotnaj /(i) uporabljijoč M v času p(|/(i)|). Ce je /(i) G Lo, potem sprejmi i, sicer ga tavmi. C. Slika 1. Nekateri razredi jetikov kjer je co-C:={r' - L; L € C in T konint abeceda} ta vsak razred jezikov Nove ratrede problemov bomo definirali s pomočjo verjetnostnega Turingovega »troja. Verjetnostni Turingov stroj je TS t motnostjo slučajnih odločitev. Tukaj samo omenimo, da s tem računska moč stroja nI nič veijv Mogoče pa je, da se lahko kak problem reši hitreje ali na manjšem prostoru, kot t determinističnim Turingovim strojem. 19.DEFINICUA: Verj'etnorlni Turiniov rtrtj (VTS) je DTS s posebnimi stanji, v katerih sta mogoči dve nadaljevanji, ki sta enako verjetni. Tika definicija VTS je ugodna laradi enostavnosti. Ker so vse veje enako verjetne, dobimo enostavno poraideiitev motnih retultatov. Model, v katerem bi dovolili neuravnoteiene verjetnosti nadaljevanj, ni nit močnejJi od tuksy definiranega (Sch»86|. V sploinem VTS računa slučajno funkcijo: ta vsak vhod x itračuna VTS retultat y i verjetnostjo Pr{M(i) = y}. 20.DEFINIC1JA: Delna funkcija, ki jo iiračunava VTS W je definirana t nedefinirano ,te takega y ni = 21.DEFINICIJA: Verjclnoit napake VTS M je euU) = I ^ i« dtflninino ^ ' I nedefinirano ,gicer 22.DEFINICUA: VTS M iiraiunava ^u t omejeno verjetnostjo napake, Je obstaja konstanta« < J, tako da je eu{x) è v ma/u kot n konkih ,če je definirano 00 .sicer Tw(n) = Zadnja definicija je analogna definiciji časovne lahtevnosti NTS kot dolline najkrajšega sprejemajočega iiračuna (definicija 5.). 24.DEFIN1C1JA: VTS je polinonukc omejen, te obstaja polinom p, tako da se vsak molen iiraCun vhodov dolline n ustavi po največ p(n) korakih. 25.DEFINICIJA: VTS raipoinava jeiik, če iiražuna njegovo karakteristično funkcijo. Ali, dnigače upisano: x G L O Pr{M(x) = 1}>^ In zjTL«» Pr{W(i) = 0}>i 26.DEFINICIJA: PP = 'raired jeiikov, ki jih raipotnavajo polinomsko omejeni VTS' BPP = 'raired jeiikov, ki jih raipotnavajo polinomsko omejeni VTS t omejeno verjetnostjo napake' ZPP (ali LVP) = 'ratred jeiikov, ki jih raipotnavajo VTS s polinomsko povprečno časovno lahtevnostjo in t verjetnostjo napake O' 27.IZREK: LVP C BPP C PP C PSPACE S PSPACE smo oinačili raired jeiikov, ki jih raipotnavajo TS s polinomsko omejeno doltino uporabljenega traku. likate se, da Je vseeno, če vtamemo v definiciji deterministični ali nedetermlnistiini TS |Savl74|. DOKAZ: Relacija BPP C PP je očitna ii definicije. PP C PSPACE je posledica itreka o deterministični simulaciji verjetnostnega TS, ki ga tu nismo navedli. Dokai bi iel podobno kot dokai iireka 8. Pokaßmo »e LVP CBPP. Naj bo L jerik, ki ga raipoinava VTS W i verjetnostjo napake O in povprečnim časom omejenim s polinomom p(n]. Bodi o 2 poljubna konstanta. Z M' oinačimo VTS, ki simulira delovanje VTS M do največ ep(n) korakov. Ce se A/ do tedaj ne bi ustavil. M' odgovori karkoli. Ker M naredi več kot cp(n) korakov i vetjetnostjo manj kot l/c, je verjetnost napake polinomsko omejenega stroja M' največ l/c < 1/2. Q.E.D. 28.IZRf;K: P C LVP C NP C PP DOKAZ: Ker vsak polinomsko omejeni TS računa t verjetnostjo n&pxkf O, je očitno PC LVP. Bodi L eLVP in M VTS, ki raipotn&va L t verjetnostjo napakt O in polinomsko omejenim povprečnim časom. Ce na W gledamo kot na nedeterministični TS vidimo, da M raipotna L v polinomskem času, saj mora biti ta vsak vhod vsaj en iiračun s polinomsko omejenim Itevilom korakov. Torej L g NP in LVP C NP. Pokstimo Se NP C PP. Erti Jkode ta spioSnost lahko privtamemo, da L razpotnavapolinomsko omejen NTS, ki ima v vsakem stanju največ dva mogoča koraka. Ce na M gledamo kot na verjetnostni stroj, potem je L mnolica nitov, ta katere obstaja sprejemajoč iiračun. Torej i 6 i Pr{ W(i) sprejme} > 0. Za dokai, da je i v rairedu PP konstruiramo stroj M' takole: M' nsjprej vrte kovanec; če je retultat grb, potem sprejme vhod, sicer pa simulira delovanje stroja A/, tako da vsakič, ko ima na voljo dve nadaljevanji, libere eno »Ti drugo i enako verjetnostjo (J). Vendar verjetnostni stroj W »e ni povsem dober. Ce i ^ L je mogoče, da velja Pr{M'(i) sprejme} = 1/2. Torej je mogoče, da M' ne liračuna karakteristične funkcije L. Definirajmo VTS M", la katerega bo veljalo Pr{A^"(i) sprejme} < 1/2 ta vse z-e, ki niso v L. Bodi p(n) polinom, ki omejuje Število korakov iiv»janja stroja M. Vsak i€ L doltine n je sprejet t verjetnostjo vsaj 2-'(*>, saj gotovo obstaja vsaj en sprejemajoč Iiračun in je verjetnost vsakegaod Itračunov dolfine n vsaj 2-''*'. Verjetnostni TS M" deluje takole: Najprej M" vrte p(n) +1 kovanec in sprejme vhod bret nadaljevanja i verjetnostjo Potem M" simulira delovanje M in sprejme vhod, če ga sprejme A/. M" torej lavme vsaki ( L i verjetnostjo vsaj ^ + in sprejme vsak i 6 £. t verjetnostjo vsaj ^ + Torej VTS M" raipoinava L v polinomskem času, tato NPCPP. Q.E.D. 29.DEFINICIJA: VPP (ali RP) je raired jetikov, ki jih raipoinajo polinomsko omejeni VTS, ki imajo ta vhode, ki niso v jeiiku, verjetnost napakr 0. Ali, drugače lapisano: L€ RP ^L raipoinava po/inomsJto omejeni VTS M in Pr{M{x) sprejme} = O la Vi ^ L. Opomba: Tu bi dobili isti raired, če bi ta i € L lahtevali Pr{Af(z)«prejme) > « ta katerokoli konstanto O < 7 < 1. Po n ponovitvah algoritma je namreč verjetnost napake enaka 1 - (;)(1 - ^j*. 30.IZREK: 1.) RP C NP n BPP 2.) LVP = RP n co-RP DOKAZ: 1.) Inkluiiji RP C BPP in RP C NP sta očitni. 2.) Pokaiall bomo LVP C RP, LVP C co - RP in RP n co - RP C LVP. Najprej pokažimo LVP C RP. Ce je L € LVP, lahko L raipotnamo TS M, katerega pričakovani čas livajanjaje omejen t nekim polinomom, imenujmo ga q, reiultatl pa so popolnoma tanesljivi. Konstruirajmo TS M' takole: • pri vhodu X simulira delovanje TS M največ ?(|i|) korakov. Ce bi se M ustavil, preden se ustavi M' (to se tgodl i verjetnostjo vs&j 1/2), potem M' odgovori isto kot M. Sicer A/' odgovori NE. Vidimo, da so pritrdilni odgovori M' povsem tanesljivi, negativni odgovori pa so pravilni t verjetnostjo 1/2. Torej je 6 RP in LVP C RP. Podoben »rgument pokaže LVP C co - RP. OiiHi. nun je 8e inkluiija RP n co - RP C LVP. Bodi L € RP n co - RP. Potem obstajata VTS A/i in A/j, za katera velja: - ta» itvajanja obeh VTS je omejen s polinomom - poiitivnl odgovori Mi in negativni odgovori Mj »o povsem lanesljivi - negativni odgovori Mi in potitivni odgovori Mj so pravilni i verjetnostjo, ki je vetja od neke konstante, na primer 1/2. Pokaümo, kako lahko konstruiramo TS i verjetnostjo napake O in s polinomskim povprečnim časom iivajanjv Bodi z poljuben. Simulirajmo delovanje Mi in delovanje Mj, kar lahko naredimo v polinomskem iasu. Ce je vs^ en odgovor zanesljiv, koniamo. Zanimiv je primer, ko oba algoritma dasta odgovor, ki ni povsem zanesljiv, (verjetnost tega dogodka je manjSa od 1/4). V tem primeru poskus ponovimo, dokler se ne zgodi prvi primer. Pričakovano Jtevilo ponovitev poskusa je omejeno j' povprečni potrebni čas omejen s polinomom. Q.E.D. Za konec navedimo Je nekaj primerov. Začnimo s problemom določitve, ali je neko »tevilo praitevilo. Problem: PraStevila (PRIMES) Vhod: Naravno število n v binarnem zapisu. Odgovor: Da, če je n praštevilo in ne, če ni. 31.IZREK: ».) PRIMES eco-W b) PRIMES eco-UP DOKAZ; a) trivialno b) . Naslednji algoritem (Solovay-Strassen) dokazuje, da je problem PRIMES v razredu co-RP. izberia€{l,2,...,n-l} te (o, n) ^ 1 potem n ni praétevilo te (a/n) [mod n) potem n ni praitevilo sicer n je (z verjetnostjo > 1/2) praitevilo kjer je (a/n) Legendrov simbol (prim. (GrasTS)). Legendrov simbol znamo učinkovito (v polinomskem času) izračunati z algoritmom, ki je nekoliko podoben Evklidovemu. Temelji na recipročnem zakonu, ki pravi, da je [g/p) = -(p/?), če je p = « = 3 in (j/p) = (p/g) sicer. Poleg tega za fl > p, torej ? = mp + r, velja (j/p) = (r/p). Ce je p praitevilo, je kongraenca (a/p) = a'^imod p) izpolnjena za vsa »tevila o, 1 < o < p -1. Ce pa p ni praitevilo, pa kongruenca velja za n^več polovico a-jev [SoStTT]. Q.E.D. Neodvisno je podoben dokaz naiel Rabin |Rabi76|. Kot drugi primer si poglejmo naslednja problema: Problem: MAJ Vhod: (KNO) formulastavčnegaračunaF(ii,...,i.) Vpraiaole: Ali F velja za večino (za več kot 2*-') naborov vrednosti za Il.....X.? Problem: #5X7 Vhod: (KNO) formula «tavčnega računa F{zi.....*;) in naravno itevilo Vpraiaije: Ali obstaja vsiyj i različnih naborov vrednosti ca spremenljivke ii,..., i«, ki zadoičajo F. KNO je kratica za konjuktivno normalno obliko. Brez dokaza navedimo (glej npr. (GÌ1177]) 32.IZREK: MAJ in fSAT sta PP-polna problema. 7. Reievaqje NP-polnih problemov Doslej najboljša ocena za ča», potreben za reievanje NP- polnega problema je eksponentna funkcija doltine vhoda. Pri velikih n je stvar videti brezupna. Po drugi strani so mnogi praktični problemi dokazano NP-polnl. Kaj storiti? Poglejmo nekaj splošnih pristopov: a) Posebni primeri: Pogosto s« zgodi, da v praksi ne potrebujemo reSti NP-polnega problema v vsej sploinosti. Podproblemi imajo zaradi dodatnih omejitev včasih poHnomske reiltve. b) Dinamično programiranje in razveji-omejl sta tehniki, ki ju je mogoče uporabiti pri reievanju nekaterih NP-polnih problemov. V obeh primerih s pametno izbiro nadaljevanja precej zmanjiamo potrebno računanje na neperspektivnih refitvah. Cas obeh metod pa je ie vedno eksponenten. Več o omenjenih metodah najdemo na primer v knjigi [KozaSe]. c) Z verjetnostno analizo lahko včasih pokafemo, da so resnično 'teiki' primeri NP-polnega problema dokaj redki. V takem primeru lahko najdemo algoritem z dobrim pritakovanim časom računanja, čeprav zgornje meje ne moremo omejiti s polinomom. Seveda je potrebno paziti pri izbiri porazdelitve primerov NP-polnega problema. Dokaz, da je izbrana porazdelitev prava, pa utegne biti resen problem [Karp76). Za primer omenimo metodo simpleksov za reievanje problema linearnega programiranja, ki ima eksponentno časovno zahtevnost. Metoda simpleksov se v praksi dobro obnese, verjetno zaradi izredne redkosti 'teiklh' primerkov problema. Dokazano je namreč, daje metoda v povprečju pollnomska |Smal83|. Kljub temu, da je problem linearnega programiranja v razredu P |Khac79), pa ne poznamo polinomskega algoritma, ki bi bil v praksi boljii od metode simpleksov. d) Aproksimacijski algoritmi lahko včasih dajo zadovoljive reiitve v kratkem času. Seveda ni vseeno, kaj nam pomeni v konkretnem primeru zadovoljiva rejltev. V primera barvanja točk grafa na primer velja, da bi bil tudi algoritem, ki bi poiskal skoraj optimalno reiitev NP-poln. Carey in Johnson sta namreč pokazala |GaJo76], da velja: če za kakšni konstanti r < 2 in . Taj jezik obično se naziva EP model om/Jezi kom (Entity-Relationship), pa demo ga ovdje tako i nazivati. Polaznu osnovu ER modela/jezika možemo izreći slijedećim riječima: Podaci su znanja o objektima, vezama (medu tim objektima) 1 svojstvima (objekata odnosno veza). Stoga, je cilj jezika za modeliranje podataka da omogući precizno 1 jednostavno predstavljanje forme tih triju temeljnih kategorija znanja. Na Chenov prijedlog ER modela podataka uslijedilo je viie terminoloških 1 notacijskih nadopuna, kao i proširenja samoga modela. Zbog ograničenosti prostora, ovdje ne Iznosimo eksplicitno sam ER model podataka. Dobar prikaz toga modela/jezika te načina njegova prevođenja na relacijski jezik dat je npr. u . Relacijski Jezik Relacija, kao temeljni element relacijskog jezika/modela, ima dva aspekta: značenje i sadržaj. Značenje relacije naziva se Intenzl-Jom, a formalno se Iskazuje shemom celaci Je. Sadriaj relacije naziva se ekstenz1 Jam, a Iskazuje se naslovljenom tabelom podataka. Tabelu tvore n-torke atomar.nlh vrijednosti. Pored "tabelarnog izgleda", relacijski model karakterizira i skup operatora definiranih na skupu tabela-relacija. Ti operatori omogućavaju da se, pored znanja (podataka) ekspjIcltno datlh pojedinim relacijama, deduclraja (izraču-naju) i ona znanja koja iz toga skupa relacija 68 logički slijede. 0 kontekstu modeliranja podataka bitni su operatori projekcije i spajanja. Prikaz relacijskog modela dat je npr. u , , , , ... . Normal 1 zaci Ja vor da to "očito treba biti tako" nije neumjesan. Međutim, kod opsežnijih ( koitipl eksn 1 j 1 h ) modela stvari obièno nisu toliko očite. S druge strane, mogli bismo isto tako reći da data podjela podataka slijedi iz ER modela sa slike 2. Neformalno rečeno, ( standardnim )" procesom normalizacije nastoji se razviti dobar model podataka na taj način da se iz datog modela podataka otklanjaju slabosti. Stoga, prikaz problematike normalizacije otpočnimo analizom slabosti koje mogu karakterizirati neki dati model podataka. Na slici 1 dat je ilustrativni primjer skupa relacija KUPAC, ARTIKAL i NARUDŽBA. niPdC i-m IflE SJEDISTE Kl Bil lb Zigrib K2 Arint Puli a Hlrni Rovinj K4 Bidil Iigrili ARTIKAL S-ART NAIIV BOJA CIJENA Al eUvki crvini 3 A2 guilci bUili 7 A] pinkili plm B M pinktlt ervri 9 MRUIltt« J-KUP i-ART KOLICIKA Kl Al 100 Kl A2 200 K2 Al 200 K2 200 K3 A3 100 Bliki 1. Postavlja se pitanje zaito je dati skup podataka (o fragmentu realnog svijeta) razđijeijen upravo u tri zasebne relacije. Eventualni odgo- KUPAC Naime, prijevod tog ER modela podataka na relacijski jezik daje sheme relacija: KüPAC(a-KÜP, IME, SJEDISTE), ARTIKAL(a-ART, NAZIV, BOJA, CIJENA), NARUDžBA(S-KUP, B-ART, KOLIČINA), tj. točno sheme relacija sa slike 1. Međutim, to je prije odgovor na pitanje odaVJe nego na pitanje zažto tri relacije. Da bi odgovorili na pitanje zašto, pokuiajmo isti fragment realnog svijeta opisati nekorektnim ER modelom .podataka datlm na slici 3. 81 i kl 3. Relacijski zapis ER modela podataka sa slike 3 glasi ! NARUDžBAl(6-KUP, IME, SJEDISTE, 6-ART, NAZIV, CIJENA, KOLIČINA) Primjer ekstenzlja relacije NARUDžBAl dat Je na slici 4. Prva slabost relacije NARUDZBAl, Jeste prisustvo redundance. Na primjer, ime i sjedište kupca Javljaju se toliko puta koliko artikala je naručio pojedini kupac. Zbog prisustva redundance javljaju sè problemi 1 kod mijenjanja ekstenzije (tj. sadriaja) relacije. Te probleme zajedničkim imenom nazivamo anomalijama odriavanja. NARUItSAl J-KUP IHE 8JE011TE J-ART NAZIV BOJA CIJENA KOUCINA Kl Bilib Zigrib Al olovlii crvini 3 100 Kl Sil lb Iigrib A2 guli Cl bljili 7 200 K2 Anni Puli At olovki crvini 3 200 K7 Ar ini Puli A2 guliti .bljili 7 200 K3 Nirm Rovinj A3 pinkili plivi 5 100 Sliki 4. üplB ^ Podatke o pojedinom kupcu nije moguće upisati u relaciju NARUDŽBAI sve dok taj kupac neSto ne naruči. Analogno vrijedi 1 za artikle. V --> W ako 1 satno ako svakoj Instanci od V moie biti pridružena točno jedna Instanca od W. Tada kažemo da V funkcljski determinira W, odnosno da W funkcijski zavisi od V. Za FZ oblika X --> Y kažemo da je trivijalna ako je Y podskup od X. Zavisnosti ne iskazuju odnose unutar neke date eksterni je relacije (tj. nekog trenutnog sadržaja relacije), veé odnose koji - u datom fragmentu realnog svijeta - uvijek vrijede medu promatranim entitetima. Na temelju zavisnosti, definiran je 1 pojam JegaJne eksterni je relacije. Brisanje 'i Brisanjem pojedine narudžbe mogu biti izgubljeni 1 svi podaci o kupcu odnosno artiklu. Mijenjanje Ako kupac promjeni ime 1/111 sjedlSte, onda se ta promjena mora provesti na toliko mjesta koliko narudžbi ima za toga kupca. Analogno vrijedi 1 za artikle. Prlmjetimo da se nijedna od iznad navedenih slabosti relacije NARUDžBAl ne javlja u skupu relacija sa slike 1, gdje su podaci razdjeljenl u tri zasebne relacije. Stoga, iznijeti primjer navodi na zaključak da je relaciju sa "mnogo atributa" poželjno dekomponirati (razdijeliti) na "viie relacija sa manje atributa". Cilj teorije normalizacije jeste da definira kriterije kada 1 proces kako se dekompozicija date sheme relacije treba izvesti. Kažemo da se normalizacija date sheme relacije izvodi na temelju dodatnih znanja o odnosima medu entitetima realnog svijeta, Sijl model podataka oblikujemo. Ta znanja nazivamo zavisnostima. Govorimo o tri vrste zavisnosti, i to: funkcijskoj zavlusnost (FZ), viSeznainoj zavisnost (VZ) i zavisnosti spajanja (ZS). 2. Funkcijska zavisnost Neka bude data shema relacije R sa skupom atributa A(R). Nadalje, neka V 1 W budu podskupovi od A(R). Na shemi relacije R vrijedi funkcijska zavisnost Neka bude data shema relacije R, sa pripadnlm skupom funkcijskih zavisnosti F. Ekstenzija relacije R legalna je ako i samo ako su na njoj zadovoljeni svi uvjeti iskazani sa FZ iz skupa F. Legalnost ekstenzije relacije ne garantira točnost podataka. Međutim nelegalnost garantira da su netočne barem neke od tvrdnji iz date ekstenzije relacije. Iz datog skupa funkcijskih zavisnosti F mogu logički slijediti 1 FZ koje u F nisu eksplicitno sadriane. Na primjer, iz skupa FZ koji sadrži zavisnosti A --> B 1 B --> C logički slijedi zavisnost A --> C. Skup svih FZ koje logički slijede iz F naziva se zatvorenjem od F, a označava se sa F+. Niz pravila koja omogućavaju da se iz datog skupa F Izvedu sve 1 samo FZ koje spadaju u F+ naziva se Armstrongov i m aksiomi ma. Formalna definicija postupka norma 11 zac i je temelji na skupu zavisnosti F+. No, radi pojednostavljenja prikaza, ovdje proces normalizacije temeljimo na skupu eksplicitno datlh zavisnosti, tj. na skupu F. Ü praktičkim terminima, to ne umanjuje valjanost iznijetih postupaka. Naime, projektant je taj koji - promatranjem odnosa u realnom svijetu - utvrđuje skup eksplicitnih zavisnosti F. Stoga, nema razloga da u taj skup ne uključi sve relevantne FZ, čime skup F+ postaje praktički beznačajan. 3. Dokoinpoz 1 ci ja baz gubitka informacija 4. Prva, druga i treća normalna forma Neka bude data shema relacije R. Relacijske sheme R1 i R2 su dekompozicija od R ako i samo ako vrijedi: A(Ri) unija A(R2) = A(R) . Dakle, sheme relacija Ri i R2 tvore dekompoziciju sheme relacije R ako i samo ako se svaki atribut iz A(R) javlja u barem Jednoj od shema relacija Ri odnosno R2. Dekompozicija (Rl, R2) sheme relacije R je bez gubitaka informaci ja ako i samo ako se svaku légalnu ekstenziju relacije R dade rekonstruirati spajanjem njenih projekcija na skupove atributa A{R1) i A(R2 ) . Prva norinaina forna Shema relacije Je u prvoj normal noj forni (INF) ako i samo ako Je domena svakog od njenih atributa skup atomar-nih vrijednosti. Shema relacije NARÜD2BA1 jeste u INF. Međutim, javljanje redundance i anomalija održavanja u relaciji NARUD2BA1 pokazuje da INF sheme relacije nije dovoljan uvjet za dobar model podataka. Druga normalna forma Ta definicija ne daje prikladan kriterij za utvrđivanje da li dekompozicija jeste ili nije bez gubitaka informacija. Pogledajmo, stoga, slljede A(R1) A(R1) presjek A(R2) --> A(R2). Drugim riJeSlma, presjek skupova atributa shema relacija Rl i R2 mora bit kandidat ključa u barem jednoj od tih relacija. Shema relacije R na kojoj vrijedi FZ oblika X --> Y dekomponira se na sheme relacija Rl i R2 tako da vrijedi: A(R1) = X unija Y A(R2) = A(R) minus Y. FZ oblika X --> Y nazivamo potpunom FZ ako na postoji skup V, V pravi podskup od X, za koji vrijedi V --> Y. Tada kažemo da Y potpuno zavisi od X. Funkcijsku zavisnost (od X) koja nije potpuna nazivamo parcijalnom PZ (od X). Shema relacije R nalazi se u 2NF ako je svaki neključni atribut od R potpuno zavisan od kandidata ključa. 2NF nema većeg praktičkog značaja jer se prevođenjem sheme relacije šarao u 2NF slabosti modela (tj. redundanca i anomalije) općenito ne otklanjaju. Stoga 2NF možemo smatrati samo "prirodnim prethodnikom" treće normalne forme. Takva dekompozicija očito Jeste bez gubitaka informacija jer je A(R1) presjek A(R2) = X, a zbog X --> Y, skup atributa X Je kandidat ključa u Rl, tako da vrijedi X --> A(R1). Datu shemu relacije R općenito se dekomponira na proizvoljan broj shema relacija Rl, Rn. No, radi pojednostavljenja, ovdje se ograničavamo na dekompoziciju na dvije sheme relacija. Proces dekomponiranja može se dalje sukcesivno izvoditi na shemama relacija generiranim u prvom koraku dekompozicije. Treća normalna forma Naka bude data relacija R, i neka X, Y i Z budu podskupovi od A(R). Funkcijska zavisnost X --> Y je tranzitivna FZ na R ako na R vrijede zavisnosti X --> Z i Z —> Y. Shema relacije R nalazi se u trećoj normalnoj formi (3NF ) ako neključni atributi nisu tranzitivno zavisni od kandidata ključa. Na Elici 5 dat je graflSkl prikaz FZ koje vrijede na shemi relacije NARUDžBAl. Ta shema relacije nije u 2NF - a time ni u 3NF - jer je parcijalna zavisnost oblik tranzltlvne zavisnosti . KOLIČINA J k Sheme relacija R3 i R4 jesu u 3NF, jer ne sadrie tranzitlvnlh zavisnosti. Dekompozicijom sheme relacije NARUDiBAl na relacije Rl, R3 1 R4 generirane su shema relacija sa slike 1 (ovdje sa drukčijim imenima, naravno). To ujedno pokazuje da proces normalizacije sheme relacije NARUDiBAl (koja slijedi iz nekorektnog ER modela podataka sa slike 2) daje one sheme relacija koje daje 1 sam korektan ER mode podataka sa slike 1! Dakle, normalizacija se ovdje svela na otklanjanje nekorektnosti ER modela podataka. A to sugerira da korektna Izrada ER modela podataka dovodi do shema relacija koje veé Jesu "u optimalnoj (normalnoj) formi", 5ime se ukida i sama potreba po normalizaciji. Sliki Polazeći od FZ e-KUP --> IME, SJEDISTE, shemu relacije NARODžBAl dekomponirajmo na sheme relacija: Rl(e-KOP, IME, SJEDISTE) R2(S-K0P, 6-ART, NAZIV, BOJA, CIJENA, KOLIČINA). Shema relacije Rl jeste u 3NF, jer IME ne determinira SJEDISTE tako da nema tranzltivnih zavisnosti. Međutim, dijagram FZ za shemu relacije R2, dat na slici 6 pokazuje da ta shema relacije nije u 2NF (a time ni u 3NF). 5. Dekompozicija bez gubitka zavisnosti Funkcijske zavisnosti iskazuju odnose koji vrijede u realnom svijetu. Stoga, ako želimo da ti odnosi vrijede i u modelu podataka (a to je globalni kriterij dobrog modeliranja 1 ) , onda prilikom dekomponlranja sheme relacije nijedna FZ ne smije biti ukinuta. Dekompozicija (Rl, R2) sheme relacije R je bez gubitaka funkclJskih zavisnosti ako sve FZ definirane na R logički slijede iz unije skupova FZ definiranih na Rl odnosno R2. KOLICIMA Sllki i. Polazeći od FZ e-ART --> NAZIV, BOJA, CIJENA, shemu relacije R2 dekomponirajmo - prema iznad datom principu - na sheme relacija R3 i R4: R3(S-ART, NAZIV, BOJA, CIJENA) R4(a-K0P, e-ART, KOLIČINA). Slijedeća definicija daje operativno pravilo za dekomponiranje bez gubitaka FZ : Shema relacije R dekomponira se (bez gubitka FZ) ako se dekomponira prema FZ koja nije od kandidata ključa. Dosadainjl prikaz procesa modeliranja podatakć zakljuSlmo slijedećom tvrdnjom: Svaka shema relacije R koja nije u 3NF dade se - sukcesivnom primjenom ovdje opisanog postupka - dekomponirati na skup shema relacija Rl, ..., Rn, tako da vrijedi; - Svaka Rl, 1 =< i =< n. Jeste u 3NF; - Dekompozicija je bez gubitka informacija; - Dekompozicija je bez gubitka funkcijskih zavisnosti. Možemo redi da je 3NF (za praksu) najvainlja normalna forma. Naime, shema relacije koje nije u 3NF redovito dovodi do ranije iznijetih slabosti (redundanca, anomalije odriavanja). S druge strane, shemu relacije koja Jaste u 3NF ponekad nije mogućne dekomponlratl bez gubitka PZ. Iznijetim primjerom ilustrirano Je da sam korektan ER model podataka daje sheme relacija koje Jesu (barem) u 3NF. Utoliko se i normalne forme mogu ovdje smatrati formalnim kriterijima za kontrolu Ispravnosti BR jnodeia podataka. 6. Boyce/Coddova normalna forma Za Boyce/Coddovu normalnu formu (BCNF) inoiemo reći da predstavlja stroži oblik 3NF. BCNF Je relevantna za one sheme relacija koje imaju viie sastavljenih kandidata klJuSa koji se međusobno dJelomiSno prekrivaju. Shema relacije Je u BCNF ako 1 samo ako su sve njene netrivijalne FZ ujedno FZ od kandidata ključa. Problematiku vezanu za BCNF iznijeti demo na primjeru trojne veze. Najprije Je data trojna veza koja - prevedena na relacijski Jezik daje shemu relacije koja Je u 3NF, ali J u BCNF. Zatim Je taj primjer modificiran tako da shema relacije jeste u 3NF, ali ne i u BCNF. Na slici 7 dat je ER model podataka koji predstavlja sVljedede tvrdnje: 1. Jedan nastavnik za Jedan predmet koristi jedan udžbenik. 2. Jedan predmet prema Jednom udžbeniku predaje viäe nastavnika. 3. Jedan udžbenik jedan nastavnik koristi za jedan predmet. Vezu KOLEGIJ zapisujemo na relacijskom Jeziku slijededom shemom relacije: K0LEGIJ(6-PRED, 6-NAST, S-UDŽ) Da su <8-PRED, S-NAST> i kandl-da- ti kljuäa slijedi iz tvrdnje (1) odnosno (3). Odatle Blijede i FZ: FZl: e-PRED, S-NAST --> S-UD2, FZ2: S-NAST, S-UDž --> g-PRED. Obzirom da shema relacije KOLEGIJ nema neklju-čnih atributa, ne može imati ni tranzltlvnlh zavisnosti takvih atributa te se stoga nalazi u 3NF. Nadalje, obzirom da su obje iznad date netrivijalne FZ od kandidata ključa, shema relacije KOLEGIJ nalazi se i u BCNF. Izmijenimo sada treću od iznad datih tvrdnji, koja neka glasi: 3'. Jedan udžbenik koristi se samo za jedan predmet. Tvrdnja (3") Implicira tvrdnju (3). Naime, ako se jedan udžbenik koristi samo za Jedan predmet (tvrdnja 3'), onda to mora poštivati svaki nastavnik (tvrdnja 3). Tvrdnju (3') - samu za sebel - predstavili bi u ER modelu podataka binarnom vezom. Međutim, u kontekstu tvrdnji (1) i (2) predstavljamo ju u okviru trojne veze, točno kako Je to učinjeno ER modelom podataka na slici 7. Odatle slijedi da je 1 shema relacije - nazovimo Ju KOLEGIJI -za vezu datu tvrdnjama (1), (2) i (3') Jednaka pređainjoj shemi relacije KOLEGIJ. Dakle, K0LEGIJ1(6-PRED, 6-NAST, a-UD2) Međutim, na tim shemama ne vrijede isti skupovi FZ. Obzirom da je tvrdnja (1) ostala nelzmje-njena, očito ovdje vrijedi FZl. Nadalje, obzirom da (3') implicira (3) vrijedi i FZ2. No, iz tvrdnje (3') slijedi 1 FZ3: e-UDa --> S-PRED koja ne vrijedi na shemi relacije KOLEGIJ. Sliki 7. Na slici 8 dat je primjer legalne ekstenzije relacije KOLEGIJI. Ekstenzlja je legalna Jer su njome zadovoljene sve tri iznad date F2. Nadalje, obzirom da vrijednost atributa S-UD2 ne "adreslra" samo Jednu n-torku iz relacije KOLEGIJI, taj atribut očito nije kandidat ključa. A to, nadalje, znači da FZ3 nije funkcijska zavisnost od kandidata ključa, tako da ni shema relacije KOLEGIJI nije u BCNF. KOIESIJI J-PREt J-NftST J-U« Pl m Ui Pl - H2 Ul P2 NI U2 PJ n U2 Bilki S. S druge strane, ta shema relacije jeste u 3NF, jer nema neključnlh atributa. Međutim, prlmje-tlmo da se u relaciji KOLEGIJI - uprkos 3NF -javlja redundanca (a sa njom 1 anomalije održavanja). Ta redundanca je posljedica od FZ3. Nalme, ako sam udibenlk determinira predmet, onda se podatak o udibenlku ne bi trebao zapisivati toliko puta koliko nastavnika predaje taj predmet (kako je to slučaj u relaciji KOLEGIJI). Postavlja se, stoga, pitanje Sto uélnltl sa shemom relacije KOLEGIJI. Ü datom primjeru, preporuäljlv odgovor glasi nJita. Dakle, zadržati ju u modelu podataka takvu kakva jeste. Nalme, shemu relacije KOLEGIJI nlje moguće dekomponlratl bez gubitka FZ. Pogledajmo.to. jer sadrži FZ koje nisu od kandidata ključa. BCNF modela podataka možemo doseći dekompozlci j.om sheme relacije NARUD2BA1 prema e-KUP --> IME-KUP koja nije od kandidata ključa) na: Rl(e-KUP, IME-KÜP) R2(S-KUP, S-RRT, KOLIČINA). Ta dekompozicija je bez gubitaka informacija i bez gubitaka funkcijskih zavisnosti. Naime: - treća 1 četvrta FZ Iz F definirane su na Rl, - prva FZ iz F definirana je na R2, - druga FZ iz F logički slijedi iz treće 1 prve FZ. Međutim, shema relacije NARUDžBAl slijedi iz grube greike u ER modelu podataka. Naime, svojstvo IME-KUP očito pripada entitetu KUPAC a ne entitetu NARUDŽBA. Dakle, sam korektan ER model podataka doveo bi do shema relacija Rl i R2, a ne do sheme relacije NARUD2BA1. Dakle, sheme relacija Rl i R2 - formirane iz sheme relacije NARUDžBAl - pokazuju da iz korektno izrađenog ER modela podataka slijede sheme relacija koje nisu samo u 3NF već 1 u BCNF. Polazeći od "kritične" FZ3, shemu relacije KOLEGIJI možemo - bez gubitka Informaci ja 1 dekomponlratl na: Rl(6-UD2, 8-PRED) R2(fl-NAST, 6-UD2) . Međutim, tom dekompozicijom izgubljene sz FZl 1 FZ2. Naime, od FZ sa sheme relacije KOLEGIJI, ovdje vrijedi sano FZ3 (na Rl), Iz koje zacjelo ne slijeda FZl 1 FZ2. Dakle, takvu dekompoziciju ne valja Izvoditi, tako da prikaz veze KOLEGIJI ER modelom sa slike 7 Jeste korektan. Postoje sheme relacija koje jesu u 3NF a nisu u BCNF, i koje se dadu prevesti u BCNF bez gubitaka FZ. Pogledajmo primjer. Neka bude data shema relacije NARUDžBAl(S-KUP, IME-KUP, 6-ART, KOLIČINA) i neka skup F sadrži slijedeće FZ: S-KÜP, e-ART --> KOLIČINA, IME-KUP, e-ART --> KOLIČINA, IME-KUP --> 8-KÜP, S-KUP --> IME-KUP Shema relacije NARUDžBAl jeste u 3NF jer (jedini) neključnl atribut KOLIČINA ne zavisi tran-zitivno. Međutim, ta shema relacije nije u BCNF Ako pak ER model podataka ne daje shemu relacije u BCNF (slučaj sheme KOLEGIJI), onda valja provjeriti da 11 se takva shema uopće dade dekompnlrati (bez gubltakal) na sheme relacija koje jesu u BCNF. Ovdje nemamo formalnog dokaza da takva dekompozicija nije moguća. No, nije nam poilo za rukom naći primjer u kojem bi takva dekompozicija bila moguća. 7. Višeznačna zavisnost Neka bude data shema relacije R 1 neka X, y 1 Z budu podskupovl od A(R). Na shemi relacije R vrijedi VZ X -->> Y ako 1 samo ako skup vrijednosti atributa iz Y za dati par vrijednosti atributa iz X 1 Z zavisi Isključivo od vrijednosti atributa iz X, a nezavisan je od vrijednosti atributa iz Z. Kažemo da X vlieznaóno determinira Y, odnosno da Y vi šeznaino zavisi od X. Grafički prikaz VZ dat na slici 9, čini gornju definiciju razumljivijom. PREDHET NASTAVNIK UDtBEIlIK (u bazi), a time 1 do anomalija održavanja. Pogledajmo primjer. Na slici 10 dat je primjer ekstenzije relacije K0LEGIJ2, čija shema glasi KOLEGIJ2(6-PRED, g-NAST, 6-UDŽ) K0LE6IJ2 Slikl 9. S-PRED J-NAST J-UtJ FiiUi Diiekrit Oinovf FUUi ttiokrlt Principi FtlUl Nnton Oinovi Fiiiki Niifton Principi Na slici 9 date su dvije VZ, i to: PREDMET -->> NASTAVNIK PREDMET -->> UDŽBENIK. VZ se javljaju u "parovima". Vrijedi slijedeće pravi lo : Ako na shemi relacije R vrijedi VZ X -->> Y onda vrijedi i VZ X -->> R minus (X unija Y). Dakle, ako skup atributa vlSeznačno determinira skup atributa Y, onda X viSeznaSno determinira i skup preostalih atributa sheme relacije R. To obično zapisujemo na slijedeći način: X -->> Y R minus (X unija Y) VZ koje su zadovoljene u svakoj eksten-ziji relacije nazivamo trivijalnima. Na primjer, u svakoj ekstenzij binarne relacije R(X, Y) zadovoljene su trivijalne VZ X -->> Y 1 Y -->> X. Naime, svakoj vrijednosti atributa X pripada skup (od jedne ili viie) vrijednosti atributa Y, i obrnuto. VZ predstavljaju generalizaciju FZ. Pod time podrazumijevamo da svaka FZ - po definiciji 1 jeste ujedno i VZ. Naime, FZ oblika X --> Y je ujedno VZ kod koje skup vrijednosti atributa Y za Jednu vrijednost atributa X sadrii samo Jedan element. Siliti 10. Shema relacije K0LEGIJ2 jeste u BCNF jer na njoj nije definirana nikakva netrivijalna FZ. Međutim, relacija K0LEGIJ2 sa slike 10 očito sadrii redundancu: zapis o tome da pojedini nastavnik predaje pojedini predmet javlja se toliko puta koliko udžbenika ima za taj predmet. Analogno vrijedi i za udžbenike. Kao i FZ, VZ daju osnovu za definiranje (nove) normalne forme, odnosno dekompon1ranje shema relacija, u cilju otklanjanja redundance i anomalija održavanja. Shema relacije je u četvrtoj normalnoj formi (4NF) ako i samo ako su sve njene netrivijalne VZ ujedno FZ od kandidata ključa. Jednostavnije rečeno, shema relacije je u 4NF ako je u BCNF i ako nema "pravih VZ". (Dakle, ako nema VZ koje nisu FZ.) Naime, ako shema relacije nema "pravih VZ", onda se zahtjev iskazan samom definicijom «NF svodi na zahtjev iskazan definicijom BCNF. (Dakle, da sve FZ budu od kandidata ključa.) Shema relacije KOLE(3IJ2 nije u 4NF. Naime, ona očito sadrii VZ S-PRED -->> e-NAST e-0D2, koje, nisu FZ (pa ni FZ od kandidata ključa). U cilju otklanjanja redundance i anomalija održavanja, shemu relacije koja sadrži VZ valja dekonponirati. 8. četvrta normalna forma Shema relacije koja se nalazi u BCNF 1/111 3NF može svejedno dovesti do javljanja redundance Shema relacije na kojoj vrijedi netrivijalna VZ oblika X -->> Y dekomponira se bez gubitaka Informacija na sheme relacija R1 i R2, pri čemu vrijedi : A(R1) = X unija Y A(R2) = A(R) minus Y. Polazeél od datlh VZ, shema relacije K0LEGIJ2 dekomponira se na sheme relacija RKfl-PRED, 6-NAST) R2(Ž-PRED, a-UDžl. Na slici 11 date su ekstenzije relacija RI i R2, dobivene projekcijom ekstenzije relacije K0LEGIJ2 na skupove atributa A(R1) odnosno A(R2). J-PRED S-NAST R2 J-PRED >-UI)< Flilki Diiokrlt Fltlki Oinovi rillki Niaton Flllki Principi Sliki 11. Sheme relacija R1 i R2 jesu u 4NF. Naime, na tim shemama vrijede samo trivijalne VZ. Ranije Isticana nuinost očuvanja zavisnosti kod dekompozicije, odnosi se i na višeznačne zavisnosti. Formalni prikaz toga problema prelazi potrebe ovoga prikaza. Stoga, navodimo samo slijedeće (praktičko) naželoi Sllkl 12. vezani u Jednu vezu, ne opisuje stvarno stanje stvari u fragmentu realnog svijeta Siju strukturu modeliramo. Drugim rijeòima, javljanje VZ na shemi relacije Je posljedica nekorektnost i ER modela podataka sa slike 12. Naime, ako vrijede iznad date VZ, onda stvarno stanje stvari ne opisuje taj model, već ER model podataka sa slike 13. PREOAJC PREDHET »ASTAVNIK KITERATURA UDtBEMIK Ako se dekompozicija izvodi polazeći od netrivijalne VZ definirane na polaznoj shemi relacije (kako je iznad učinjeno), onda dekompozicija neće dovesti do gubitka VZ. Činjenica da neka shema relacije jeste u BCNF a nije u 4NF ukazuje na evidentna grešku u EP modeJu podataka, iz kojeg takva shema relacije slijedi. Na primjer, shema relacije K0LEGIJ2 očito slijedi iz ER modela podataka datog na slici 12. Napomenimo da tip povezanosti mnogo svih triju tipova objekata koji tvore vezu slijedi Iz toga ito je ključ sheme relacije K0LEGIJ2 sastavljen iz identifikatora svih trlju tipova objekata. "Otkriće" VZ e-PRED -->> S-NAST a-ÜD2 na shemi relacije K0LEGIJ2 ukazuje na to da su tipovi objekata NASTAVNIK i UDŽBENIK međusobno potpuno nezavisni. A to onda znači da trojna veza K0LEGIJ2, kojom su ti tipovi objekata po- Sllki 13. Iz potonjeg ER modela očito direktno slijede sheme relacija R1 i R2 (tj. PREDAJE 1 LITERATURA), koje su ranije bi le generi rane dekompozicijom sheme relacije K0LEGIJ2 dobivene iz nekorektnog ER modela podataka. Dakle, normalizacija ima i u ovom slučaju samo ulogu otkrivanja (l/ili otklanjanja) greiaka učinjenih u procesu izrade ER modela podataka. Naravno, veza K0LEGIJ2 mogia bi biti 1 trojna. Međutim, u tom slučaju na pripadnoj shemi relacije K0LEGIJ2 ne bi vrijedile iznad date VZ. Dakle, tada bi shema relacije K0LEGIJ2 bila ne samo u BCNF već i u 4NF. A to znači da bi i ER model podataka sa slike 12 tada dao shemu relacije koja jeste "u optimalnoj normalnoj formi". 9. Zavisnost spajanja Postoje relacije koje se ne mogu dekomponirati bez gubitka informacija na dvije relacije, ali se mogu bez gubitka informacija dekomponirat i na tri ili viie relacija. Pogledajmo primjer. Neka bude data shema relacije K0LEGIJ3(e-PRED, 6-NAST, S-ODŽ) na kojoj nisu definirane nikakve netrivijalne V2. Dakle, ta shema relacije Je u 4NF. Međutim, ekstenzija relacije K0LEGIJ3 data na slici 14 sadrži redundancu. pn k0lhij3 J-PRED !-NSST »-U01 Pl NI U1 Pl NI U2 Pl N2 U1 P2 NI U1 smki 14. Na primjer, podatak da nastavnik NI predaje predmet Pl zapisan je toliko puta koliko udžbenika taj nastavnik koristi za taj predmet. Međutim, obzirom da nastavnik N2 za isti predmet ne koristi isti skup udžbenika kao i nastavnik NI, na shemi relacije K0LEGIJ3 ne vrijedi VZ fi-PRED -->> S-NAST fi-0d2. Odatle slijedi da tu shemu relacije nije mogude dekomponirati na dvije sheme relacija (na temelju VZ), kako je to užinjeno sa shemom relacije K0LEGIJ2. Međutim, na slici 15 pokazano je da se ta relacija dade dekomponirati na trJ relacije, i to bez gubitka Informacija. Relacija PNU, dobivena spajanjem relacija PN i NO sadrži jednu suvlžnu n-torku. Drugim rljeSi-ma, rezultat tog spajanja sadrži 1 tvrdnju da za predmet P2 nastavnik NI koristi i udžbenik 02, Sto - prema relaciji K0LEGIJ3 - nije istina. Dakle, dekompozicija relacije K0LEGIJ3 samo na relacije PN i NU dovela bi do gubitka informa- cijskog sadržaja relacije K0LEG1J3. Međutim, spajanje relacije PNU sa relacijom PU daje toino polaznu relaciju K0LEGIJ3. Obzirom da na shemi relacije K0LEGIJ3 nisu definirane nikakve netrivijalne VZ, tro- J-PREO J-NA8T NU !-NA8T J-UtJ PU 5-PRED J-UDJ Pl NI NI Ul Pl Ul Pl N2 NI U2 Pl U2 P2 NI N2 Ul P2 Ul pnü ■ p* bpooi hu PNU prtdodin! «-PRED J-HS8T J-UDl Pl NI Ul Pl NI U2 Pl N2 Ul P2 NI Ul P2 NI U2 kolebuj • pk« spoji pu I kolegij: S-PREt i-MS »-UDI Pl NI Ul Pl m U2 Pl N2 Ul P2 NI Ul sllki 19. dekomponibiInoBt te relacije ne može biti formalno utemeljena (obrazložena) na osnovu takvih zavisnosti. Stoga je uvedena (definirana) nova vrsta zavisnosti: zavisnost spajanja. Na shemi relacije R vrijedi zavisnost spajanja (ZS) •(Rl, .. . , Rn) ako i samo ako je Rl, ..., Rn dekompozicija od R bez gubitka informacija. Zavisnost spajanja '(Rl, ..., Rn) nazivamo trivi jalnom ako je R jednaka nekoj od Ri, 1 =< 1 =< n. Zavisnost spajanja omogućava da se iznad dato svojstvo tro-dekomponlbiinosti (odnosno, općenito n-dekomponlblInosti) definira na razini sheme celaci je. Za promatrani primjer to èinimo pomoću slijedeće zavisnosti spajanja: *(PN, NU, PU). Naravno, time smo ujedno definirali klasu legalnih eksterni Ja relacije K0LEGIJ3. Stoga, takvu ZS ne valja definirati samo na temelju jedne ekstenzije relacije, već to smijemo uSiniti samo ako ta ZS iskazuje stvarne odnose u realnom svijetu. O tom (problemul) biti će viie riječi u slijedećem odjeljku. Iz definicije ZS slijedi da su FZ i VZ samo posebni slučajevi ZS. Već je ranije pokazano da je FZ samo poseban slučaj VZ. Nadalje, pokazano je da se shema relacije oblika R(X, Y, Z) može dekomponlrati (bez gubitka informacija) na sheme relacija R1(X, Y) i R2(X, Z) ako na shemi relacije R vrijedi VZ X -->> Y I Z. Prema definiciji ZS, to ujedno znači da na shemi relacije R vrijedi ZS »(XY, XZ). Dakle, gornja VZ dade se na ekvivalentan način izraziti kao ZS. S druge strane, očito postoje ZS koje nisu VZ. Takva je, na primjer, ZS *(PN, NU, PÜ), definirana na shemi relacije K0LEGIJ3, na kojoj nije definirana nikakva VZ. Iz definicije ZS slijedi da je to najopćenitiji mogući oblik zavisnosti, sve dok se sheme relacija dekomponiraju primjenom operacije projekcije 1 regeneriraju primjenom operacije spajanja. Kao i prethodne zavisnosti, ZS daje osnovu za definiranje nove (a u kontekstu projekcije-spajanja i najviie moguće) normalne forme. 10. Peta normalna forma Shema relacije na kojoj su definirane netrivijalne ZS moie se dalje dekomponlrati, i to upravo na one sheme relacija koje su date samim zapisom ZS. Cilj takve dekompozicije jeste i. doseći petu normalnu formu shema relacija, kao najviiu moguću normalnu formu, koja definitivno ne dovodi do nikakvih redundanci ni anomalija održavanja. Shema relacije R nalazi se u petoj normalnoj formi (5NF) ako i samo ako za svaku netrivijalnu ZS oblika •(Rl, ..., Rn) definiranu na R vrijedi da je svaki A(Ri), 1 =< i =< n, suparključ od R. Jednostavnije rečeno, shema relacije je u 5NF ako se ne da "suvislo dekomponlrati". Pokušajmo to ilustrirati na primjeru sheme relacije OSOBA(MAT-BROJ, IME, GOD-ROD). Takvom shemom relacije može u relacijskom jeziku biti predstavljen objekt OSOBA iz ER modela podataka. Obzirom da je MAT-BROJ ključ sheme relacije OSOBA, na toj shemi relacije zacijelo vrijedi FZ MAT-BROJ --> IME. Polazeći od te FZ, shemu relacije OSOBA može se dekomponlrati (prema ranije iznijetom principu) na sheme relacija OSOBAl(MAT-BROJ, IME), 0S0BA2(MAT-BROJ, DAT-ROD). Prema definiciji ZS, to ujedno znači da na shemi relacije OSOBA vrijedi ZS *((MAT-BROJ, IME), (MAT-BROJ, DAT-ROD)). Međutim, prlmjetimo da je ključ sheme relacije OSOBA (tj. MAT-BROJ) sadržan u obje sheme relacija generirane dekompozicijom. Dakle, A(OSOBAl) 1 A(0S0BA2) Jesu superkl Jučevi^ sheme relacije OSOBA. A, prema definiciji 5NF, to znači da relacija OSOBA Jeste u 5NF. Dakle, kao što je objekt OSOBA Jedan entitet ER modela podataka, tako je i shema relacije OSOBA u 5NF, ' te ne postoje formainJ razlozi za njeno dalnje dekomponlranje. S druge strane, .shema relacije K0LEGIJ3, uz pretpostavku da na njoj vrijedi ZS *(PN, NU, PÜ), nije u 5NF. Naime, ključ sheme relacije K0LEGIJ3 je trojka atributa , tako da A(PN), A(NU) i A(PO) nisu superk1JučevI sheme relacije K0LEGIJ3. Stoga je ta shema relacije mogla biti "suvislo dekomponi rana" na sheme relacija PN, NU i PU. ^^Sheme relacija PN, NU 1 PU jesu u 5NF Jer se binarne relacije ne da netrivijalno dekomponlrati, tako da na njima nema netrivijalnih ZS. Stoga, svaka binarna relacija Jeste u 5NF. Postavlja se pitanje da 11 bi (i na temelju čega) i iz ER modela podataka slijedile sheme relacija P'N NU i PU, a ne shema relacije K0LEGIJ3. Prije odgovora, izneslmo slijedeća činjenice o 5NF. Obzirom da Je ZS najopćenitiji oblik zavisnosti, i 5NF najviša normalna forma, mogli bismo zaključiti da ZS 1 5NF imaju dominantnu ulogu u oblikovanju modela podataka. No, tome nije baš tako. Na primjer, u nalazimo tvrdnju da se BNF (a time ni 2S) u praksi "općenito ne koriste". Razlog za to iznosi.Date: "... dok je otkrivanje FZ 1 VZ relativno jednostavno, Isto se ne bi moglo reći za ZS (koje nisu VZ), jer je intuitivno značenje ZS daleko od Jednostavnog. Stoga je i postupak utvrđivanja kada data relacija jeste u 4NF ali nije u 5NF (te bi mogla biti dalje de-komponirana) joS uvijek nejasan." (Podcrtao M.R.) Odatle direktno slijede i sheme relacija PN, NU 1 PU. Međutim, problem 6NF je u njenom otkrivanju. Točnije: u otkrivanju ZS koje nisu VZ. A taj problem je jednako prisutan u formalizmu normalnih formi kao 1 u (bitno) jednostavnijem ER jeziku. Dakle, i s tog aspekta, ER jezik nudi mogućnost Izrade jednako kvalitetnog modela podataka kao i "sofisticirani" proces normal 1 zaci je . Napomenimo da je u gornja tvrdnja (Iz prethodnog izdanja Iste knjige) neito ublaiena; umjesto "daleko od jednostavnog" nalazimo "može ne biti očito'. Međutim, zaključak o nejasnoći postupka otkrivanja ZS ostaje nelzmjenjen. Date, nadalje, sugerira mogućnost da su relacije koje jesu u 4NF a nisu u 5NF "patološki slučajevi" koji se, izgleda, rijetko javljaju u praksi. Taj stav, zajedno sa ranije rečenim o BCNF 1 4NF, potkrepljuje ocjenu da Je sa praktičkog aspekta 3NF najvažnija. Naime, ako je ER model podataka korektno izrađen (i ako relacija nije "patološka"), onda će sheme relacija generirane Iz tog ER modela koje jesu u 3NF biti ujedno i u BCNF, 4NF 1 5NF. Stoga, ne izgleda primjerenim reći da se 5NF "u praksi općenito ne koristi", već da se (u praksi) ne pokušava eksplicitno doseći. Međutim, to ne znači da sheme relacije generirane iz dobro oblikovanih ER modela podataka nisu i u 5NF. Iznad rečeno ujedno ukazuje na odnos ER modela 1 5NF. Naime, ako se uspije otkriti zavisnost spajanja poput "(PM, NU, PU) onda Ista moio biti Iskazana tako da se umjesto trojne veze K0LEG1J3 (koja Je po formi jednaka vezi K0LEGIJ2 sa slike 12), ER modelom iskažu tri binarne veze, kako je to učinjeno na slici 16. P* REFERENCE Brackett, H.M.: Developing Data Structured Databases, Prent i ce-Ha 1 1 , 1987 . Chen, P.P.: The Entity-Relationship Model -Toward a unified Vie« of Data, ACK Trans, on Database S., No.l, 1976. Date, C.J.: Introduction to Database Systems, Voi I, Addlson-Wesley, 19S6. Korth, F.H., S11 ber schatz, A.: Database System Concepts, McGraw-Hill, 1986. Maler, 0.: The Theory of Relational Databases, Comp. Sci. Press, 1983.