f f f f i n f 0 r m a t i f: a 2 YU ISSN 0350-5596 L Časopis Izdaja Slovensko društvo Informatika, BlOOO Ljubljana, Parmova 41, Jugoslav i ja ČASOPIS 2A TEHNOLOGIJO RAČUNALNIŠTVA IN PROBLEME INFORMATIKE ČASOPIS ZA RAČUNARSKU TEHNOLOGIJU I PROBLEME INFORMATIKE SPISANIE ZA TEHNOLOGIJA NA SMETANJETO I PROBLEMI OD OBLASTA NA INFORMATIKATA Uredniški odbor: T. Alekslć, Beograd; D. Bitrakov, Skopje; P, Dragojlovic, Rijeka; S. Hodžar, Ljubljana; B. Horvat, Maribor; A. MandSiö, Sarajevo; S. Mlhalie, Varaždin; S. Turk, Zagreb YU ISSN 0350^596 LETNIK 12,1988-ŠT. 2 Glavni in odgovorni urednik: prof. dr. Anton P. 2eleznlkar - . . ' -J - A V i Tehnični «urednik: dr. Rudolf Murn ZaložnfSki svet: ,, T. Banovee, Zavod SR.Slovenije za_statisti ko, Voearski pot 12, 61000 Ljubi jahši; A. Jerman-Blažt e'. Iskra Telematika, Kardeljeva plosead 24, SlOOO LJubljana; . B. KlemenCiö, Iskra Telematika, 64000 Kranj; S. Saksida, institut za sociòlogijo Univerze Edvarda Kardelja,' 61000 Ljubljana; , . J. VirTnt, Fakulteta zat elektrotehniko, Trzaäka 25. G1Đ00 Ljubljana. ,- Uredništvo in uprava; Časopis Informatika, Iskra Delta,' Stegne Ì5B, 61000 Ljubljana, telefon (061) 574 554; teleks 31366-;:YU f Delta. , Letna naroönina za delovne organizacije znaša 24000 din, za zasebne naročnike 6000 din, za Studente 2000 din; posamezna Številka 8000 din. Številka 2iro računa: 50101-678-51841 Pri fInane i ranju ćasopi sa sodeluj e Razi skovalna skupnost Slovenije - ; . Na podlagi mnenja RepublTskega komiteja za informiranje fit. 23-85, z dne 29. 1. 1986, je Časopis oproščen temeljnega davka od prometa proizvodov. Printed Tiskarna Kresija, Ljubljana, in March, 1988 v s E B INA A.P.Zeleznikar 3 O razvojni potudi raftunalni- äke-iindustrije M. Cams 12 Approaching the Linit of M. DrobniO . Classification Accuracy N. Cuid B. Zalik . 18 Standards in Computer Graphics - . J, Györkös 24 A Survey of Host Important I, Rozman and Outstanding Methods ■for Tatjana Welzer - Software Engineering ; A.P.Zeleznikar 31 Problems of the Rational Understanding of Inforoation L, M, Patnaik 47 A Critique on Parallel Compu-R.Goviiidarajan ter Architecture M. Spegel J. Silc . ' ;, D. Surla Nonika K^apus-Kolar A. M. Jenkins J. V. Carl is 65 The Relation between a Point and a.Simple Polyhedron 69 Deriving Protocols from Services In the Finite State Machine Representation 76 Control Flowcharting for Data Driven Systems P. Kolbessen 83 Paralelni vektorski procescr- S, Mavrió ji in njihova uporaba I B. Mlhoviloviä ■V. Vouk 94 Veöopravilno okolje za delo v J. Ferbežar realnem Času na rafiunalniku A. Brodnik IBM-PC ' I. Kononeako 102 Terminologija programiranja v Nada Lavraù Prologu J, JlugelJ 107 Sinhronizacija v porazdeljenih računalniških sistemih L, Vuga 114 Matematično sodilo o možnosti strojne inteligence info Published by Informatika, Slovene-SoaTèty for' Informatics, Parmova 41, 61000 Ljubljana, Yugoslavia Editorial Bqarđ„. : _ T. Alckstö, Beograd} D. Bitrakov, Skopje; P. Dragojlovlö, Rijeka; S. HodSar, Ljubljana; B, Horvat, Maribor; A. MandZiCi, .Sarajevo;^ S. MihaliC, Varaždin; S. Turk, Zagreb JOURNAL OF COMPUTING AND INFORMATICS YU ISSN 0350-5596 VOLUME 12,1988-No. 2 Edi tor-ln-Chief : Prof. Dr. Anton P. Zeleznikar Executive Editor': ' ' ' Dr. Rudolf- Mu'rn" : • Publishing Council: T. Banovec, Zavod SR Slovenije za statistiko, Vožarskt pot.12, 61000 Ljubljana; A-.-'Jerinan-BlaZi«, Iskra Te lema tika, Kardeljeva plosead 24 61000^ Ljubi Jana ; ■ .. v B..Klenienöiö, Iskra Telematlka, 64000 Kranj;' ' S..,Saksida, Institut za- sociologi jo Univerze Edvarda Kardelja,: 61000 Ljubljana J. Virant, Fakulteta za elektrotehniko, Tr2aska-25, 61000 Ljubljana. .. .. . /" .Headquarters^; .. Informatica, ; . Iskra Deità, Stegne 15B, 6TÖ0Ö"" Ljubljana, Yugoslavia. Phoné; 61 57 45 54. Telex; 31366 yu' delta . " " Annual Subscription ;Rate: "uSi 30 companies, and US$ 15'for Individuals." for Opinions Expressed in the Vontributions are .not' necessarily shared by the,Ed i tor iaTBoard Tisk: Tiskarna Kresijà, Ljubljana, v marcu 1988 • CONTENTS A,P.Zeleznikar 3 On the Developmental Inicia- tive of the Computer Industry' M, Gams iz Approaching the Limit of M. Drobniß Classification Accuracy N, Guid B. ZAlik 18 Standards in Computer Graphics J. Gyärkäs 24 A Survey of Most Important I. Rozman and Outstanding Methods for Tatjana Welzer Software Engineering A . P. Zeleznikar 31 Problema of the Ratibnaì Un-^ derstariding of Information L. M. Patnaik 47 A Critique on Parallel Compu-R.GovlndaraJan ter Architecture M. Spegel T J. Silt:, D. Surla Monika Kapüa-Xolar A, M. Jenkins J. V. Carl is 65 The Relation between a Point arid a Simple Polyhedron 69 Deriving Protocols' from Services in the Finite State Machine Representation 76 Control Flowcharting for Data Driven Systems P, Kolbezen 83 Parallel Array Processors and S. Mavrió Their Application I B, Mihovilovid V.'Vouk J. Ferbežar A. Brodnik 94 Multitasking Real-time Environment "for the IBH-PC' I. Kononenko 102 Terrainology of Prolog Pro-Nada Lavrdä graiitining J. BugelJ 107 Synchronization in Distribu- ted Computing Systems I. Vüga 114 A Mathematical Judgement on ' possibility of Machine Intelligence o RAZVOJNI POBUDI RAČUNALNIŠKE INDUSTRIJEV INFORMATICA 2/88 Anton P. Železnikar UDK 681.3.001.2 Iskra Delta Ta tlanek je avtorjev govor ob otvoritvi Deltinega razvojno-proizvod-nega centra v Iskrini tehnološki dolini v Stegnah pri Ljubljani. Članek poudarja vefikot tridesetletno zamisel avtentične strukture in organizacije računalniške tovarne, ki nI le privesek neke raznorodne, računalništvu tuje industrije, temveC je sodobna proizvodna zmogljivost, ki obsega dovolj širok in tehnoloèko zahteven računalniški proizvodni program, Ta proizvodnja naj bi bila podprta s sodobno zamišljenimi raziskavami in razvojem računalniških in informacijskih sistemov. Ze danes je Iskra Delta povezana z raziskovalnimi, razvojnimi in proizvodnimi srediSči doma in na tujem. V prihodnje naj bi bila nacionalna računalniška industrija tudi deleSna posebne skrbi in podpore vlade, industrije in akademije. Iskra Delta sproža tudi že novo industrijsko raziskovalno pobudo za sodobne tehnoloSke raziskave na področju računalniške znanosti, tehnologije in uporabe. Pod okrlljetn Slovenskega društva Informatika je bil ustanovljen t.i. Forum informationis, ki ga sestavljajo najvidnejši računalniški strokovnjaki in novinarji, z namenom, da bi lahko nudil javno in kritično podporo hitro rastoči domači računalniški industriji. Na koncu članka je sporočen poziv vodstvu in inženirjem Iskre Delte, ki naj bi pospešil intelektualno motivacijo in ustrezno poklicno reagiranje inženirjev. On the Developmental Iniciative of the Computer Industry. This paper presenta the opening speech, held by the author, at the occasion of the opening ceremony of the Iskra Delta's Researoh, Developmen t, and Manufacturing Center at the so-called Iskra Technology Valley in Stegne, near Ljubljana. The paper apostrophizes more than the thirty-years growing concept of authentic structural and organizational principles of a computer factory, which is not only an appendix of a heterogenous, to the field of computing strange industry, but is a modern manufacturing facility embracing sufficiently broad and technologically pretentious computer manufacturing program. This manufacturing should be supported by contemporarily imagined research and development of computer and information systems. Already today, Iskra Delta is we 11-connected with research, development and manufacturing places in the country and abroad. In future, the national computer industry should receive the necessary care and support from governmental authorities, industry and acaderaia. Practically, Iskra Delta is also giving the so-called industrial research initiative for modern technological and conceptional research in the area of computer science, computer technology, and to other computer-related application. In the framework of the Slovene Society for Informatics the so-called Forum Informationis, uniting the most distinguished and experienced computer professionals and journalists was established, to offer the public and critical support to the fast growing domestic computer industry. At the end of the article, a call to the managerial and professionally diverse engjneering staff of Iskra Delta is communicated, to rise the necessary intellectual motivation and the corresponding professional reaction. Slika 1 IzgotaJ). Na sliki je prikazan izgled proizvodno-razvojnega objekta Iskre Delte med izgradnjo. Sprednji del objekta (levo) je v glavnem namenjen raziskovalno-razvo jnemu delu, zadnji del (desno) pa pretežno proizvodnemu delu. V kletnih etažah se nahajajo skladiSöa, na vrhu zadaj pa infrastrukturne enote (kuhinja, menza, ambulanta, predavalnica). Slika 2 (spodaj). Ta slika ka2e proizvodna montažo in preizkušanje vefiproceaorskih mini-raäunalniä kih sistemov Gemini in Delta 4860, ko ae njihova izdelava te približuje konfini fazi. To so znani in že uveljavljeni računalniški sistemi Iskre Delte iz razreda super miniraöu-nalnikov, z lastnimi modutakimi aparaturniroi in prog ras kimi komponentami. Slika 3 (desno). Pri izdelavi modulov na tiskanih vezjih, ki je sicer avtomatizirana, so večkrat potrebne tudi dodatne ali korekturne ročne operacije. Tako se moduli dopolnjujejo oziroma popravljajo. Vse to pa zahteva vestno oziroma natančno delo. Slika 4 f apoda jV. Tiskano vezje z vstavljenimi in elektriöno povezanimi elementi se preizkuäa z uporabo posebne preizkuSevalne naprave, ki opravlja meritve in a tem preverja avtomatično ustreznost vezja. To je visoko učinkovito diagnosticiranje modula na tiskanem vezju, torej ugotavljanje kritičnih odstopov. SJlira 5 (zgoraj). Na aliki je prikazana proizvodnja mikroraCunalniS hih sistemov Triglav (stolpna ali obmizna izvedba) s 16-bitniroi procesorji Jll, M68010 in I80286. Te konfiguracije je mogoCe raznovrstno dopolnjevati z disketnimi, diskovnimi in trafinini enotami. Slika 6 (desno). Namizno izvedba mikroraćunainiskih sistemov Triglav je namenjena njihovi uporabi s funkcijo visoko zmogljivih delovnih poataj za različne namene (CAD/CAM, grafika za različne uporabne naloge, komunikacijske mre2e itd.). Sistemi Triglav uporabljajo lastne, tj. domaČe module s tremi razliónimi mi k roproce Bor ji na vodilu VME. To vodilo je industrijski standard. Tudi operacijski sistemi za računalnike družine Triglava so različni in omogočajo uporabo najSirSega spektra uporabniških programov. Ti óperacijski sistemi so Delta-M, Xenix, Unix in MS-DOS. ■ Some of the riskiest work we do is concerned with altering organization structures. Emotions run wild and almost everyone feels threatened. Why should they be? The answer is that if companies do not have strong notions of themselves, as reflected in their values, stories, myths, and legends, people's onJ.y security comes from where they live on the organization chart. Threaten that, and in the absence of some grander corporate purpose, you have threatened the closest thing they have to meaning in their business li ves. T.J. Peters and R.H. Waterman; In Search of Exellence, p. 77 (Harper & Row, 1981) Zamisel, ki stno Jo Jugoslovanski računalniški inženirji nosili vefi kot trideset let^ v svojem hotenju in znanju, se v teh dneh materialno uresničuje. Pred nami je prava in na lastnih tehnoloških in razvojnih temeljih/ na dolgotrajni avtentični zamisli' zgrajena in organizirana tovarna računalnikov, ki ni le privesek neke drugorodne, računalništvu mačehovske ali tuje industrije, temveč je sodobna proizvodna zmogljivost, ki obsega dovolj airok in tehnološko zahteven računalniški proizvodni program. Ta nova zmogljivost naj bi se podpirala tudi s sodobno koncipiranim! raziskavami in razvojem, ki naj bi zagotavljale novonastajajoče računalniške produkte, tj. informacijske in računalniške sisteme za svetovno konkurenčni plasma in ' Slavnostni govor ob otvoritvi proizvodno-razvojnega srediSCa Iskre Delte v Stegnah, dne 4. decembra 1987. ' Zamisli o oblikovanju slovenske računalniške industrije so nastajale že v drugi polovici 50-ih let. Na Institutu JoZefa Stefana so potekala interdisciplinarna predavanja, s katerimi se je de facto pripravljal tudi razvoj računalnika, ki naj bi ga proizvajala domača industrija. V 60-ih letih Je ta zamisel ugaäala, saj se je na Zavodu za avtomatizacijo pojavila licenčna montaža (neke vrste proizvodnja) računalnika Zuse Z23. ' Avtentična zamisel računalniške tovarne pomeni med drugim na lastnih možnostih in izkušnjah strukturirano in organizirano proizvodnjo, podprto s sodobno (lastno) računalniško avtomatizacijo tipa CIM (Computer Integrated Manufacturing) in tipa CIC (Computer Integrated Communication) , kot jo uvajajo sodobno organizirane tovarne v avtomobilski (General Motors) in računalniški industriji (npr. avtomatizacija proizvodnje mikroračunal-niSkih sistemov tipa PS in velikih računalniških sistemov pri IBM). uporabo. To Je vizija in odločenost ob tej otvoritvi in prav zaradi tega Selim kritično poudariti, da bosta svetovno konkurenčni plasma in uporaba mogoča le tedaj, če bo sodobno organizirana znanstveno-produkci j s k a veriga dobivala ustrezne tržno-strateške inpute in le če se bo vzpenjala na skrajne meje znanstvene in tehnološke i no va t i vnos t i ob izredno samokritičnem uresničevanju vseh dejavnosti sklenjene verige raziskav, razvoja, proizvodnje in trženja. Vsestranska znanstveno-tehnolofika 'povezanost Iskre Delte s svetovnimi in domačimi raziskovalnimi, razvojnimi, tehnološkimi, proizvodnimi in poslovnimi srediSči bo odločilna in mora neglede na trenutne krizne okoliščine slej ko prej ostajati, nastajati in se razvijati kot glavna strateška usmeritev računalniškega podjetja. TakSna usmeritev, ki edina zmore zagotavljati obstoj in prepotrebne razvojne impulze lastne in druge sodobne industrializacije na naSih tleh in v evropskih koordinatah, bo lahko noSena le a strokovno in intelektualno vrhunsko usposobljenimi kadri v vodenju, raziskavah, razvoju, proizvodnji in trženju. V naporih za svoje preživetje in razvoj se bo Iskra Delta vedno znova soočala s problemom zahtevne kadrovske kvalifikaciJe, izkušenosti in intelektualne prodornosti. Značilni kadrovski kompromisi danaànje stagnantnosti, neaktivnosti in disfunkcije ne bodo zadostovali niti za golo preživetje. Računalniška industrija in razvoj te industrije sleherne države sta danes pomemben integracijski dejavnik na področju narodnega gospodarstva, raziskav, razvoja, proizvodnje in trženja. Ta industrija je zato skrbno opazovana in varovana z državno regulativo, saj je izhodiščna za razrast druge industrije, malih podjetij materialne in programske proizvodnje, trgovine, nacionalnih šolskih, zdravstvenih in drugih infrastrukturnih dejavnosti. Računalniška industrija je temeljna podlaga prihajajoče informacijske epohe. Narodi ali mednarodne skupnosti, ki Se domišljajo in gradijo svoje vizije preživetja, morajo jasno in načrtovano predvidevati svoje vire informaciJskega znanja in tehnologije. Tudi funkcija današnje Iskre Delte postaja čedalje bolj raznovrstno integrativna na področju poslovnega povezovanja, postaja pa tudi bistveno inicializirajoča skladno in spontano s tistimi potrebami sodobnega tehnološkega in socialnega razvoja, ki zmore odločilno posegati v prestTukturi ranJ e papirnatih vrhunskih raziskav in tudi v navidezno oziroma nebistveno akademsko tehnološko ' napredovanje. Raziskovalna industrijska iniciativa Iskre Delte že oblikuje tisto bistveno polje razvojne relevantnosti, ki bo lahko podlaga ključnim preobratom v razvojni strategiji sozda Iskre in Gorenja in t.i. raziskovalnih skupnosti. Iskra Delta vendarle in to navkljub občutjem krizne brezizhodnosti odpira obetajoče in motivacijsko bistvene Slika 7 (zgoraj). Na tej sliki je viden del laboratorija za razvoj mikroraäunalniäkega sistema Triglav, in sicer za modul z 32-bitnini mikroprocesorjom Intel 80386. Iz slike je razvidno, kako so svetli laboratorijski prostori bogato opremljeni a najrazliònejèiml aparaturnimi pripomočki za kompleksen razvoj (načrtovanje, risanje, simulacija itd.) vezij in podsistemov. Preko posebnih terminalov je laboratorij povezan z visoko zmogljivimi miniracunalniskimi sistemi lastne proizvodnje (glej npr. sliko spodaj). Ti razvojni računalniki so locirani na veC mestih v proizvodno-raziskovalnem arediäöu Iskre Delte. Slika 8 (spodaj). RaCunalniäki center projektov za operacijske sisteme in super miniraćunalnike je le eden od petih računalniških centrov v proìzvodno-razvojnem arediSöu Iskre Delte v Stegnah. V ozadju je v sredini slike viden veC-procesorski rsCunalniSki sistem ID Gemini, s skupnim diskovnim obsegom 4,2 Gbyte. Ta sistem je namenjen med drugim tudi razvoju IDA orodij in lastnih operacijahih sistemov iz družine Deltix. Na levi strani slike je viden sistem VAX n/780, ki se uporablja pri razvoju super tniniraCunalnikov (za simulacijo, naCrto~ vanje in druge razvojne naloge). T^'iv- ■ : - " ■■ ■ „■'t-. '^.^■JS razvojne perspektive, ki jih letargiCno in nemotivirano okolje že sprejema s tihim odobravanjem, saj vidi in Cuti v njih svoj izhod, razvojno upanje in preatrukturirno zaupanje. V tem spoznavnem kontekstu je razvojna uspeSnost Iskre Delte izredno pomembna za obćutje ihdustri jakih in drugih infrastrukturnih razvojnih moSnosti jugoslovanskih nacionalnih in federalnih znanstveno-tehnoloökih strategij. V bližnjem razdobju se bo z nadaljnim Deltinem razvojem pojavila tudi v obeh sozdlb pa tudi v strokovni in SirSi javnosti bistveno nova, oživljajoča pobuda. Slovensko druStvo Informatika bo oblikovalo t.i. Forum informationis' , ki ga bodo sestavljali najvidnejši računalniški strokovnjaki in novinarji, kot kritično, problemsko in intelektualno usmerjeno telo za podroóje računalništva in informatike. Delovno polje relevantnosti tega telesa bo raznovrstna strokovna, raziskovalna, organizacijska in proizvodna pobuda. Hkrati bo Slovensko druätvo Informatika reformiralo svoj strokovni Časopis Informatica tako, da bo njegova vsebina bistveno povezana z raziskovalno in razvojno problematiko obeh aozdov, da ae bo skozi časopis razvijala tudi potrebna motivacija za ràziskovalnost, razvojnost, strokovnost in znanstveno-tehnoloéko obveSčenoat raziskovalcev in inženirjev na področju računalništva in informatike. Z odpiranjem Deltine tovarne se tako de facto zaöenja tudi bistveno nova, oživljajoča pobuda strokovne aktivnosti kot nujnost iti kot predhodnica in posledica industrializacije, kot oko in uho potrebne razvojne relevantnosti. Z vstopom Iskre Delte v računalniško industrializacijo se odpirajo nove možnosti za infornacijako-značilno in spremljajočo drugo industrijo v ožjem in äiräem okolju. Državni, gospodarski in politični centri odločanja naj bi take in drugačne, smiselne in integracijske pobude in tokove, ki so povezani z informatizacijo ključnih dejavnosti in z njihovim nujnim razvojem, z razumevanjem podpirali. Iskra Delta je eden tistih inovativnih gospodarskih poganjkov, ki je äe sposoben izvenkrizne rasti, motivacije in pozitivnih gospodarskih učinkov. Zaradi tega pričakujem, da bo politična moč podpirala prav svoj lasten, demokratičen razvoj v največji meri tudi tako, da ne bo postavljala umetnih zaprek Deltinemu razvoju tam, kjer bi ga bilo smiselno kvečjemu pospeševati. Seveda pa mora tudi Iskra Delta v prihodnje sprejeti strokovno in vsestransko odgovornost za skladen in ekonomsko učinkovit razvoj računalništva in informatike na naSih tleh in v SirSem evropskem in svetovnem prostoru. Podobno kot ae s tem govorom obračam na Sirèo slovensko in jugoslovansko javnost, se obračam Se posebej na Deltine inženirje. Predvsem od vas Deltini inženirji, od vaše strokovne zavezanosti, organiziranosti, brezkompromisar-skega izražanja vaakrSnjih kvalitativnih zahtev, od vaSega hotenja in raziskovalno-tehnoloSkega izpopolnjevanja je in bo odvisen razvojni in gospodarski potencial in uspeSnoat Iskre Delte. Slej ko prej se bo potrebno odpovedati konzervati vni in že danes razvojno neperspektivni tehnologiji' in osvajati naprednejšo, za vse nas napornejso in zahtevnejšo tehnologijo in znanje. Znani koncepti kopiranja in golega posnemanja ne bodo več zadostovali. Zato pričakujem, da bo vaSa strokovna in organizacijska glasnost glasnejša, strokovno bolj inte 1ektua1 i z i rana in konceptualno smelejSa, kot je bila doslej; da inženirji ne boste le nemočni opazovalci in nekritični sprejemniki tistih usmeritev, katerim se argumentirano potihoma ali vsaj ne dovolj glasno upirate. Inženirji * Forum informationis je latinskega izvora in pomeni trg {javno mesto) predočbe (informacije). Pomen foruma je dvojen: informacijsko strokoven , tj . utemeljen s strokovnostjo informacije kot informacije (računalnik je danes pojem za informacijski stroj) in javen, tj. kvalificiran za javno razpravo o računalništvu in informatiki. • Konzervativna te/inoJo^lja je v svojem bistvu subkulturno oblikovana (funkcionalno in '"oblikovno "dizajnirana"), modificirana, največkrat reducirana tehnologija, ki ne sledi več glavnim, tudi konjunkturnim razvojnim, tržnim in atandarizacij-skim trendom, se pa na značilen nesposobnosten način odeva v svoje zaščitne znake in ščiti tako le sama sebe pred nikomer. TehnoloSka konzervativnost je lahko le trenutna tržna poteza, ki jo je potrebno čimprej nadomestiti s tržno relevantno noviteto. ® Po otvoritvenem govoru' so se pojavili očitki, da je ta govor poziv Deltiniai inženirjem na upor. Moj odgovor na te očitke je bil in ostaja, da se Deltini inženirji dejansko morajo argumentirano in zavestno, torej sistematično in strokovno organizirano, z vso intelektualno silo, ki je vest in zavest sodobnega inženirja, upirati kriznemu napredovanju, ki grozi z razrušitvijo doseženega industrijskega, predvsem pa v tem kompleksu nujnega, vse drugo pogojujočega raziskovalno-razvojnega potenciala in s tem k razrušitvi določenega preživetvenega minimuma, Inženirji kot vlečni konji so moralno poklicani, da vlečejo skupaj z drugimi voz visokotehnoloSkega podjetja iz krize, da gradijo svojo moralno in strokovno motivacijo predvsem tudi na elementu vleke iz kriee. Brez te vleke, ki Je moralna in intelektualna. Je tudi sama populacija inženirjev obsojena na propadanje, na drsenje v letargijo in v skrajni posledici na izumrtje. Slika 9. V novih prostorih proizvodno-raz vojnega centra Iskre Dette v Ljubljani, v Stegnah 15b, je dobilo avoj prostor tudi uredništvo öasopiaa Informatica. Iz slike je razvidna funkcionalnost proatora, ki je razen 2 uredniškimi mizami in policami opremljen tudi a teriDinaloiD, mikroračunalnikom in pomožnim tiskalnikom. To je hkrati tudi delovni prostor glavnega urednika. Ob vsem tem je gotovo treba poudarili, da je Iskra Delta materialno In vsebinsko z vsestranskim razumevanjem podpirala delo Časopisa Informatica od leta 1980 naprej. Le pod takimi pogoji je bilo mogoöe vzdrževati utefieno, standarizirano in zanesljivo urejanje in izhajanje časopisa Informatica. Seveda Je pri tem nujno poudariti tudi urejeno razmero v Raziskovalni skupnosti Slovenije, ki je z vaakolelniiD financiranjem omogoCala praktično neproblematično izhajanje Časopisa. Okolje uredništva Časopisa Informatica je izjemno primerno, saj je uredništvo locirano v eamem erediäCu t.i. raziakovalno-raz vojne enote Iskre Delte; ta enota predstavlja bržkone jedro prihodnjega (korporativnega, tj. Iskrinega) instituta za raCunalnid tvo (in informatiko). Prav s tem pa se vpliv stroka in aktivne publicistične kulture v taki in drugaCni obliki prenaSa tudi na delo vrste raziskovalcev, razvojnikov in inaenirjev Iskre Delte. Uredništvo lahko na ta naCin tudi sistematično vzpodbuja in povezuje potencialne avtorje v Iskri Delti in izven nje. V nove prostore uredništva prihajajo tudi avtorji in Študentje iz drugih krajev, iz zunanjih institutov in jugoslovanskih univerz. Na ta naCin se smiselno oblikujejo interesne povezave nas ih in tujih avtorjev in strokovnjakov. (Nadaljevanje podnapisa k sliki 9.) Zanimivo je tudi, da krizne razmere ne vplivajo negativno na rast obsega, kakovosti in na distribucijo časopisa Informatica. Število prispevkov naraSCa in urejanje se ne sooCa veC a problemom premjahnega obaega, katerejra spodnja meja je bila ob ustanovitvi Časopisa v letu 1977 postavljena na 72 strani za eno Številko. Čedalje veCja je tudi pripravljenost avtorjev, da piftejo svoje prispevke v anglefikem jeziku, s Čemer se bistveno poveCuje komunikativnost Časopisa oziroma prispevkov posameznih avtorjev s tujino. V bližnji prihodnosti bo Časopis Informatica nekoliko spremenjen, NaCrtuje se sodobnejša oblika oziroma vsebina platnic. Z razvojem t.i. namiznega urejanja in izdajanja pa bi bilo mogoCe vpeljati tudi izpisovanje vseh tekstov z laserskim tiskalnikom. V tem primeru bi morati avtorji svoje Članke dostavljati v uredništvo (Se) na disketah oziroma z elektronsko poŠto. Na ta naCln bi dobili Časopis, ki bi bil v vseh ozlrih (In po moanostl ob nebistveno spremenjenih stroSkih) tudi tehnološko kvalitetno oblikovan. Ob veem tem je seveda potrebna tudi spremenjena uredniška politika, ki bi zagotavljala kvalitetni dotok prispevkov In morda tudi pogostnejSe izhajanje Časopisa (v prvi fazi bi bilo mogoCe brez posebnih naporov izdajati letno Sest namesto dosedanjih Štirih Številk v enakem obsegu). Na koncu tega zapisa velja omeniti, da je potrebno navajati novi naslov uredništva, ker se dogaja, da posta vraCa pošiljke, ki uporabljajo stari naslov. Novi naslov je: Časopis Informatica, Iskra Delta, Stegne 15Đ, 61000 Ljubljana. raziskovalni, razvojni, proizvodni, finanöni in prodajni - ste nosilci in vleòni konji visokotehnolodkega podjetja. Prav zato naj bi tudi vsi drugi - vsi delavci Iskre Delte in uporabniki Deltinih proizvodov - vlekli v vami in ne proti vam*. Današnja otvoritev računalniške tovarne in raziskovalno-razvojnega centra ob njej je praznik SirSega okolja neglede na to, kako se to v tem trenutku mani fest ira^ . To je zgodovinski korak v domafii industrializaciji računalništva. Zato končujem s spoštovanjem, priznanjem in z zahvalo vsem, ki so na tej dolgi poti načrtovali, pripravljali in nosili kamne gradnike' . Hvala! ' Manifestacija otvoritve računalni&ke tovarne Iskre Delte je pokazala značilno nerazumevanje in g tem soc!Ìa.lno izoliranost slovenskega univerzitetnega, znanstveno-institucionalnega, gospodarsko-zborničnega in političnega estalishmenta do dejanskega tehnolo&kega .razvoja in s tem do bistvenih preZivetvenih možnosti slovenske in z njo vred jugoslovanske populacije. Kljub prisotnosti najvišjih predstavnikov slovenske znanosti in umetnosti, njenega predsednika akademika Janeza Milčinskega, visokih vojaSkih osebnosti in vrhunskih posameznikov strok, je lahko Mija Repovž naslednjega dne resignirano ugotovila: "Ce bi pomen neke tovarne presojali po goatib iz političnega vrha, ki naj bi s avojo navzočnostjo otvoritvenemu ceremonialu dali pečat družbene skrbi in poaor/josti, Je bila za ta podalpski kot Jeseniška Jeklarna dogodek številka 1 zadnjih desetletij, novi razvojno-proizvodni center Iskre Delte pa epizodica, ki naj ne moti velikega duha ..." {Neelaatićna in soft pravila. Delo, stran 2, 5. decembra 1987). ' K načrtovanju, pripravljanju in noSenju kamnov gradnikov bi morali dodati fio trpljenje, tragično dinamiko in dramatizacijo epopeje hitre rasti podjetja Iskre Delte in same graditve tovarne in njenih brezizhodnih posledic. Vrsta vodilnih delavcev je bila postavljenih v iskanje izhodov in je občutila nečloveške, brezobzirne in skrajno nehumane (totalitarne) pritiske, pod katerimi je doživljala tudi svojo lastno deformacijo in razcep. Doživljala je svojo lastno nehumanost in izgubljenost (suicidnost), MiJa Bepovä* priznava Janezu Skrvbeju, generalnemu direktorju Iskre Delte vlogo jugoslovanskega Stevena Jobsa. Osebno sem prepričan, da bi vsak direktor, ki je s svojo osebno odgovornostjo zgradil tako veliko računalniško podjetje in s svojo äirino omogočal razmah vrhunskih industrijskih raziskav in razvoja^«, moral imeti na voljo vsaj en izhod, ki bi bil priznanje (ne nagrada) njegovemu požrtvovalnemu, izčrpljajočemu delu, v katerem nikoli ni videl tiste lastne koristi, ki so jo lahko zaradi njegovega dela pobirali drugi. Skrajno neetično in za tehniäko inteligenco nesprejemljivo je atalifiče, da lahko določena višja struktura, popolnoma birokratsko, post-totalitarno, po logiki svoje totalitarne samo-poklicanosti atigne ustvarajalno osebnost v poligon svojih brezobzirnih in ruftilnih manipulacij. • Mija Repovž: Računalništvo na nas način. Delo, Sobotna priloga, stran. 18 (5. decembra 1987) . 1® A. P. Zeleznikar: Parsys Expeditions to New Worlds. Informatica 11 (1987), No. 3, 76-80. V tem svojem članku sem opozoril na posebno, seveda odločujočo, atrateäko smiselno vlogo generalnega direktorja Iskre Delte, Janeza Skrubeja: "... Quite at the beginning, Iskra Delta has introduced and incorporated strategic thinking and acting in its managerial decision making. As a fast growing enterprise in the field of computer industry, it has been confronted not only with the very basic organizational and technological problems of modern computer industry of the developed hemisphere, but alao with specific problems of a technologically and even civi 1 i zationally (socially, ideologically) underdeveloped environment. In these times, up to this day, the general director of Iskra Delta - Mr, Janez SkrubeJ -was not only the real strategist of the company, but also the optimistic fighter, organizer, believer of the progress, and the carrier of several hard, arduous, and exhausting business situations and developmental processes of the company. And all of this in irregularly and unforeseeably changing circumstances of the underdeveloped hemisphere. Thus, it is to say, that he was the main initiator and supporter of the innovative-and independent Parsys project. APPROACHING THE LIMIT OF CLASSIFICATION ACCURACY INFORMATICA 2/88 UDK 519.226.3 Matjaž Gams, Matija DrobnIč Jožef Stefan Institute ABSTRACT. One of the recently developed systems for machine learning (GINGSYS) significantly outperformed all compared systems including theoretically optimal Bayesian classifier, which was the second in both tests. We tested several options in Bayesian classifier to investigate the teal cause for nonoptimal results and to estimate the upper limit in classification accuracy. The conclusion is that while it is possible to achieve even higher classification accuracy with suitable parameter adjustment in Bayesian classifier, it seems that GINESYS practically achieved the optimal classification accuracy. Sistem za empirično u&enje GINESYS je v praktičnih meritvah presegel primerjane^sisteme z Bayesovim klasifikatorjem vred. Podrobna analiza kaže, da je dosežene reeultate mogoče preseči, vendar so iže zelo blizu optimalne meje. 1. INTRODUCTION SWITCHES status Kachine learning is a quickly developing area of Artificial intelligence (Winston). According to the major inference type used it can be divided into rote learning, learning from instruction, learning by deduction, by analogy, from examples and from observation and discovery [Carbonell et al; Hichalski]. The scope of this article is learning from examples or Empirical Learning (EL). The aim of EL is to induce general descriptions of concepts from examples (instances) of these concepts. Examples are usually objects of a known class described in terms of attributes and values. The final product of learning are symbolic descriptions in human understandable forms. Induced descriptions of concepts, representing different classes of objects, can be used for classifying new objects. EL systems basically perform the same task (classification) as statistical methods and can be directly compared to them from the point of classification accuracy. On the other hand, EL systems offer further advantages, namely a) explanation during classification of new examples and b) the insight into the laws of the domain by observing classification rules. Explanation during classification (a) is important since it enables the usee to check the line of reasoning and verify the system's decision. The knowledge base (b) can be viewed as a new representation of the domain knowledge, which can be of great value to domain experts, especially in domains that are not yet well formalized and understood. 2. a simple example For a simple example let us consider a case where we have a device with 8 binary switches representing 256 legal combinations. Device reports errors in some combinations and we want to find out what subsequence causes them. 1 2 3 4 5 6 7 8 1 1 0 1 0 1 0 1 0 1 1 0 1 1 1 1 0 0 1 0 error ok ERROR OK error Table 1. Device reports error in some combinations of switches. Which subsequence of switches causes them? Probably the most common answer in EL systems would be, that error is reported when switches 5 and 8 are on (»1). In practical tasks EL systems deal with domains with 10 to 10.000 examples (typically some hundred) with 2 to 500 attributes (typically ten or some ten) [Breiman et al; Qulrlan; Lavrac et al]. Attributes can be real, integer or categorical with many possible values. 3. EMPIRICAL LEARNING The whole process of empirical learning consists of four steps: - preprocessing of learning examples, - construction of a classification rule, - classification of new instances and - analysing the laws of the domain. Detailed description can be found elsewhere, e.g. [Kononenko] or [Gams, LavracJ with detailed overview of some well known algorithms - C4 [Quinlan], CART [Breiman et al], ASSISTANT 86 (Cestnik et al], CN2 [Clark, Niblett], aqIS [Carbonell et al}. We shall formally represent here only a domain area and a classification rule. A set of learning examples L - {(x,c)} consists of pairs .'.,c), where x is a vector (denoting propercies of the object) in a measurement space x and c represents index of the class of example x. the Components of vectors x are called attributes or variables. The values of attributes can be numerical or categorical. A classification or a decision rule đ(x) is a mapping which maps every x from X into some c from C or into the probability distribution (pl,p2,..pj) where pi is a real number between 0 and 1. A classification rule d(x) splits the whole space X into spaces X1,X2,...XJ, such that for every Xi only a certain subset of d(x) is relevant. The syntax of a classification rule d(x) is; Si- 1 and classification rule if rule | and complex selector Atr It- val ( val or values !!- 112|3...|J class ■ J I- < I , I > operators Atr corresponds to the name of the attribute and Val is a categorical or numerical value. This syntax is transformable into DNF and is similar to the syntax of most rule-based systems or expert systems [Waterman, Hayes-Roth]. Note that the actual syntax is slightly more complicated [Gams]. 4. DOMAIN DESCRIPTION We performed practical measurements on two real-world domains. Data were obtained by I. Kononenko and represent descriptions and diagnoses of patients from the Oncological Institute Ljubljana. The only correction was replacement of missing values by the most probable ones for a given class. More detailed description is in [Gamsi, here we present only cumulative data about these domains: Domain 1 number of attributes 18 no. of possible values per attribute 2-8 ( average 3.3 ) . number of classes 9 total number of examples 150 distribution of examples amongst classes number of examples in CI to C9 12 .34 56789 2 1 12 8 69 53 1 4 0 Domain 2 number of attributes 17 no. of possible values per attribute 2-3 ( average 2.2 ) number of classes 22 total number of examples 339 distribution of examples amongst classes number of examples in CI to C22 1 2 3 4 5 6 7 8 9 10 11 84 20 9 14 39 1 14 6 0 2 28 12 13 14 15 16 17 18 19 20 21 22 16 7 24 2 1 10 29 6 2 1 24 importance of attributes - Al to A17 (counting how many examples overlap when omitting the i-th attribute) 123456 7 89 80 80 58 85 60 53 65 55 68 10 11 12 13 14 15 16 17 63 55 60 53 54 57 65 65 5. GINESYS importance of attributes - Al to Aie of them is redundant none 5.1. ALGORITHM DESCRIPTION The top level description of GINESVS (Generic iNductive Expert SYstem Shell) is as follows; repeat initialize Rule; generate Rule; add Rule to d(x); L - L - {examples covered by Rule) until satisfiable(d(x)) In this general view GINESVS represents a prototype of a unifying algorithm for empirical learning covering many other systems. In a slightly more specified description we obtain the following algorithm! repeat generalise Rule; repeat specialize Rule until stop(Rule)f postprocess(Rule); add Rule to d(x); L - L - (examples covered by Rule] until 6atisfiable(d(x)) The main difference between other EL systems and GINESYS is in "confirmation rules". Basic idea of confirmation rules is using several sources of information for classification. That seems to be common practice in every day life. For example when we try to predict the weather, we look at the official weather report, but also look at the sky and ask our neighbour. The implementation of this idea in GINESYS is that instead of using only one rule for classification several rules confirm or confute the first one. In case of a confrontation between these rules the Bayesian classifier is consulted as a method of a conflict resolution (Waterman, Hayes-Roth]. One confirmation rule in our simple example in Table 1 could be ; Error is reported, when switches 3, 5 and 8 are on. This rule could be redundant or even wrong, but on the other hand it could be the only correct onel From examples in Table 1 it if not clear which of these possibilities is the right one, so both (and other) rules are 14 Stored and consulted. In more detailed tests [Gams] it was shown that this method of consulting several rules (> using different kinds of Information) significantly improved classification accuracy. 5.2. COMPARATIVE RESULTS A detailed comparison was made with other well known EL systems in two noisy medical domains (oncology). Table 2 shows results in classification accuracy. GINESYS BAYES other systems domain 1 69.9 68.4 67.3 domain 2 51.9 50.1 48.7 Table 2. Classification accuracy measured as the percentage of correct guesses. While GINESYS achieved bost results and Bayesian classifier the second ones, none of the compared systems [Gamsl outperformed results in the last cow In Table 2. These results ace actually an avecage over ten cuns on randomly chosen 70% of data for learning and remaining 30% of data for testing. In further tests (t-tests, [Gams]) it was shown that the number of tests) distribution and the difference between classification accuracies was sufficient to ensure that differences are a result of some deeper cause (e.g. better algorithm) and not a chance choice. Other measurements proved superiority not only from the point of classification accuracy, but also in generality, complexity of classification rule and explanation [Gams]. GINESYS and other algorithms discussed in this paper were implemented In Pascal on VAX 11/750. 5.3. IRREPROACHABILITY OF MEASUREMENTS He argue that our measurements are irreproachable (unbiased) since; all systems were measured on exactly the same data - no "cleaning" of data was performed - no special form of data was allowed - no unusual method of measuring classification accuracy was used - no domain dependent parameters were allowed - the number of data and tests was sufficient (t-tests) to avoid chance choice - results were strictly checked and verified by many supervisors from the program source level to the level of classification trace. accuracy as other statistical methods, in our measurements some of the EL systems, especially those without special mechanisms for noisy domains, gave unexpectedly poor performance compared to the results published by the originators of algorithms. Since our implementations of those systems were the same as published, several possible explanations remain. It might be that actual Implementations use some unpublished extra features, maybe the domains used foe testing were especially suitable for specific algorithms etc. The authors of this article also find questionable comparing between the system and the expert, since we regard EL systems mainly as a helping tool and not as a stand-alone program. The other reason is that fair comparison between machine and human is extremely difficult. The correct comparison should be (system + user) : user. In most complex realistic domains mechanisms for dealing with noise are of greatest importance as Independently discovered in [Breiman et al; Kononenko] and it is not realistic to achieve even tolerable results without them [Kononenko; Gams]. 6. BAYESIAN CLASSIFIER 6.1, THEORETICAL FOUNDATIONS The concept of the Bayes rule is one of the most important concepts in the field of classification and also learning. For the data drawn from a probability distribution P(A,j), the most accurate rule can be given in the terms of P(A,J} and this rule Is called the Bayes rule. It is normally denoted by dB(x). precisely, suppose that (x,y), x« K, y« Y is a random sample from the probability distribution P(A,j) on X x C, i. e., P(x «■ A, y=j)"P(A,3). Then we define dB(x) as the Bayes rule if for any other classifier d(x), Let us assume that X is H-dlmensional euclidean space and for every j, j=l, ... ,J, P(A/j) has the probability density fj(x) and for sets AC X Then we can prove the following theorem; 5.4. DISCUSSION ABOUT RESULTS Some of the systems for empirical learning achieved good results in practical testing In several real life domains, practically approaching or even outperforming domain experts and statistical methods [Kononenko; Hichalski, Chilausky; Breiman et al; Gams]). Hore acceptable is the opinion [Breiman et al), that although all methods are more or less domain dependent, EL systems in general achieve about the same classification where J is the number of classes and P(j) is the prior probability of the class j. The proof can be found In [Breiman et al). in practice, neither the P{j) nor the fj(x) are known. The three most common classification procedures, used to approximate the Bayes rule by using the learning sample data, are discriminant analysis, kernel density estimation and k-th nearest neighbour. Accuracy of two of them have been compared with the results of Glnesys on both domains. 6.2. PRACTICAL IMPLÉHENTATIOMS 6.3. A PRACTICAL EXAMPLE The k-th nearest neighbour method [Fix, Hod9e6) was implemented as simple as possible. The algorithm eearches through the set of learning examples and determinates distance between learning and test example as the number of mismatches in- their attribute values. Test example Is then classified by its first nearest neighbour, and if there are more equally distant neighbour, the last one found is picked for classification. It is so called Nearest-neighbour classifier iBatcheles 1974], Although it is so primitive, this method classifies test examples with 72.9% average accuracy in the domain 1, what is even better than GINESYS. However, in the domain 2, which is far more complex, the classification accuracy is only 40.4% what is considerably lower than that òf the other methods. The following approximation of the Bayes rule (Clark, Niblett; Kononenkoj is one of the most commonly used. In general, the rule is farmed as At this point the assumption is made, that all attributes are independent : mAijc) \}nAiic] (11 when classifying a new example we need to evaluate formula (1), One practical solution when dealing with categorical values is to store all factors into a 3-diraenslonal table TB[i,j,k] with the following indexes : i - attribute index j - attribute's value's index k - class index TB(i,j,k] is the number of examples in the learning set with the properties, denoted by index values. When evaluating formula (1) during classification of a new example, one of the factors can be 0. The result can be either 0 or undefined. The solution in the second case is obvious - delete this attribute from the formula. The same solution is sometimes used when the result is 0. In Table 2 GINESYS (without domain dependent parameters or other adjustments [Gams]) achieved higher classification accuracy than the practical implementation of theoretically optimal Bayesian classifier. The reason for this must be in practical Implementation, especially in a) approximation of probabilities from the learning set, b) assumption, that attributes are independent, c) practical solutions to numerical problems. Problem (a> can be discarded, since all systems [Gams] processed exactly the same data. But it could be the case, that ditferent classifiers (also different implementations of Bayesian classifier) are more and other less sensible to the number and distribution of input data. Problems appearing during the evaluation of formula (1) can be shown in a simple example. Let us try to classify examples el and e2 from data obtained from Table 1. el - 0000 0000 e2 -10111011 P{OK/ü)=l P{ERRORlel) = t P{ERRORfa) None of the two examples gives the sum of all classes equal to 1. Even if we delete all columns having 0 we obtain results like P(OK/el) - 2.6. Furthermore, in case of example e2 the probability for class OK is greater than for the class ERROR, although example e2 is the same as example e5 from Table 1, belonging to class ERROR. However note that the nearest neighbour method would classify correctly in this case, A small number of examples, is insufficient for most statistical methods and also foe Bayesian classifier. In real measurements in domain 1 and 2 the number of examples was always greater than one hundred and was considered sufficient. Nevertheless these counterexamples show that better classification accuracy is possible. 7. ADJUSTING PARAMETERS CLASSIFIER IN BAYESIAN Probabilities used in evaluating formula (1) are approximated by prior probabilities in the learning set, what yields some error in classification. The formula is then evaluated as follows pm (2) where ai is the number of examples of the class c with the same value of the i-th attribute as the test example, b is the number of examples of the class c, ci is the nurober of all examples with the same value of the i-th attribute as the test example, and d is the number of all learning examples. When dealing occur during with noisy data, errors may evaluation of formula (2). Two methods have'been usèd to avoid this errors. 7.1. OMITTING THE UNRELIABLE FACTORS The main idea of this.method is that' the accuracy of estimations in formula (2) grows 'with the nfjmber of examples. Therefore, if b oc d (only b in practice) is smaller than parameter HINH, which is set before evaluation, then the factor with this b is omitted during evaluation of formula (2) and »Btimated by its shows the results in classification accuracy with different values of HINN. WHlAWVeW W dX Vid ^Wk'i the class probability Is es prior probability. Table 3 st KINN : domain 1 domain 2 0 66.4 ( 0.0) 50.1 ( 0.0) 1 66.4 (11.1) 50.1 ( 4.5) 2 66.4 (47.6) 49.7 (31.8) 3 69.3 (52.2) 49.7 (32.3) 4 68.7 (56.7) 49.8 (33.6) 6 68.0 (60.0) 50.9 (47.7) 10 66.7 (75.6) 48.7 (60.5) IS 67.1 (77.8) 46.2 (72.7) Table 3. Classification accuracy measured as Liie peEctiiita^e oc cocreci; guesses, values in brackets are percentages of classifications with prior probability. It is interesting that accuracy is almost independent of the number of classifications with prior probabilities and it decreases only if we classify approximately 75% of examples this way. Yet we can see that accuracy on both domains reaches its maximum when approximately 50« of claesificatlons are done by the prior probability of classes and this maximal accuracy le near the accuracy of GIHESYS. \ EPS : 0.00 0.01 0.05 0,10 hinn ; 0 0.0 0.3 2.7 3.8 2 0.0 1.4 1.1 1.1 4 0.3 1.2 0.9 0.3 6 -0.4 0.0 -0.4 -0.4 Table 5. Increase of classification accuracy by combination of both methods in domain 1 (basic accuracy 68.4%). \ EPS : 0.00 0.01 0.05 0.10 MINN ; 0 0.0 2.7 2.8 1.8 2 -0.4 1.9 2.2 2.0 4 -0.3 1.8 2.1 2.0 6 0.8 2.7 3.2 3.2 Table 6. Increase of classification accuracy by combination of both methods in domain 2 (basic accuracy 50.1%). The accuracy of GINBSYS is in both cases exceeded by more than 1% what is also the difference between basic Bayesian classifier and GINESYS. Yet there is a problem which combination of BPS and hinn values to choose and how far this decision is domain independent. Therefore, GINESYS can still be considered to reach the practical upper bound of classification accuracy. 7.2. ADJUSTING ZERO FACTORS 8. EXTERNAL RULE CLASSIFIER directed bayesiam A problem occurs, what to do when ai in the formula (2) is 0. One solution is to omit this factor from the formula. Another idea could be to set ai to a very small number EPS in such case. The idea is that this zero is the result of domain noise and, with more learning examples, we would sooner or later find such example and therefore we made almost no mistake and we also don't lose the information contained in the distribution of other attributes. The results of this method are shown in Table 4. EPS : domain 1 domain 2 0.00 68.4 50.1 0.01 68.7 52.8 0.05 71.1 52.9 0.10 72.2 51.9 0.50 24.2 27.7 1.00 2.2 7.5 Table 4. Classification accuracy achieved by adjusting zero factors in formula (2) by some small value EPS. In both cases this method achieves accuracy higher than GINESYS. But, rapid drop of accuracy also shows that this method is very sensible to the value of EPS. Whenever we set zero factor to some value different from 0 we introduce an error into the evaluation of formula (2) and if the value of EPS is too big the results are no more reliable at all. Bayesian classifier itself does not derive any explicit rules and therefore rules generated by some other system (in our case ginesys) can be used to control the evaluation of formula (1). Two such methods have been tested. The idea of the first one is that any (more or less successful) rule denotes a complex of attributes which are logically connected and therefore a deviation from the optimal Bayesian classifier is somewhat corrected. The second one is an attempt to cross GIHESYS and Bayes together to yield better results. 8.1. CLASSIFICATION ATTRIBUTES with important This method uses rules, generated by GIHESYS. During evaluation of formula (1) the rule which matches the current example is searched for. If it is found, we calculate adequate probabilities by searching through the table for entire complex and not by decomposing the attribute complex to basic attributes. On the other hand, if the matching rule is not found, undisturbed evaluation of formula (1) follows. The results are shown in Table 7. Measurements show that introducing of externally generated rules into Bayesian classifier only slightly disturbs its classification accuracy. 7.3. COMBINATION OF BOTH METHODS In this case, both methods described in 7.1. and 7.2. are used together. First we look whether b and d are bigger than HINN and than we set zero factors in formula (2) to BPS where needed. The results of measurements on both domains are shown in Table 5 and Table 6. 6.2. CLASSIFICATION ATTRIBUTES ONLY with important The main idea of this method is to use important attribute complexes in classification if possible. Por each example classified we first search for the matching GINESYS rule. If such one is found, it is used for classification, if not (when only Null rule of GINESYS is found), the classification is carried out by formula (1). The results are shown in Table 7. Bayes Rule Directed Bayes Important Attr. & Bayes domain 68.4 68.4 69.3 domain 2 50.1 49.2 47.9 Table 7. Classification accuracy measured as the percentage of correct guesses. 9. COMPARISON BETWEEN EL STATISTICAL CLASSIFIERS SYSTEMS AMD Let us summarize conclusions from previous paragraphs: It is possible to further improve classification accuracy of implementations of Bayesian classifier, even to overpass best results of GINESYS. - Among compared methods without domain-independent parameters GINESYS performs best and is very close to the practical limit in classification accuracies in measured domains. Statistical classifiers are basically unable to perform explanation during classification and to build a human understandable knowledge base. Besides these, other disadvantages can be pointed out [Bceiraan et al; Gams Is - they can not deal with domains with small number of learning examples; - it is difficult to deal with unusual situations (deleting by 0, unknown values, ... ) ; - their results vary according to the suitability of problem domain. It is only fair to notice that more advanced statistical methods eliminate some of these disadvantages. However these properties remain basically unchanged. Another reason for so good results of EL systems like GINESYS compared to statistical methods is shown in Figure 1. Real-life complex domains probably contain logical laws which cover greater areas regardless of noise in given examples. On the contrary statistical methods depend on variations of probability distribution. In Figure 1 the correct probability distribution for classes 1 and 2 is presented by bold lines. Dotted lines represent probability distribution, obtained from given examples. Because of the fact that probability distribution is more sensible to chance choice it is possible that dotted lines 1 and 2 overlap causing incorrect classification. Figure 1; A graphical representation of one of the possible reasons why GINESYS performs so well compared to statistical methods. 10. CONCLUSION AND DISCUSSION older systems for empirical learning (EL) outperformed the statistical methods from the point of explanation during classification and possibility of building a human understandable knowledge base, while it was in some cases reported that older EL systems outperformed statistical methods as well as domain experts this opinion is not undoubtedly shared with the authors of this article, nore acceptable is the conclusion [Breiman et al], that the best EL systems achieve about the same classification accuracy as statistical methods. Nevertheless it seems that the new breed of EL systems with GINESYS as one of the most promising representatives outperforms statistical methods even in classification accuracy (at least in so far measured domains). ACKNOWLEDGES We are grateful for suggestions to prof. Ivan Bratko. Marjan Petkovsek provided mathematical background for our analysis. Students Izidor Jerebic, Borut Znidar, Aram Karallc and Darko Zupanic were of great help in programming tasks. Igor Rononenko and the Oncological institute Ljubljana enabled us to do the measurements on real-world domains. This work was partly supported by the Slovene Research Council and COST-13. Research facilities were provided by the "Jozef Stefan" institute. REFERENCES Breiman L., Friedman J.H., olshen R.A., Stone C.J. (1984)! "Classification and Regression Trees", Wardsworth Int. Group. Carbonell J.G., Michalski R.S., Mitchell T.H. (1983): "An overview of Machine Learning", in Michalski R.S., Carbonell J.G., Mitchell T.M. (ed.). Machine Learnings an Artificial intelligence Approach, Tioga Publishing. Cestnik B., Kononenko i., Bratko I. (1987): "ASSISTANT 86: A Knowledge Elecitation Tool For Sophisticated Users", Progress in Machine Learning, Sigma Press. Clark P., Niblett P. (1987); "Induction, in Noisy Domains", Progress i-n Machine Learning, SIGMA Press. Kononenko I. (1985): "Razvoj sistema za induktivno učenje ASISTENT", magistrsko delo. Fakulteta za eletrotehniko, Ljubljana. Kononenko I. (1985): "Strukturno avtomatsko učenje". Informatica 3, str. 44 - 56. Gams M. (1987): "Principi poenostavljanja v sistemih za avtomatsko učenje", doktorska disertacija, Ljubljana. Gams M., Lavra« N. (1987): "Review of Five Empirical Leaning Systems Within a Proposed Schemata", Progress in Machine Learning, (ed. Bratko I., Lavrac N.), Sigma Press. Lavra5 N., Varšek A., Gams M., Kononenko I., Bratko I. (1986): "Automatic construction of the knowledge base for a steel classification expert system". The 6th International workshop on Expert Systems, Avignon. Michalski R.S. (1987): "Machine Learning", Tutorial 6, IJCAI 1987, Milano. Michalski R.S., chilausky L.R. (1980): "Learning by Being Told and Learning from Examples: an Experimental Comparison for Soybean Disease Diagnosis", Policy Analysis and information Systems, Vol. 4, No 2. Quinlan J.R. (1986): "Induction of Decision Trees", AI Summer Seminar, Dubrovnik. Waterman Đ.A., Hayes-Roth F. (ed.) (1977): "Pattern-Directed Inference Systems", Academic Press. Winston H.P. (1984): "Artificial Intelligence", Addison-Wesley. STANDARDS IN COVPUTER GRAPHICS INFORMATICA 2/88 UDK 681.3.083.7 Niko Guld, Borut Žalik Tehnical Faculty Maribor Abstract: In this paper standards in computer graphics are described. fit -first the reasons for evolution these standards are given and then the ways of accepting the international standards are presented. Afterwards the evolution phases of the graphical standards under ISO and ANSI are interpreted and current stage of particular standards are given. In proceeding the place of graphical standards and standard proposals in a graphical system are shown. Finally, the position and rale of the graphical standards in a modern CftD system is presented. Povzetek: V ćlanku podamo pregled standardov v raćunalnižki grafiki. Najprej opišemo vzroke za razvoj teh standardov, nato pa prikaieno poti, preko katerih nek predlog lahko postane mednarodni standard. Zatem predstavimo raevojne fase ISO in ANSI standardov ter podamo trenutne razvojne stopnje posameznih standardov za raćunalnižko grafiko. V nadaljevanju opiéemo mesto grafitnih standardov oziroma predlogov standardov v grafičnem sistemu. Naaadnje podamo mesto in vlogo grafitnih standardov v modernem CflD sistemu. 1. THE BEGININQS OF STANDARDS DEVELOPMENT Today, a large number of different graphical hardware and even more different graphical software exist. A big part of this graphical software is device-dependent. The consequences aret 1. It is impossible to exchange graphical software between different graphical systems 2. There are problems by installation of old programs on new graphical equipment, although it has been suplied by the same producer, etc. Because of these problems an idea has been apps^rsd to make a device-independent graphical packet. Advantages_of this device-independent graphical packet are: 1. It could serve different device generations. 2. Programs could work on different graphical systems. 3. Programmers could immediately different graphical systems. work 4. Graphical systems are distinguished only by quality, price, and efficiency. Of course, this device-independent graphical packets have also some weaknesses. They are slower than device-dependent and more memory space is needed. Because the power of computers is rapidly increasing and their prices are decreasing, advantages of standardization are going over its weaknesses. People) who are opposite to standards in computer graphics, affirm that standards are against inovations. It is clear, when a standard is accepted, it could not be changed immediately. Portability of api i cation programs could be achieved in some different ways /ENDE84/'t with development of computer languages, with extension of existing program languages with graphical features or with libraries of graphical subroutines which could be linked into application program. Experts from the field of computer graphics have chosen the last possibility by the construction of international device-Independent graphical standard. However, it is least elegant of all but it Is the best way to awoid confusing in structures of program languages. The place of the device-Independent graphical standard in a graphic system is shown by figure 1, APPLICATION PROGRAM DEVICE-IMDEPEHDENT ORAPMICAL PACKAGE DEVICE ORAPHICAL DRIVER DEVICE 1 DEVICE QRAPHICAL DRIVER DEVICE N Figure 1. The place of graphical standard in graphical system The development o-f graphical standards began in the year 1974, when the Qraphics Standard Planing Cainmittee <6SPC) was found by ACM SISQRAPH ("ftssDciation for Computing Mashinery Special Interest Sroup on Graphics"). This committee met with other international members, Involved in computer graphics in Seillac in 1976. This meeting had a great influence on first draft standard called" Core Systeoi. It was introduced on SI6GRAPH 1977. Two years later, on SIGGRAPH in 1979, an improved version of Core was appeared. Soon after that a new group has been found by German Institute -For Standards DIN which has been worked on a neM graphical standard basis. The group has been directed by Jose Encarnacao and it prepared in 1977 a draft standard called GKS (Graphical Kernel System). Two propositions were apeared by ISO in 1979: Core and GKS. Working Group WI32 by ISO decided that only efforts on BKS continued. GKS was much more simple, it was 2D, and it was intended for raster devices. On other side Core was 3D and destined for vector devices. The first draft proposal of GKS was made by ISQ in 19B2. QK8 was accepted as an ISO standard in 1985. In 1981 SIGGRAPH SSPC committee was disbanded and passed over to the ANSI X:3H3 committee, which was founded in 1979. XIH) gtipkkal lUMirdt XlHl.l »Hl.t PHI GS (Funi) S»««lllettlM) MHt.) CGUCGU (Luguigi Blndlagi) OHl.t SKS »HM (WU40WI) Fi gurE 2. )C3H3 Tehnical Committee and its subcommi ttees It is similar for International Organization for Standardization (ISO). ANSI is only a secretariate in ISO's Technical Committee TC97. Working Group WG2 in subcommittee SC21 is responsible for graphical standards. Its sign is ISD/TC97/SC21/WB2. ' Some standards which are set up by ISO or ANSI', are effective, but others are even ignored. From this point of view standards could be considered as de facto and formal standards /STRAST/. For example, IBM's Color Graphics Adapter (CGA) is a de facto display standard for PCs, just as the IBM PC is a de -facto standard for personal computers. Neither CGA neither IBM PC was formal standards, but market factors has adopted them as standards. Among formal standards we distinguish successful and unsuccessful ones. A case of formal standard, which has been widely used, is RS-232-C. On the other hand, who has heard about ANS X3.23 standard for keyboard layout. This standard has been totally eclipsed by de facto standards, first by the IBM Selectric layout and later by PC and AT layouts. 3. DEVELOPMENT PROCESS OF STANDARDS GKS has become a basis for many other proposals of standards including PHIGS, CQM, and CGI. I6ES, as a standard for transferring CAD/CAM data bases, has been developed in completely another way than Core and BKS (through others ANSI committees)i IGES was accepted as an ANSI standard in 1981. 2. WHO SETS UP THE STANDARDS? That a proposal becomes a standard, there are several phases it must go through. It takes much more time for standardi zati on process in computer graphics than for standards from other areas, because the projects are very large and completly new. Evolution process of an ISO standard Evolution phases of an ISO standards are /BGNOaS, EiDNOBi/i American National Standard Institute (ANSI) does not set up the standards, but it only whaches over the process, through which the standards are accepted. ANSI has to notice if a standard draft is acceptable by most wide part of industry. Only such standard could be adopted and used in industry and other institutions. ANSI adopts a standard as a national standard when it is acceptable by most companies and organizations. ANSI consists of by several committees. So the ANSI X3 is the standards development committee for information processing and has about 3i3 committees, each wi t);i about 15 to 80 members /BQNQSè/. One of them Is )t3H3 tehnical committee, which ie responsible for computer graphics standards. X3H3 committee consists of 6 subcommittees, which is showed by figure 2 /STRA86/. More than llM participants, representing about aä companies (CalComp, Control Data, DEC, HP, Honeywell, IBM, Intel, Tektronix, TI, etc.), attend )(3H3 meetings. the first; New Work Item proposal (NWI); discussion about new project is started, when subcommittee (like SC21) or a member body (like ANSI) makes a proposal. Representati ves of different countries decide if they accept the definition of the work item and if the'work is continuing on this proposal. This stage can take 5 to 8 months. the second: Working Draft- (WD)s document could be in this stage 6 to IS months; the thirds Draft Proposed (DP); this stage can take 12 to 14 months; - the fourth: Draft International Standard (DIS)! document could take place in this stage for 9 to 12 months; - the final: International Standard (IS), Evolution process of an ANSI standard Evolution stages o-f an ANSI standard differ from stages of an ISG standard and are followingi - the firsts Standing Document 3 CSD-3) is an initial proposal which can take no less than b months) - the secondi Working Drafts; X3H3 prepares a series of working drafts that are circulated among X3H3 members. This stage typically takes several years. the third: Draft Proposed American National Standard (dp ANS); this stage takes é to 10 months; - the fourth! Public Review; document could be in this stage 8 months or more, which depends an the number of public reviews. At least two public reviews are required by X3H3. the finali Final Approval takes 6 to 9 months. The current stage of graphical standards under ISO and ANSI is shown by table 1 /BONOBA, SELEB7/. Table 1. The stage of graphical standards project Project EKS GKS Fortran SKS Pascal EKS Ada ISO status IS 7942 published in Avgust, 198S. Known as ISO DIS B631/Ì. CIS ballot closed in Avgust, 198à. Known as ISO DIS 8^51/2. DIS Ballot closed in August, 198à. Known as ISO DP 06S1/3. Second DP ballot closed in Apri 1 , 1966. ANSI status ANS X3.123-1985. Published in October, 1985. ANS X3.124.1-1903. Published in October 19B5. ANS X3.124.2-1987. Public review closed in May, 1987. ANS s<3. 124.3-19BX . Public review closed in 1986. GKS C GKS-3D Not yet an ISO standard language. UJD available now (SC21/N669). Known as ISO DP B80S. Second DP ballot closed in March 1986. ANS X3.l24.2-19aK. Public review will be -finished by October,19S7. Public review finished in 1980. SKS-3D Fortran Known as ISO DP 8806. Public rewiew finished 19B6. GKS-3D Pascal Not yet available. PHI GS WD finished in 1986. ANS X3.144~198m. Second public review finished in 19B7. PHI GS Fortran UID availablt (SC21/N667> Second public review finished in 1987. PHIGS Ada WD available (SC21/N819) Public review finished in 1986. CSM (former VDM) IS 8632 published in 1987. ANS X3.122-1986 published in 1986. CGI (former VDI) DP began in 1986. ANS X3.161-198«. Public review finished in Juny, 1987. 4. THE PLACE OF THE GRAPHICAL STANDARDS IN THE GRAPHICAL SYSTEM Six known could bE /DEUSa4/! standards (suggested or accepted) devi ded into three chategorles 1. Core, BKS, and PHIGS represent an aplication prograoiming interface (API). This API standards are usually implemented as a set of the external procedures and an application programmer could link them into his application code. 2. IQES and CGM are used by transfering and storing the graphical information. 3. CGI represent interface. graphical device Figure 3. Interfaces of the graphical system These tree classes could help us by de-fining common -features o-f current and future standards with regard an performance, price, and usefulness of graphical so-ftware and hardware. A comparison and valuation of the graphical standards is not easy, because the majority of them are very compi e» and because they are comming from different areas. Figure 3 gives a review of the standard graphical interfaces. The most important part o-f each language and device independent graphical system Is a member o-f the BKS standards family. These are GKS . In regard of development phases of graphical standards and their language bindings we can expect, that all GKS language bindings (with exception the language bindings far C, because it is not a standard language yet) will become the international standards in a short time. The basi cal request for EKS-3D is fully compatibility with GKS. Currently, there are di scusions about compatibility among PHISS and GKS. It seems, that there will not be a compatibility at all, because GKS uses only one-leveled graphical data structure (segments), while on other hand PHIGS manages with hisrarhical data structures, which are suitable for presenting the graphical models. PHIGS is intended for tirae-demand applications and so it more than likely will not be available on PC computers. SPECIFIC ADDITIONS PROGRAMMING ir^mRFACE Core of CAD «yttem MODELLING PRESENTATION CALCULATION DIALOGUE other CAD tyttem* LANGUAGE BINDINGS DEVICE AND LANGUAGE INDEPENDENT GRAPHCyU- PACKAGE Standard Piclure Exchange Format CGM Other ir graphical »yttemt COMPUTER GRAPHICS INTERFACE DEVICE DRIVER CURRENT GRAPHICAL DEVICE FUTURE C»U>HICAL DEVICE Figure 4. A place of the standards itr CAD system C6M has already became an inter-national standard in its elementary version. The work is proceeding now on an Kupansion o-f CGM, that it could support also GKSM (GKS meta-file) -format (this is format in which SKS saves graphical in-formatlon through logical workstation MO and MI)> The evolution o-f CQI has been already started. There is a long way until CGI as an international standard will be accepted, because this standard should, not limiting the development o-f the graphical hardware. So we can expect a great interest and in-fluence o-f manu+acturers o-f the graphical hardware. Re-ferences B0NO85 Bono, P.R., "A Survey o-f Graphics Standards and Their Role in In-formation Interchange", IEEE Computer, Vol. 18, No. 10, October 1985 EONOai Bono, P.R. , "Guest Editor's Introduction Graphics Standards", IEEE CG&ft, Vol. b, No. 8, August 1986 CflOas CflDCAM Communications, "A Practical Guide to Exchanging CADCAM Data Using the IGE5 Format", Department o-f Trade and Industry, 19B5 DEU584 Deusen, E., "Graphics Standard Handbook", CC exchange. Laguna Beach, 1984 ENDE84 Enderle, 6., K. Kansy, and G. P^a-ff , "Computer Graphics Programming", Sprlngei—Verlang Berlin, Heidelberg 1984 ENDESà Enderle, 0., "Inter-faces -for Storage and Communication o-f Computer Graphics In-formation", from the book Hopgood,F.R.A, R.J.Hubbold, and D.A. Duce,"Advaces in Computer Graphics", Eurographics Seminars, Springei—Verlang, Heidelberg, 1986 SELEST Selective Update; "Two draft standards available -for public review", IEEE CG & A, Vol.7, No. 3, Mach 1987 STRA86 Straayer, D.H., "Setting Standards",, Computer Graphics World, November 1986 WILSB7 Wilson, P.R., "A Short History of CAD Data Transfer Standards", IEEE CS5