Informatica 41 (2017) 451–461 451 Robust Computer Algebra, Theorem Proving, and Oracle AI Gopal P. Sarma School of Medicine, Emory University, Atlanta, GA USA E-mail: gopal.sarma@emory.edu Nick J. Hay Vicarious FPC, San Francisco, CA USA E-mail: nnickhay@gmail.com Keywords: Oracle AI, AI safety, CAS, theorem proving, math oracles Received: June 24, 2013 In the context of superintelligent AI systems, the term “oracle” has two meanings. One refers to modular systems queried for domain-specific tasks. Another usage, referring to a class of systems which may be useful for addressing the value alignment and AI control problems, is a superintelligent AI system that only answers questions. The aim of this manuscript is to survey contemporary research problems related to oracles which align with long-term research goals of AI safety. We examine existing question answering systems and argue that their high degree of architectural heterogeneity makes them poor candidates for rigorous analysis as oracles. On the other hand, we identify computer algebra systems (CASs) as being primitive examples of domain-specific oracles for mathematics and argue that efforts to integrate computer algebra systems with theorem provers, systems which have largely been developed independent of one another, provide a concrete set of problems related to the notion of provable safety that has emerged in the AI safety community. We review approaches to interfacing CASs with theorem provers, describe well-defined architectural deficiencies that have been identified with CASs, and suggest possible lines of research and practical software projects for scientists interested in AI safety. Povzetek: Obravnavani so raziskovalni problemi, povezani z računskimi sistemi s prerokom in varnostjo umetne inteligence. 1 Introduction Recently, significant public attention has been drawn to the consequences of achieving human-level artificial intel- ligence. While there have been small communities analy- zing the long-term impact of AI and related technologies for decades, these forecasts were made before the many re- cent breakthroughs that have dramatically accelerated the pace of research in areas as diverse as robotics, computer vision, and autonomous vehicles, to name just a few [1–3]. Most researchers and industrialists view advances in ar- tificial intelligence as having the potential to be overwhel- mingly beneficial to humanity. Medicine, transportation, and fundamental scientific research are just some of the areas that are actively being transformed by advances in artificial intelligence. On the other hand, issues of privacy and surveillance, access and inequality, or economics and policy are also of utmost importance and are distinct from the specific technical challenges posed by most cutting- edge research problems [4, 5]. In the context of AI forecasting, one set of issues stands apart, namely, the consequences of artificial intelligence whose capacities vastly exceed that of human beings. Some researchers have argued that such a “superintelligence” po- ses distinct problems from the more modest AI systems described above. In particular, the emerging discipline of AI safety has focused on issues related to the potential con- sequences of mis-specifying goal structures for AI systems which have significant capacity to exert influence on the world. From this vantage point, the fundamental concern is that deviations from “human-compatible values” in a superintelligent agent could have significantly detrimental consequences [1]. One strategy that has been advocated for addressing sa- fety concerns related to superintelligence is Oracle AI, that is, an AI system that only answers questions. In other words, an Oracle AI does not directly influence the world in any capacity except via the user of the system. Be- cause an Oracle AI cannot directly take physical action ex- cept by answering questions posed by the system’s opera- tor, some have argued that it may provide a way to bypass the immediate need for solving the “value alignment pro- blem” and would itself be a powerful resource in enabling the safe design of autonomous, deliberative superintelligent agents [1, 6–9]. A weaker notion of the term oracle, what we call a domain-specific oracle, refers to a modular component of a larger AI system that is queried for domain-specific tasks. In this article, we view computer algebra systems as primi- tive domain-specific oracles for mathematical computation 452 Informatica 41 (2017) 451–461 G.P. Sarma et al. which are likely to become quite powerful on the time ho- rizons on which many expect superintelligent AI systems to be developed [10, 11]. Under the assumption that math oracles prove to be useful in the long-term development of AI systems, addressing well-defined architectural pro- blems with CASs and their integration with interactive the- orem provers provides a concrete set of research problems that align with long-term issues in AI safety. In addition, such systems may also be useful in proving the functio- nal correctness of other aspects of an AI architecture. In Section 2, we briefly discuss the unique challenges in al- locating resources for AI safety research. In Section 3, we briefly summarize the motivation for developing oracles in the context of AI safety and give an overview of safety risks and control strategies which have been identified for super- intelligent oracle AIs. In Section 4 we analyze contempo- rary question answering systems and argue that in contrast to computer algebra systems, current consumer-oriented, NLP-based systems are poor candidates for rigorous ana- lysis as oracles. In Section 5, we review the differences between theorem provers and computer algebra systems, efforts at integrating the two, and known architectural pro- blems with CASs. We close with a list of additional re- search projects related to mathematical computation which may be of interest to scientists conducting research in AI safety. 2 Metascience of AI safety research From a resource allocation standpoint, AI safety poses a unique set of challenges. Few areas of academic research operate on such long and potentially uncertain time hori- zons. This is not to say that academia does not engage in long-term research. Research in quantum gravity, for ex- ample, is approaching nearly a century’s worth of effort in theoretical physics [12]. However, the key difference between open-ended, fundamental research in the sciences or humanities and AI safety is the possibility of negative consequences, indeed significant ones, of key technologi- cal breakthroughs taking place without corresponding ad- vances in frameworks for safety [1, 13] . These issues have been controversial, largely due to dis- agreement over the time-horizons for achieving human- level AI and the subsequent consequences [10, 11]. Speci- fically, the notion of an “intelligence explosion,” whereby the intelligence of software systems dramatically increases due their capacity to model and re-write their own source code, has yet to receive adequate scientific scrutiny and analysis [14]. We affirm the importance of AI safety research and also agree with those who have cautioned against proceeding down speculative lines of thinking that lack precision. Our perspective in this article is that it is possible to fruitfully discuss long-term issues related to AI safety while maintai- ning a connection to practical research problems. To some extent, our goal is similar in spirit to the widely discussed manuscript “Concrete Problems in AI Safety” [15]. Ho- wever, we aim to be a bit more bold. While the authors of “Concrete Problems” state at the outset that their ana- lysis will set aside questions related to superintelligence, our goal is to explicitly tackle superintelligence related sa- fety concerns. We believe that there are areas of contem- porary research that overlap with novel ideas and concepts that have arisen among researchers who have purely focu- sed on analyzing the consequences of AI systems whose capacities vastly exceed those of human beings. To be clear, we do not claim that the strategy of sear- ching for pre-existing research objectives that align with the aims of superintelligence theory is sufficient to cover the full spectrum of issues identified by AI safety resear- chers. There is no doubt that the prospect of superintel- ligence raises entirely new issues that have no context in contemporary research. However, considering how young the field is, we believe that the perspective adopted in this article is a down-to-earth and moderate stance to take while the field is in a critical growth phase and a new culture is being created. This article focuses on one area of the AI safety lands- cape, Oracle AI. We identify a set of concrete software pro- jects that relate to more abstract, conceptual ideas from AI safety, to bridge the gap between practical contemporary challenges and longer term concerns which are of an un- certain time horizon. In addition to providing concrete pro- blems for researchers and engineers to tackle, we hope this discussion will be a useful introduction to the concept of Oracle AI for newcomers to the subject. We state at the outset that within the context of Oracle AI, our analysis is limited in scope to systems which perform mathematical computation, and not to oracles in general. Nonetheless, considering how little effort has been directed at the super- intelligence control problem, we are confident that there is low-hanging fruit in addressing these more general issues which are awaiting discovery. 3 Brief overview of Oracle AI As described above, an Oracle AI is a system which only answers questions. Although the term has traditionally been used to denote fully general AI systems, we can also imagine oracles whose capabilities are more limited. The latter systems are what we refer to as domain-specific ora- cles. AI safety researchers have argued that superintelligent oracles have the potential to help address safety concerns of autonomous, deliberative superintelligent AI systems, as an Oracle AI’s only ability to influence the world is through the user of the system. Consequently, an Oracle AI itself may be in a position to assist researchers in solving difficult technical and conceptual problems related to the design of independent, deliberative agents. Yet superintelligent ora- cle AIs pose their own distinct set of safety concerns. The problems identified in [6], for example, presume a super- intelligent AI system, which while limited in its ability to Robust Computer Algebra, Theorem Proving, and Oracle AI Informatica 41 (2017) 451–461 453 interact with the world, may nonetheless have the potential to negatively impact the world by manipulating the user of the system. As Armstrong, Bostrom, and Sandberg write: We are all vulnerable to the OAI’s social engineering. An advanced OAI can literally offer us the world. Gen- uine altruism is no defence against this; it can offer a world free of poverty, hunger, war, disease or death, as well as all personal trinkets we could want. It could guess the weaknesses of each individual, and find the right arguments to convince us that granting it power or liberty is the moral and profitable thing to do. It will be able to play on rivalries, institutional weaknes- ses and ethical dilemmas with more virtuosity than any human could. [6] Given the potential dangers of human manipulation by an OAI, researchers have searched for control strategies to mitigate such risks. A few examples of control methods that have been proposed are: • Physical security- physically containing the OAI sy- stem • Limiting or eliminating external manipulators- like the previous security measure, precluding the OAI from physically manipulating the world in any direct manner • Epistemic capability control- the OAIs output could be restricted even further beyond question answering, for example, restricting it to answer “yes / no ” ques- tions • Virtual worlds- the OAI could be run in a simulated world so that even if the system’s capabilities were ex- panded, its behavior could be observed and modeled. • Resetting the OAI after each question- this would prevent the OAI from engaging in long-term social engineering by manipulating the answers to the users’ questions Although the capacities of domain-specific oracles are more limited, they can nonetheless pose safety risks. Ar- chitectural deficiencies in such oracles might be exploited by a larger AI system to manipulate the human user. It could give answers which are difficult to verify and which allow the OAI to execute complex and intricate plans un- beknownst to the user. Therefore, while flaws in domain- specific oracles are not inherently risky if used solely in their domain of applicability, they may very well be dan- gerous as part of a larger system with more general capa- bilities. Though not a “control strategy” in the narrowest sense, creating “robust” domain-specific oracles is an im- portant objective in designing safe OAIs. Furthermore, en- suring the robustness of domain-specific subsystems might mitigate the need for stronger control strategies, as the OAI would have fewer weaknesses to exploit. It should go without saying that the arguments presented above are highly schematic and do not dependent on spe- cific technologies. To our knowledge, there is very limi- ted work on translating analyses of superintelligent oracle AIs into the concrete language of modern artificial intelli- gence [8, 9, 16]. Our goal in this manuscript is in this spi- rit, that is, to anchor schematic, philosophical arguments in practical, contemporary research. To do so, we will narrow our focus to the mathematical domain. In the re- mainder of the article, we will use the term oracle in the more limited sense of a domain-specific subsystem, and in particular, oracles for performing mathematical computati- ons. We hope that the analysis presented here will be of intrinsic value in developing robust math oracles, as well as provide some intuition and context for identifying con- crete problems relevant to developing safe, superintelligent oracle AI systems. 4 Are there contemporary systems which qualify as oracles? The obvious class of contemporary systems which would seem to qualify as oracles are question answering systems (QASs). As we stated above, a basic criterion characteri- zing oracles is that their fundamental mode of interaction is answering questions posed by a user, or for domain-specific queries as part of a larger AI system. Contemporary QASs are largely aimed at using natural language processing techniques to answer questions per- taining to useful facts about the world such as places, mo- vies, historical figures, and so on. An important point to make about QASs is the highly variable nature of the un- derlying technology. For instance, IBM’s original Watson system which competed in Jeopardy, was developed prior to the recent advances in deep learning which have funda- mentally transformed areas ranging from computer vision, to speech recognition, to natural language processing [17]. In this particular task, the system was nonetheless able to perform at a level beyond that of the most accomplished human participants. The introduction of “info panes” into popular search engines, on the other hand, have been based on more recent machine learning technology, and indeed, these advances are also what power the latest iterations of the Watson system [18]. On the other end of the spectrum is Wolfram | Alpha, which is also a question answering sy- stem, but which is architecturally centered around a large, curated repository of structured data, rather than datasets of unstructured natural language [19]. While these systems are currently useful for humans in navigating the world, planning social outings, and arriving at quick and useful answers to ordinary questions, it is not clear that they will remain useful in quite the same capacity many years from now, or as standalone components of su- perintelligent AI systems. Although the underlying techni- ques of deep learning or NLP are of fundamental interest in their own right, the fact that these systems are QASs at 454 Informatica 41 (2017) 451–461 G.P. Sarma et al. all seems to be more of an artifact of their utility for consu- mers. Another important observation about contemporary QASs is that much of their underlying NLP-based archi- tecture can be replaced by taking advantage of structured data, as the example of Wolfram — Alpha demonstrates. For the other NLP or machine learning based systems, the underlying technology can be used as part of larger, semi- automated pipelines to turn unstructured data from textual sources into structured data. Once again, this fact simply underscores that contemporary QASs are not particularly appealing model systems to analyze from the Oracle AI sa- fety perspective.1 4.1 Computer algebra and domain-specific oracles for mathematical computation The question answering systems described above all rely on natural language processing to varying degrees. In ad- dition, their domain of applicability has tended towards “ordinary” day-to-day knowledge useful to a wide array of consumers. Another type of question answering system is a computer algebra system (CAS). Computer algebra has traditionally referred to systems for computing specific re- sults to specific mathematical equations, for example, com- puting derivatives and integrals, group theoretic quantities, etc. In a sense, we can think of computer algebra as a set of algorithms for performing what an applied mathematician or theoretical physicist might work out on paper and pencil. Indeed, some of the early work in computer algebra came from quantum field theory—one of the first computer al- gebra systems was Veltman’s Schoonschip for performing field theoretic computations that led to the theory of elec- troweak unification [20]. As computer algebra systems have grown in popula- rity, their functionality has expanded substantially to co- ver a wide range of standard computations in mathematics and theoretical physics, including differentiation, integra- tion, matrix operations, manipulation of symbolic expres- sions, symbolic substitution, algebraic equation solving, li- mit computation, and many others. Computer algebra sys- 1We emphasize that our argument that contemporary QASs are not good candidates for analysis as Oracle AIs is not an argument against the traditional formulation of Oracle AI as a tool for AI safety. We fully ex- pect significant breakthroughs to be made in advancing the theory and practice of oracle-based techniques for AI safety and we hope that this manuscript will provide some motivation to pursue such research. Rat- her, our point is that when viewing contemporary systems from the lens of superintelligence, there seems little reason to believe that current NLP- based QASs will remain sufficiently architecturally stable to be used as standalone components in AI systems many years from now. On the other hand, there are certainly important present-day problems to exa- mine when evaluating the broader impact of QASs, such as bias in NLP systems, overgeneralization, and privacy, to name just a few. Some of these issues overlap with the set of problems identified in [15] as exam- ples of concrete problems in AI safety. In addition, we are beginning to see conferences devoted to contemporary ethical issues raised by machine learning. See, for example, the workshop Ethics in Natural Language Processing (https://www.aclweb.org/portal/content/first-workshop-ethics- natural-language-processing). tems typically run in a read, evaluate, print loop (repl), and in the research and education context, their popularity has also grown as a result of the notebook model pioneered by the Mathematica system, allowing for com- putations in CASs to closely mimic the sequential, paper and pencil work of mathematicians and theoretical physi- cists. In assessing the long-term utility of CASs, it is impor- tant to note that there is little reason to believe that com- puter algebra will be subsumed by other branches of AI research such as machine learning. Indeed, recent research has demonstrated applications of machine learning to both computer algebra and theorem proving (which we discuss in more detail below), via algorithm selection in the for- mer case [21] and proof assistance in the latter [22, 23]. While certainly not as visible as machine learning, compu- ter algebra and theorem proving are very much active and deep areas of research which are also likely to profit from advances in other fields of artificial intelligence, as oppo- sed to being replaced by them [24]. On the time horizons on which we are likely to see human-level artificial intel- ligence and beyond, we can expect that these systems will become quite powerful, and possess capabilities that may be useful in the construction of more general AI systems. Therefore, it is worth examining such systems from the per- spective of AI safety. 4.2 Briefly clarifying nomenclature Before proceeding, we want to explicitly describe issues relating to nomenclature that have arisen in the discussion thus far, and state our choices for terminology. Given that the phrase “Oracle AI” has become common usage in the AI safety community, we will continue to use this phrase, with the first word capitalized, as well as the acronym OAI. Where clarification is needed, we may also use the full phrase “superintelligent oracle AI,” without capitalization. For more modest use cases of the word oracle, we will either refer to “domain-specific oracles,” or state the dom- ain of knowledge where the oracle is applicable. We can, at the very least in the abstract, consider extending this ter- minology to other domains such as “physics oracles,” “cell biology oracles,” or “ethics oracles” and so on. Therefore, the remainder of the article will be concerned with safety and robustness issues in the design of “math oracles.” 5 Robust computer algebra and integrated theorem proving Today we should consider as a standard feature much closer interaction between proof assistance and com- puter algebra software. Several areas can benefit from this, including specification of interfaces among com- ponents, certification of results and domains of appli- cability, justification of optimizations and, in the other direction, use of efficient algebra in proofs. Robust Computer Algebra, Theorem Proving, and Oracle AI Informatica 41 (2017) 451–461 455 - Stephen Watt in On the future of computer algebra systems at the threshold of 2010 As we described above, computer algebra systems can be thought of as question answering systems for a subset of mathematics. A related set of systems are interactive proof assistants or interactive theorem provers (ITPs). While ITPs are also systems for computer-assisted mathematics, it is for a different mathematical context, for computati- ons in which one wishes to construct a proof of a gene- ral kind of statement. In other words, rather than compu- ting specific answers to specific questions, ITPs are used to show that candidate mathematical structures (or software systems) possess certain properties. In a sense, the distinction between theorem proving and computer algebra should be viewed as a historical anomaly. From the perspective of philosophical and logical efforts in the early 20th century that led to the “mechanization of mathematics” the distinction between computing the nth Laguerre polynomial and constructing a proof by induction might have been viewed as rather artificial, although with the benefit of hindsight we can see that the two types of tasks are quite different in practice [25]. The role of ITPs in the research world is very different from that of CASs. Whereas CASs allow researchers to perform difficult computations that would be impossible with paper and pencil, constructing proofs using ITPs is often more difficult than even the most rigorous methods of pure mathematics. In broad terms, the overhead of using ITPs to formalize theorems arises from the fact that proofs in these systems must proceed strictly from a set of for- malized axioms so that the system can verify each compu- tation. Consequently, ITPs (and related systems, such as automatic theorem provers) are largely used for verifying properties of mission-critical software systems which re- quire a high-degree of assurance, or for hardware verifica- tion, where mistakes can lead to costly recalls [26–30]. As the quotation above suggests, many academic rese- archers view the integration of interactive proof assistants and computer algebra systems as desirable, and there have been numerous efforts over the years at exploring possi- ble avenues for achieving this objective [31–34] (a more complete list is given below). By integrating theorem pro- ving with computer algebra, we would be opening up a wealth of potentially interoperable algorithms that have to date remained largely unintegrated. To cite one such ex- ample, in [35], the authors have developed a framework for exchange of information between the Maple computer al- gebra system and the Isabelle interactive theorem prover. They show a simple problem involving the proof of an ele- mentary polynomial identity that could be solved with the combined system, but in neither system alone (see Fig. 1). We cite this example to demonstrate how a simply sta- ted elementary problem cannot be solved in existing envi- ronments for either computer algebra or proof assistance. The computer algebra system does not have the capacity for structural induction and theorem provers generally have rather weak expression simplifiers. There are numerous ex- amples such as this one in the academic literature. Another key difference between CASs and ITPs is the ar- chitectural soundness of the respective systems. As we will discuss below, computer algebra systems have well-defined architectural deficiencies, which while not a practical issue for the vast majority of use cases, pose problems for their integration with theorem provers, which by their nature, are designed to be architecturally sound. In the context of superintelligent AI systems, the architectural problems of CASs are potential points of weakness that could be exploi- ted for malicious purposes or simply lead to unintended and detrimental consequences. Therefore, we use the phrase “robust computer algebra” to refer to CASs which lack the problems that have been identified in the research literature. In the section below, we combine the discussion of robust computer algebra and integration with interactive theorem provers, as there is a spectrum of approaches which address both of these issues to varying degrees. 5.1 A taxonomy of approaches There are many possible avenues to tackle the integration of theorem provers with computer algebra systems. We give 4 broad categories characterizing such integration efforts2: 1. Theorem provers built on top of computer alge- bra systems: These include Analytica, Theorema, RedLog, and logical extensions to the Axiom system [34, 36–39] . 2. Frameworks for mathematical exchange between the two systems: This category includes MathML, OpenMath, OMSCS, MathScheme, and Logic Broker [40–44]. 3. “Bridges” or “ad-hoc” information exchange solu- tions: The pairs of systems in this category include bridges combining PVS, HOL, or Isabelle with Maple, NuPRL with Weyl, Omega with Maple/GAP, Isabelle with Summit, and most recently, Lean with Mathema- tica [35, 45–51]. The example given above, bridging Isabelle and Maple, is an example of an approach from this category. 4. Embedding a computer algebra system inside a proof assistant: This is the approach taken by Ka- liszyk and Wiedijk in the HOLCAS system. In their system, all expressions have precise semantics, and the proof assistant proves the correctness of each sim- plification made by the computer algebra system [32]. One primary aspect of integration that differentiates these approaches is the degree of trust the theorem prover places in the computer algebra system. Computer algebra 2This classification was first described by Kaliszyk and Wiedijk [32] in a paper arguing for an architecture which we list as the fourth category given above. 456 Informatica 41 (2017) 451–461 G.P. Sarma et al. Specification and Integration of Theorem Provers 105 Maple calls. In concrete, this is performed by simplification via the evalua- tion rules. 4. Finally, by a repeated use of the laws which governate disequalities between products and sums, the induction step is proved. In this phase, additional Maple calls are used to verify disequalities between ground values, e.g. 2  5. The compound OMSCS tactic that originates the proof in our formalization closely resembles the series of Isabelle’s tactics invocations used in [10] to achieve the result. Its execution results in a (flat) symbolic mathematical structure which represents the proof of the conjecture. The following picture provides a simplified presentation of the structure. 2 3 5 INDUCT 4 SIMPLIFY MAPLE SIMPLIFY 6 7 8 MAPLE REST 1 ASSUMEREFL 1 : TH `I n5  5n 2 : TH `I 55  55 3 : TH `I n  5 4 : TH `I 8x : [x 2 N ^ 5  x ^ x5  5x] =) (x + 1)5  5(x+1) 5 : x 2 N `M (x + 1)5 ⌘ x5 + 5x4 + 10x3 + 10x2 + 5x + 1 6 : TH `I 8x : [x 2 N ^ 5  x ^ x5  5x] =) x5 + 5x4 + 10x3 + 10x2 + 5x + 1  5(x+1) 7 : x 2 N `M 5(x+1) ⌘ 5 ⇤ 5x 8 : TH `I 8x : [x 2 N ^ 5  x ^ x5  5x] =) x5 + 5x4 + 10x3 + 10x2 + 5x + 1  5 ⇤ 5x Circles represent object nodes, whose labels are reported in the table; rectangles represent link nodes, and contain their labels. The complex series of steps cor- responding to the final phase of the proof are folded within the triangular REST node. Link nodes labelled with SIMPLIFY identify the points where the systems cooperate to the solution of the problem; namely, where Maple is invoked to ex- pand some polynomial power. Note that the REST folded node hides away several additional Maple calls, meant to perform evaluations of disequalitites. Figure 1: Example of a polynomial identity proven by integrating the Maple computer algebra system with Isabelle. Maple’s simplifier is used for expanding polynomials—a powerful complement to the theorem proving architecture of Isabelle which allows for the setup of a proof by induction. systems give the false impression of being monolithic sys- tems with globally well-defined semantics. In reality, they are large collections of algorithms which are neatly packa- ged into a unified interface. Consequently, there are often corner cases where the lack of precise semantics can lead to erroneous solutions. Consider the following example: Figure 2: Example of an incorrect solution to a simple po- lynomial equation by a computer algebra system. The system incorrectly gives 1 as a solution, even though the given polynomial has an indeterminate value for x = 1. However, because the expression is treated as a fraction of polynomials, it is first simplified before the solve operation is applied. In other words, there is an unclear semantics between the solver module and the simplifier which leads to an incorrect result. Another simple example is the following integral: Figure 3: A problem arising in symbolic integration due to the non-commutativity of evaluation and substitution. Making the substitution n = −1 gives an indeterminate result, while it is clear by inspection that the solution to the integral for n = −1 is simply ln(x). This belongs to a class of problems known as the specialization problem, namely that expression evaluation and variable substitution do not commute [31]. So while we have seen above that theorem proving can benefit tremendously from the wealth of algorithms for expression simplification and mathema- tical knowledge in computer algebra, there is the potential cost of compromising the reliability of the combined sy- stem. As a possible application to current research in AI safety, consider the decision-theoretic research agenda for the development of safe, superintelligent AI systems outli- ned in [52–56]. If we require formal guarantees of correct- ness at any point in a sequence of computations in which computer algebra is used, current systems would be unable to provide the necessary framework for constructing such a proof. 5.1.1 Qualitatively certified computations In our taxonomy of approaches to bridging theorem provers with computer algebra, we described how a key distinction was the degree of trust that the theorem prover places in the computer algebra system. For instance, approaches which build theorem provers on top of computer algebra systems do not address the architectural issues with CASs. They are integrative, but not more sound. On the other extreme, buil- ding a computer algebra system on top of a theorem prover allows for a degree of trust that is on par with that of the theorem prover itself. However, this approach has the dis- tinct disadvantage that computer algebra systems represent many hundred man-years worth of effort. The more intermediate approaches involving common languages for symbolic exchange or ad-hoc bridges, bring to light an important notion in the spectrum of provable sa- Robust Computer Algebra, Theorem Proving, and Oracle AI Informatica 41 (2017) 451–461 457 fety, namely the ability to assign probabilities for the cor- rectness of computations. In [57], the authors present an algorithm for assigning probabilities to any statement in a formal language. We might ask what strategies might look like that have a similar goal in mind, but are significantly weaker. Interfaces between theorem provers and computer algebra systems provide a concrete example where we can ask a question along these lines. Fundamentally, in such an interface, the computer algebra system is the weaker link and should decrease our confidence in the final result. But by how much? For instance, in the example given in Figure 1, how should we revise our confidence in the result kno- wing that polynomial simplification was conducted within a computer algebra system? It is worth asking for simple answers to this question that do not require major theoretical advances to be made. For instance, we might imagine curating information from computer algebra experts about known weaknesses, and use this information to simply give a qualitative degree of confidence in a given result. Or, for example, in a repo- sitory of formal proofs generated using integrated systems, steps of the proof that require computer algebra can be flag- ged and also assigned a qualitative measure of uncertainty. The relationship that this highly informal method of gi- ving qualitative certification to computations has with the formal algorithm developed in [57] can be compared to ex- isting techniques in the software industry for ensuring cor- rectness. On the one hand, unit testing is a theoretically trivial, yet quite powerful practice, something along the li- nes of automated checklists for software. The complexities of modern software would be impossible to handle wit- hout extensive software testing frameworks [58–62]. On the other hand, formal verification can provide substanti- ally stronger guarantees, yet is a major undertaking, and the correctness proofs are often significantly more deman- ding to construct than the software itself. Consequently, as discussed in Section 5, formal verification is much less frequently used in industry, typically only in exceptional circumstances where high guarantees of correctness are re- quired, or for hardware verification [26–30]. Integrated systems for computer algebra and theorem proving give rise to a quite interesting (and perhaps ironic) opportunity to pursue simple strategies for giving qualita- tive estimates for the correctness of a computation. 5.1.2 Logical failures and error propagation As the examples described above demonstrate, errors in initial calculations may very well propagate and give rise to non-sensical results. As AI systems capable of perfor- ming mathematical computation become increasingly so- phisticated and embedded as part of design workflows for science and engineering (beyond what we see today), we could imagine such errors being quite costly and difficult to debug. In the case of a superintelligent AI system, more concerning scenarios would be if systematic errors in com- puter algebra could be exploited for adversarial purposes or if they led to unintentional accidents on a large scale. The issue of error propagation is another example of a concrete context for pursuing simple strategies for assig- ning qualitative measures of certainty to computations per- formed by integrated theorem proving / computer algebra systems. For instance, we may be less inclined to trust a result in which the computer algebra system was invoked early on in a computation as opposed to later. With cu- rated data from computer algebra experts on the reliability or failure modes of various algorithms, we might also chain together these informal estimates to arrive at a single global qualitative estimate. If multiple systems were to be develo- ped independently, or which were based on fundamentally different architectures, we might also be significantly more confident in a result which could be verified by two sepa- rate systems. 5.1.3 Additional topics Some related ideas merit investigation in the broader con- text of mathematical computation: • Integrating SMT solvers with interactive theorem provers: Satisfiability modulo theories (SMT) solvers are an important element of automated reasoning and there have been efforts analogous to those described above to bridge SMT solvers with interactive theorem provers [63, 64]. • Identifying the most important / widely used algo- rithms in computer algebra: Computer algebra sy- stems have grown to become massive collections of algorithms extending into domains well outside of the realm of mathematics. If the purely mathematical ca- pacities of CASs prove to be useful in future AI sys- tems, it would be valuable to rank order algorithms by their popularity or importance. One approach would be to do basic textual analysis of the source code from GitHub or StackExchange. This would also allow for more targeted efforts to directly address the issues with soundness in core algorithms such as expression simplification or integration. In the context of the HOLCAS system described above, for example, it would be valuable to have rough estimates for the number of man-hours required to implement a minimal CAS with the most widely used functionality on top of a theorem prover. • Proof checkers for integrated systems: Proof chec- kers are important tools in the landscape of formal ve- rification and theorem proving. Indeed, as it is often much less computationally expensive to verify the cor- rectness of a proof than to generate it from scratch, the availability of proof checkers for the widely used interactive theorem provers is one reason we can be confident in the correctness of formal proofs [65, 66]. As we described above, strategies for integrating com- puter algebra with theorem provers can potentially re- sult in a combined system which is less trustworthy 458 Informatica 41 (2017) 451–461 G.P. Sarma et al. than the theorem prover alone. Therefore, the availa- bility of proof checkers for combined systems would be a valuable resource in verifying proof correctness, and in certain mathematical domains, potentially pro- vide an avenue for surmounting the need to directly make the CAS itself more architecturally robust. The development of integrated proof checkers is li- kely to be a substantial undertaking and require novel architectures for integrating the core CAS and ITP sy- stems distinct from what has been described above. However, it is a largely unexplored topic that merits further investigation. • Analyzing scaling properties of algorithms for computer algebra and theorem proving as a function of hardware resources: The premise of the analysis presented above is that CASs (and in- tegrated theorem proving) are likely to remain suffi- ciently architecturally stable and useful on a several decade time-horizon in the construction of AI sys- tems. On the other hand, as we argued earlier, it is much less clear that the same will be true of the most visible, NLP-based, consumer-oriented question ans- wering systems. To make these arguments more ri- gorous, it would be valuable to develop quantitative predictions of what the capabilities will be of existing algorithms for computer algebra and theorem proving when provided with substantially expanded hardware resources. For instance, we might examine problems in mathematics or theoretical physics for which naı̈ve solutions in CASs are intractable with current resour- ces, but which may be feasible with future hardware. • The cognitive science of computer algebra: What role has computer algebra played in theoretical phy- sics and mathematics? How has it influenced the thin- king process of researchers? Has computer algebra simply been a convenience that has shifted the way problems are solved, or has it fundamentally enabled new problems to be solved that would have been com- pletely intractable otherwise? The cognitive science of mathematical thought is a substantial topic which overlaps with many establis- hed areas of research [67–71]. However, a systema- tic review of research in mathematics and theoretical physics since the advent of computer algebra and its role in the mathematical thought process is an unde- rexplored topic. It would be an interesting avenue to pursue in understanding the role that CASs, ITPs, and integrated systems may come to play in super- intelligence, particularly in the case of neuromorphic systems that have been modeled after human cogni- tion. These questions also relate to understanding the scaling properties of CAS and theorem proving algo- rithms as well as cataloguing the most widely used algorithms in computer algebra. 6 Conclusion The aim of this article has been to examine pre-existing research objectives in computer science and related dis- ciplines which align with problems relevant to AI safety, thereby providing concrete, practical context for problems which are otherwise of a longer time horizon than most re- search. In particular, we focused on the notion of “Oracle AI” as used in the AI safety community, and observed that the word oracle has two meanings in the context of super- intelligent AI systems. One usage refers to a subsystem of a larger AI system queried for domain-specific tasks, and the other to superintelligent AI systems restricted to only answer questions. We examined contemporary question answering systems (QASs) and argued that due to their architectural heteroge- neity, consumer-oriented, NLP-based systems do not rea- dily lend themselves to rigorous analysis from an AI safety perspective. On the other hand, we identified computer al- gebra systems (CASs) as concrete, if primitive, examples of domain-specific oracles. We examined well-known ar- chitectural deficiencies with CASs identified by the theo- rem proving community and argued that the integration of interactive theorem provers (ITPs) with CASs, an objective that has been an area of research in the respective commu- nities for several decades, provides a set of research pro- blems and practical software projects related to the deve- lopment of powerful and robust math oracles on a multi- decade time horizon. Independent of their role as domain- specific oracles, such systems may also prove to be use- ful tools for AI safety researchers in proving the functional correctness of other components of an AI architecture. Na- tural choices of systems to use would be interfaces for the Wolfram Language, the most widely used computer alge- bra system, with one of the HOL family of theorem pro- vers or Coq, both of which have substantial repositories of formalized proofs [72–75], or a more modern ITP such as Lean [51, 76]. Rather than representing a bold and profound new agenda, we view these projects as being concrete and achie- vable goals that may pave the way to more substantial rese- arch directions. Because the topics we have discussed have a long and rich academic history, there are a number of “shovel-ready” projects appropriate for students anywhere from undergraduates to PhD students and beyond. Good undergraduate research projects would probably start with some basic data science to catalogue core computer alge- bra algorithms by their usage and popularity. From there, it would be useful to have an estimate of what certified imple- mentations of these algorithms would entail, whether for- mally verified implementations, or along the lines of Ka- liszyk and Wiedijk’s HOLCAS system where the CAS is built on top of a theorem prover. Also useful would be a systematic study of role that computer algebra has played in mathematics and theoretical physics. This would have some interesting overlap with cognitive psychology, and these three projects together would make for an approa- Robust Computer Algebra, Theorem Proving, and Oracle AI Informatica 41 (2017) 451–461 459 chable undergraduate thesis, or a beginning project for a graduate student. A solid PhD thesis devoted to the topic of Oracle AI might involve tackling approaches to oracles stemming from reinforcement learning (RL) [8,16], as well as more advanced theorem proving and CAS related topics such as investigating the development of a hybrid architec- ture that would allow for proof-checking. A student who worked on these projects for several years would develop a unique skill set spanning philosophy, machine learning, theorem proving, and computer algebra. In the context of superintelligent oracle AIs which may possess the ability to manipulate a human user, we differen- tiate between addressing architectural or algorithmic defi- ciencies in subsystems versus general control methods or containment strategies. Given that strong mathematical ca- pabilities are likely to be useful in the construction of more general AI systems, designing robust CASs (and any ot- her domain-specific oracle) is an important counterpart to general control strategies, as the top-level AI system will have fewer loopholes to exploit. Controlling OAIs poses a distinct set of challenges for which concrete mathematical analysis is in its infancy [8,9,16]. Nonetheless, considering how little attention has been given to the superintelligence control problem in general, we are optimistic about the po- tential to translate the high-level analyses of OAIs that have arisen in the AI safety community into the mathematical and software frameworks of modern artificial intelligence. Acknowledgements We would like to thank Stuart Armstrong, David Kristof- fersson, Marcello Herreshoff, Miles Brundage, Eric Drex- ler, Cristian Calude, and several anonymous reviewers for insightful discussions and feedback on the manuscript. We would also like to thank the guest editors of Informatica, Ryan Carey, Matthijs Maas, Nell Watson, and Roman Yam- polskiy, for organizing this special issue. References [1] N. Bostrom, Superintelligence: Paths, Dangers, Stra- tegies. Oxford University Press, 2014. [2] M. Shanahan, The Technological Singularity. MIT Press, 2015. [3] D. Chalmers, “The Singularity: A Philosophical Ana- lysis,” Journal of Consciousness Studies, vol. 17, no. 9-10, pp. 7–65, 2010. [4] M. Tegmark et al., “An Open Letter: Research Prio- rities for Robust and Beneficial Artificial Intelligence (Future of Life Institute),” 2015. [5] S. Russell, D. Dewey, and M. Tegmark, “Research Priorities for Robust and Beneficial Artificial Intel- ligence,” AI Magazine, vol. 36, no. 4, pp. 105–114, 2015. [6] S. Armstrong, A. Sandberg, and N. Bostrom, “Thin- king inside the box: Controlling and Using an Oracle AI,” Minds and Machines, vol. 22, no. 4, pp. 299–324, 2012. [7] B. Fallenstein, J. Taylor, and P. F. Christiano, “Re- flective oracles: A foundation for game theory in ar- tificial intelligence,” in Logic, Rationality, and Inte- raction, pp. 411–415, Springer, 2015. [8] S. Armstrong, “Value and policy networks as Oracle AIs.” in preparation, 2017. [9] S. Armstrong, “Good and safe uses of AI Oracles,” ArXiv e-prints, Nov. 2017. [10] V. C. Müller and N. Bostrom, “Future Progress in Artificial Intelligence: A survey of expert opi- nion,” in Fundamental Issues of Artificial Intelli- gence, pp. 553–570, Springer, 2016. [11] K. Grace, J. Salvatier, A. Dafoe, B. Zhang, and O. Evans, “When Will AI Exceed Human Perfor- mance? Evidence from AI Experts,” ArXiv e-prints, May 2017. [12] C. Rovelli, “Quantum gravity,” Scholarpedia, vol. 3, no. 5, p. 7117, 2008. [13] S. Russell, “Should We Fear Supersmart Robots?,” Scientific American, vol. 314, no. 6, pp. 58–59, 2016. [14] A. H. Eden, J. H. Moor, J. H. Soraker, and E. Stein- hart, Singularity Hypotheses: A Scientific and Phi- losophical Assessment. Springer Verlag, 2012. [15] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané, “Concrete Problems in AI Safety,” ArXiv e-prints, June 2016. [16] S. M. Armstrong and L. Orseau, “Safely Interruptible Agents.” submitted, 2016. [17] D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gon- dek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al., “Building Watson: An overview of the DeepQA project,” AI magazine, vol. 31, no. 3, pp. 59–79, 2010. [18] W. Knight, “IBM Pushes Deep Learning with a Wat- son Upgrade,” MIT Technology Review, 7 2015. [19] S. Wolfram, “Jeopardy, IBM, and Wolfram — Al- pha,” Stephen Wolfram — Blog, 1 2011. [20] S. Weinzierl, “Computer Algebra in Particle Phy- sics,” ArXiv High Energy Physics - Phenomenology e-prints, Sept. 2002. [21] Z. Huang, “Machine Learning and Computer Alge- bra,” tech. rep., University of Cambridge, Computer Laboratory, 2016. 460 Informatica 41 (2017) 451–461 G.P. Sarma et al. [22] G. Irving, C. Szegedy, A. A. Alemi, F. Chollet, and J. Urban, “DeepMath—Deep Sequence Models for Premise Selection,” in Advances in Neural Informa- tion Processing Systems, pp. 2235–2243, 2016. [23] E. Komendantskaya, J. Heras, and G. Grov, “Machine Learning in Proof General: Interfacing Interfaces,” ArXiv e-prints, Dec. 2012. [24] A. Bundy, D. Hutter, C. B. Jones, and J. S. Moore, “AI meets Formal Software Development (Dagstuhl Seminar 12271),” Dagstuhl Reports, vol. 2, no. 7, pp. 1–29, 2012. [25] M. J. Beeson, “The Mechanization of Mathematics,” in Alan Turing: Life and Legacy of a Great Thinker, pp. 77–134, Springer, 2004. [26] G. Klein, K. Elphinstone, G. Heiser, J. Andronick, D. Cock, P. Derrin, D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish, et al., “seL4: Formal ve- rification of an OS kernel,” in Proceedings of the ACM SIGOPS 22nd Symposium on Operating Sys- tems Principles, pp. 207–220, ACM, 2009. [27] R. Kaivola, R. Ghughal, N. Narasimhan, A. Telfer, J. Whittemore, S. Pandav, A. Slobodová, C. Tay- lor, V. Frolov, E. Reeber, et al., “Replacing Testing with Formal Verification in Intel CoreTM i7 Proces- sor Execution Engine Validation,” in International Conference on Computer Aided Verification, pp. 414– 429, Springer, 2009. [28] L. Fix, “Fifteen years of formal property verification in Intel,” in 25 Years of Model Checking, pp. 139–144, Springer, 2008. [29] C. Kern and M. R. Greenstreet, “Formal verification in hardware design: a survey,” ACM Transactions on Design Automation of Electronic Systems, vol. 4, no. 2, pp. 123–193, 1999. [30] T. Kropf, Introduction to Formal Hardware Verifica- tion. Springer Science & Business Media, 2013. [31] C. Ballarin, Computer Algebra and Theorem Proving. PhD thesis, University of Cambridge, Computer La- boratory, 1999. [32] C. Kaliszyk and F. Wiedijk, “Certified computer al- gebra on top of an interactive theorem prover,” in To- wards Mechanized Mathematical Assistants, pp. 94– 105, Springer, 2007. [33] S. M. Watt, “On the future of Computer Algebra Sy- stems at the Threshold of 2010,” Proceedings ASCM- MACIS, pp. 422–430, 2009. [34] W. Windsteiger, “Theorema 2.0: a system for mat- hematical theory exploration,” in International Con- gress on Mathematical Software, pp. 49–52, Springer, 2014. [35] P. G. Bertoli, J. Calmet, F. Giunchiglia, and K. Ho- mann, “Specification and integration of theorem pro- vers and computer algebra systems,” in Internatio- nal Conference on Artificial Intelligence and Symbo- lic Computation, pp. 94–106, Springer, 1998. [36] E. Clarke and X. Zhao, “Analytica—A theorem pro- ver in Mathematica,” in International Conference on Automated Deduction, pp. 761–765, Springer, 1992. [37] A. Dolzmann and T. Sturm, “Redlog: Computer al- gebra meets computer logic,” ACM SIGSAM Bulletin, vol. 31, no. 2, pp. 2–9, 1997. [38] R. D. Jenks and R. S. Sutor, AXIOM: The Scientific Computation System. Springer, 2013. [39] E. Poll and S. Thompson, “Adding the axioms to Ax- iom,” tech. rep., Computing Laboratory, University of Kent, 1998. [40] R. Miner, “The importance of MathML to mathema- tics communication,” Notices of the AMS, vol. 52, no. 5, pp. 532–538, 2005. [41] S. Buswell, O. Caprotti, D. P. Carlisle, M. C. Dewar, M. Gaetano, and M. Kohlhase, “The Open Math Stan- dard,” tech. rep., The Open Math Society, 2004. [42] J. Calmet and V. Lefevre, “Toward the Integration of Numerical Computations into the OMSCS Frame- work,” in 7th International Workshop on Computer Algebra in Scientific Computing-CASC, pp. 71–79, 2004. [43] J. Carette, W. M. Farmer, and R. O’Connor, “Mat- hScheme: project description,” in International Conference on Intelligent Computer Mathematics, pp. 287–288, Springer, 2011. [44] A. Armando and D. Zini, “Towards Interoperable Me- chanized Reasoning Systems: the Logic Broker Ar- chitecture,” in AI*IA-TABOO Workshop: From Ob- jects to Agents: Evolutionary Trends of Software Sys- tems, pp. 70–75, 2000. [45] A. Adams, M. Dunstan, H. Gottliebsen, T. Kelsey, U. Martin, and S. Owre, “Computer algebra meets automated theorem proving: Integrating Maple and PVS,” in International Conference on Theorem Pro- ving in Higher Order Logics, pp. 27–42, Springer, 2001. [46] J. Harrison and L. Théry, “A skeptic’s approach to combining HOL and Maple,” Journal of Automated Reasoning, vol. 21, no. 3, pp. 279–294, 1998. [47] C. Ballarin, K. Homann, and J. Calmet, “Theorems and algorithms: An interface between Isabelle and Maple,” in Proceedings of the International Sym- posium on Symbolic and Algebraic Computation, pp. 150–157, ACM, 1995. Robust Computer Algebra, Theorem Proving, and Oracle AI Informatica 41 (2017) 451–461 461 [48] P. Jackson, “Exploring abstract algebra in con- structive type theory,” in International Conference on Automated Deduction, pp. 590–604, Springer, 1994. [49] J. Siekmann, C. Benzmüller, V. Brezhnev, L. Cheikhrouhou, A. Fiedler, A. Franke, H. Ho- racek, M. Kohlhase, A. Meier, E. Melis, et al., “Proof development with OMEGA,” in International Conference on Automated Deduction, pp. 144–149, Springer, 2002. [50] C. Ballarin and L. C. Paulson, “A pragmatic appro- ach to extending provers by computer algebra—with applications to coding theory,” Fundamenta Informa- ticae, vol. 39, no. 1, 2, pp. 1–20, 1999. [51] R. Y. Lewis, “An extensible ad hoc interface between Lean and Mathematica.” in preparation, 2017. [52] E. Yudkowsky and M. Herreshoff, “Tiling agents for self-modifying AI, and the Löbian obstacle,” tech. rep., Machine Intelligence Research Institute, 2013. [53] P. LaVictoire, “An Introduction to Löbs Theorem in MIRI Research,” tech. rep., Machine Intelligence Re- search Institute, 2015. [54] M. Barasz, P. Christiano, B. Fallenstein, M. Her- reshoff, P. LaVictoire, and E. Yudkowsky, “Robust Cooperation in the Prisoner’s Dilemma: Program Equilibrium via Provability Logic,” ArXiv e-prints, Jan. 2014. [55] B. Fallenstein and N. Soares, “Problems of self- reference in self-improving space-time embedded in- telligence,” in International Conference on Artificial General Intelligence, pp. 21–32, Springer, 2014. [56] N. Soares and B. Fallenstein, “Toward Idealized De- cision Theory,” ArXiv e-prints, July 2015. [57] S. Garrabrant, T. Benson-Tilsen, A. Critch, N. Soares, and J. Taylor, “Logical Induction,” ArXiv e-prints, Sept. 2016. [58] K. Beck, Test Driven Development: By Example. Ad- dison Wesley, 2002. [59] R. Osherove, The Art of Unit Testing: with examples in C#. Manning Publications, 2013. [60] E. M. Maximilien and L. Williams, “Assessing test- driven development at IBM,” in Proceedings of the 25th International Conference on Software Engineer- ing, pp. 564–569, IEEE, 2003. [61] H. Erdogmus, “On the effectiveness of test-first ap- proach to programming,” IEEE Transactions on Soft- ware Engineering, vol. 31, no. 1, 2005. [62] G. P. Sarma, T. W. Jacobs, M. D. Watts, S. V. Gha- yoomie, S. D. Larson, and R. C. Gerkin, “Unit tes- ting, model validation, and biological simulation,” F1000Research, vol. 5, 2016. [63] C. Keller, A Matter of Trust: Skeptical Communica- tion Between Coq and External Provers. PhD thesis, École Polytechnique, 2013. [64] M. Armand, G. Faure, B. Grégoire, C. Keller, L. Théry, and B. Werner, “A modular integration of SAT/SMT solvers to Coq through proof witnesses,” in International Conference on Certified Programs and Proofs, pp. 135–150, Springer, 2011. [65] J. Harrison, “Towards self-verification of HOL Light,” in International Joint Conference on Automa- ted Reasoning, pp. 177–191, Springer, 2006. [66] R. Pollack, “How to believe a machine-checked proof,” Twenty Five Years of Constructive Type The- ory, vol. 36, p. 205, 1998. [67] G. Hardy and J. Hadamard, “The Psychology of In- vention in the Mathematical Field,” 1946. [68] S. Dehaene, The Number Sense: How the Mind Cre- ates Mathematics. Oxford University Press, 2011. [69] P. Drijvers and K. Gravemeijer, “Computer Algebra as an Instrument: Examples of Algebraic Schemes,” in The Didactical Challenge of Symbolic Calculators, pp. 163–196, Springer, 2005. [70] P. Drijvers, “Learning mathematics in a computer al- gebra environment: obstacles are opportunities,” Zen- tralblatt für Didaktik der Mathematik, vol. 34, no. 5, pp. 221–228, 2002. [71] G. Lakoff and R. Núñez, Where mathematics come from: How the embodied mind brings mathematics into being. Basic books, 2000. [72] S. Wolfram, An Elementary Introduction to the Wolf- ram Language. Wolfram Media, 2015. [73] L. C. Paulson, “The foundation of a generic theo- rem prover,” Journal of Automated Reasoning, vol. 5, no. 3, pp. 363–397, 1989. [74] L. C. Paulson, Isabelle: A generic theorem prover, vol. 828. Springer Science & Business Media, 1994. [75] Y. Bertot and P. Castéran, Interactive theorem proving and program development: Coq‘Art: The Calculus of Inductive Constructions. Springer Science & Busi- ness Media, 2013. [76] L. de Moura, S. Kong, J. Avigad, F. Van Doorn, and J. von Raumer, “The Lean Theorem Prover,” in International Conference on Automated Deduction, pp. 378–388, Springer, 2015.